ham-fisted.api

Fast mutable and immutable associative data structures based on bitmap trie hashmaps. Mutable pathways implement the java.util.Map or Set interfaces including in-place update features such as compute or computeIfPresent.

Mutable maps or sets can be turned into their immutable counterparts via the Clojure persistent! call. This allows working in a mutable space for convenience and performance then switching to an immutable pathway when necessary. Note: after persistent! one should never backdoor mutate map or set again as this will break the contract of immutability. Immutable data structures also support conversion to transient via transient.

Map keysets (.keySet) are full PersistentHashSets of keys.

Maps and sets support metadata but setting the metadata on mutable objects returns a new mutable object that shares the backing store leading to possible issues. Metadata is transferred to the persistent versions of the mutable/transient objects upon persistent!.

Very fast versions of union, difference and intersection are provided for maps and sets with the map version of union and difference requiring an extra argument, a java.util.BiFunction or an IFn taking 2 arguments to merge the left and right sides into the final map. These implementations of union, difference, and intersection are the fastest implementation of these operations we know of on the JVM.

Additionally a fast value update pathway is provided, enabling quickly updating all the values in a given map. Additionally, a new map primitive

  • mapmap - allows transforming a given map into a new map quickly by mapping across all the entries.

Unlike the standard Java objects, mutation-via-iterator is not supported.

->collection

(->collection item)

Ensure item is an implementation of java.util.Collection.

->random-access

(->random-access item)

Ensure item is derived from java.util.List and java.util.RandomAccess and thus supports constant time random addressing.

->reducible

(->reducible item)

Ensure item either implements IReduceInit or java.util.Collection. For arrays this will return an object that has a much more efficient reduction pathway than the base Clojure reducer.

add-all!

(add-all! l1 l2)

Add all items from l2 to l1. l1 is expected to be a java.util.List implementation. Returns l1.

apply-concat

(apply-concat args)

Faster lazy noncaching version of (apply concat)

apply-concatv

(apply-concatv data)

argsort

(argsort comp coll)(argsort coll)

Sort a collection of data returning an array of indexes. The collection must be random access and the return value is an integer array of indexes which will read the input data in sorted order. Faster implementations are provided when the collection is an integer, long, or double array. See also reindex.

array-list

(array-list data)(array-list)

Create an implementation of java.util.ArrayList.

assoc!

(assoc! obj k v)

assoc! that works on transient collections, implementations of java.util.Map and RandomAccess java.util.List implementations. Be sure to keep track of return value as some implementations return a different return value than the first argument.

boolean-array

(boolean-array)(boolean-array data)

boolean-array-list

(boolean-array-list)(boolean-array-list data)

byte-array

(byte-array)(byte-array data)

byte-array-list

(byte-array-list)(byte-array-list data)

char-array

(char-array)(char-array data)

char-array-list

(char-array-list)(char-array-list data)

clear!

(clear! map-or-coll)

Mutably clear a map, set, list or implementation of java.util.Collection.

clear-memoized-fn!

(clear-memoized-fn! memoized-fn)

Clear a memoized function backing store.

concata

(concata)(concata v1)(concata v1 v2)(concata v1 v2 & args)

non-lazily concat a set of items returning an object array. This always returns an object array an may return an empty array whereas concat may return nil.

concatv

(concatv)(concatv v1)(concatv v1 v2)(concatv v1 v2 & args)

non-lazily concat a set of items returning a persistent vector.

conj!

(conj! obj val)

conj! that works on transient collections, implementations of java.util.Set and RandomAccess java.util.List implementations. Be sure to keep track of return value as some implementations return a different return value than the first argument.

constant-count

(constant-count data)

Constant time count. Returns nil if input doesn't have a constant time count.

constant-countable?

(constant-countable? data)

Return true if data has a constant time count.

custom-counted-ireduce

macro

(custom-counted-ireduce n-elems rfn acc & code)

Custom implementation of IReduceInit and nothing else. This can be the most efficient way to pass data to other interfaces. Also see custom-ireduce if the object does not need to be counted and see reduced-> for implementation helper.

custom-ireduce

macro

(custom-ireduce rfn acc & code)

Custom implementation of IReduceInit and nothing else. This can be the most efficient way to pass data to other interfaces. Also see custom-counted-ireduce if the object should also implement ICounted. See reduced-> for implementation helper.

darange

(darange end)(darange start end)(darange start end step)

Return a double array holding the values of the range. Use wrap-array to get an implementation of java.util.List that supports the normal Clojure interfaces.

dbl-ary-cls

difference

(difference map1 map2)

Take the difference of two maps (or sets) returning a new map. Return value is a map1 (or set1) without the keys present in map2.

dnth

macro

(dnth obj idx)

nth operation returning a primitive double. Efficient when obj is a double array.

double-array

macro

(double-array)(double-array data)

double-array-list

(double-array-list)(double-array-list cap-or-data)

An array list that is as fast as java.util.ArrayList for add,get, etc but includes many accelerated operations such as fill and an accelerated addAll when the src data is an array list.

drop

(drop n coll)

Drop the first N items of the collection. If item is random access, the return value is random-access.

drop-last

(drop-last n)(drop-last n coll)

Drop the last N values from a collection. IF the input is random access, the result will be random access.

drop-min

(drop-min n comp values)(drop-min n values)

Drop the min n values of a collection. This is not an order-preserving operation.

dvec

macro

(dvec)(dvec data)

Create a persistent-vector-compatible list backed by a double array.

empty-map

Constant persistent empty map

empty-set

Constant persistent empty set

empty-vec

Constant persistent empty vec

empty?

(empty? coll)

evict-memoized-call

(evict-memoized-call memo-fn fn-args)

filterv

(filterv pred coll)

Filter a collection into a vector.

first

(first coll)

Get the first item of a collection.

float-array

macro

(float-array)(float-array data)

float-array-list

(float-array-list)(float-array-list data)

fnth

macro

(fnth obj idx)

nth operation returning a primitive float. Efficient when obj is a float array.

freq-reducer

(freq-reducer options)(freq-reducer)

Return a hamf parallel reducer that performs a frequencies operation.

frequencies

(frequencies coll)(frequencies options coll)

Faster implementation of clojure.core/frequencies.

fvec

macro

(fvec)(fvec data)

Create a persistent-vector-compatible list backed by a float array.

group-by

(group-by f options coll)(group-by f coll)

Group items in collection by the grouping function f. Returns persistent map of keys to persistent vectors.

Options are same as group-by-reduce but this reductions defaults to an ordered reduction.

group-by-consumer

(group-by-consumer key-fn reducer coll)(group-by-consumer key-fn reducer options coll)

Perform a group-by-reduce passing in a reducer. Same options as group-by-reduce - This uses a slightly different pathway - computeIfAbsent - in order to preserve order. In this case the return value of the reduce fn is ignored. This allows things like the linked hash map to preserve initial order of keys. It map also be slightly more efficient because the map itself does not need to check the return value of rfn - something that the .compute primitive does need to do.

Options:

  • :skip-finalize? - skip finalization step.

group-by-reduce

(group-by-reduce key-fn init-val-fn rfn merge-fn options coll)(group-by-reduce key-fn init-val-fn rfn merge-fn coll)

Group by key. Apply the reduce-fn with the new value an the return of init-val-fn. Merged maps due to multithreading will be merged with merge-fn in a similar way of preduce.

This type of reduction can be both faster and more importantly use less memory than a reduction of the forms:

  (->> group-by map into)
  ;; or
  (->> group-by mapmap)

Options (which are passed to preduce):

  • :map-fn Function which takes no arguments and must return an instance of java.util.Map that supports computeIfAbsent. Some examples:
    • (constantly (java.util.concurrent.ConcurrentHashMap. ...)) Very fast update especially in the case where the keyspace is large.
    • mut-map - Fast merge, fast update, in-place immutable conversion via persistent!.
    • java-hashmap - fast merge, fast update, just a simple java.util.HashMap-based reduction.
    • #(LinkedHashMap.) - When used with options {:ordered? true} the result keys will be in order and the result values will be reduced in order.

group-by-reducer

(group-by-reducer key-fn reducer coll)(group-by-reducer key-fn reducer options coll)

Perform a group-by-reduce passing in a reducer. Same options as group-by-reduce.

Options:

  • :skip-finalize? - skip finalization step.

hash-map

(hash-map)(hash-map a b)(hash-map a b c d)(hash-map a b c d e f)(hash-map a b c d e f g h)(hash-map a b c d e f g h i j)(hash-map a b c d e f g h i j k l)(hash-map a b c d e f g h i j k l m n)(hash-map a b c d e f g h i j k l m n o p)(hash-map a b c d e f g h i j k l m n o p & args)

Drop-in replacement to Clojure's hash-map function.

iarange

(iarange end)(iarange start end)(iarange start end step)

Return an integer array holding the values of the range. Use ->collection to get a list implementation wrapping for generic access.

immut-list

(immut-list)(immut-list data)

Create a persistent list. Object arrays will be treated as if this new object owns them.

immut-map

(immut-map)(immut-map data)(immut-map options data)

Create an immutable map. This object supports conversion to a transient map via Clojure's transient function. Duplicate keys are treated as if by assoc.

If data is an object array it is treated as a flat key-value list which is distinctly different than how conj! treats object arrays. You have been warned.

If you know you will have consistently more key/val pairs than 8 you should just use (persistent! (mut-map data)) as that avoids the transition from an arraymap to a persistent hashmap.

Examples:

ham-fisted.api> (immut-map (obj-ary :a 1 :b 2 :c 3 :d 4))
{:a 1, :b 2, :c 3, :d 4}
ham-fisted.api> (type *1)
ham_fisted.PersistentArrayMap
ham-fisted.api> (immut-map (obj-ary :a 1 :b 2 :c 3 :d 4 :e 5))
{:d 4, :b 2, :c 3, :a 1, :e 5}
ham-fisted.api> (type *1)
ham_fisted.PersistentHashMap
ham-fisted.api> (immut-map [[:a 1][:b 2][:c 3][:d 4][:e 5]])
{:d 4, :b 2, :c 3, :a 1, :e 5}
ham-fisted.api> (type *1)
ham_fisted.PersistentHashMap

immut-set

(immut-set)(immut-set data)(immut-set options data)

Create an immutable hashset based on a hash table. This object supports conversion to transients via transient.

Options:

  • :hash-provider - An implementation of BitmapTrieCommon$HashProvider. Defaults to the default-hash-provider.

in-fork-join-task?

(in-fork-join-task?)

True if you are currently running in a fork-join task

inc-consumer

(inc-consumer)(inc-consumer init-value)

Return a consumer that simply increments a long. See java/ham_fisted/Consumers.java for definition.

inc-consumer-reducer

A hamf reducer that works with inc-consumers

int-array

macro

(int-array)(int-array data)

int-array-list

(int-array-list)(int-array-list cap-or-data)

An array list that is as fast as java.util.ArrayList for add,get, etc but includes many accelerated operations such as fill and an accelerated addAll when the src data is an array list.

intersect-sets

(intersect-sets sets)

Given a sequence of sets, efficiently perform the intersection of them. This algorithm is usually faster and has a more stable runtime than (reduce clojure.set/intersection sets) which degrades depending on the order of the sets and the pairwise intersection of the initial sets.

intersection

(intersection s1 s2)

Intersect the keyspace of set1 and set2 returning a new set. Also works if s1 is a map and s2 is a set - the map is trimmed to the intersecting keyspace of s1 and s2.

inth

macro

(inth obj idx)

nth operation returning a primitive int. Efficient when obj is an int array.

into

(into container data)(into container xform data)

Like clojure.core/into, but also designed to handle editable collections, transients, and base java.util.Map, List and Set containers.

into-array

(into-array aseq)(into-array ary-type aseq)(into-array ary-type mapfn aseq)

Faster version of clojure.core/into-array.

ivec

macro

(ivec)(ivec data)

Create a persistent-vector-compatible list backed by an int array.

java-concurrent-hashmap

(java-concurrent-hashmap)(java-concurrent-hashmap data)

Create a java concurrent hashmap which is still the fastest possible way to solve a few concurrent problems.

java-hashmap

(java-hashmap)(java-hashmap data)(java-hashmap xform data)(java-hashmap xform options data)

Create a java.util.HashMap. Duplicate keys are treated as if map was created by assoc.

java-hashset

(java-hashset)(java-hashset data)

Create a java hashset which is still the fastest possible way to solve a few problems.

java-linked-hashmap

(java-linked-hashmap)(java-linked-hashmap data)

Linked hash maps perform identically or very nearly so to java.util.HashMaps but they retain the order of insertion and modification.

keys

(keys m)

Return the keys of a map. This version allows parallel reduction operations on the returned sequence.

larange

(larange end)(larange start end)(larange start end step)

Return a long array holding values of the range. Use ->collection get a list implementation for generic access.

last

(last coll)

Get the last item in the collection. Constant time for random access lists.

linked-hashmap

(linked-hashmap)(linked-hashmap data)

Linked hash map using clojure's equiv pathways. At this time the node link order reflects insertion order. Modification and access do not affect the node link order.

lnth

macro

(lnth obj idx)

nth operation returning a primitive long. Efficient when obj is a long array.

long-array

macro

(long-array)(long-array data)

long-array-list

(long-array-list)(long-array-list cap-or-data)

An array list that is as fast as java.util.ArrayList for add,get, etc but includes many accelerated operations such as fill and an accelerated addAll when the src data is an array list.

lvec

macro

(lvec)(lvec data)

Create a persistent-vector-compatible list backed by a long array.

make-map-entry

macro

(make-map-entry k-code v-code)

Create a dynamic implementation of clojure's IMapEntry class.

map-intersection

(map-intersection bfn map1 map2)

Intersect the keyspace of map1 and map2 returning a new map. Each value is the result of bfn applied to the map1-value and map2-value, respectively. See documentation for map-union.

Clojure's merge functionality can be duplicate via:

(map-intersection (fn [lhs rhs] rhs) map1 map2)

map-union

(map-union bfn map1 map2)

Take the union of two maps returning a new map. bfn is a function that takes 2 arguments, map1-val and map2-val and returns a new value. Has fallback if map1 and map2 aren't backed by bitmap tries.

  • bfn - A function taking two arguments and returning one. + is a fine choice.
  • map1 - the lhs of the union.
  • map2 - the rhs of the union.

Returns a persistent map if input is a persistent map else if map1 is a mutable map map1 is returned with overlapping entries merged. In this way you can pass in a normal java hashmap, a linked java hashmap, or a persistent map and get back a result that matches the input.

If map1 and map2 are the same returns map1.

map-union-java-hashmap

(map-union-java-hashmap bfn lhs rhs)

Take the union of two maps returning a new map. See documentation for map-union. Returns a java.util.HashMap.

mapmap

(mapmap map-fn src-map)

Clojure's missing piece. Map over the data in src-map, which must be a map or sequence of pairs, using map-fn. map-fn must return either a new key-value pair or nil. Then, remove nil pairs, and return a new map. If map-fn returns more than one pair with the same key later pair will overwrite the earlier pair.

Logically the same as:

(into {} (comp (map map-fn) (remove nil?)) src-map)

mapv

(mapv map-fn coll)(mapv map-fn c1 c2)(mapv map-fn c1 c2 c3)(mapv map-fn c1 c2 c3 & args)

Produce a persistent vector from a collection.

mean

(mean coll)(mean options coll)

Return the mean of the collection. Returns double/NaN for empty collections. See options for sum.

memoize

(memoize memo-fn)(memoize memo-fn {:keys [write-ttl-ms access-ttl-ms soft-values? weak-values? max-size record-stats? eviction-fn]})

Efficient thread-safe version of clojure.core/memoize.

Also see clear-memoized-fn! evict-memoized-call and memoize-cache-as-map to mutably clear the backing store, manually evict a value, and get a java.util.Map view of the cache backing store.

ham-fisted.api> (def m (memoize (fn [& args] (println "fn called - " args) args)
                                {:write-ttl-ms 1000 :eviction-fn (fn [args rv cause]
                                                                   (println "evicted - " args rv cause))}))
#'ham-fisted.api/m
ham-fisted.api> (m 3)
fn called -  (3)
(3)
ham-fisted.api> (m 4)
fn called -  (4)
(4)evicted -  [3] (3) :expired
ham-fisted.api> (dotimes [idx 4] (do (m 3) (evict-memoized-call m [3])))
fn called -  (3)
fn called -  (3)
fn called -  (3)
fn called -  (3)
nil
ham-fisted.api> (dotimes [idx 4] (do (m 3) #_(evict-memoized-call m [3])))
fn called -  (3)
nil

Options:

  • :write-ttl-ms - Time that values should remain in the cache after write in milliseconds.
  • :access-ttl-ms - Time that values should remain in the cache after access in milliseconds.
  • :soft-values? - When true, the cache will store SoftReferences to the data.
  • :weak-values? - When true, the cache will store WeakReferences to the data.
  • :max-size - When set, the cache will behave like an LRU cache.
  • :record-stats? - When true, the LoadingCache will record access statistics. You can get those via the undocumented function memo-stats.
  • :eviction-fn - Function that receives 3 arguments, [args v cause], when a value is evicted. Causes the keywords:collected :expired :explicit :replaced and :size`. See caffeine documentation for cause definitions.

memoize-cache-as-map

(memoize-cache-as-map memoized-fn)

Return the memoize backing store as an implementation of java.util.Map.

merge

(merge)(merge m1)(merge m1 m2)(merge m1 m2 & args)

Merge 2 maps with the rhs values winning any intersecting keys. Uses map-union with BitmapTrieCommon/rhsWins.

Returns a new persistent map.

merge-with

(merge-with f)(merge-with f m1)(merge-with f m1 m2)(merge-with f m1 m2 & args)

Merge (union) any number of maps using f as the merge operator. f gets passed two arguments, lhs-val and rhs-val and must return a new value.

Returns a new persistent map.

mmax-idx

(mmax-idx f data)

Like mmin-key but returns the max index. F should be a function from obj->long.

mmax-key

(mmax-key f data)

Faster and nil-safe version of #(apply max-key %1 %2)

mmin-idx

(mmin-idx f data)

Like mmin-key but returns the min index. F should be a function from obj->long.

mmin-key

(mmin-key f data)

Faster and nil-safe version of #(apply min-key %1 %2)

mode

(mode data)

Return the most common occurance in the data.

mut-hashtable-map

(mut-hashtable-map)(mut-hashtable-map data)(mut-hashtable-map xform data)(mut-hashtable-map xform options data)

Create a mutable implementation of java.util.Map. This object efficiently implements ITransient map so you can use assoc! and persistent! on it but you can additionally use the various members of the java.util.Map interface such as put, compute, computeIfAbsent, replaceAll and merge.

If data is an object array it is treated as a flat key-value list which is distinctly different than how conj! treats object arrays. You have been warned.

mut-list

(mut-list)(mut-list data)

Create a mutable java list that is in-place convertible to a persistent list

mut-long-hashtable-map

(mut-long-hashtable-map)(mut-long-hashtable-map data)(mut-long-hashtable-map xform data)(mut-long-hashtable-map xform options data)

Create a mutable implementation of java.util.Map. This object efficiently implements ITransient map so you can use assoc! and persistent! on it but you can additionally use the various members of the java.util.Map interface such as put, compute, computeIfAbsent, replaceAll and merge.

If data is an object array it is treated as a flat key-value list which is distinctly different than how conj! treats object arrays. You have been warned.

mut-long-map

(mut-long-map)(mut-long-map data)(mut-long-map xform data)(mut-long-map xform options data)

Create a mutable implementation of java.util.Map specialized to long keys. This object efficiently implements ITransient map so you can use assoc! and persistent! on it but you can additionally use the various members of the java.util.Map interface such as put, compute, computeIfAbsent, replaceAll and merge. Attempting to store any non-numeric value will result in an exception.

If data is an object array it is treated as a flat key-value list which is distinctly different than how conj! treats object arrays. You have been warned.

mut-map

(mut-map)(mut-map data)(mut-map xform data)(mut-map xform options data)

Create a mutable implementation of java.util.Map. This object efficiently implements ITransient map so you can use assoc! and persistent! on it but you can additionally use the various members of the java.util.Map interface such as put, compute, computeIfAbsent, replaceAll and merge.

If data is an object array it is treated as a flat key-value list which is distinctly different than how conj! treats object arrays. You have been warned.

mut-map-rf

(mut-map-rf cons-fn)(mut-map-rf cons-fn finalize-fn)

mut-map-union!

(mut-map-union! merge-bifn l r)

Very fast union that may simply update lhs and return it. Both lhs and rhs must be mutable maps. See docs for map-union.

mut-set

(mut-set)(mut-set data)(mut-set options data)

Create a mutable hashset based on the hashtable. You can create a persistent hashset via the clojure persistent! call.

Options:

  • :hash-provider - An implementation of BitmapTrieCommon$HashProvider. Defaults to the default-hash-provider.

mutable-map?

(mutable-map? m)

obj-ary

(obj-ary)(obj-ary v0)(obj-ary v0 v1)(obj-ary v0 v1 v2)(obj-ary v0 v1 v2 v3)(obj-ary v0 v1 v2 v3 v4)(obj-ary v0 v1 v2 v3 v4 v5)(obj-ary v0 v1 v2 v3 v4 v5 v6)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14)(obj-ary v0 v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14 v15)

As quickly as possible, produce an object array from these inputs. Very fast for arities <= 16.

object-array

(object-array item)

Faster version of object-array for java collections and strings.

object-array-list

(object-array-list)(object-array-list cap-or-data)

An array list that is as fast as java.util.ArrayList for add,get, etc but includes many accelerated operations such as fill and an accelerated addAll when the src data is an object array based list.

ovec

macro

(ovec)(ovec data)

Return an immutable persistent vector like object backed by a single object array.

persistent!

(persistent! v)

If object is an ITransientCollection, call clojure.core/persistent!. Else return collection.

pgroups

(pgroups n-elems body-fn options)(pgroups n-elems body-fn)

Run y index groups across n-elems. Y is common pool parallelism.

body-fn gets passed two longs, startidx and endidx.

Returns a sequence of the results of body-fn applied to each group of indexes.

Before using this primitive please see if ham-fisted.reduce/preduce will work.

You must wrap this in something that realizes the results if you need the parallelization to finish by a particular point in the program - (dorun (hamf/pgroups ...)).

Options:

  • :pgroup-min - when provided n-elems must be more than this value for the computation to be parallelized.
  • :batch-size - max batch size. Defaults to 64000.

pmap

(pmap map-fn & sequences)

pmap using the commonPool. This is useful for interacting with other primitives, namely pgroups which are also based on this pool. This is a change from Clojure's base pmap in that it uses the ForkJoinPool/commonPool for parallelism as opposed to the agent pool - this makes it compose with pgroups and dtype-next's parallelism system.

Before using this primitive please see if ham-fisted.reduce/preduce will work.

Is guaranteed to not trigger the need for shutdown-agents.

pmap-opts

(pmap-opts opts map-fn & sequences)

pmap but takes an extra option map as the first argument. This is useful if you, for instance, want to control exactly the parallel options arguments such as :n-lookahead. See docs for ham-fisted.reduce/options->parallel-options.

range

(range)(range end)(range start end)(range start end step)

When given arguments returns a range that implements random access java list interfaces so nth, reverse and friends are efficient.

reduced->

macro

(reduced-> rfn acc & data)

Helper macro to implement reduce chains checking for if the accumulator is reduced before calling the next expression in data.

(defrecord YMC [year-month ^long count]
  clojure.lang.IReduceInit
  (reduce [this rfn init]
    (let [init (reduced-> rfn init
                   (clojure.lang.MapEntry/create :year-month year-month)
                   (clojure.lang.MapEntry/create :count count))]
      (if (and __extmap (not (reduced? init)))
        (reduce rfn init __extmap)
        init))))

reindex

(reindex coll indexes)

Permut coll by the given indexes. Result is random-access and the same length as the index collection. Indexes are expected to be in the range of [0->count(coll)).

repeat

(repeat v)(repeat n v)

When called with no arguments, produce an infinite sequence of v. When called with 2 arguments, produce a random access list that produces v at each index.

rest

(rest coll)

Version of rest that does uses subvec if collection is random access. This preserves the ability to reduce in parallel over the collection.

reverse

(reverse coll)

Reverse a collection or sequence. Constant time reverse is provided for any random access list.

short-array

(short-array)(short-array data)

short-array-list

(short-array-list)(short-array-list data)

shuffle

(shuffle coll)(shuffle coll opts)

shuffle values returning random access container. If you are calling this repeatedly on the same collection you should call ->random-access on the collection before you start as shuffle internally only works on random access collections.

Options:

  • :seed - If instance of java.util.Random, use this. If integer, use as seed. If not provided a new instance of java.util.Random is created.

sort

(sort coll)(sort comp coll)

Exact replica of clojure.core/sort but instead of wrapping the final object array in a seq which loses the fact the result is countable and random access. Faster implementations are provided when the input is an integer, long, or double array.

The default comparison is nan-last meaning null-last if the input is an undefined container and nan-last if the input is a double or float specific container.

sort-by

(sort-by keyfn coll)(sort-by keyfn comp coll)

Sort a collection by keyfn. Typehinting the return value of keyfn will somewhat increase the speed of the sort :-).

sorta

(sorta coll)(sorta comp coll)

Sort returning an object array.

splice

(splice v1 idx v2)

Splice v2 into v1 at idx. Returns a persistent vector.

subvec

(subvec m sidx eidx)(subvec m sidx)

More general version of subvec. Works for any java list implementation including persistent vectors and any array.

sum

(sum coll)(sum options coll)

Very stable high performance summation. Uses both threading and kahans compensated summation.

Options:

  • nan-strategy - defaults to :remove. Options are :keep, :remove and :exception.

sum-fast

(sum-fast coll)

Fast simple double summation. Does not do any nan checking or summation compensation.

sum-stable-nelems

(sum-stable-nelems coll)(sum-stable-nelems options coll)

Stable sum returning map of {:sum :n-elems}. See options for sum.

take

(take n)(take n coll)

Take the first N values from a collection. If the input is random access, the result will be random access.

take-last

(take-last n)(take-last n coll)

Take the last N values of the collection. If the input is random-access, the result will be random-access.

take-min

(take-min n comp values)(take-min n values)

Take the min n values of a collection. This is not an order-preserving operation.

transient

(transient v)

transient-map-rf

(transient-map-rf cons-fn)(transient-map-rf cons-fn finalize-fn)

union

(union s1 s2)

Union of two sets or two maps. When two maps are provided the right hand side wins in the case of an intersection - same as merge.

Result is either a set or a map, depending on if s1 is a set or map.

union-reduce-maps

(union-reduce-maps bfn maps)

Do an efficient union reduction across many maps using bfn to update values. If the first map is mutable the union is done mutably into the first map and it is returned.

update-vals

(update-vals data f)

update-values

(update-values map bfn)

Immutably (or mutably) update all values in the map returning a new map. bfn takes 2 arguments, k,v and returns a new v. Returning nil removes the key from the map. When passed a vector the keys are indexes and no nil-removal is done.

upgroups

(upgroups n-elems body-fn options)(upgroups n-elems body-fn)

Run y index groups across n-elems. Y is common pool parallelism.

body-fn gets passed two longs, startidx and endidx.

Returns a sequence of the results of body-fn applied to each group of indexes.

Before using this primitive please see if ham-fisted.reduce/preduce will work.

You must wrap this in something that realizes the results if you need the parallelization to finish by a particular point in the program - (dorun (hamf/upgroups ...)).

Options:

  • :pgroup-min - when provided n-elems must be more than this value for the computation to be parallelized.
  • :batch-size - max batch size. Defaults to 64000.

upmap

(upmap map-fn & sequences)

Unordered pmap using the commonPool. This is useful for interacting with other primitives, namely pgroups which are also based on this pool.

Before using this primitive please see if ham-fisted.reduce/preduce will work.

Like pmap this uses the commonPool so it composes with this api's pmap, pgroups, and dtype-next's parallelism primitives but it does not impose an ordering constraint on the results and thus may be significantly faster in some cases.

vals

(vals m)

Return the values of a map. This version allows parallel reduction operations on the returned sequence. Returned sequence is in same order as (keys m).

vec

(vec data)(vec)

Produce a persistent vector. Optimized pathways exist for object arrays and java List implementations.

vector

(vector)(vector a)(vector a b)(vector a b c)(vector a b c d)(vector a b c d e)(vector a b c d e f)(vector a b c d e f g)(vector a b c d e f g h)(vector a b c d e f g h i)(vector a b c d e f g h i j)(vector a b c d e f g h i j k)(vector a b c d e f g h i j k & args)

wrap-array

(wrap-array ary)

Wrap an array with an implementation of IMutList

wrap-array-growable

(wrap-array-growable ary ptr)(wrap-array-growable ary)

Wrap an array with an implementation of IMutList that supports add and addAllReducible. 'ptr is the numeric put ptr, defaults to the array length. Pass in zero for a preallocated but empty growable wrapper.