ham-fisted.lazy-noncaching

Lazy, noncaching implementation of many clojure.core functions. There are several benefits of carefully constructed lazy noncaching versions:

  1. No locking - better multithreading/green thread performance.
  2. Higher performance generally.
  3. More datatype flexibility - if map is passed a single randomly addressible or generically parallelizable container the result is still randomly addressible or generically perallelizable. For instance (map key {:a 1 :b 2}) returns in the generic case something that can still be parallelizable as the entry set of a map implements spliterator.

->collection

(->collection item)

Ensure an item implements java.util.Collection. This is inherently true for seqs and any implementation of java.util.List but not true for object arrays. For maps this returns the entry set.

->iterable

(->iterable a)

->random-access

(->random-access item)

->reducible

(->reducible item)

apply-concat

(apply-concat)(apply-concat data)(apply-concat opts data)

A more efficient form of (apply concat ...) that doesn't force data to be a clojure seq.
See concat-opts for opts definition.

as-random-access

(as-random-access item)

If item implements RandomAccess, return List interface.

cartesian-map

(cartesian-map f)(cartesian-map f a)(cartesian-map f a b)(cartesian-map f a b & args)

Create a new sequence that is the cartesian join of the input sequence passed through f. Unlike map, f is passed the arguments as a single persistent vector. This is to enable much higher efficiency in the higher-arity applications. For tight numeric loops, see ham-fisted.hlet/let.

The argument vector is mutably updated between function calls so you can't cache it. Use (into [] args) or some variation thereof to cache the arguments as is.

user> (hamf/sum-fast (lznc/cartesian-map
                      #(h/let [[a b c d](lng-fns %)]
                         (-> (+ a b) (+ c) (+ d)))
                      [1 2 3]
                      [4 5 6]
                      [7 8 9]
                      [10 11 12 13 14]))
3645.0

concat

(concat)(concat a)(concat a & args)

concat-opts

(concat-opts opts a)(concat-opts opts a & args)

Concat where the first argument is an options map. This variation allows you to set the :cat-parallelism as you may have an idea the best way to parallelism this concatenation at time of the concatenation creation.

Options:

:cat-parallelism - Set the type of parallelism - either :elem-wise or :seq-wise - this overrides settings later passed into calls such as reduce.preduce - see reduce/options->parallel-options for definition.

constant-count

(constant-count data)

Constant time count. Returns nil if input doesn't have a constant time count.

constant-countable?

(constant-countable? data)

empty-vec

every?

(every? pred coll)

Faster (in most circumstances) implementation of clojure.core/every?. This can be much faster in the case of primitive arrays of values. Type-hinted functions are best if coll is primitive array - see example.

user> (type data)
[J
user> (count data)
100
user> (def vdata (vec data))
#'user/vdata
user> (crit/quick-bench (every? (fn [^long v] (> v 80)) data))
             Execution time mean : 40.248868 ns
nil
user> (crit/quick-bench (lznc/every? (fn [^long v] (> v 80)) data))
             Execution time mean : 7.601190 ns
nil
user> (crit/quick-bench (every? (fn [^long v] (< v 80)) vdata))
             Execution time mean : 1.269582 µs
nil
user> (crit/quick-bench (lznc/every? (fn [^long v] (< v 80)) vdata))
             Execution time mean : 211.645613 ns
nil
user>

filter

(filter pred)(filter pred coll)

into-array

(into-array aseq)(into-array ary-type aseq)(into-array ary-type mapfn aseq)

make-readonly-list

macro

(make-readonly-list n idxvar read-code)(make-readonly-list cls-type-kwd n idxvar read-code)

Implement a readonly list. If cls-type-kwd is provided it must be, at compile time, either :int64, :float64 or :object and the getLong, getDouble or get interface methods will be filled in, respectively. In those cases read-code must return the appropriate type.

map

(map f)(map f arg)(map f arg & args)

map-indexed

(map-indexed map-fn coll)

map-reducible

(map-reducible f r)

Map a function over r - r need only be reducible. Returned value does not implement seq but is countable when r is countable.

object-array

(object-array item)

Faster version of object-array for eductions, java collections and strings.

partition-all

(partition-all n)(partition-all n coll)(partition-all n step coll)

Lazy noncaching version of partition-all. When input is random access returns random access result.

If input is not random access then similar to partition-by each sub-collection must be entirely iterated through before requesting the next sub-collection.

user> (crit/quick-bench (mapv hamf/sum-fast (lznc/partition-all 100 (range 100000))))
             Execution time mean : 335.821098 µs
nil
user> (crit/quick-bench (mapv hamf/sum-fast (partition-all 100 (range 100000))))
             Execution time mean : 6.831242 ms
nil
user> (crit/quick-bench (into [] (comp (partition-all 100)
                                       (map hamf/sum-fast))
                              (range 100000)))
             Execution time mean : 1.645954 ms
nil

partition-by

(partition-by f)(partition-by f coll)(partition-by f options coll)

Lazy noncaching version of partition-by. For reducing partitions into a singular value please see apply-concat. Return value most efficiently implements reduce with a slightly less efficient implementation of Iterable.

Unlike clojure.core/partition-by this does not store intermediate elements nor does it build up intermediate containers. This makes it somewhat faster in most contexts.

Each sub-collection must be iterated through entirely before the next method of the parent iterator else the result will not be correct.

Options:

  • :ignore-leftover? - When true leftover items in the previous iteration do not cause an exception. Defaults to false.
  • :binary-predicate - When provided, use this for equality semantics. Defaults to equiv semantics but in a numeric context it may be useful to have '(== ##NaN ##Nan).
user> ;;incorrect - inner items not iterated and non-caching!
user> (into [] (lznc/partition-by identity [1 1 1 2 2 2 3 3 3]))
Execution error at ham_fisted.lazy_noncaching.PartitionBy/reduce (lazy_noncaching.clj:514).
Sub-collection was not entirely consumed.

user> ;;correct - transducing form of into calls vec on each sub-collection
user> ;;thus iterating through it entirely.
user> (into [] (map vec) (lznc/partition-by identity [1 1 1 2 2 2 3 3 3]))
[[1 1 1] [2 2 2] [3 3 3]]
user> ;;filter,collect NaN out of sequence
user> (lznc/map hamf/vec (lznc/partition-by identity {:binary-predicate (hamf-fn/binary-predicate
                                                                         x y (let [x (double x)
                                                                                   y (double y)]
                                                                               (cond
                                                                                 (Double/isNaN x)
                                                                                 (if (Double/isNaN y)
                                                                                   true
                                                                                   false)
                                                                                 (Double/isNaN y) false
                                                                                 :else true))) }
                                            [1 2 3 ##NaN ##NaN 3 4 5]))
([1 2 3] [NaN NaN] [3 4 5])

user> (def init-data (vec (lznc/apply-concat (lznc/map #(repeat 100 %) (range 1000)))))
#'user/init-data
user> (crit/quick-bench (mapv hamf/sum-fast (lznc/partition-by identity init-data)))
             Execution time mean : 366.915796 µs
  ...
nil
user> (crit/quick-bench (mapv hamf/sum-fast (clojure.core/partition-by identity init-data)))
             Execution time mean : 6.699424 ms
  ...
nil
user> (crit/quick-bench (into [] (comp (clojure.core/partition-by identity)
                                       (map hamf/sum-fast)) init-data))
             Execution time mean : 1.705864 ms
  ...

reindex

(reindex coll indexes)

Permut coll by the given indexes. Result is random-access and the same length as the index collection. Indexes are expected to be in the range of [0->count(coll)).

remove

added in 1.0

(remove pred coll)(remove pred)

Returns a lazy sequence of the items in coll for which (pred item) returns logical false. pred must be free of side-effects. Returns a transducer when no collection is provided.

repeatedly

(repeatedly f)(repeatedly n f)

When called with one argument, produce infinite list of calls to v. When called with two arguments, produce a non-caching random access list of length n of calls to v.

seed->random

(seed->random seed)

shift

(shift n coll)

Shift a collection forward or backward repeating either the first or the last entries. Returns a random access list with the same elements as coll.

Example:

ham-fisted.api> (shift 2 (range 10))
[0 0 0 1 2 3 4 5 6 7]
ham-fisted.api> (shift -2 (range 10))
[2 3 4 5 6 7 8 9 9 9]

shuffle

(shuffle coll)(shuffle coll opts)

shuffle values returning random access container.

Options:

  • :seed - If instance of java.util.Random, use this. If integer, use as seed. If not provided a new instance of java.util.Random is created.

tuple-map

(tuple-map f c1)(tuple-map f c1 c2)(tuple-map f c1 c2 & cs)

Lazy nonaching map but f simply gets a single random-access list of arguments. The argument list may be mutably updated between calls.

type-single-arg-ifn

(type-single-arg-ifn ifn)

Categorize the return type of a single argument ifn. May be :float64, :int64, or :object.

type-zero-arg-ifn

(type-zero-arg-ifn ifn)

Categorize the return type of a single argument ifn. May be :float64, :int64, or :object.