tech.v3.datatype.functional

Arithmetic and statistical operations based on the Buffer interface. These operators and functions all implement vectorized interfaces so passing in something convertible to a reader will return a reader. Arithmetic operations are done lazily. These functions generally incur a large dispatch cost so for example each call to '+' checks all the arguments to decide if it should dispatch to an iterable implementation or to a reader implementation. For tight loops or operations like map and filter, using the specific operators will result in far faster code than using the '+' function itself.

*

(* x y)(* x y & args)

+

(+ x)(+ x y)(+ x y & args)

-

(- x)(- x y)(- x y & args)

/

(/ x)(/ x y)(/ x y & args)

<

(< x y z)(< x y)

<=

(<= x y z)(<= x y)

>

(> x y z)(> x y)

>=

(>= x y z)(>= x y)

abs

(abs x options)(abs x)

acos

(acos x options)(acos x)

and

(and x y)

asin

(asin x options)(asin x)

atan

(atan x options)(atan x)

atan2

(atan2 x y)(atan2 x y & args)

bit-and

(bit-and x y)(bit-and x y & args)

bit-and-not

(bit-and-not x y)(bit-and-not x y & args)

bit-clear

(bit-clear x y)(bit-clear x y & args)

bit-flip

(bit-flip x y)(bit-flip x y & args)

bit-not

(bit-not x options)(bit-not x)

bit-or

(bit-or x y)(bit-or x y & args)

bit-set

(bit-set x y)(bit-set x y & args)

bit-shift-left

(bit-shift-left x y)(bit-shift-left x y & args)

bit-shift-right

(bit-shift-right x y)(bit-shift-right x y & args)

bit-test

(bit-test x y)

bit-xor

(bit-xor x y)(bit-xor x y & args)

bool-reader->indexes

(bool-reader->indexes options x)(bool-reader->indexes x)

Given a reader, produce a filtered list of indexes filtering out 'false' values.

cbrt

(cbrt x options)(cbrt x)

ceil

(ceil x options)(ceil x)

cos

(cos x options)(cos x)

cosh

(cosh x options)(cosh x)

cummax

(cummax x options)(cummax x)

Cumulative running max; returns result in double space.

Options:

  • :nan-strategy - one of :keep, :remove, :exception. Defaults to :remove.

cummin

(cummin x options)(cummin x)

Cumulative running min; returns result in double space.

Options:

  • :nan-strategy - one of :keep, :remove, :exception. Defaults to :remove.

cumprod

(cumprod x options)(cumprod x)

Cumulative running product; returns result in double space.

Options:

  • :nan-strategy - one of :keep, :remove, :exception. Defaults to :remove.

cumsum

(cumsum x options)(cumsum x)

Cumulative running summation; returns result in double space.

Options:

  • :nan-strategy - one of :keep, :remove, :exception. Defaults to :remove.

descriptive-statistics

(descriptive-statistics x stats-names stats-data options)(descriptive-statistics x stats-names options)(descriptive-statistics x stats-names)(descriptive-statistics x)

Calculate a set of descriptive statistics on a single reader.

Available stats: #{:min :quartile-1 :sum :mean :mode :median :quartile-3 :max :variance :standard-deviation :skew :n-elems :kurtosis}

options

  • :nan-strategy - defaults to :remove, one of :keep :remove :exception. The fastest option is :keep but this may result in your results having NaN's in them. You can also pass in a double predicate to filter custom double values.

distance

(distance x y)

distance-squared

(distance-squared x y)

dot-product

(dot-product x y)

eq

(eq x y)

equals

(equals x y & args)

even?

(even? x options)(even? x)

exp

(exp x options)(exp x)

expm1

(expm1 x options)(expm1 x)

fill-range

(fill-range x max-span)

Given a reader of numeric data and a max span amount, produce a new reader where the difference between any two consecutive elements is less than or equal to the max span amount. Also return a bitmap of the added indexes. Uses linear interpolation to fill in areas, operates in double space. Returns {:result :missing}

finite?

(finite? x options)(finite? x)

fixed-rolling-window

(fixed-rolling-window x window-size window-fn options)(fixed-rolling-window x window-size window-fn)

Return a lazily evaluated rolling window of window-fn applied to each window. The iterable or sequence is padded such that there are the same number of values in the result as in the input with repeated elements padding the beginning and end of the original sequence. If input is an iterator, output is an lazy sequence. If input is a reader, output is a reader.

:Options

  • :relative-window-position - Defaults to :center - controls the window's relative positioning in the sequence.
  • :edge-mode - Defaults to :clamp - either :zero in which case window values off the edge are zero for numeric types or nil for object types or :clamp - in which case window values off the edge of the data are bound to the first or last values respectively.

Example (all results are same length):

user> (require '[tech.v3.datatype :as dtype])
nil
user> (require '[tech.v3.datatype.rolling :as rolling])
nil
user> (require '[tech.v3.datatype.functional :as dfn])
nil
  user> (rolling/fixed-rolling-window (range 20) 5 dfn/sum {:relative-window-position :left})
[0 1 3 6 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85]
user> (rolling/fixed-rolling-window (range 20) 5 dfn/sum {:relative-window-position :center})
[3 6 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 89 92]
user> (rolling/fixed-rolling-window (range 20) 5 dfn/sum {:relative-window-position :right})
[10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 89 92 94 95]
user>

floor

(floor x options)(floor x)

get-significand

(get-significand x options)(get-significand x)

hypot

(hypot x y)(hypot x y & args)

identity

(identity x options)(identity x)

ieee-remainder

(ieee-remainder x y)(ieee-remainder x y & args)

infinite?

(infinite? x options)(infinite? x)

kendalls-correlation

(kendalls-correlation x y options)(kendalls-correlation x y)

kurtosis

(kurtosis x options)(kurtosis x)

linear-regressor

(linear-regressor x y)

Create a simple linear regressor. Returns a function that given a (double) 'x' predicts a (double) 'y'. The function has metadata that contains the regressor and some regressor info, notably slope and intercept.

Example:

tech.v3.datatype.functional> (def regressor (linear-regressor [1 2 3] [4 5 6]))
#'tech.v3.datatype.functional/regressor
tech.v3.datatype.functional> (regressor 1)
4.0
tech.v3.datatype.functional> (regressor 2)
5.0
tech.v3.datatype.functional> (meta regressor)
{:regressor
  #object[org.apache.commons.math3.stat.regression.SimpleRegression 0x52091e82 "org.apache.commons.math3.stat.regression.SimpleRegression@52091e82"],
 :intercept 3.0,
 :slope 1.0,
 :mean-squared-error 0.0}

log

(log x options)(log x)

log10

(log10 x options)(log10 x)

log1p

(log1p x options)(log1p x)

logistic

(logistic x options)(logistic x)

magnitude

(magnitude x)

magnitude-squared

(magnitude-squared x)

mathematical-integer?

(mathematical-integer? x options)(mathematical-integer? x)

max

(max x)(max x y)(max x y & args)

mean

(mean x options)(mean x)

double mean of x

mean-fast

(mean-fast x)

Take the mean of the x. This operation doesn't know anything about nan hence it is a bit faster than the base mean fn.

median

(median x options)(median x)

min

(min x)(min x y)(min x y & args)

mode

(mode data)

Return the value of the most common occurance in the data.

nan?

(nan? x options)(nan? x)

neg?

(neg? x options)(neg? x)

next-down

(next-down x options)(next-down x)

next-up

(next-up x options)(next-up x)

normalize

(normalize x)

not

(not x options)(not x)

not-eq

(not-eq x y)

odd?

(odd? x options)(odd? x)

or

(or x y)

pearsons-correlation

(pearsons-correlation x y options)(pearsons-correlation x y)

percentiles

(percentiles x percentages options)(percentiles x percentages)

Create a reader of percentile values, one for each percentage passed in. Estimation types are in the set of #{:r1,r2...legacy} and are described here: https://commons.apache.org/proper/commons-math/javadocs/api-3.3/index.html.

nan-strategy can be one of :keep :remove :exception and defaults to :exception.

pos?

(pos? x options)(pos? x)

pow

(pow x y)(pow x y & args)

quartile-1

(quartile-1 x options)(quartile-1 x)

quartile-3

(quartile-3 x options)(quartile-3 x)

quartile-outlier-fn

(quartile-outlier-fn x & args)

Create a function that, given floating point data, will return true or false if that data is an outlier. Default range mult is 1.5:

  (or (< val (- q1 (* range-mult iqr)))
      (> val (+ q3 (* range-mult iqr)))

Options:

  • :range-mult - the multiplier used.

quartiles

(quartiles x)(quartiles x options)

return min, 25 50 75 max of item

quot

(quot x y)(quot x y & args)

reduce-*

(reduce-* x)

reduce-+

(reduce-+ x)

reduce-max

(reduce-max x)

reduce-min

(reduce-min x)

rem

(rem x y)(rem x y & args)

rint

(rint x options)(rint x)

round

(round x options)(round x)

Vectorized implementation of Math/round. Operates in double space but returns a long or long reader.

shift

(shift x n)

Shift by n and fill in with the first element for n>0 or last element for n<0.

Examples:

user> (dfn/shift (range 10) 2)
[0 0 0 1 2 3 4 5 6 7]
user> (dfn/shift (range 10) -2)
[2 3 4 5 6 7 8 9 9 9]

signum

(signum x options)(signum x)

sin

(sin x options)(sin x)

sinh

(sinh x options)(sinh x)

skew

(skew x options)(skew x)

spearmans-correlation

(spearmans-correlation x y options)(spearmans-correlation x y)

sq

(sq x options)(sq x)

sqrt

(sqrt x options)(sqrt x)

standard-deviation

(standard-deviation x options)(standard-deviation x)

sum

(sum x options)(sum x)

Double sum of data using Kahan compensated summation.

sum-fast

(sum-fast x)

Find the sum of the data. This operation is neither nan-aware nor does it implement kahans compensation although via parallelization it implements pairwise summation compensation. For a more but slightly slower but far more correct sum operator, use sum.

tan

(tan x options)(tan x)

tanh

(tanh x options)(tanh x)

to-degrees

(to-degrees x options)(to-degrees x)

to-radians

(to-radians x options)(to-radians x)

ulp

(ulp x options)(ulp x)

unsigned-bit-shift-right

(unsigned-bit-shift-right x y)(unsigned-bit-shift-right x y & args)

variance

(variance x options)(variance x)

zero?

(zero? x options)(zero? x)