API docs

Entropy Functions

spkit.entropy(x, alpha=1, ignoreZero=False, base=2, normalize=False)[source]

Rényi entropy of order α alpha:[0,inf]

0

Max-entropy H(x) = log(N) where N = number of bins

1

Shannan entropy H(x) = -sum{Px*log(Px)}

2

Collision entropy or Rényi entropy H(x) = 1/(1-α)*log{sum{Px^α}}

inf:Min-entropy

H(x) = -log(max(Px))

base: base of log:

: if 2, entropy is in bits, e-nats, 10 -bans

ignoreZero: if true, probabilities with zero value will be omited, before computations

: It doesn’t make much of difference

spkit.mutual_Info(x, y, ignoreZero=False, base=2)[source]

I(X;Y) = H(X)+H(Y)-H(X,Y)

spkit.dispersion_entropy(x, classes=10, scale=1, emb_dim=2, delay=1, mapping_type='cdf', de_normalize=False, A=100, Mu=100, return_all=False, warns=True)[source]

Calculate dispersion entropy of signal x (multiscale)

input:

x : input signal x - 1d-array of shape=(n,) classes: number of classes - (levels of quantization of amplitude) (default=10) emb_dim: embedding dimension, delay : time delay (default=1) scale : downsampled signal with low resolution (default=1) - for multipscale dispersion entropy mapping_type: mapping method to discretizing signal (default=’cdf’)

: options = {‘cdf’,’a-law’,’mu-law’,’fd’}

A : factor for A-Law- if mapping_type = ‘a-law’ Mu : factor for μ-Law- if mapping_type = ‘mu-law’

de_normalize: (bool) if to normalize the entropy, to make it comparable with different signal with different

number of classes and embeding dimensions. default=0 (False) - no normalizations

if de_normalize=1:
  • dispersion entropy is normalized by log(Npp); Npp=total possible patterns. This is classical way to normalize entropy since max{H(x)}<=np.log(N) for possible outcomes. However, in case of limited length of signal (sequence), it would be not be possible to get all the possible patterns and might be incorrect to normalize by log(Npp), when len(x)<Npp or len(x)<classes**emb_dim. For example, given signal x with discretized length of 2048 samples, if classes=10 and emb_dim=4, the number of possible patterns Npp = 10000, which can never be found in sequence length < 10000+4. To fix this, the alternative way to nomalize is recommended as follow.

  • select this when classes**emb_dim < (N-(emb_dim-1)*delay)

de_normalize=2: (recommended for classes**emb_dim > len(x)/scale)
  • dispersion entropy is normalized by log(Npf); Npf [= (len(x)-(emb_dim - 1) * delay)] the total number of patterns founds in given sequence. This is much better normalizing factor. In worst case (lack of better word) - for a very random signal, all Npf patterns could be different and unique, achieving the maximum entropy and for a constant signal, all Npf will be same achieving to zero entropy

  • select this when classes**emb_dim > (N-(emb_dim-1)*delay)

de_normalize=3:
  • dispersion entropy is normalized by log(Nup); number of total unique patterns (NOT RECOMMENDED) - it does not make sense (not to me, at least)

de_normalize=4:
  • auto select normalizing factor

  • if classes**emb_dim > (N-(emb_dim-1)*delay), then de_normalize=2

  • if classes**emb_dim > (N-(emb_dim-1)*delay), then de_normalize=2

output

disp_entr : dispersion entropy of the signal prob : probability distribution of patterns

if return_all True - also returns

patterns_dict: disctionary of patterns and respective frequencies x_discrete : discretized signal x (Npf,Npp,Nup): Npf - total_patterns_found, Npp - total_patterns_possible) and Nup - total unique patterns found

: Npf number of total patterns in discretized signal (not total unique patterns)

spkit.dispersion_entropy_multiscale_refined(x, classes=10, scales=[1, 2, 3, 4, 5], emb_dim=2, delay=1, mapping_type='cdf', de_normalize=False, A=100, Mu=100, return_all=False, warns=True)[source]

Calculate multiscale refined dispersion entropy of signal x

compute dispersion entropy at different scales (defined by argument - ‘scales’) and combining the patterns found at different scales to compute final dispersion entropy

input:

x : input signal x - 1d-array of shape=(n,) classes : number of classes - (levels of quantization of amplitude) (default=10) emb_dim : embedding dimension, delay : time delay (default=1) scales : list or 1d array of scales to be considered to refine the dispersion entropy

mapping_type: mapping method to discretizing signal (default=’cdf’)

: options = {‘cdf’,’a-law’,’mu-law’,’fd’}

A : factor for A-Law- if mapping_type = ‘a-law’ Mu : factor for μ-Law- if mapping_type = ‘mu-law’

de_normalize: (bool) if to normalize the entropy, to make it comparable with different signal with different

number of classes and embeding dimensions. default=0 (False) - no normalizations

if de_normalize=1:
  • dispersion entropy is normalized by log(Npp); Npp=total possible patterns. This is classical way to normalize entropy since max{H(x)}<=np.log(N) for possible outcomes. However, in case of limited length of signal (sequence), it would be not be possible to get all the possible patterns and might be incorrect to normalize by log(Npp), when len(x)<Npp or len(x)<classes**emb_dim. For example, given signal x with discretized length of 2048 samples, if classes=10 and emb_dim=4, the number of possible patterns Npp = 10000, which can never be found in sequence length < 10000+4. To fix this, the alternative way to nomalize is recommended as follow.

de_normalize=2: (recommended for classes**emb_dim > len(x)/scale)
  • dispersion entropy is normalized by log(Npf); Npf [= (len(x)-(emb_dim - 1) * delay)] the total number of patterns founds in given sequence. This is much better normalizing factor. In worst case (lack of better word) - for a very random signal, all Npf patterns could be different and unique, achieving the maximum entropy and for a constant signal, all Npf will be same achieving to zero entropy

de_normalize=3:
  • dispersion entropy is normalized by log(Nup); number of total unique patterns (NOT RECOMMENDED) - it does not make sense (not to me, at least)

output

disp_entr : dispersion entropy of the signal prob : probability distribution of patterns

if return_all True - also returns

patterns_dict: disctionary of patterns and respective frequencies x_discrete : discretized signal x (Npf,Npp,Nup): Npf - total_patterns_found, Npp - total_patterns_possible) and Nup - total unique patterns found

: Npf number of total patterns in discretized signal (not total unique patterns)

Information Theory Module

Information Theory techniques

Author @ Nikesh Bajaj Date: 18 Apr 2019 Version : 0.0.3 Github : https://github.com/Nikeshbajaj/spkit Contact: n.bajaj@qmul.ac.uk

spkit.core.infotheory.A_law(x, A=255, companding=True)[source]

A-Law for companding and expanding

  • A way to map gaussian, or laplacian distribuated signal to uniformly distributed one

  • for companding - smaller amplitude values are stretched (since they have high-probabilty) and large amplitutde values are compressed (since they have low probabiltity)

  • for expending - it reverts the mapping

for -1 ≤ x ≤ 1 and 1 < A Companding:

y(n) = sign(x)* (A|x|/(1 + log(A))) if |x|<1/A

= sign(x)* ((1 + log(A|x|)) /(1 + log(A))) else

expanding:
y(n) = sign(x)* (|x|(1+log(A)))/A if |x|<1/(1+log(A))

= sign(x)* (np.exp(-1 + |x|(1+log(A)))) /A else

  • An alternative to μ-Law

  • The μ-law algorithm provides a slightly larger dynamic range than the A-law at the cost of worse proportional distortions for small signals.

input

x : 1d (or nd) array of signal, must be -1 ≤ x ≤ 1 A : scalar (1≤A) - factor for companding-expanding mapping

: 1: identity function, higher value of A will stretch and compress more

companding: (bool) - if True, companding is applied, else expanding

output

y: mapped output of same shape as x

spkit.core.infotheory.Mu_law(x, Mu=255, companding=True)[source]

μ-Law for companding and expanding

  • A way to map gaussian, or laplacian distribuated signal to uniformly distributed one

  • for companding - smaller amplitude values are stretched (since they have high-probabilty) and large amplitutde values are compressed (since they have low probabiltity)

  • for expending - it reverts the mapping

for -1 ≤ x ≤ 1 Companding:

y(n) = sign(x)*log(1 + μ|x|)/log(1 + μ)

expanding:

y(n) = sign(x)*((1 + μ)**|x| - 1)/μ

  • An alternative to A-Law

  • The μ-law algorithm provides a slightly larger dynamic range than the A-law at the cost of worse proportional distortions for small signals.

input

x : 1d (or nd) array of signal, must be -1 ≤ x ≤ 1 Mu : scalar (0≤Mu) - factor for companding-expanding mapping

: ~0: identity function, higher value of Mu will stretch and compress more : use 1e-5 for Mu=0 to avoid ‘zero-division’ error

companding: (bool) - if True, companding is applied, else expanding

output

y: mapped output of same shape as x

spkit.core.infotheory.Quantize(x, scale=1, min_bins=2, A=None, Mu=None, cdf_map=False, nbins=None)[source]

input

output

spkit.core.infotheory.TD_Embed(x, order=3, delay=1)[source]

Time delay Embedding Matrix. input ———- x : 1d-array, time series of shape (n,) order : int, Embedding dimension (order). delay : int, Delay.

output

X: Embedded Matrix: ndarray

Embedded time-series, of shape (n - (order - 1) * delay, order)

spkit.core.infotheory.binSize_FD(x)[source]

Compute bin width using Freedman–Diaconis rule

bin_width = 2*IQR(x)/(n**(1/3))

input

x: 1d-array

output

bw : bin width

spkit.core.infotheory.bin_width(x, method='fd')[source]

Compute bin width using different methods

‘fd’ (Freedman Diaconis Estimator)
  • Robust (resilient to outliers) estimator that takes into account data variability and data size.

‘doane’
  • An improved version of Sturges’ estimator that works better with non-normal datasets.

‘scott’
  • Less robust estimator that that takes into account data variability and data size.

‘stone’
  • Estimator based on leave-one-out cross-validation estimate of the integrated squared error.

    Can be regarded as a generalization of Scott’s rule.

‘rice’
  • Estimator does not take variability into account, only data size.

    Commonly overestimates number of bins required.

‘sturges’
  • Only accounts for data size. Only optimal for gaussian data and underestimates number of bins for large non-gaussian datasets.

‘sqrt’
  • Square root (of data size) estimator, used by Excel and other programs for its speed and simplicity.

input

x : 1d-array or (n-d array) method : method to compute bin width and number of bins

output

bw : bin width k : number of bins

spkit.core.infotheory.entropy(x, alpha=1, ignoreZero=False, base=2, normalize=False)[source]

Rényi entropy of order α alpha:[0,inf]

0

Max-entropy H(x) = log(N) where N = number of bins

1

Shannan entropy H(x) = -sum{Px*log(Px)}

2

Collision entropy or Rényi entropy H(x) = 1/(1-α)*log{sum{Px^α}}

inf:Min-entropy

H(x) = -log(max(Px))

base: base of log:

: if 2, entropy is in bits, e-nats, 10 -bans

ignoreZero: if true, probabilities with zero value will be omited, before computations

: It doesn’t make much of difference

spkit.core.infotheory.entropy_approx(X, m, r)[source]

Approximate Entropy Ref: https://en.wikipedia.org/wiki/Approximate_entropy

Return type

float

spkit.core.infotheory.entropy_cond(x, y, ignoreZero=False, base=2)[source]

H(X|Y) = H(X,Y) - H(Y)

0 <= H(X|Y) <= H(x)

spkit.core.infotheory.entropy_cross(x, y, base=2)[source]

Cross entropy H_xy = - sum{Px*log(Py)}

spkit.core.infotheory.entropy_joint(x, y, ignoreZero=False, base=2)[source]

H(X,Y) = sum {P(x,y)*np.log(P(x,y))}

Computing joint probability using histogram2d from numpy

max{H(x),H(y)} <= H(X,Y) <= H(x) + H(y)

spkit.core.infotheory.entropy_kld(x, y, base=2)[source]

H_xy = sum{Px*log(Px/Py)} Cross entropy - Kullback–Leibler divergence

spkit.core.infotheory.entropy_permutation(x, order=3, delay=1, base=2, normalize=False)[source]

Permutation Entropy.

spkit.core.infotheory.entropy_sample(X, m, r)[source]

Sample Entropy Ref: https://en.wikipedia.org/wiki/Sample_entropy

spkit.core.infotheory.entropy_spectral(x, fs, method='fft', alpha=1, base=2, normalize=True, axis=-1, nperseg=None, bining=True)[source]

Spectral Entropy Measure of the uncertainity of frequency components in a Signal For Uniform distributed signal and Gaussian distrobutated signal, their entropy is quite different, but in spectral domain, both have same entropy x: 1d array fs = sampling frequency method: ‘fft’ use periodogram and ‘welch’ uses wekch method base: base of log, 2 for bit, 10 for nats

spkit.core.infotheory.entropy_svd(x, order=3, delay=1, base=2, normalize=False)[source]

Singular Value Decomposition entropy.

spkit.core.infotheory.mutual_Info(x, y, ignoreZero=False, base=2)[source]

I(X;Y) = H(X)+H(Y)-H(X,Y)

spkit.core.infotheory.plotJointEntropyXY(x, y, Venn=True, DistPlot=True, JointPlot=True, printVals=True)[source]

Analyse Entropy-Venn Diagrams of X and Y

input

x: 1d-array y: 1d-array of same size as x Venn: if True, plot Venn diagram DistPlot: if True, plot distribuation of x and y JointPlot: if True, plot joint distribuation printVals: if True, print values output ——- Nothing is returned, just plots and prints

spkit.core.infotheory.quantize_FD(x, scale=1, min_bins=2, keep_amp_scale=True)[source]

Discretize (quantize) input signal x using optimal bin-width (FD)

input

output

Wavelet Analysis

spkit.cwt.ScalogramCWT(x, t, wType='Gauss', fs=128, PlotPSD=False, PlotW=False, fftMeth=True, interpolation='sinc', **Parameters)

Compute scalogram using Continues Wavelet Transform for wavelet type (wType) and given scale range

Parameters

x: array-like, input signal, t: array-like, time array corresponding to x, same length as x fs: sampling rate PlotPSD: bool, if True, plot Scalogram PlotW : bool, if True, plot wavelets in time and frequecy with different scalling version fftMeth: if True, FFT method is used, else convolution method is used. FFT method is faster. interpolation: str, or None, interpolation while ploting Scalogram.

Parameters for different wavelet functions

Common Parameters for all the Wavelet functions

farray of frequency range to be analysed, e.g. np.linspace(-10,10,2*N-1), where N = len(x)

: if None, frequency range of signal is considered from -fs/2 to fs/2 : ( fs/n1*(np.arange(n1)-n1/2))

A list of wavelets will be generated for each value of scale (e.g. f0, sigma, n etc)

  1. Gauss: (wType ==’Gauss’)

    f0 = array of center frquencies for wavelets, default: np.linspace(0.1,10,100) [scale value] Q = float or array of q-factor for each wavelet, e.g. 0.5 (default) or np.linspace(0.1,5,100)

    : if array, should be of same size as f0

    t0 = float=0, time shift of wavelet, or phase shift in frquency, Not suggeestive to change

  2. For Morlet: (wType ==’Morlet’)

    sig = array of sigma values for morlet wavelet, default: np.linspace(0.1,10,100) [scale value] fw = array of frequency range, e.g. np.linspace(-10,10,2*N-1), where N = len(x) ref: https://en.wikipedia.org/wiki/Morlet_wavelet

  3. For Gabor: (wType ==’Gabor’)

    Gauss and Gabor wavelet are essentially same f0 = array of center frquencies for wavelets, default: np.linspace(1,40,100) [scale value] a = float, oscillation parameter, default 0.5,

    could be an array (not suggeestive), similar to Gauss, e.g np.linspace(0.1,1,100) or np.logspace(0.001,0.5,100)

    t0 = float=0, time shift of wavelet, or phase shift in frquency. Not suggeestive to change

  4. For Poisson: (wType==’Poisson’)

    n = array of intergers, default np.arange(100), [scale value] method = 1,2,3, different implementation of Poisson funtion, default 3 keep the method=3, other methods are under development and not exactly compatibile with framework yet,

    ref: https://en.wikipedia.org/wiki/Poisson_wavelet

  5. For Complex MaxicanHat: (wType==’cMaxican’)

    f0 = array of center frquencies for wavelets, default: np.linspace(1,40,100) [scale value] a = float, oscillation parameter, default 1.0, could be an array (not suggeestive)

    ref: https://en.wikipedia.org/wiki/Complex_Mexican_hat_wavelet

  6. For Complex Shannon: (wType==’cShannon’)

    f0 = array of center frquencies for wavelets, default: 0.1*np.arange(10) [scale value], fw = BandWidth each wavelet, default 0.5, could be an array (not suggeestive)

    ref: https://en.wikipedia.org/wiki/Shannon_wavelet

Returns

XW: Complex-valued matrix of time-scale - Scalogram, with shape (len(S), len(x)). scale vs time S : scale values

Examples

import numpy as np import matplotlib.pyplot as plt from spkit.cwt import ScalogramCWT

# Example 1 - EEG Signal import spkit as sp from spkit.cwt import compare_cwt_example x,fs = sp.load_data.eegSample_1ch() t = np.arange(len(x))/fs print(x.shape, t.shape) compare_cwt_example(x,t,fs=fs)

# Example 2.1 - different wavelets XW,S = ScalogramCWT(x,t,fs=fs,wType=’Gauss’,PlotPSD=True)

# Example 2.2 - set scale values and number of points nS = 100 f0 = np.linspace(0.1,10,nS) # range of scale values - frquency Q = np.linspace(0.1,5,nS) # different q-factor for each scale value # Q = 0.5 XW,S = ScalogramCWT(x,t,fs=fs,wType=’Gauss’,PlotPSD=True,f0=f0,Q=Q)

# Example 2.3 - plot scalled wavelets too XW,S = ScalogramCWT(x,t,fs=fs,wType=’Gauss’,PlotPSD=True,PlotW=True,f0=f0,Q=Q)

# Example 3 t = np.linspace(-5, 5, 10*100) x = (np.sin(2*np.pi*0.75*t*(1-t) + 2.1) + 0.1*np.sin(2*np.pi*1.25*t + 1) + 0.18*np.cos(2*np.pi*3.85*t)) xn = x + np.random.randn(len(t)) * 0.5 XW,S = ScalogramCWT(xn,t,fs=100,wType=’Gauss’,PlotPSD=True)

# Example 4 f0 = np.linspace(0.1,30,100) Q = np.linspace(0.1,5,100) # or = 0.5 XW,S = ScalogramCWT(xn,t,fs=128,wType=’Gauss’,PlotPSD=True,f0=f0,Q=Q)

Analysis & Synthesis Models

spkit.dft_analysis(x, window='blackmanharris', N=None, scaling_dB=True, normalize_win=True, plot=False, fs=None)[source]

Analysis of a signal x using the Discrete Fourier Transform

input

x : 1d-array input signal ofshape (n,) window: window-type (default = ‘blackmanharris’)

: in None, qrectangular window is used

NFFT size should be >= len(x) and power of 2

: if None then N = 2**np.ceil(np.log2(len(n)))

scaling_dB: bool, if false, then linear scale of spectrum is returned, else in dB

: default True

normalize_win: bool (default True), if to normalize wondow (recommended) plot: int, (default: 0) for no plot

: 1 for plotting magnitude and phse spectrum : 2 for ploting signal along with spectrum

fssampling frequency, only used to plot the signal when plot=2

: if not provided, fs=1 is used : it does not affect any computations

output

mX: magnitude spectrum (of shape=int((N/2)+1)) # positive spectra pX: phase spectrum same shape as mX N : N-point FFT used for computation

spkit.dft_synthesis(mX, pX, M=None, scaling_dB=True, window=None)[source]

Synthesis of a signal using the Discrete Fourier Transform from positive spectra

input

mX: magnitude spectrum - 1d-array (of shape=int((N/2)+1)) for N-point FFT pX: phase spectrum - same size as mX M : length of signal: x, if None, then M = N = 2*(len(mX)-1) window: if provided, synthesized signal is rescalled with corresponding window function

: undoing the scaling

output

y: output signal of shape (M,)

spkit.stft_analysis(x, winlen, window='blackmanharris', nfft=None, overlap=None)[source]

Analysis of a signal using the Short-Time Fourier Transform

input

x: 1d-array signal - shape (n,) winlen : window length for analysis (good choice is a odd number)

: window size is chosen based on the frequency resolution required : winlen >= Bs*fs/del_f : where Bs=4 for hamming window, Bs=6 for blackman harris : def_f is different between two frequecies (to be resolve) : higher the window length better the frequency resolution, but poor time resolution

overlap: overlap of windows

: if None then winlen//2 is used (50% overlap) : shorter overlap can improve time resoltion - upto an extend

window: analysis window (default = blackmanharris)

: if None, rectangular window is used

nfft: FFT size, should be >=winlen and power of 2

: if None - nfft = 2**np.ceil(np.log2(len(n)))

output

mXt : magnitude spectra of shape (number of frames, int((nfft/2)+1)) pXt : phase spectra of same shape as mXt

spkit.stft_synthesis(mXt, pXt, winlen, overlap)[source]

Synthesis of signal from Short-Time Fourier Transform

input

mXt: magnitude spectra of signal - 2d-array of shape (number of frames, int((nfft/2)+1)) pXt: phase spectra of same size as mXt winlen: window length used while analysing overlap: overlap of windows used while analysing

output

y : 1d-array - synthesized signal shape = (nFrames*overlap + winlen)

spkit.sineModel_analysis(x, fs, winlen, overlap=None, window='blackmanharris', nfft=None, thr=-10, maxn_sines=100, minDur=0.01, freq_devOffset=20, freq_devSlope=0.01)[source]

Analysis of a signal x using the sinusoidal model

  • Decomposing a signal x into sine waves tracks over the time

input

x : input signal - 1d-array of shape (n,) fs: sampling frequency

winlen : window length for analysis overlap: overlap of windows

: if None overlap = winlen//4

window : window type (default = ‘blackmanharris’) e.g. hann,ham

nfft: FFT-points used for analysis, should be >=winlen and should be of power of 2

: if None, than nfft = ceil[2**log2(winlen)]

thr : threshold in negative dB for selecting sine tracks maxn_sines: maximum number of sines per frame minDur : minimum duration of sines in seconds freq_devOffset: minimum frequency deviation freq_devSlope : slope increase of minimum frequency deviation

output

fXt : frequencies mXt : magnitudes pXt: phases of sinusoidal tracks

fXt, mXt, pXt

spkit.sineModel_synthesis(fXt, mXt, pXt, fs, overlap, crop_end=False)[source]

Synthesis of signal x using the Sinusoidal Model

Synthesing signal for given frequencies with magnitude sinasodal tracks

input

fXt : frequency tracks - 2d-array- shape =(number of frames, number of sinasodal tracks) mXt : magnitude - same size of array as pXt : phases of sinusoids fs : sampling frequency overlap: overlap of consequitive frames (in samples)

output

y : 1d-array - synthesised signal

Ref: https://www.coursera.org/learn/audio-signal-processing

spkit.frft(x, alpha=0.1, method=1)[source]

Fractional Fourier Transform

input

x: real signal alpha: value method=1, other methods to be implemented

output

Y: complex signal

spkit.ifrft(x, alpha=0.1, method=1, verbose=0)[source]

Inverse Fractional Fourier Transform

spkit.ffrft(x, alpha)[source]

Fast Fractional Fourier Transform

  • perfect reconstruction with iffrft

input

x: real signal alpha: value

output

Y: complex signal

spkit.iffrft(x, alpha)[source]

Inverse Fast Fractional Fourier Transform

spkit.RFB(x, Pmax=10, Rcq=10, Rav=2, Th=0.2, Penalty=None, return_filters=False, apply_averaging=True)[source]

Ramanujan Filter Banks for Estimation and Tracking of Periodicity

input

x = 1d array, sequence of signal Pmax = the largest expected period. Rcq = Number of repeats in each Ramanujan filter Rav = Number of repeats in each averaging filter Th = Outputs of the RFB are thresholded to zero for all values less than Th*max(output) Penalt = penalty for each period shape=(len(Pmax)),

If None, then set to 1, means no penalty

output

y = 2d array of shape = (len(x),Pmax), time vs period matrix, normalized

if return_filters==True: also returns

FR = list of Ramanujan Filters FA = list of Averaging Filters

References: [1] S.V. Tenneti and P. P. Vaidyanathan, “Ramanujan Filter Banks for Estimation and Tracking of Periodicity”, Proc. IEEE Int. Conf. Acoust. Speech, and Signal Proc., Brisbane, April 2015.

[2] P.P. Vaidyanathan and S.V. Tenneti, “Properties of Ramanujan Filter Banks”, Proc. European Signal Processing Conference, France, August 2015.

Python impletation is done by using matlab code version from - http://systems.caltech.edu/dsp/students/srikanth/Ramanujan/

spkit.RFB_prange(x, Pmin=1, Pmax=10, skip=1, Rcq=10, Rav=2, thr=0.2, Penalty=None, return_filters=False, apply_averaging=True)[source]

Ramanujan Filter Banks for Estimation and Tracking of Periodicity

  • for range of period given by Pmin and Pmax.

input

x = 1d array, sequence of signal Pmin = the smallest expected period. (default=1) Pmax = the largest expected period. skip = int >=1: if to skip period (default=1 –> no skipping) (>1 is not recomended) Rcq = Number of repeats in each Ramanujan filter Rav = Number of repeats in each averaging filter thr = Outputs of the RFB are thresholded to zero for all values less than Th*max(output) Penalty = penalty for each period shape=(len(Pmax)),

If None, then set to 1, means no penalty

apply_averaging: bool, if False, no averaging is applied (deault=True) return_filters: bool, ifTrue, return FR - Ramanujan and FA - Averaging filters

output

y = 2d array of shape = (len(x),Pmax), time vs period matrix, normalized

if return_filters==True: also returns

FR = list of Ramanujan Filters FA = list of Averaging Filters

References: [1] S.V. Tenneti and P. P. Vaidyanathan, “Ramanujan Filter Banks for Estimation and Tracking of Periodicity”, Proc. IEEE Int. Conf. Acoust. Speech, and Signal Proc., Brisbane, April 2015.

[2] P.P. Vaidyanathan and S.V. Tenneti, “Properties of Ramanujan Filter Banks”, Proc. European Signal Processing Conference, France, August 2015.

Python impletation is done by using matlab code version from - http://systems.caltech.edu/dsp/students/srikanth/Ramanujan/

spkit.Create_Dictionary(Nmax, rowSize, method='Ramanujan')[source]

Creating Dictionary

input

Nmax : maximum expected Period, rowSize : number of rows (e.g. samples in signal) method : ‘Ramanujan’ ‘random’, ‘NaturalBasis’, ‘DFT’

output

A : Matrix of (rowSize, q)

The relevant paper is: [1] S.V. Tenneti and P. P. Vaidyanathan, “Nested Periodic Matrices and Dictionaries:

New Signal Representations for Period Estimation”, IEEE Transactions on Signal Processing, vol.63, no.14, pp.3736-50, July, 2015.

Python impletation is done by using matlab code version from - http://systems.caltech.edu/dsp/students/srikanth/Ramanujan/

spkit.PeriodStrength(x, Pmax, method='Ramanujan', lambd=1, L=1, cvxsol=False)[source]

Computing strength of periods

for given signal x, using method and respective loss fun (e.g. l1, l2)

inputs

x : one dimentional sequence (signal) Pmax: largest expected period in the signal method: type of dictionary used to create transform matrix A

: ‘Ramanujan’, ‘NaturalBasis’, ‘random’ or Farray (DFT)

lambd: for penalty vector, to force towards lower (usually) or higher periods

: if 0, then penalty vector is 1, means no penalization : if >0, then lambd is multiplied to penalty vector

L : regularazation: L=1, minimize ||s||_1, L=2, ||s||_2

cvxsol: bool, wether to use cvxpy solver of matrix decomposition approach

: matrix decomposition approach works only for L=2 : for L=1, use cvxpy as solver

output

period_energy: vecotor shape: (Pmax,): strength of each period

Reference: [1] S.V. Tenneti and P. P. Vaidyanathan, “Nested Periodic Matrices and Dictionaries:

New Signal Representations for Period Estimation”, IEEE Transactions on Signal Processing, vol.63, no.14, pp.3736-50, July, 2015.

Python impletation is done by using matlab code version from - http://systems.caltech.edu/dsp/students/srikanth/Ramanujan/

spkit.RFB_example_1(period=10, SNR=0, seed=10)[source]
spkit.RFB_example_2(periods=[3, 7, 11], signal_length=100, SNR=10, seed=15)[source]
spkit.filterDC(X, alpha=256, return_background=False, initialize_zero=True)[source]

Filter out DC component - Remving drift using Recursive (IIR type) filter

y[n] = ((alpha-1)/alpha) * ( x[n] - x[n-1] -y[n-1])

where y[-1] = x[0], x[-1] = x[0] resulting y[0] = 0

implemenatation works for single (1d array) or multi-channel (2d array) input —– X : (vecctor) input signal single channel (n,) or multi-channel, channel axis should be 1 shape ~ (n,ch)

alpha: (scalar) filter coefficient, higher it is, more suppressed dc component (0 frequency component)

: with alpha=256, dc component is suppressed by 20 dB

initialize_zero: (bool): If True, running backgrpund b will be initialize it with x[0], resulting y[0] = 0

if False, b = 0, resulting y[0] ~ x[0], and slowly drifting towards zeros line - recommended to set True

output

Y : output vector, shape same as input X (n,) or (n,ch)

spkit.filterDC_sGolay(X, window_length=127, polyorder=3, deriv=0, delta=1.0, mode='interp', cval=0.0, return_background=False)[source]

Filter out DC component - Remving drift using Savitzky-Golay filter

Savitzky-Golay filter for multi-channels signal: From Scipy library

input

X : (vecctor) input signal single channel (n,) or multi-channel, channel axis should be 1 shape ~ (n,ch) window_length: should be an odd number others input parameters as same as in scipy.signal.savgol_filter

:(polyorder=3, deriv=0, delta=1.0, mode=’interp’, cval=0.0)

output

Y : corrected signal Xm: background removed - return only if return_background is True

spkit.filter_X(X, fs=128.0, band=[0.5], btype='highpass', order=5, ftype='filtfilt', verbose=1, use_joblib=False)[source]

Buttorworth filtering - basic filtering

X : (vecctor) input signal single channel (n,) or multi-channel, channel axis should be 1 shape ~ (n,ch)

band: cut of frequency, for lowpass and highpass, band is list of one, for bandpass list of two numbers btype: filter type order: order of filter ftype: filtering approach type, filtfilt or lfilter,

: lfilter is causal filter, which introduces delaye, filtfilt does not introduce any delay, but it is non-causal filtering

Xf: filtered signal of same size as X

spkit.eeg.ATAR(X, wv='db3', winsize=128, thr_method='ipr', IPR=[25, 75], beta=0.1, k1=10, k2=100, est_wmax=100, theta_a=inf, bf=2, gf=0.8, OptMode='soft', wpd_mode='symmetric', wpd_maxlevel=None, factor=1.0, verbose=False, window=['hamming', True], hopesize=None, ReconMethod='custom', packetwise=False, WPD=True, lvl=[], fs=128.0, use_joblib=False)[source]

. ATAR: - Automatic and Tunable Artifact Removal Algorithm ========================================================

Apply ATAR on short windows of signal (multiple channels - if provided on axis 1:): Signal is decomposed in smaller overlapping windows and reconstructed after correcting using overlap-add method. —— for more details, check: Ref: Bajaj, Nikesh, et al. “Automatic and tunable algorithm for EEG artifact removal using wavelet decomposition with applications in predictive modeling during auditory tasks.” Biomedical Signal Processing and Control 55 (2020): 101624. —————-

input

X: input multi-channel signal of shape (n,ch)

Wavelet family: wv = [‘db3’…..’db38’, ‘sym2…..sym20’, ‘coif1…..coif17’, ‘bior1.1….bior6.8’, ‘rbio1.1…rbio6.8’, ‘dmey’]

:’db3’(default)

Threshold Computation method: thr_method : None (default), ‘ipr’

: None: fixed threshold theta_a is applied : ipr : applied with theta_a, bf , gf, beta, k1, k2 and OptMode : theta_b = bf*theta_a : theta_g = gf*theta_a

Operating modes: OptMode = [‘soft’,’elim’,’linAtten’]

: default ‘soft’ : use ‘elim’ with globalgood

Wavelet Decomposition modes: wpd_mode = [‘zero’, ‘constant’, ‘symmetric’, ‘periodic’, ‘smooth’, ‘periodization’]

default ‘symmetric’

Reconstruction Method - Overlap-Add method ReconMethod : None, ‘custom’, ‘HamWin’ for ‘custom’: window[0] is used and applied after denoising is window[1] is True else windowing applied before denoising

output

XR: corrected signal of same shape as input X

spkit.eeg.ATAR_1Ch(x, wv='db3', winsize=128, thr_method='ipr', IPR=[25, 75], beta=0.1, k1=None, k2=100, est_wmax=100, theta_a=inf, bf=2, gf=0.8, OptMode='soft', factor=1.0, wpd_mode='symmetric', wpd_maxlevel=None, verbose=False, window=['hamming', True], hopesize=None, ReconMethod='custom', packetwise=False, WPD=True, lvl=[], fs=128.0)[source]

‘’ ATAR: - Automatic and Tunable Artifact Removal Algorithm ========================================================

Apply ATAR on short windows of signal (single channel):

Signal is decomposed in smaller overlapping windows and reconstructed after correcting using overlap-add method.


Wfilter(x,wv=’db3’,method=None,IPR=[25,75],beta=0.1,k1=None,k2 =100,

theta_a=np.inf,bf=2,gf=0.8,OptMode =’soft’,factor=1.0,showPlot=False,wpd_mode=’symmetric’,wpd_maxlevel=None)

input

X: input single-channel signal of shape (n,)

Threshold Computation method: method : None (default), ‘ipr’

: provided with theta_a, bf , gf : theta_b = bf*theta_a : theta_g = gf*theta_a

Operating modes: OptMode = [‘soft’,’elim’,’linAtten’]

: default ‘soft’ : use ‘elim’ with global

Wavelet Decomposition modes: wpd_mode = [‘zero’, ‘constant’, ‘symmetric’, ‘periodic’, ‘smooth’, ‘periodization’]

default ‘symmetric’

Wavelet family: wv = [‘db3’…..’db38’, ‘sym2…..sym20’, ‘coif1…..coif17’, ‘bior1.1….bior6.8’, ‘rbio1.1…rbio6.8’, ‘dmey’]

:’db3’(default)

Reconstruction Methon ReconMethod : None, ‘custom’, ‘HamWin’ for ‘custom’: window[0] is used and applied after denoising is window[1] is True else windowing applied before denoising

output

XR: corrected signal of same shape as input X

spkit.eeg.ATAR_mCh(X, wv='db3', winsize=128, thr_method='ipr', IPR=[25, 75], beta=0.1, k1=10, k2=100, est_wmax=100, theta_a=inf, bf=2, gf=0.8, OptMode='soft', wpd_mode='symmetric', wpd_maxlevel=None, factor=1.0, verbose=False, window=['hamming', True], hopesize=None, ReconMethod='custom', packetwise=False, WPD=True, lvl=[], fs=128.0, use_joblib=False)[source]

. ATAR: - Automatic and Tunable Artifact Removal Algorithm ======================================================== Apply ATAR on short windows of signal (multiple channels:):

Signal is decomposed in smaller overlapping windows and reconstructed after correcting using overlap-add method.

for more details, check: Ref: Bajaj, Nikesh, et al. “Automatic and tunable algorithm for EEG artifact removal using wavelet decomposition with applications in predictive modeling during auditory tasks.” Biomedical Signal Processing and Control 55 (2020): 101624. —————-

input

X: input multi-channel signal of shape (n,ch)

Wavelet family: wv = [‘db3’…..’db38’, ‘sym2…..sym20’, ‘coif1…..coif17’, ‘bior1.1….bior6.8’, ‘rbio1.1…rbio6.8’, ‘dmey’]

:’db3’(default)

Threshold Computation method: thr_method : None (default), ‘ipr’

: None: fixed threshold theta_a is applied : ipr : applied with theta_a, bf , gf, beta, k1, k2 and OptMode : theta_b = bf*theta_a : theta_g = gf*theta_a

Operating modes: OptMode = [‘soft’,’elim’,’linAtten’]

: default ‘soft’ : use ‘elim’ with globalgood

Wavelet Decomposition modes: wpd_mode = [‘zero’, ‘constant’, ‘symmetric’, ‘periodic’, ‘smooth’, ‘periodization’]

default ‘symmetric’

Reconstruction Method - Overlap-Add method ReconMethod : None, ‘custom’, ‘HamWin’ for ‘custom’: window[0] is used and applied after denoising is window[1] is True else windowing applied before denoising

output

XR: corrected signal of same shape as input X

spkit.eeg.ICA_filtering(X, winsize=128, ICA_method='extended-infomax', kur_thr=2, corr_thr=0.8, AF_ch_index=[0, 13], F_ch_index=[1, 2, 11, 12], verbose=True, window=['hamming', True], hopesize=None, winMeth='custom')[source]

input

X: input signal (n,ch) with n samples and ch channels winsize: window size to process, if None, entire signal is used at once ICAMed = [‘fastICA’,’infomax’,’extended-infomax’,’picard’]

(1) Kurtosis based artifacts - mostly for motion artifacts

kur_thr: (default 2) threshold on kurtosis of IC commponents to remove, higher it is, more peaky component is selected

: +ve int value

(2) Correlation Based Index (CBI) for eye movement artifacts

for applying CBI method, index of prefrontal (AF - First Layer of electrodes towards frontal lobe) and frontal lobe (F - second layer of electrodes)channels needs to be provided. For case of 14-channels Emotiv Epoc ch_names = [‘AF3’,’F7’,’F3’,’FC5’,’T7’,’P7’,’O1’,’O2’,’P8’,’T8’,’FC6’,’F4’,’F8’,’AF4’] PreProntal Channels =[‘AF3’,’AF4’], Fronatal Channels = [‘F7’,’F3’,’F4’,’F8’] AF_ch_index =[0,13] : (AF - First Layer of electrodes towards frontal lobe) F_ch_index =[1,2,11,12] : (F - second layer of electrodes) if AF_ch_index or F_ch_index is None, CBI is not applied

(3) Correlation of any independent component with many EEG channels

If any indepentdent component is correlated fo corr_thr% (80%) of elecctrodes, is considered to be artifactual – Similar like CBI, except, not comparing fronatal and prefrontal but all corr_thr: (deafult 0.8) threshold to consider correlation, higher the value less IC are removed and vise-versa

: float [0-1], : if None, this is not applied

spkit.eeg.ICAremoveArtifact(x, ICA_method='extended-infomax', corr_thr=0.8, kur_thr=2.0, AF_ch_index=[0, 13], F_ch_index=[1, 2, 11, 12])[source]
spkit.ml.LR(X, y, alpha=0.01, lambd=0, polyfit=True, degree=3, FeatureNormalize=False)[source]
spkit.ml.LogisticRegression(alpha=0.01, lambd=0, polyfit=False, degree=3, FeatureNormalize=False, penalty='l2', tol=0.01, rho=0.9, C=1.0, fit_intercept=True)[source]
spkit.ml.NaiveBayes()[source]

The Gaussian Naive Bayes classifier. Based on the bayes rule

X::shape (n, nf), n samples with nf features y::shape (n,) or (n,1)

spkit.ml.ClassificationTree(min_samples_split=2, min_impurity=1e-07, max_depth=inf, thresholdFromMean=False)[source]

Super class of RegressionTree and ClassificationTree.

spkit.ml.RegressionTree(min_samples_split=2, min_impurity=1e-07, max_depth=inf, thresholdFromMean=False)[source]

Super class of RegressionTree and ClassificationTree.

List of all functions

spkit.entropy
spkit.mutual_Info
spkit.entropy_joint
spkit.entropy_cond
spkit.entropy_cross
spkit.entropy_kld
spkit.entropy_spectral
spkit.entropy_sample
spkit.entropy_approx
spkit.entropy_svd
spkit.entropy_permutation
spkit.dispersion_entropy
spkit.dispersion_entropy_multiscale_refined
spkit.entropy
spkit.mutual_Info
spkit.entropy_joint
spkit.entropy_cond
spkit.entropy_cross
spkit.entropy_kld
spkit.entropy_spectral
spkit.entropy_sample
spkit.entropy_approx
spkit.entropy_svd
spkit.entropy_permutation
spkit.dispersion_entropy
spkit.dispersion_entropy_multiscale_refined
spkit.cwt.ScalogramCWT
spkit.cwt.compare_cwt_example
spkit.dft_analysis
spkit.dft_synthesis
spkit.stft_analysis
spkit.stft_synthesis
spkit.sineModel_analysis
spkit.sineModel_synthesis
spkit.f0_detection
spkit.peak_detection
spkit.peak_interp
spkit.TWM_f0
spkit.TWM_algo



spkit.frft
spkit.ifrft
spkit.ffrft
spkit.iffrft

spkit.HistPlot
spkit.Mu_law
spkit.A_law
spkit.bin_width
spkit.binSize_FD
spkit.Quantize
spkit.plotJointEntropyXY


spkit.low_resolution
spkit.cdf_mapping


spkit.SVD
spkit.ICA
spkit.infomax


spkit.filterDC
spkit.filterDC_sGolay
spkit.filter_X
spkit.Periodogram
spkit.getStats
spkit.getQuickStats
spkit.OutLiers

spkit.wavelet_filtering
spkit.wavelet_filtering_win
spkit.WPA_temporal
spkit.WPA_coeff
spkit.WPA_plot


spkit.RFB
spkit.RFB_prange
spkit.Create_Dictionary
spkit.PeriodStrength
spkit.RFB_example_1
spkit.RFB_example_2