spkit.data.gaussian

spkit.data.gaussian(N=[100, 100], ndist=3, means='random', sigmas='random', return_para=False, **kwargs)

Generate a 2-class dataset from a mixture of gaussians

Sample a dataset from a mixture of gaussians

Parameters:
N: list or two int, default =[100,100]
  • vector that fix the number of samples from each class

  • example N = [100,100], 100 samples for each class

ndist: scalar, default=3
  • number of gaussian for each class. Default is 3

means: array, shape (2*ndist X 2), default=’random’
  • vector of size(2*ndist X 2) with the means of each gaussian.

sigmas: array , default=’random’
  • A sequence of covariance matrices of size (2*ndist, 2)

    New in version 0.0.9.7: Added to return parameters

return_para: bool, default=False
  • if True, return the parameters

Returns:
X: 2d-array
  • data matrix with a sample for each row

  • shape (n, 2)

    Changed in version 0.0.9.7: shape is changed to (n, 2)

y: 1d-array
  • vector with the labels

    Changed in version 0.0.9.7: shape is changed to (n, )

(ndist, means, sigmas): parameters
  • if return_para is True

Examples

#sp.data.gaussian
import numpy as np
import matplotlib.pyplot as plt
import spkit as sp
np.random.seed(3)
X, y =  sp.data.gaussian(N =[100, 100],ndist=3, means='random', sigmas='random')
np.random.seed(None)
plt.figure()
plt.plot(X[y==0,0],X[y==0,1],'o')
plt.plot(X[y==1,0],X[y==1,1],'o')
plt.xlabel('x1')
plt.ylabel('x2')
plt.title('Gaussian Data')
plt.show()
../../_images/spkit-data-gaussian-1.png