API Reference

blockdiagonalBMD

class bmdcluster.blockdiagonalBMD(n_clusters, max_iter=100, use_bootstrap=False, b=None, init_ratio=1.0, seed=None)[source]
__init__(n_clusters, max_iter=100, use_bootstrap=False, b=None, init_ratio=1.0, seed=None)[source]

Run the block-diagonal form of the BMD algorithm.

Parameters:
  • n_clusters (int) – number of data clusters
  • max_iter (int, optional) – maximum number of optimization iterations, by default 100
  • use_bootstrap (bool, optional) – use bootstrap cluster initialization, by default False
  • b (int, optional) – number of bootstrapped samples to use, by default None
  • init_ratio (float, optional) – fraction of points to randomly initialize, by default 1.0
  • seed (int, optional) – random initialization seed, by default None
Raises:
  • ValueError – If use_bootstrap is set to True but and b is not specified
  • ValueError – If both B_ident and f_clusters are not specified
fit(W, verbose=False)[source]

Fit the model.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
fit_predict(W, verbose=False)[source]

Fit the model and return final value of objective function and cluster assignment labels for the data and features.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
Returns:

  • float – final value of objective function
  • np.array – data cluster labels
  • np.array – feature cluster labels

fit_transform(W, verbose=False)[source]

Fit the model and return final value of objective function and final values of the data and feature cluster assignment matrices A and B, whose entries are cluster affinity scores.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
Returns:

  • float – final cost of objective function
  • np.array – final value of data cluster assignment matrix A
  • np.array – final value of feature cluster assignment matrix B

get_data_labels()[source]

Get data cluster labels after .fit(). Outliers will be labeled -1.

Returns:data cluster labels
Return type:np.array
get_feature_labels()[source]

Get feature cluster labels after .fit(). Outliers will be labeled -1.

Returns:feature cluster labels
Return type:np.array
predict(W)[source]

Predict cluster labels of new data.

Parameters:W (np.array) – binary data matrix
Returns:predicted cluster labels
Return type:np.array
transform(W)[source]

Predict cluster assignment matrx of new data

Parameters:W (np.array) – binary data matrix
Returns:predicted cluster assignment matrix
Return type:np.array

generalBMD

class bmdcluster.generalBMD(n_clusters, f_clusters=None, B_ident=True, max_iter=100, use_bootstrap=False, b=None, init_ratio=1.0, seed=None)[source]
__init__(n_clusters, f_clusters=None, B_ident=True, max_iter=100, use_bootstrap=False, b=None, init_ratio=1.0, seed=None)[source]

Run the general form of the BMD algorithm.

Parameters:
  • n_clusters (int) – number of data clusters
  • f_clusters (int, optional) – number of feature clusters, by default None
  • B_ident (bool, optional) – initialize feature cluster assignment matrix to the identity, by default True
  • max_iter (int, optional) – maximum number of optimization iterations, by default 100
  • use_bootstrap (bool, optional) – use bootstrap cluster initialization, by default False
  • b (int, optional) – number of bootstrapped samples to use, by default None
  • init_ratio (float, optional) – fraction of points to randomly initialize, by default 1.0
  • seed (int, optional) – random initialization seed, by default None
Raises:
  • ValueError – If use_bootstrap is set to True but and b is not specified
  • ValueError – If both B_ident and f_clusters are not specified
  • ValueError – If both B_ident=True and f_clusters is set

Caution

Setting both B_ident=True and f_clusters are mutually exclusive options and will result in an error.

fit(W, verbose=False)[source]

Fit the model.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
fit_predict(W, verbose=False)[source]

Fit the model and return final value of objective function and cluster assignment labels for the data and features.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
Returns:

  • float – final value of objective function
  • np.array – data cluster labels
  • np.array – feature cluster labels

fit_transform(W, verbose)[source]

Fit the model and return final value of objective function and final values of the data and feature cluster assignment matrices A and B, whose entries are cluster affinity scores.

Parameters:
  • W (np.array) – binary data matrix
  • verbose (bool, optional) – print progress during optimization, by default False
Returns:

  • float – final cost of objective function
  • np.array – final value of data cluster assignment matrix A
  • np.array – final value of feature cluster assignment matrix B

get_data_labels()[source]

Get data cluster labels after .fit(). Outliers will be labeled -1.

Returns:data cluster labels
Return type:np.array
get_feature_labels()[source]

Get feature cluster labels after .fit(). Outliers will be labeled -1.

Returns:feature cluster labels
Return type:np.array
predict(W)[source]

Predict cluster labels of new data.

Parameters:W (np.array) – binary data matrix
Returns:predicted cluster labels
Return type:np.array
transform(W)[source]

Predict cluster assignment matrx of new data

Parameters:W (np.array) – binary data matrix
Returns:predicted cluster assignment matrix
Return type:np.array