calour.analysis.correlation

calour.analysis.correlation(exp: calour.experiment.Experiment, field, method='spearman', nonzero=False, transform=None, numperm=1000, alpha=0.1, fdr_method='dsfdr', random_seed=None)[source]

Find features with correlation to a numeric metadata field

With permutation based p-values and multiple hypothesis correction

Note

This function is also available as a class method Experiment.correlation()

Parameters:
  • exp (Experiment) – Input experiment object.
  • field (str) – The field to test by. Values are converted to numeric.
  • method (str or function) – the method to use for the t-statistic test. options: ‘spearman’ : spearman correlation (numeric) ‘pearson’ : pearson correlation (numeric) function : use this function to calculate the t-statistic (input is data,labels, output is array of float)
  • nonzero (bool, optional) – True to calculate the correlation only for samples where the feature is present (>0). False (default) to calculate the correlation over all samples Note: setting nonzero to True slows down the calculation Note: can be set to True only using ‘spearman’ or ‘pearson’, not using a custom function
  • transform (str or None) – transformation to apply to the data before caluculating the statistic. ‘rankdata’ : rank transfrom each OTU reads ‘log2data’ : calculate log2 for each OTU using minimal cutoff of 2 ‘normdata’ : normalize the data to constant sum per samples ‘binarydata’ : convert to binary absence/presence
  • alpha (float) – the desired FDR control level
  • numperm (int) – number of permutations to perform
  • fdr_method (str) – method to compute FDR. Allowed method include “”, “”
  • random_seed (int or None (optional)) – int to set the numpy random seed to this number before running the random permutation test. None to not set the numpy random seed
Returns:

The experiment with only correlated features, sorted according to correlation coefficient

Return type:

Experiment