calour.experiment.Experiment.diff_abundance

Experiment.diff_abundance(field, val1, val2=None, method='meandiff', transform='rankdata', numperm=1000, alpha=0.1, fdr_method='dsfdr', random_seed=None)[source]

Differential abundance test between 2 groups of samples for all the features.

It uses permutation based nonparametric test and then applies multiple hypothesis correction. The idea is that you compute a defined statistic and compare it to the distribution of the same statistic values computed from many permutations.

Parameters:
  • field (str) – The field from sample metadata to group samples.
  • val1 (str or list of str) – The values in the field column for the first group.
  • val2 (str or list of str or None (optional)) – The values in the field column to select the second group. None (default) to compare to all other samples (excluding val1).
  • method (str or function) –

    the method to compute a statistic. options:

    • ’meandiff’ : mean(A)-mean(B)
    • ’stdmeandiff’ : (mean(A)-mean(B))/(std(A)+std(B))
    • callable : use this to calculate the statistic (input is data,labels, output is array of float)
  • transform (str or None) –

    transformation to apply to the data before caluculating the statistic.

    • ’rankdata’ : rank transfrom each OTU reads
    • ’log2data’ : calculate log2 for each OTU using minimal cutoff of 2
    • ’normdata’ : normalize the data to constant sum per samples
    • ’binarydata’ : convert to binary absence/presence
  • alpha (float (optional)) – the desired FDR control level
  • numperm (int (optional)) – number of permutations to perform
  • fdr_method (str (optional)) –

    The method used to control the False Discovery Rate. options are:

    • ’dsfdr’ : the discrete FDR control method
    • ’bhfdr’ : Benjamini-Hochberg FDR method
    • ’byfdr’ : Benjamini-Yekutielli FDR method
    • ’filterBH’ : Benjamini-Hochberg FDR method following

    removal of all features with minimal possible p-value less than alpha (e.g. a feature that appears in only 1 sample can obtain a minimal p-value of 0.5 and will therefore be removed when say alpha=0.1)

  • random_seed (int or None (optional)) – int to set the numpy random seed to this number before running the random permutation test. None to not set the numpy random seed
Returns:

A new experiment with only significant (FDR <= maxfval) difference, sorted according to the effect size.

The new experiment contains additional feature_metadata_fields that include:

  • ’_calour_diff_abundance_pval’ : the p-value for the feature,
  • ’_calour_diff_abundance_effect’ : the effect size (t-statistic),
  • ’_calour_diff_abundance_group’ : the value (in field) where the statistic is higher

Return type:

Experiment