calour.experiment.Experiment

class calour.experiment.Experiment(data, sample_metadata, feature_metadata=None, exp_metadata={}, description='', sparse=True)[source]

Bases: object

This class contains the data for a experiment or a meta experiment.

The data set includes a data table (otu table, gene table, metabolomic table, or all those tables combined), a sample metadata table, and a feature metadata.

Parameters:
Variables:

See also

AmpliconExperiment

Methods

copy.deepcopy(exp) Implement the deepcopy since pandas has problem deepcopy empty dataframe
exp1 == exp2 Check equality.
exp[k] Get the abundance at (sampleid, featureid)
exp1 != exp2 Return self!=value.
repr(exp) Return a string representation of this object.
add_sample_metadata_as_features(fields[, …]) Add covariates from sample metadata to the data table as features for machine learning.
add_terms_to_features(dbname[, …]) Add a field to the feature metadata, with most common term for each feature
aggregate_by_metadata(field[, agg, axis, …]) aggregate all samples/features that have the same value in the given field.
binarize([threshold, inplace]) Binarize the data with a threshold.
center_log_ratio([method, centralize, inplace]) Performs a clr transform to normalize each sample.
classify(fields, estimator[, cv, predict, …]) Evaluate classification during cross validation.
cluster_data([transform, axis, metric, inplace]) Cluster the samples/features.
cluster_features([min_abundance, inplace]) Cluster features.
copy() Copy the object (deeply).
correlation(field[, method, nonzero, …]) Find features with correlation to a numeric metadata field
diff_abundance(field, val1[, val2, method, …]) Differential abundance test between 2 groups of samples for all the features.
diff_abundance_kw(field[, transform, …]) Test the differential expression between multiple sample groups using the Kruskal Wallis test.
downsample(field[, axis, num_keep, inplace, …]) Downsample the data set.
enrichment(features, dbname, *args, **kwargs) Get the list of enriched annotation terms in features compared to all other features in exp.
export_html([sample_field, feature_field, …]) Export an interactive html heatmap for the experiment.
filter_abundance([cutoff]) Filter features with sum abundance across all samples less than the cutoff.
filter_by_data(predicate[, axis, field, …]) Filter samples or features by the data matrix.
filter_by_metadata(field, select[, axis, …]) Filter samples or features by metadata.
filter_ids(ids[, axis, negate, inplace]) Filter samples or features based on a list IDs.
filter_mean_abundance([cutoff, field]) Filter features with a mean at least cutoff of the mean total abundance/sample
filter_prevalence(fraction[, cutoff, field]) Filter features keeping only ones present in more than certain fraction of all samples.
filter_sample_categories(field[, …]) Filter sample categories that have too few samples.
filter_samples(field, values[, negate, inplace]) Shortcut for filtering samples.
filter_samples_([cutoff, inplace])
from_pandas(df[, exp]) Convert a Pandas DataFrame into an experiment.
get_data([sparse, copy]) Get the data as a 2d array
heatmap([sample_field, feature_field, …]) Plot a heatmap for the experiment.
join_experiments(other[, field_name, prefixes]) Combine two Experiment objects into one.
join_experiments_featurewise(other[, …]) Combine two Experiment objects into one.
join_metadata_fields(field1, field2[, …]) Join two sample/feature metadata fields into a single new field
log_n([n, inplace]) Log transform the data
normalize([total, axis, inplace]) Normalize the sum of each sample (axis=0) or feature (axis=1) to sum total
normalize_by_subset_features(features[, …]) Normalize each sample by their total sums without a list of features
normalize_compositional([min_frac, total, …]) Normalize each sample by ignoring the features with mean>=min_frac in all the experiment
plot([title, barx_fields, barx_width, …]) Plot the interactive heatmap and its associated axes.
plot_abund_prevalence(field[, log, …]) Plot abundance against prevalence.
plot_core_features([field, steps, cutoff, …]) Plot the percentage of core features shared in increasing number of samples.
plot_diff_abundance_enrichment([max_show, …]) Plot the term enrichment of differentially abundant bacteria
plot_enrichment(enriched[, max_show, …]) Plot a horizontal bar plot for enriched terms
plot_feature_matrix(fields, feature_ids[, …]) This plots an array of scatter plots between each features against the specified sample metadata.
plot_hist([ax]) Plot histogram of all the values in data.
plot_stacked_bar([field, sample_color_bars, …]) Plot the stacked bar for feature abundances.
random_permute_data([normalize]) Shuffle independently the reads of each feature
regress(field, estimator[, cv, params]) Evaluate regression during cross validation.
reorder(new_order[, axis, inplace]) Reorder according to indices in the new order.
rescale([total, axis, inplace]) Rescale the data to mean sum of all samples (axis=0) or features (axis=1) to be total.
save(prefix[, fmt]) Save the experiment data to disk.
save_biom(f[, fmt, add_metadata]) Save experiment to biom format
save_fasta(f[, seqs]) Save a list of sequences to fasta.
save_metadata(f[, axis]) Save sample/feature metadata to file.
scale([axis, inplace]) Standardize a dataset along an axis
sort_abundance([subgroup]) Sort features based on their abundance in a subset of the samples.
sort_by_data([axis, subset, key, inplace, …]) Sort features based on their mean frequency.
sort_by_metadata(field[, axis, inplace]) Sort samples or features based on metadata values in the field.
sort_centroid([transform, inplace]) Sort the features based on the center of mass
sort_ids(ids[, axis, inplace]) Sort the features or samples by the given ids.
sort_samples(field, **kwargs) Sort samples by field A convenience function for sort_by_metadata
split_train_test(test_size[, train_size, …]) Split experiment into train experiment and test experiment.
subsample_count(total[, replace, inplace, …]) Randomly subsample each sample to the same number of counts.
to_pandas([sample_field, feature_field, sparse]) Get a pandas dataframe of the abundances Samples are rows, features are columns.
transform([steps, inplace]) Chain transformations together.

Attributes

shape
sparse