calour.io.read

calour.io.read(data_file, sample_metadata_file=None, feature_metadata_file=None, description='', sparse=True, data_file_type='biom', sample_metadata_kwargs=None, feature_metadata_kwargs=None, cls=<class 'calour.experiment.Experiment'>, table_sample_id_proc=None, table_feature_id_proc=None, data_table_sep=', ', sample_in_row=False, *, normalize)[source]

Read the files for the experiment.

Note

The order in the sample and feature metadata tables are changed to align with biom table.

Parameters:
  • data_file (str) – file path to the biom table.
  • sample_metadata_file (None or str, optional) – None (default) to just use sample names (no additional metadata). if not None, file path to the sample metadata (aka mapping file in QIIME).
  • feature_metadata_file (None or str, optional) – file path to the feature metadata.
  • description (str) – description of the experiment
  • sparse (bool) – read the biom table into sparse or dense array
  • data_file_type (str, optional) – the data_file format. options: ‘biom’ : a biom table (biom-format.org) (default) ‘tsv’: a tab-separated table with (samples in column and feature in row) ‘openms’ : an OpenMS bucket table csv (rows are feature, columns are samples) ‘openms_transpose’ an OpenMS bucket table csv (columns are feature, rows are samples) ‘gnps_ms’ : an OpenMS bucket table tsv with samples as columns (exported from GNPS) ‘qiime2’ : a qiime2 biom table artifact (need to have qiime2 installed)
  • feature_metadata_kwargs (sample_metadata_kwargs,) – keyword arguments passing to pandas.read_table() when reading sample metadata or feature metadata. For example, you can set sample_metadata_kwargs={'dtype': {'ph': int}, 'encoding': 'latin-8'} to read the column of ph in the sample metadata as int and parse the file as latin-8 instead of utf-8. By default, it assumes the first column in the metadata files is sample/feature IDs and is read in as row index. To avoid this, please provide {‘index_col’: False}.
  • cls (class, optional) – what class object to read the data into (Experiment by default)
  • table_sample_id_proc (None or callable, optional) –
  • table_feature_id_proc (None or callable, optional) – if not None, modify each sample/feature id in the table using the callable function. The callable accepts a list of str and returns a list of str (sample/feature ids after processing). Useful in metabolomics experiments, where the sampleIDs in the data table contain additional information compared to the mapping file (using a ‘_’ separator), and this needs to be removed in order to sync the sampleIDs between table and mapping file.
  • sample_in_row (bool, optional) – False if data table columns are sample, True if rows are samples
  • normalize (int or None) – normalize each sample to the specified read count. None to not normalize
Returns:

the new object created

Return type:

Experiment