File Formats

Emperor uses two main files as inputs, a mapping file and a coordinates file, here we will describe their structure.

Mapping File

A mapping file refers to the additional information collected about your samples, some common examples are body site, collection time, pH, latitude, longitude, etc. This information is used by Emperor to color and modify visual aspects of your plot in a systematic way.

Mapping files should conform to the format as described by QIIME. Note that the only required column is SampleID, if this column is not present the file will not be parsed correctly and the script will bring up an error.

Coordinates File

Coordinates files describe the position in which each of the samples should be placed in space, as of version 0.9.4-dev there are two different file formats that are accepted by Emperor:

Classic Format

This format is tab-delimited and must include the following information ( versions of QIIME <= 1.8.0 generate this file when you execute

  • The first line is the header for each of the dimensions.
  • For each of the samples in your dataset you’ll find a line with a sample identifier and its value at each dimension. If you have 7 samples you should expect to see 7 lines with coordinates spanning through 7 dimensions.
  • Two empty lines.
  • A line with an eigenvalue for each dimension.
  • A line with the percentages of variation explained at each dimension.

See the example below that presents a dataset where 5 samples were included.

pc vector number    1       2       3       4       5
PC.636      0.333514620475  0.081687        0.25081 0.103746        2.50106743521e-09
PC.635      0.258145479886  -0.22437        -0.25446        -0.0290044      2.50106743521e-09
PC.356      -0.270663791669 -0.23983        0.18965 -0.118931       2.50106743521e-09
PC.481      -0.0437436047686        0.30751 -0.077078       -0.20627        2.50106743521e-09
PC.354      -0.277252703922 0.074945        -0.10898        0.245681        2.50106743521e-09

eigvals     0.329912543767  0.2147172       0.1815366       0.1261003       -3.127915774e-17
% variation explained       38.68853        25.18276        21.2717 14.85772        3.667478e-15

scikit-bio Ordination Results

Emperor is also capable of parsing files generated by scikit-bio’s OrdinationResults class.