emperor.Emperor

class emperor.Emperor(ordination, mapping_file, feature_mapping_file=None, dimensions=5, remote=True, jackknifed=None, procrustes=None, ignore_missing_samples=False)

Display principal coordinates analysis plots

Use this object to interactively display a PCoA plot using the Emperor GUI. IPython provides a rich display system that will let you display a plot inline, without the need of creating a temprorary file or having to write to disk.

Parameters:

ordination: skbio.OrdinationResults

Object containing the computed values for an ordination method in scikit-bio. Currently supports skbio.stats.ordination.PCoA and skbio.stats.ordination.RDA results.

mapping_file: pd.DataFrame

DataFrame object with the metadata associated to the samples in the ordination object, should have an index set and it should match the identifiers in the ordination object.

feature_mapping_file: pd.DataFrame, optional

DataFrame object with the metadata associated to the features in the ordination object, should have an index set and it should match the identifiers in the ordination.features object.

dimensions: int, optional

Number of dimensions to keep from the ordination data, defaults to 5. Be aware that this value will determine the number of dimensions for all computations.

remote: bool or str, optional

This parameter can have one of the following three behaviors according to the value: (1) str - load the resources from a user-specified remote location, (2) False - load the resources from the nbextensions folder in the Jupyter installation or (3) True - load the resources from the GitHub repository. This parameter defaults to True. See the Notes section for more information.

jackknifed: list of OrdinationResults, optional

A list of the OrdinationResults objects with the same sample identifiers as the identifiers in ordination.

procrustes: list of OrdinationResults, optional

A list of the OrdinationResults objects with the same sample identifiers as the identifiers in ordination.

ignore_missing_samples: bool, optional

If set to True samples and features without metadata are included by setting all metadata values to: This element has no metadata. By default an exception will be raised if missing elements are encountered. Note, this flag only takes effect if there’s at least one overlapping element.

Raises:

ValueError

If the remote argument is not of bool or str type. If none of the samples in the ordination matrix are in the metadata. If the data is one-dimensional.

KeyError

If there’s samples in the ordination matrix but not in the metadata.

Notes

This object currently does not support the full range of actions that the GUI does support and should be considered experimental at the moment.

The remote parameter is intended for different use-cases, you should use the first option “(1) - URL” when you want to load the data from a location different than the GitHub repository or your Jupyter notebook resources i.e. a custom URL. The second option “(2) - False” loads resources from your local Jupyter installation, note that you need to execute nbinstall at least once or the application will error, this option is ideal for developers modifying the JavaScript source code, and in environments of limited internet connection. Finally, the third option “(3) - True” should be used if you intend to embed an Emperor plot in a notebook and then publish it using http://nbviewer.jupyter.org.

References

[R3]EMPeror: a tool for visualizing high-throughput microbial community data Vazquez-Baeza Y, Pirrung M, Gonzalez A, Knight R. Gigascience. 2013 Nov 26;2(1):16.

Examples

Create an Emperor object and display it from the Jupyter notebook:

>>> import pandas as pd, numpy as np
>>> from emperor import Emperor
>>> from skbio import OrdinationResults

Ordination plots are almost invariantly associated with a set of data, that relates each sample to its scientific context, we refer to this as the sample metadata, and represent it using Pandas DataFrames. For this example we will need some metadata, we start by creating our metadata object:

>>> data = [['PC.354', 'Control', '20061218', 'Control_mouse_I.D._354'],
... ['PC.355', 'Control', '20061218', 'Control_mouse_I.D._355'],
... ['PC.356', 'Control', '20061126', 'Control_mouse_I.D._356'],
... ['PC.481', 'Control', '20070314', 'Control_mouse_I.D._481'],
... ['PC.593', 'Control', '20071210', 'Control_mouse_I.D._593'],
... ['PC.607', 'Fast', '20071112', 'Fasting_mouse_I.D._607'],
... ['PC.634', 'Fast', '20080116', 'Fasting_mouse_I.D._634'],
... ['PC.635', 'Fast', '20080116', 'Fasting_mouse_I.D._635'],
... ['PC.636', 'Fast', '20080116', 'Fasting_mouse_I.D._636']]
>>> columns = ['SampleID', 'Treatment', 'DOB', 'Description']
>>> mf = pd.DataFrame(columns=columns, data=data)

Before we can use this mapping file in Emperor, we should set the index to be SampleID.

>>> mf.set_index('SampleID', inplace=True)

Then let’s create some artificial ordination data:

>>> ids = ('PC.636', 'PC.635', 'PC.356', 'PC.481', 'PC.354', 'PC.593',
...             'PC.355', 'PC.607', 'PC.634')
>>> eigvals = np.array([0.47941212, 0.29201496, 0.24744925,
...                     0.20149607, 0.18007613, 0.14780677,
...                     0.13579593, 0.1122597, 0.])
>>> eigvals = pd.Series(data=eigvals, index=ids)
>>> n = eigvals.shape[0]
>>> samples = np.random.randn(n, n)
>>> samples = pd.DataFrame(data=site, index=ids)
>>> p_explained = np.array([0.26688705, 0.1625637, 0.13775413, 0.11217216,
...                         0.10024775, 0.08228351, 0.07559712, 0.06249458,
...                         0.])
>>> p_explained = pd.Series(data=p_explained, index=ids)

And encapsulate it inside an OrdinationResults object:

>>> ores = OrdinationResults(eigvals, samples=samples,
...                          proportion_explained=p_explained)

Finally import the Emperor object and display it using Jupyter, note that this call will have no effect under a regular Python session:

>>> Emperor(ores, mf)

Attributes

jackknifed: list List of OrdinationResults objects in the same sample-order as self.ordination.
procrustes: list List of OrdinationResults objects in the same sample-order as self.ordination.
procrustes_names: list A list of names that will be used to distinguish samples from each ordination in a procrustes plot. The GUI will display a category labeled __Procrustes_Names__.
width: str Width of the plot when displayed in the Jupyter notebook (in CSS units).
height: str Height of the plot when displayed in the Jupyter notebook (in CSS units).
settings: dict A dictionary of settings that is loaded when a plot is displayed. Settings generated from the graphical user interface are stored as JSON files that can be loaded, and directly set to this attribute. Alternatively, each aspect of the plot can be changed with dedicated methods, for example see color_by, set_background_color, etc. This attribute can also be serialized as a JSON string and loaded from the GUI.
feature_mf: pd.DataFrame DataFrame object with the metadata associated to the features in the ordination object, should have an index set and it should match the identifiers in the ordination.features property.
custom_axes (list of str, optional) Custom axes to embed in the ordination.
jackknifing_method ({‘IQR’, ‘sdev’}, optional) Used only when plotting ellipsoids for jackknifed beta diversity (i.e. using a directory of coord files instead of a single coord file). Valid values are "IQR" (for inter-quartile ranges) and "sdev" (for standard deviation). This argument is ignored if self.jackknifed is None or an empty list.

Methods

animations_by(gradient, trajectory, colors) Set the shape settings for the plot elements
color_by(category[, colors, colormap, ...]) Set the coloring settings for the plot elements
copy_support_files([target]) Copies the support files to a target directory
make_emperor([standalone]) Build an emperor plot
opacity_by(category[, opacities, ...]) Set the scaling settings for the plot elements
scale_by(category[, scales, global_scale, ...]) Set the scaling settings for the plot elements
set_axes([visible, invert, color, view_type]) Change visual aspects about visible dimensions in a plot
set_background_color([color]) Changes the background color of the plot
shape_by(category[, shapes]) Set the shape settings for the plot elements
visibility_by(category[, visibilities, negate]) Set the visibility settings for the plot elements