In this study, there were 18 patients with cystic fibrosis. The hypothesis was that there were two main communities at play in the CF lung. One of these communities thrives at low pH, and the other community thrives at high pH. To test this, sputum samples were divided among 8 tubes, and each of the tubes was perturbed with a different pH. Here we will calculate balances, and test how these balances change with respect to pH, using linear mixed effects models.

First we'll want to load up the datasets we want to process into qiime

In [1]:
!qiime tools import \
--input-path otu_table.biom \
--output-path cfstudy.biom.qza \
--type FeatureTable[Frequency]

!qiime tools import \
--input-path cfstudy_taxonomy.txt \
--output-path cfstudy_taxonomy.qza \
--type FeatureData[Taxonomy]


Again, we'll want to filter out low abundance OTUs. This will not only remove potential confounders, but could also alleviate the issue with zeros.

In [2]:
!qiime feature-table filter-features \
--i-table cfstudy_common.biom.qza \
--o-filtered-table cfstudy_common_filt500.biom.qza \
--p-min-frequency 500

Saved FeatureTable[Frequency] to: cfstudy_common_filt500.biom.qza


Again, we will create the tree using pH. Note that we'll also want to reorder the OTU table for the balance calculations.

In [3]:
!qiime gneiss gradient-clustering \
--i-table cfstudy_common_filt500.biom.qza \
--o-clustering ph_tree.nwk.qza \
--p-weighted

Saved Hierarchy to: ph_tree.nwk.qza


Before running the linear mixed effects models using mixed we'll want to replace zeros with a pseudocount to approximate the uncertainity probability.

In [4]:
!qiime composition add-pseudocount \
--i-table cfstudy_common_filt500.biom.qza \
--p-pseudocount 1 \
--o-composition-table cf_composition.qza

Saved FeatureTable[Composition] to: cf_composition.qza

In [5]:
!qiime gneiss ilr-transform \
--i-table cf_composition.qza \
--i-tree ph_tree.nwk.qza \
--o-balances cf_balances.qza

Saved FeatureTable[Balance] to: cf_balances.qza


Now we can run the linear mixed effects models. pH is the only covariate being tested for and each of the patients are being accounted for by passing host_subject_id into groups. This is because the microbial differences between the patients are much larger than the pH effects, so we need to correct for this change, by treating each patient separately. This is why the linear mixed effects strategy is chosen.

In [6]:
!qiime gneiss lme-regression \
--p-formula "ph" \
--i-table cf_balances.qza \
--i-tree ph_tree.nwk.qza \
--p-groups host_subject_id \
--o-visualization cf_linear_mixed_effects_model

Saved Visualization to: cf_linear_mixed_effects_model.qzv


These summary results can be visualized in qiime2 visualization framework. Checkout view.qiime2.org

Let's further summarize the results of the linear mixed effects model. We'll plot the how one of the top balances change with respect to the pH.

In [7]:
!qiime gneiss balance-taxonomy \
--i-balances cf_balances.qza \
--i-tree ph_tree.nwk.qza \
--i-taxonomy cfstudy_taxonomy.qza \
--p-taxa-level 4 \
--p-balance-name 'y2' \

Saved Visualization to: y2_taxa_summary.qzv