In [1]:
import calour as ca
/Users/amnon/miniconda3/envs/calour/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
In [2]:
ca.set_log_level(11)
In [3]:
%matplotlib notebook
We will use the Chronic faitigue syndrome data from:
Giloteaux, L., Goodrich, J.K., Walters, W.A., Levine, S.M., Ley, R.E. and Hanson, M.R., 2016.
Reduced diversity and altered composition of the gut microbiome in individuals with myalgic encephalomyelitis/chronic fatigue syndrome.
Microbiome, 4(1), p.30.
In [4]:
cfs=ca.read_amplicon('data/chronic-fatigue-syndrome.biom',
'data/chronic-fatigue-syndrome.sample.txt',
normalize=10000,min_reads=1000)
2018-07-26 13:09:44 INFO loaded 87 samples, 2129 features
2018-07-26 13:09:44 WARNING These have metadata but do not have data - dropped: {'ERR1331814'}
2018-07-26 13:09:44 INFO After filtering, 87 remaining
remove non-interesting bacteria, cluster bacteria and sort samples by disease status
In [5]:
cfs=cfs.filter_abundance(10)
2018-07-26 13:09:45 INFO After filtering, 1100 remaining
In [6]:
cfs=cfs.cluster_features()
2018-07-26 13:09:45 INFO After filtering, 1100 remaining
In [7]:
cfs=cfs.sort_samples('Subject')
in the interactive heatmap, when clicking on a bacteria, we get a list of all database results about the selected bacteria.
We can choose which databases to use by the databases=['dbbact',...]
parameter. The possible databases depend on which database modules were
installed.
Currently, supported microbiome database interfaces include:
By default, calour uses the dbBact database for microbiome data
In [8]:
cfs.plot(sample_field='Subject',gui='jupyter')
Out[8]:
<calour.heatmap.plotgui_jupyter.PlotGUI_Jupyter at 0x1a10abebe0>
By selecting a set of bacteria (using the shift+click or ctrl+click) and choosing the “Enrichment” button, we can get a list of terms that are significantly enriched in the selected bacteria compared to the rest of the bacteria in the plot
(Only possible using the gui='qt5'
GUI)
To add a new annotation to the selected set of bacteria, choose the “Annotate” button.
Detailed instructions are available at the dbBact.org website.
To find the bacteria significantly different between samples with ‘Control’ (healthy) and ‘Patient’ (sick) in the ‘Subject’ field.
In [9]:
dd=cfs.diff_abundance(field='Subject',val1='Control',val2='Patient', random_seed=2018)
2018-07-26 13:09:57 INFO 87 samples with both values
2018-07-26 13:09:57 INFO After filtering, 1100 remaining
2018-07-26 13:09:57 INFO 39 samples with value 1 (['Control'])
2018-07-26 13:09:58 INFO method meandiff. number of higher in ['Control'] : 38. number of higher in ['Patient'] : 16. total 54
When clicking on a bacteria, we’ll get both dbBact, SpongeEMP, and phenoDB information
In [10]:
dd.plot(sample_field='Subject', gui='jupyter', databases=['dbbact','sponge'])
Out[10]:
<calour.heatmap.plotgui_jupyter.PlotGUI_Jupyter at 0x1a16f33f98>
diff_abundance_enrichment
)¶We can ask what is special in the bacteria significanly higher in the Control vs. the Patient group and vice versa.
In [11]:
ax, enriched=dd.plot_diff_abundance_enrichment()
2018-07-26 13:10:03 INFO Getting dbBact annotations for 54 sequences, please wait...
2018-07-26 13:10:08 INFO Got 2328 annotations
2018-07-26 13:10:08 INFO Added annotation data to experiment. Total 705 annotations, 54 terms
2018-07-26 13:10:08 INFO removed 0 terms
The enriched terms are in a calour experiment class (terms are features, bacteria are samples), so we can see the list of enriched terms with the p-value (pval) and effect size (odif)
In [12]:
enriched.feature_metadata
Out[12]:
odif | pvals | term | |
---|---|---|---|
little physical activity {*single exp 63*} | -18.562500 | 0.000999 | little physical activity {*single exp 63*} |
LOWER IN physical activity {*single exp 63*} | -18.562500 | 0.000999 | LOWER IN physical activity {*single exp 63*} |
LOWER IN rural community | -18.518092 | 0.000999 | LOWER IN rural community |
LOWER IN control | -17.807566 | 0.000999 | LOWER IN control |
LOWER IN small village | -17.452303 | 0.000999 | LOWER IN small village |
LOWER IN tunapuco {*single exp 276*} | -16.430921 | 0.000999 | LOWER IN tunapuco {*single exp 276*} |
LOWER IN peru {*single exp 276*} | -16.430921 | 0.000999 | LOWER IN peru {*single exp 276*} |
crohn's disease | -15.587171 | 0.000999 | crohn's disease |
chronic fatigue syndrome {*single exp 12*} | -15.187500 | 0.000999 | chronic fatigue syndrome {*single exp 12*} |
LOWER IN adult | -14.254934 | 0.001998 | LOWER IN adult |
mus musculus | -13.588816 | 0.000999 | mus musculus |
age > 1 year | -12.478618 | 0.004995 | age > 1 year |
kingdom of denmark {*single exp 273*} | -12.434211 | 0.006993 | kingdom of denmark {*single exp 273*} |
mouse | -11.945724 | 0.000999 | mouse |
state of oklahoma | -11.501645 | 0.007992 | state of oklahoma |
age 1 year | -11.368421 | 0.000999 | age 1 year |
LOWER IN plant diet {*single exp 74*} | -11.368421 | 0.002997 | LOWER IN plant diet {*single exp 74*} |
stroke {*single exp 333*} | -11.235197 | 0.001998 | stroke {*single exp 333*} |
age one year {*single exp 273*} | -11.190789 | 0.003996 | age one year {*single exp 273*} |
research facility | -10.968750 | 0.005994 | research facility |
infant | -10.657895 | 0.018981 | infant |
LOWER IN male | -10.347039 | 0.002997 | LOWER IN male |
LOWER IN age 30-40 {*single exp 330*} | -10.125000 | 0.000999 | LOWER IN age 30-40 {*single exp 330*} |
msw {*single exp 344*} | -10.036184 | 0.007992 | msw {*single exp 344*} |
finland | -10.036184 | 0.024975 | finland |
heterosexual {*single exp 344*} | -10.036184 | 0.007992 | heterosexual {*single exp 344*} |
LOWER IN age <1 year {*single exp 240*} | -9.858553 | 0.019980 | LOWER IN age <1 year {*single exp 240*} |
age | -9.858553 | 0.031968 | age |
animal product diet | -9.769737 | 0.005994 | animal product diet |
obsolete_juvenile stage | -9.547697 | 0.038961 | obsolete_juvenile stage |
... | ... | ... | ... |
LOWER IN state of oklahoma | 14.210526 | 0.000999 | LOWER IN state of oklahoma |
msm {*single exp 344*} | 14.388158 | 0.000999 | msm {*single exp 344*} |
gay {*single exp 344*} | 14.388158 | 0.000999 | gay {*single exp 344*} |
homosexual {*single exp 344*} | 14.388158 | 0.000999 | homosexual {*single exp 344*} |
cron diet {*single exp 293*} | 15.009868 | 0.000999 | cron diet {*single exp 293*} |
caloric restriction diet {*single exp 293*} | 15.009868 | 0.000999 | caloric restriction diet {*single exp 293*} |
LOWER IN united states of america | 15.631579 | 0.000999 | LOWER IN united states of america |
sus scrofa | 15.942434 | 0.000999 | sus scrofa |
pig | 15.942434 | 0.000999 | pig |
right colon {*single exp 256*} | 16.164474 | 0.000999 | right colon {*single exp 256*} |
left colon {*single exp 256*} | 16.164474 | 0.000999 | left colon {*single exp 256*} |
LOWER IN city | 16.342105 | 0.000999 | LOWER IN city |
influent {*single exp 53*} | 17.230263 | 0.000999 | influent {*single exp 53*} |
sewage {*single exp 53*} | 17.230263 | 0.000999 | sewage {*single exp 53*} |
LOWER IN effluent | 17.230263 | 0.000999 | LOWER IN effluent |
wastewater treatment plant | 17.452303 | 0.000999 | wastewater treatment plant |
LOWER IN finland | 17.629934 | 0.000999 | LOWER IN finland |
tanzania {*single exp 190*} | 17.763158 | 0.000999 | tanzania {*single exp 190*} |
hadza {*single exp 190*} | 17.763158 | 0.000999 | hadza {*single exp 190*} |
egypt {*single exp 62*} | 18.118421 | 0.000999 | egypt {*single exp 62*} |
amerindian {*single exp 75*} | 19.184211 | 0.000999 | amerindian {*single exp 75*} |
venezuela {*single exp 75*} | 19.184211 | 0.000999 | venezuela {*single exp 75*} |
south america {*single exp 53*} | 20.250000 | 0.000999 | south america {*single exp 53*} |
peru {*single exp 276*} | 21.315789 | 0.000999 | peru {*single exp 276*} |
tunapuco {*single exp 276*} | 21.315789 | 0.000999 | tunapuco {*single exp 276*} |
rural community | 21.937500 | 0.000999 | rural community |
hunter gatherer | 22.026316 | 0.000999 | hunter gatherer |
el salvador {*single exp 53*} | 22.026316 | 0.000999 | el salvador {*single exp 53*} |
LOWER IN infant | 23.269737 | 0.000999 | LOWER IN infant |
small village | 23.758224 | 0.000999 | small village |
135 rows × 3 columns
We can plot the enriched terms heatmap to see the term scores for each bacteria.
Note now rows are the bacteria and columns are the terms
In [16]:
enriched.plot(gui='jupyter', databases=[], feature_field='term',sample_field='group',
yticklabel_kwargs={'rotation': 0, 'size': 7})
Out[16]:
<calour.heatmap.plotgui_jupyter.PlotGUI_Jupyter at 0x1a1c678c50>
We want to see all the annotations where a given term appears, and see what bacteria from either group (CFS or healthy) appear in that annotations. To do this, we use dbbact.show_term_details_diff(). The output of this function is an experiment where each COLUMN is a bacteria, and each row is an annotation. We see whether each bacteria appears in the annotation. Color indicates the annotation type.
In [38]:
dbbact=ca.database._get_database_class('dbbact')
In [40]:
term_info_exp = dbbact.show_term_details_diff('small village',dd,gui='jupyter')
2018-07-26 13:24:01 INFO found 12 annotations with term
2018-07-26 13:24:01 WARNING Do you forget to normalize your data? It is required before running this function
2018-07-26 13:24:01 INFO After filtering, 12 remaining
Each annotation is coming from a single experiment (as opposed to terms that can come from annotations in multiple experiment)
In [17]:
ax, enriched=dd.plot_diff_abundance_enrichment(term_type='annotation')
2018-07-26 13:12:53 INFO removed 0 terms
In [18]:
enriched.feature_metadata
Out[18]:
odif | pvals | term | |
---|---|---|---|
higher in individuals with low physical activity ( high in little physical activity compared to physical activity in feces homo sapiens united states of america | -18.562500 | 0.000999 | higher in individuals with low physical activi... |
high in united states of america city state of oklahoma compared to peru small village tunapuco rural community in feces homo sapiens adult | -16.430921 | 0.000999 | high in united states of america city state o... |
high in children with Crohn's disease compared to healthy adult controls ( high in crohn's disease child obsolete_juvenile stage compared to control adult in feces homo sapiens glasgow | -15.187500 | 0.000999 | high in children with Crohn's disease compared... |
high in chronic fatigue syndrome compared to control in feces homo sapiens new york county | -15.187500 | 0.000999 | high in chronic fatigue syndrome compared to... |
high in female compared to male in feces homo sapiens united states of america | -15.187500 | 0.000999 | high in female compared to male in feces ho... |
Higher in animal product diet compared to plant diet ( high in diet animal product diet compared to plant diet in feces homo sapiens united states of america | -11.368421 | 0.000999 | Higher in animal product diet compared to plan... |
common feces, homo sapiens, infant, kingdom of norway, oslo, age 1 year, | -10.125000 | 0.000999 | common feces, homo sapiens, infant, kingdom o... |
high in infant age 1 year compared to adult age 30-40 in feces homo sapiens kingdom of norway oslo | -10.125000 | 0.000999 | high in infant age 1 year compared to adult ... |
higher in stroke patients compared to healthy controls ( high in stroke compared to control in feces homo sapiens china adult guangzhou city prefecture | -10.125000 | 0.001998 | higher in stroke patients compared to healthy ... |
lower in infants age<1 year compared to 1-3 years in baby feces ( high in age age > 1 year compared to age <1 year in feces homo sapiens infant finland | -9.858553 | 0.016983 | lower in infants age<1 year compared to 1-3 ye... |
lower in gay (msm) individuals compared to heterosexual (msw) ( high in heterosexual msw compared to gay homosexual msm in feces homo sapiens united states of america state of colorado denver | -9.414474 | 0.002997 | lower in gay (msm) individuals compared to het... |
high in age 1 year compared to age 2 months in feces homo sapiens female infant state of california | -9.414474 | 0.001998 | high in age 1 year compared to age 2 months ... |
higher in lean participants in human feces ( high in low bmi compared to high bmi in feces homo sapiens united states of america adult | -8.437500 | 0.001998 | higher in lean participants in human feces ( h... |
common feces, homo sapiens, infant, kingdom of denmark, age one year, | -8.437500 | 0.000999 | common feces, homo sapiens, infant, kingdom o... |
high in age age one month compared to age one week in feces homo sapiens infant kingdom of denmark | -7.726974 | 0.006993 | high in age age one month compared to age on... |
high in healthy dogs compared to EPI dogs without treatment ( high in control compared to exocrine pancreatic insufficiency in feces united states of america canis lupus familiaris dog | -7.726974 | 0.006993 | high in healthy dogs compared to EPI dogs with... |
common feces, homo sapiens, china, city, adult, | -7.371711 | 0.061938 | common feces, homo sapiens, china, city, adult, |
high in age age > 1 year compared to age < 1 year in feces homo sapiens united states of america infant | -7.016447 | 0.019980 | high in age age > 1 year compared to age < 1... |
negatively correlated with age (30-80 years) ( high in age age 30-50 years compared to age 50-80 years in feces homo sapiens south korea | -7.016447 | 0.017982 | negatively correlated with age (30-80 years) (... |
common feces, homo sapiens, diarrhea, state of michigan, clostridium difficile intestinal infectious disease, | -6.750000 | 0.005994 | common feces, homo sapiens, diarrhea, state o... |
high in city compared to small village rural community in feces homo sapiens china adult | -6.750000 | 0.006993 | high in city compared to small village rural... |
high in old (14-28 days) compared to young (0-3 day) chickens ( high in age old age compared to young age in united states of america caecum gallus gallus chicken | -6.750000 | 0.010989 | high in old (14-28 days) compared to young (0-... |
common feces, united states of america, canis lupus familiaris, iowa, | -6.750000 | 0.002997 | common feces, united states of america, canis... |
common in infants age <3 years (common feces, homo sapiens, infant, finland, age < 3 years, | -6.750000 | 0.006993 | common in infants age <3 years (common feces,... |
positively correlated with bmi ( high in body mass index high bmi compared to low bmi in feces homo sapiens united kingdom | -6.305921 | 0.053946 | positively correlated with bmi ( high in body ... |
common feces, united states of america, canis lupus familiaris, dog, | -6.305921 | 0.035964 | common feces, united states of america, canis... |
common feces, homo sapiens, china, crohn's disease, adult, | -6.305921 | 0.044955 | common feces, homo sapiens, china, crohn's di... |
higher in feces of individuals with kidney stones ( high in nephrolithiasis compared to control in feces homo sapiens china adult nanning city prefecture age 50-60 years | -5.328947 | 0.057942 | higher in feces of individuals with kidney sto... |
higher in babies from finland compared to estonia ( high in finland compared to estonia in feces homo sapiens infant age < 3 years | -5.062500 | 0.014985 | higher in babies from finland compared to esto... |
common united states of america, colon, canis lupus familiaris, dog, | -5.062500 | 0.020979 | common united states of america, colon, canis... |
... | ... | ... | ... |
common feces, ethiopia, monkey, chlorocebus djamdjamensis, bale monkey, | 9.236842 | 0.018981 | common feces, ethiopia, monkey, chlorocebus d... |
common in feces of homosexual males (common feces, homo sapiens, united states of america, state of colorado, denver, gay, homosexual, msm, | 9.414474 | 0.025974 | common in feces of homosexual males (common f... |
common feces, homo sapiens, brazil, | 9.592105 | 0.042957 | common feces, homo sapiens, brazil, |
common duodenum, jejunum, ileum, sus scrofa, united kingdom, pig, | 9.680921 | 0.014985 | common duodenum, jejunum, ileum, sus scrofa, ... |
high in healthy adult controls compared to children with Crohn's disease ( high in control adult compared to crohn's disease child obsolete_juvenile stage in feces homo sapiens glasgow | 9.947368 | 0.005994 | high in healthy adult controls compared to chi... |
higher in caloric restriction (CRON) diet compared to american diet ( high in diet cron diet caloric restriction diet compared to american diet in feces homo sapiens united states of america adult | 9.947368 | 0.004995 | higher in caloric restriction (CRON) diet comp... |
common feces, homo sapiens, adult, india, | 9.947368 | 0.007992 | common feces, homo sapiens, adult, india, |
low in diarrhea compared to recovery period ( high in control compared to diarrhea in feces homo sapiens adult bangladesh | 10.036184 | 0.016983 | low in diarrhea compared to recovery period ( ... |
common feces, homo sapiens, united states of america, adult, cron diet, caloric restriction diet, | 10.125000 | 0.013986 | common feces, homo sapiens, united states of ... |
common feces, chlorocebus aethiops, ethiopia, monkey, grivet monkey, | 10.657895 | 0.009990 | common feces, chlorocebus aethiops, ethiopia,... |
common feces, chlorocebus aethiops, ethiopia, monkey, vervet monkey, | 10.657895 | 0.004995 | common feces, chlorocebus aethiops, ethiopia,... |
higher in gay (msm) individuals compared to heterosexual (msw) ( high in gay homosexual msm compared to heterosexual msw in feces homo sapiens united states of america state of colorado denver | 10.657895 | 0.007992 | higher in gay (msm) individuals compared to he... |
high in wet season compared to dry season in feces homo sapiens tanzania hunter gatherer hadza | 10.657895 | 0.003996 | high in wet season compared to dry season i... |
common feces, homo sapiens, united states of america, child, obsolete_juvenile stage, | 11.190789 | 0.003996 | common feces, homo sapiens, united states of ... |
high in control compared to chronic fatigue syndrome in feces homo sapiens new york county | 11.368421 | 0.003996 | high in control compared to chronic fatigue ... |
higher in babies from russia compared to finland ( high in russia compared to finland in feces homo sapiens infant age < 3 years | 11.812500 | 0.004995 | higher in babies from russia compared to finla... |
common feces, homo sapiens, city, lima, shantytown, | 11.812500 | 0.004995 | common feces, homo sapiens, city, lima, shant... |
common feces, homo sapiens, tanzania, hunter gatherer, hadza, | 12.789474 | 0.000999 | common feces, homo sapiens, tanzania, hunter ... |
lower in small intestine compared to colon in pigs ( high in caecum left colon right colon compared to duodenum jejunum ileum in sus scrofa united kingdom pig | 12.789474 | 0.000999 | lower in small intestine compared to colon in ... |
high in adult age 30-40 compared to infant age 1 year in feces homo sapiens kingdom of norway oslo | 13.233553 | 0.002997 | high in adult age 30-40 compared to infant a... |
high in peru small village tunapuco rural community compared to united states of america city state of oklahoma in feces homo sapiens adult | 14.210526 | 0.001998 | high in peru small village tunapuco rural com... |
common caecum, left colon, right colon, sus scrofa, united kingdom, pig, | 14.654605 | 0.000999 | common caecum, left colon, right colon, sus s... |
common feces, homo sapiens, child, egypt, obsolete_juvenile stage, | 15.009868 | 0.000999 | common feces, homo sapiens, child, egypt, obs... |
high in male compared to female in feces homo sapiens united states of america | 15.631579 | 0.000999 | high in male compared to female in feces ho... |
lower in babies from finland compared to estonia ( high in estonia compared to finland in feces homo sapiens infant age < 3 years | 16.342105 | 0.000999 | lower in babies from finland compared to eston... |
lower in wastewater plant effluent compared to influent and sewer in south america ( high in sewage influent compared to effluent in city wastewater treatment plant south america | 17.230263 | 0.000999 | lower in wastewater plant effluent compared to... |
common feces, homo sapiens, venezuela, amerindian, hunter gatherer, | 19.184211 | 0.000999 | common feces, homo sapiens, venezuela, amerin... |
common feces, homo sapiens, adult, peru, small village, tunapuco, rural community, | 19.184211 | 0.000999 | common feces, homo sapiens, adult, peru, smal... |
common feces, homo sapiens, city, el salvador, small village, | 20.605263 | 0.000999 | common feces, homo sapiens, city, el salvador... |
high in adult compared to infant age < 1 year in feces homo sapiens india | 22.470395 | 0.000999 | high in adult compared to infant age < 1 yea... |
79 rows × 3 columns
In [19]:
ax, enriched=dd.plot_diff_abundance_enrichment(term_type='combined')
2018-07-26 13:13:05 INFO removed 0 terms
In [20]:
enriched.feature_metadata
Out[20]:
odif | pvals | term | |
---|---|---|---|
higher in individuals with low physical activity ( high in little physical activity compared to physical activity in feces homo sapiens united states of america | -18.562500 | 0.000999 | higher in individuals with low physical activi... |
LOWER IN physical activity {*single exp 63*} | -18.562500 | 0.000999 | LOWER IN physical activity {*single exp 63*} |
little physical activity {*single exp 63*} | -18.562500 | 0.000999 | little physical activity {*single exp 63*} |
LOWER IN rural community | -18.518092 | 0.000999 | LOWER IN rural community |
LOWER IN control | -17.807566 | 0.000999 | LOWER IN control |
LOWER IN small village | -17.452303 | 0.000999 | LOWER IN small village |
LOWER IN peru {*single exp 276*} | -16.430921 | 0.000999 | LOWER IN peru {*single exp 276*} |
high in united states of america city state of oklahoma compared to peru small village tunapuco rural community in feces homo sapiens adult | -16.430921 | 0.000999 | high in united states of america city state o... |
LOWER IN tunapuco {*single exp 276*} | -16.430921 | 0.000999 | LOWER IN tunapuco {*single exp 276*} |
crohn's disease | -15.587171 | 0.000999 | crohn's disease |
high in children with Crohn's disease compared to healthy adult controls ( high in crohn's disease child obsolete_juvenile stage compared to control adult in feces homo sapiens glasgow | -15.187500 | 0.000999 | high in children with Crohn's disease compared... |
high in chronic fatigue syndrome compared to control in feces homo sapiens new york county | -15.187500 | 0.000999 | high in chronic fatigue syndrome compared to... |
high in female compared to male in feces homo sapiens united states of america | -15.187500 | 0.000999 | high in female compared to male in feces ho... |
chronic fatigue syndrome {*single exp 12*} | -15.187500 | 0.000999 | chronic fatigue syndrome {*single exp 12*} |
LOWER IN adult | -14.254934 | 0.001998 | LOWER IN adult |
mus musculus | -13.588816 | 0.000999 | mus musculus |
age > 1 year | -12.478618 | 0.000999 | age > 1 year |
kingdom of denmark {*single exp 273*} | -12.434211 | 0.001998 | kingdom of denmark {*single exp 273*} |
mouse | -11.945724 | 0.000999 | mouse |
state of oklahoma | -11.501645 | 0.007992 | state of oklahoma |
LOWER IN plant diet {*single exp 74*} | -11.368421 | 0.001998 | LOWER IN plant diet {*single exp 74*} |
age 1 year | -11.368421 | 0.000999 | age 1 year |
Higher in animal product diet compared to plant diet ( high in diet animal product diet compared to plant diet in feces homo sapiens united states of america | -11.368421 | 0.001998 | Higher in animal product diet compared to plan... |
stroke {*single exp 333*} | -11.235197 | 0.001998 | stroke {*single exp 333*} |
age one year {*single exp 273*} | -11.190789 | 0.004995 | age one year {*single exp 273*} |
research facility | -10.968750 | 0.011988 | research facility |
infant | -10.657895 | 0.014985 | infant |
LOWER IN male | -10.347039 | 0.001998 | LOWER IN male |
LOWER IN age 30-40 {*single exp 330*} | -10.125000 | 0.000999 | LOWER IN age 30-40 {*single exp 330*} |
common feces, homo sapiens, infant, kingdom of norway, oslo, age 1 year, | -10.125000 | 0.000999 | common feces, homo sapiens, infant, kingdom o... |
... | ... | ... | ... |
high in male compared to female in feces homo sapiens united states of america | 15.631579 | 0.000999 | high in male compared to female in feces ho... |
pig | 15.942434 | 0.001998 | pig |
sus scrofa | 15.942434 | 0.001998 | sus scrofa |
left colon {*single exp 256*} | 16.164474 | 0.000999 | left colon {*single exp 256*} |
right colon {*single exp 256*} | 16.164474 | 0.000999 | right colon {*single exp 256*} |
lower in babies from finland compared to estonia ( high in estonia compared to finland in feces homo sapiens infant age < 3 years | 16.342105 | 0.000999 | lower in babies from finland compared to eston... |
LOWER IN city | 16.342105 | 0.000999 | LOWER IN city |
sewage {*single exp 53*} | 17.230263 | 0.000999 | sewage {*single exp 53*} |
LOWER IN effluent | 17.230263 | 0.000999 | LOWER IN effluent |
influent {*single exp 53*} | 17.230263 | 0.000999 | influent {*single exp 53*} |
lower in wastewater plant effluent compared to influent and sewer in south america ( high in sewage influent compared to effluent in city wastewater treatment plant south america | 17.230263 | 0.000999 | lower in wastewater plant effluent compared to... |
wastewater treatment plant | 17.452303 | 0.000999 | wastewater treatment plant |
LOWER IN finland | 17.629934 | 0.000999 | LOWER IN finland |
hadza {*single exp 190*} | 17.763158 | 0.000999 | hadza {*single exp 190*} |
tanzania {*single exp 190*} | 17.763158 | 0.000999 | tanzania {*single exp 190*} |
egypt {*single exp 62*} | 18.118421 | 0.000999 | egypt {*single exp 62*} |
venezuela {*single exp 75*} | 19.184211 | 0.000999 | venezuela {*single exp 75*} |
amerindian {*single exp 75*} | 19.184211 | 0.000999 | amerindian {*single exp 75*} |
common feces, homo sapiens, venezuela, amerindian, hunter gatherer, | 19.184211 | 0.000999 | common feces, homo sapiens, venezuela, amerin... |
common feces, homo sapiens, adult, peru, small village, tunapuco, rural community, | 19.184211 | 0.000999 | common feces, homo sapiens, adult, peru, smal... |
south america {*single exp 53*} | 20.250000 | 0.000999 | south america {*single exp 53*} |
common feces, homo sapiens, city, el salvador, small village, | 20.605263 | 0.000999 | common feces, homo sapiens, city, el salvador... |
tunapuco {*single exp 276*} | 21.315789 | 0.000999 | tunapuco {*single exp 276*} |
peru {*single exp 276*} | 21.315789 | 0.000999 | peru {*single exp 276*} |
rural community | 21.937500 | 0.000999 | rural community |
el salvador {*single exp 53*} | 22.026316 | 0.000999 | el salvador {*single exp 53*} |
hunter gatherer | 22.026316 | 0.000999 | hunter gatherer |
high in adult compared to infant age < 1 year in feces homo sapiens india | 22.470395 | 0.000999 | high in adult compared to infant age < 1 yea... |
LOWER IN infant | 23.269737 | 0.000999 | LOWER IN infant |
small village | 23.758224 | 0.000999 | small village |
215 rows × 3 columns
If our experiment is already in dbBact, or if there are other
experiments in dbBact we do not want to include in the enrichment
analysis, we can specify them using the ignore_exp=[expID,...]
parameter.
In our case, the cfs experiment is already added to dbBact, so let’s
ignore it’s annotations when doing the analysis. By looking at
dbBact.org we know its experimentID is 12.
Alternatively we can use ignore_exp=True
to automatically detect the
current experimentID if it exists in dbBact (using the data and mapping
file md5 hash).
In [21]:
ax, enriched=dd.plot_diff_abundance_enrichment(term_type='combined', ignore_exp=[12])
2018-07-26 13:13:12 INFO removed 0 terms
add_terms_to_features
)¶We can attach to each bacteria the most common dbBact term associated with it.
The terms are selected from all of the dbBact terms, or can be selected from a supplied list.
In [22]:
cfs=cfs.add_terms_to_features(dbname='dbbact',use_term_list=['feces','saliva','skin','mus musculus'])
2018-07-26 13:13:20 INFO Getting dbBact annotations for 1100 sequences, please wait...
2018-07-26 13:13:32 INFO Got 24053 annotations
2018-07-26 13:13:32 INFO Added annotation data to experiment. Total 2151 annotations, 1100 terms
In [23]:
tt=cfs.sort_by_metadata('common_term',axis='feature')
In [24]:
tt.plot(sample_field='Subject', feature_field='common_term', gui='jupyter')
Out[24]:
<calour.heatmap.plotgui_jupyter.PlotGUI_Jupyter at 0x1a1cbe0e48>
Instead of just comparing the bacteria enriched in the two groups (and then comparing terms between them), we can do a weighted term average for each group using all bacteria (weighing the terms of each bacteria by its’ frequency in the sample). This can work if we don’t have a strong set of bacteria separating between the two groups.
In [25]:
dbbact=ca.database._get_database_class('dbbact')
In [32]:
enriched=dbbact.sample_enrichment(cfs,'Subject','Control','Patient',
term_type='combined',ignore_exp=[12])
2018-07-26 13:17:22 INFO 87 samples with both values
2018-07-26 13:17:22 WARNING Do you forget to normalize your data? It is required before running this function
2018-07-26 13:17:22 INFO After filtering, 2704 remaining
2018-07-26 13:17:22 INFO 39 samples with value 1 (['Control'])
2018-07-26 13:17:24 INFO method meandiff. number of higher in ['Control'] : 455. number of higher in ['Patient'] : 51. total 506
In [27]:
enriched.feature_metadata
Out[27]:
term | num_features | _calour_diff_abundance_effect | _calour_diff_abundance_pval | _calour_diff_abundance_group | |
---|---|---|---|---|---|
enzyme supplement | enzyme supplement | 20 | -1.467864 | 0.000999 | Patient |
-no enzyme supplement | -no enzyme supplement | 20 | -1.252388 | 0.000999 | Patient |
high in EPI dogs with enzyme supplement compared to no supplement ( high in enzyme supplement compared to no enzyme supplement in feces united states of america exocrine pancreatic insufficiency canis lupus familiaris dog | high in EPI dogs with enzyme supplement compar... | 20 | -1.252388 | 0.000999 | Patient |
-gastric bypass | -gastric bypass | 4 | -1.009475 | 0.000999 | Patient |
lower in people with Roux-en-Y gastric bypass compared to controls ( high in control compared to gastric bypass in feces homo sapiens united states of america | lower in people with Roux-en-Y gastric bypass ... | 4 | -1.009475 | 0.000999 | Patient |
-physical activity | -physical activity | 49 | -0.964541 | 0.001998 | Patient |
higher in individuals with low physical activity ( high in little physical activity compared to physical activity in feces homo sapiens united states of america | higher in individuals with low physical activi... | 49 | -0.964541 | 0.001998 | Patient |
little physical activity | little physical activity | 49 | -0.931222 | 0.002997 | Patient |
high in children with Crohn's disease compared to healthy adult controls ( high in crohn's disease child obsolete_juvenile stage compared to control adult in feces homo sapiens glasgow | high in children with Crohn's disease compared... | 53 | -0.874353 | 0.000999 | Patient |
-age 30-40 | -age 30-40 | 16 | -0.832852 | 0.000999 | Patient |
high in infant age 1 year compared to adult age 30-40 in feces homo sapiens kingdom of norway oslo | high in infant age 1 year compared to adult ... | 16 | -0.832852 | 0.000999 | Patient |
salmune vaccination | salmune vaccination | 12 | -0.821224 | 0.001998 | Patient |
-salmune vaccination | -salmune vaccination | 28 | -0.812220 | 0.001998 | Patient |
-vaccination | -vaccination | 28 | -0.812220 | 0.001998 | Patient |
higher in non-vaccinated chickens ( high in control compared to vaccination salmune vaccination in united states of america caecum gallus gallus chicken | higher in non-vaccinated chickens ( high in co... | 28 | -0.812220 | 0.001998 | Patient |
pulsed antibiotic treatment, macrolide tylosin tartrate | pulsed antibiotic treatment, macrolide tylosin... | 9 | -0.771024 | 0.004995 | Patient |
exocrine pancreatic insufficiency | exocrine pancreatic insufficiency | 33 | -0.728781 | 0.002997 | Patient |
highfreq feces, acinonyx jubatus, namibia, | highfreq feces, acinonyx jubatus, namibia, | 7 | -0.661063 | 0.000999 | Patient |
common united states of america, caecum, gallus gallus, chicken, age 14-28 days, | common united states of america, caecum, gall... | 23 | -0.626557 | 0.001998 | Patient |
higher in stroke patients compared to healthy controls ( high in stroke compared to control in feces homo sapiens china adult guangzhou city prefecture | higher in stroke patients compared to healthy ... | 31 | -0.610551 | 0.001998 | Patient |
high in old (14-28 days) compared to young (0-3 day) chickens ( high in age old age compared to young age in united states of america caecum gallus gallus chicken | high in old (14-28 days) compared to young (0-... | 36 | -0.601091 | 0.003996 | Patient |
high in control compared to diarrhea in feces felis catus state of california | high in control compared to diarrhea in fec... | 10 | -0.592252 | 0.006993 | Patient |
higher in babies from finland compared to estonia ( high in finland compared to estonia in feces homo sapiens infant age < 3 years | higher in babies from finland compared to esto... | 26 | -0.589219 | 0.012987 | Patient |
canis mesomelas | canis mesomelas | 33 | -0.569207 | 0.001998 | Patient |
smj: higher in female mice feces treated with antibiotics ( high in antibiotic pulsed antibiotic treatment, macrolide tylosin tartrate compared to control in feces united states of america female research facility mus musculoides nyulmc nod/shiltj (no. 001976, jackson labs) | smj: higher in female mice feces treated with ... | 9 | -0.563986 | 0.011988 | Patient |
common feces, namibia, canis mesomelas, | common feces, namibia, canis mesomelas, | 33 | -0.559916 | 0.002997 | Patient |
acinonyx jubatus | acinonyx jubatus | 17 | -0.553638 | 0.002997 | Patient |
-dust day | -dust day | 21 | -0.550642 | 0.012987 | Patient |
higher in dust storm compared to clear day in israel air ( high in clear day compared to dust day in air dust israel size < 10um | higher in dust storm compared to clear day in ... | 21 | -0.550642 | 0.012987 | Patient |
namibia | namibia | 40 | -0.538952 | 0.001998 | Patient |
... | ... | ... | ... | ... | ... |
peru | peru | 234 | 0.851086 | 0.000999 | Control |
tunapuco | tunapuco | 229 | 0.856866 | 0.000999 | Control |
-tibetan pig | -tibetan pig | 26 | 0.866270 | 0.001998 | Control |
-tibetan swine | -tibetan swine | 26 | 0.866270 | 0.001998 | Control |
high in sus scrofa pig compared to tibetan pig tibetan swine in china farm caecum tibet autonomous region cecal content | high in sus scrofa pig compared to tibetan p... | 26 | 0.866270 | 0.001998 | Control |
high in male compared to female in feces homo sapiens united states of america | high in male compared to female in feces ho... | 129 | 0.878053 | 0.000999 | Control |
lower in lean participants in human feces ( high in high bmi compared to low bmi in feces homo sapiens united states of america adult | lower in lean participants in human feces ( hi... | 25 | 0.914211 | 0.001998 | Control |
lower in babies from finland compared to estonia ( high in estonia compared to finland in feces homo sapiens infant age < 3 years | lower in babies from finland compared to eston... | 110 | 0.943118 | 0.000999 | Control |
lower in small intestine compared to colon in pigs ( high in caecum left colon right colon compared to duodenum jejunum ileum in sus scrofa united kingdom pig | lower in small intestine compared to colon in ... | 137 | 0.962579 | 0.000999 | Control |
-irritable bowel syndrome | -irritable bowel syndrome | 65 | 0.969604 | 0.000999 | Control |
high in control compared to irritable bowel syndrome in feces homo sapiens adult kingdom of spain | high in control compared to irritable bowel ... | 65 | 0.969604 | 0.000999 | Control |
-united states of america | -united states of america | 195 | 0.979835 | 0.002997 | Control |
-camp hukamako | -camp hukamako | 6 | 0.993227 | 0.000999 | Control |
lower in Hadza camp Hukamako compared to hadza camp Sengeli ( high in camp sengeli compared to camp hukamako in feces homo sapiens tanzania hunter gatherer hadza | lower in Hadza camp Hukamako compared to hadza... | 6 | 0.993227 | 0.000999 | Control |
common feces, ethiopia, monkey, theropithecus gelada, | common feces, ethiopia, monkey, theropithecus... | 17 | 1.005345 | 0.001998 | Control |
plant based diet | plant based diet | 4 | 1.033398 | 0.000999 | Control |
-little physical activity | -little physical activity | 84 | 1.051961 | 0.001998 | Control |
higher in individuals with high physical activity ( high in physical activity compared to little physical activity in feces homo sapiens united states of america | higher in individuals with high physical activ... | 84 | 1.051961 | 0.001998 | Control |
high in male compared to female in feces homo sapiens toronto | high in male compared to female in feces ho... | 11 | 1.053019 | 0.000999 | Control |
physical activity | physical activity | 84 | 1.072794 | 0.001998 | Control |
-city | -city | 184 | 1.084257 | 0.000999 | Control |
highfreq caecum, left colon, right colon, sus scrofa, united kingdom, pig, | highfreq caecum, left colon, right colon, sus... | 10 | 1.108316 | 0.000999 | Control |
hiv infection | hiv infection | 67 | 1.112100 | 0.000999 | Control |
-state of oklahoma | -state of oklahoma | 177 | 1.132083 | 0.000999 | Control |
high in peru small village tunapuco rural community compared to united states of america city state of oklahoma in feces homo sapiens adult | high in peru small village tunapuco rural com... | 177 | 1.132083 | 0.000999 | Control |
camp sengeli | camp sengeli | 6 | 1.177868 | 0.000999 | Control |
-msw | -msw | 82 | 1.255350 | 0.000999 | Control |
-heterosexual | -heterosexual | 82 | 1.255350 | 0.000999 | Control |
higher in gay (msm) individuals compared to heterosexual (msw) ( high in gay homosexual msm compared to heterosexual msw in feces homo sapiens united states of america state of colorado denver | higher in gay (msm) individuals compared to he... | 82 | 1.255350 | 0.000999 | Control |
high in hiv infection compared to control in feces homo sapiens united states of america | high in hiv infection compared to control i... | 65 | 1.518273 | 0.000999 | Control |
516 rows × 5 columns
In [ ]: