predict_metagenomes.py – This script produces the actual metagenome functional predictions for a given OTU table.

Description:

Usage: predict_metagenomes.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --input_otu_table
The input otu table in biom format
-o, --output_metagenome_table
The output file for the predicted metagenome

[OPTIONAL]

-t, --type_of_prediction
Type of functional predictions. Valid choices are: ko, cog, rfam [default: ko]
-g, --gg_version
Version of GreenGenes that was used for OTU picking. Valid choices are: 13_5, 18may2012 [default: 13_5]
-c, --input_count_table
Precalculated function predictions on per otu basis in biom format (can be gzipped). Note: using this option overrides –type_of_prediction and –gg_version. [default: None]
-a, --accuracy_metrics
If provided, calculate accuracy metrics for the predicted metagenome. NOTE: requires that per-genome accuracy metrics were calculated using predict_traits.py during genome prediction (e.g. there are “NSTI” values in the genome .biom file metadata)
--suppress_subset_loading
Normally, only counts for OTUs present in the sample are loaded. If this flag is passed, the full biom table is loaded. This makes no difference for the analysis, but may result in faster load times (at the cost of more memory usage)
--load_precalc_file_in_biom
Instead of loading the precalculated file in tab-delimited format (with otu ids as row ids and traits as columns) load the data in biom format (with otu as SampleIds and traits as ObservationIds) [default: False]
--input_variance_table
Precalculated table of variances corresponding to the precalculated table of function predictions. As with the count table, these are on a per otu basis and in BIOM format (can be gzipped). Note: using this option overrides –type_of_prediction and –gg_version. [default: None]
--with_confidence
Calculate 95% confidence intervals for metagenome predictions. By default, this uses the confidence intervals for the precalculated table of genes for greengenes OTUs. If you pass a custom count table with -c and select this option, you must also specify a corresponding table of confidence intervals for the gene content prediction using –input_variance_table. (these are generated by running predict_traits.py with the –with_confidence option). If this flag is set, three addtional output files will be generated, named the same as the metagenome prediction output, but with .variance .upper_CI or .lower_CI appended immediately before the file extension[default: False]
-f, --format_tab_delimited
Output the predicted metagenome table in tab-delimited format [default: False]

Output:

Output is a table of function counts (e.g. KEGG KOs) by sample ids.

Predict KO abundances for a given OTU table picked against the newest version of GreenGenes.

predict_metagenomes.py -i normalized_otus.biom -o predicted_metagenomes.biom

Change output format to plain tab-delimited:

predict_metagenomes.py -f -i normalized_otus.biom -o predicted_metagenomes.txt

Predict COG abundances for a given OTU table.

predict_metagenomes.py -i normalized_otus.biom -t cog -o cog_predicted_metagenomes.biom

Output confidence intervals for each prediction.

predict_metagenomes.py -i normalized_otus.biom -o predicted_metagenomes.biom --with_confidence

Predict metagenomes using a custom trait table in tab-delimited format.

predict_metagenomes.py -i otu_table_for_custom_trait_table.biom -c custom_trait_table.tab -o output_metagenome_from_custom_trait_table.biom

Predict metagenomes,variances,and 95% confidence intervals for each gene category using a custom trait table in tab-delimited format.

predict_metagenomes.py -i otu_table_for_custom_trait_table.biom --input_variance_table custom_trait_table_variances.tab -c custom_trait_table.tab -o output_metagenome_from_custom_trait_table.biom --with_confidence

Change the version of GG used to pick OTUs

predict_metagenomes.py -i normalized_otus.biom -g 18may2012 -o predicted_metagenomes.biom