predict_metagenomes.py – This script produces the actual metagenome functional predictions for a given OTU table.¶
Description:
Usage: predict_metagenomes.py [options]
Input Arguments:
Note
[REQUIRED]
- -i, --input_otu_table
- The input otu table in biom format
- -o, --output_metagenome_table
- The output file for the predicted metagenome
[OPTIONAL]
- -t, --type_of_prediction
- Type of functional predictions. Valid choices are: ko, cog, rfam [default: ko]
- -g, --gg_version
- Version of GreenGenes that was used for OTU picking. Valid choices are: 13_5, 18may2012 [default: 13_5]
- -c, --input_count_table
- Precalculated function predictions on per otu basis in biom format (can be gzipped). Note: using this option overrides –type_of_prediction and –gg_version. [default: None]
- -a, --accuracy_metrics
- If provided, calculate accuracy metrics for the predicted metagenome. NOTE: requires that per-genome accuracy metrics were calculated using predict_traits.py during genome prediction (e.g. there are “NSTI” values in the genome .biom file metadata)
- --suppress_subset_loading
- Normally, only counts for OTUs present in the sample are loaded. If this flag is passed, the full biom table is loaded. This makes no difference for the analysis, but may result in faster load times (at the cost of more memory usage)
- --load_precalc_file_in_biom
- Instead of loading the precalculated file in tab-delimited format (with otu ids as row ids and traits as columns) load the data in biom format (with otu as SampleIds and traits as ObservationIds) [default: False]
- --input_variance_table
- Precalculated table of variances corresponding to the precalculated table of function predictions. As with the count table, these are on a per otu basis and in BIOM format (can be gzipped). Note: using this option overrides –type_of_prediction and –gg_version. [default: None]
- --with_confidence
- Calculate 95% confidence intervals for metagenome predictions. By default, this uses the confidence intervals for the precalculated table of genes for greengenes OTUs. If you pass a custom count table with -c and select this option, you must also specify a corresponding table of confidence intervals for the gene content prediction using –input_variance_table. (these are generated by running predict_traits.py with the –with_confidence option). If this flag is set, three addtional output files will be generated, named the same as the metagenome prediction output, but with .variance .upper_CI or .lower_CI appended immediately before the file extension[default: False]
- -f, --format_tab_delimited
- Output the predicted metagenome table in tab-delimited format [default: False]
Output:
Output is a table of function counts (e.g. KEGG KOs) by sample ids.
Predict KO abundances for a given OTU table picked against the newest version of GreenGenes.
predict_metagenomes.py -i normalized_otus.biom -o predicted_metagenomes.biom
Change output format to plain tab-delimited:
predict_metagenomes.py -f -i normalized_otus.biom -o predicted_metagenomes.txt
Predict COG abundances for a given OTU table.
predict_metagenomes.py -i normalized_otus.biom -t cog -o cog_predicted_metagenomes.biom
Output confidence intervals for each prediction.
predict_metagenomes.py -i normalized_otus.biom -o predicted_metagenomes.biom --with_confidence
Predict metagenomes using a custom trait table in tab-delimited format.
predict_metagenomes.py -i otu_table_for_custom_trait_table.biom -c custom_trait_table.tab -o output_metagenome_from_custom_trait_table.biom
Predict metagenomes,variances,and 95% confidence intervals for each gene category using a custom trait table in tab-delimited format.
predict_metagenomes.py -i otu_table_for_custom_trait_table.biom --input_variance_table custom_trait_table_variances.tab -c custom_trait_table.tab -o output_metagenome_from_custom_trait_table.biom --with_confidence
Change the version of GG used to pick OTUs
predict_metagenomes.py -i normalized_otus.biom -g 18may2012 -o predicted_metagenomes.biom