make_test_datasets.py – Generates test datasets for cross-validation studies of PICRUSt’s accuracy

Description:

Usage: make_test_datasets.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --input_trait_table
The input trait table.
-t, --input_tree
The input tree in Newick format

[OPTIONAL]

-o, --output_dir
The output directory. Duplicate trees, trait tables, expected values and prediction files will be saved here.[default:./test_datasets/]
--min_dist
The minimum phylogenetic distance to use with the holdout method, if applicable. Usually 0.0.[default:0.0]
--suppress_tree_modification
If passed, modify only the trait table, not the tree . [default: False]
--dist_increment
The phylogenetic distance increment to use with the holdout method, if applicable.[default:0.03]
--max_dist
The maximum phylogenetic distance to use with the holdout method, if applicable.[default:0.45]
--limit_to_tips
If specified, limit test dataset generation to specified tips (comma-separated).[default:]
-m, --method
The test method to use in generating test data. Valid choices are:exclude_tips_by_distance,randomize_tip_labels_by_distance,collapse_tree_by_distance [default: exclude_tips_by_distance]

Output:

Generate holdout test trees from genome_tree.newick, and save results in the directory ./test_holdout_trees/.

make_test_datasets.py -t genome_tree.newick -o ./test_holdout_trees