run_genome_evaluations.py – Runs genome evaluations on PICRUSt.¶

Description:

Using files created by make_test_datasets.py it runs each test dataset through the ASR (ancestral_state_reconstruction.py) and the genome prediction (predict_traits.py)

Usage: run_genome_evaluations.py [options]

Input Arguments:

Note

[REQUIRED]

[OPTIONAL]

-o, --output_dir: The output directory [default: <input_dir>]
-j, --parallel_method: Method for parallelization. Valid choices are: sge, torque, multithreaded [default: multithreaded]
-m, --prediction_method: Method for trait prediction. See predict_traits.py for full documentation. Valid choices are: asr_and_weighting, nearest_neighbor, random_neighbor [default: asr_and_weighting]
--with_confidence: If set, calculate confidence intervals with ace_ml or ace_reml, and use confidence intervals in trait prediction
--with_accuracy: If set, calculate accuracy using the NSTI (nearest sequenced taxon index) during trait prediction
-a, --asr_method: Method for ancestral_state_reconstruction. See ancestral_state_reconstruction.py for full documentation. Valid choices are: ace_ml, ace_reml, ace_pic, wagner [default: wagner]
-w, --weighting_method: Method for weighting during trait prediction. See predict_traits.py for full documentation. Valid choices are: linear, exponential, equal [default: exponential]
-n, --num_jobs: Number of jobs to be submitted (if –parallel). [default: 100]
--tmp-dir: Location to store intermediate files [default: <output_dir>]
--force: Run all jobs even if output files exist [default: False]
--check_for_null_files: Check if pre-existing output files have null files. If so remove them and re-run. [default: False]

Output:

Predictions from predict_traits.py for each test dataset.

Minimum Requirments:

Provide a directory that contains one or more datasets created by make_test_datasets.py and the original reference tree used

run_genome_evaluations.py -i test_datasets_dir -t reference_tree_fp

Specify output file:

run_genome_evaluations.py -i test_datasets_dir -t reference_tree_fp -o output_dir

Force the launching of jobs that alredy seem done by overwriting existing output files:

run_genome_evaluations.py --force -i test_datasets_dir -t reference_tree_fp -o output_dir