run_genome_evaluations.py – Runs genome evaluations on PICRUSt.

Description:

Using files created by make_test_datasets.py it runs each test dataset through the ASR (ancestral_state_reconstruction.py) and the genome prediction (predict_traits.py)

Usage: run_genome_evaluations.py [options]

Input Arguments:

Note

[REQUIRED]

-i, --input_dir
Directory containing one or more test datasets
-t, --ref_tree
Reference tree that was used with make_test_datasets

[OPTIONAL]

-o, --output_dir
The output directory [default: <input_dir>]
-j, --parallel_method
Method for parallelization. Valid choices are: sge, torque, multithreaded [default: multithreaded]
-m, --prediction_method
Method for trait prediction. See predict_traits.py for full documentation. Valid choices are: asr_and_weighting, nearest_neighbor, random_neighbor [default: asr_and_weighting]
--with_confidence
If set, calculate confidence intervals with ace_ml or ace_reml, and use confidence intervals in trait prediction
--with_accuracy
If set, calculate accuracy using the NSTI (nearest sequenced taxon index) during trait prediction
-a, --asr_method
Method for ancestral_state_reconstruction. See ancestral_state_reconstruction.py for full documentation. Valid choices are: ace_ml, ace_reml, ace_pic, wagner [default: wagner]
-w, --weighting_method
Method for weighting during trait prediction. See predict_traits.py for full documentation. Valid choices are: linear, exponential, equal [default: exponential]
-n, --num_jobs
Number of jobs to be submitted (if –parallel). [default: 100]
--tmp-dir
Location to store intermediate files [default: <output_dir>]
--force
Run all jobs even if output files exist [default: False]
--check_for_null_files
Check if pre-existing output files have null files. If so remove them and re-run. [default: False]

Output:

Predictions from predict_traits.py for each test dataset.

Minimum Requirments:

Provide a directory that contains one or more datasets created by make_test_datasets.py and the original reference tree used

run_genome_evaluations.py -i test_datasets_dir -t reference_tree_fp

Specify output file:

run_genome_evaluations.py -i test_datasets_dir -t reference_tree_fp -o output_dir

Force the launching of jobs that alredy seem done by overwriting existing output files:

run_genome_evaluations.py --force -i test_datasets_dir -t reference_tree_fp -o output_dir