NAME
ompa-pa.pl - Extract seqs from BLAST/HMMER interactively or in batch mode
VERSION
version 0.260260
USAGE
ompa-pa.pl <infiles> --database=<file> [optional arguments]
REQUIRED ARGUMENTS
OPTIONAL ARGUMENTS
- --report-type=<str>
-
Type of the reports used as infiles [default: blastxml]. Currently, the following types are available:
- blastxml (XML BLAST reports generated with -outfmt 5) - hmmertbl (tabular HMMER reports generated with -domtblout) - --database=<file>
-
Path to the sequence database used to generate the reports. For efficiency, this argument must always be the basename of a BLAST database, even when the reports where obtained using
hmmsearchon a FASTA file.To build such a database, use one of the following commands:
$ makeblastdb -in database.fasta -out database -dbtype prot -parse_seqids $ makeblastdb -in database.fasta -out database -dbtype nucl -parse_seqidsThis argument is required when the option
--extract-seqsis enabled. - --colorize=<scheme>
-
When specified, sequence points are colored after their taxon using the specified CLS file. As above, this requires enabling taxonomic annotation and thus a local mirror of the NCBI Taxonomy database.
- --taxdir=<dir>
-
Path to local mirror of the NCBI Taxonomy database.
To build such a directory, use the following command:
$ setup-taxdir.pl --taxdir=taxdir - --max-hits=<n>
-
Maximum number of hits to read from the report [default: 200000]. This limit is implemented for efficiency. It applies before any other filter.
- --min-cov=<n>
-
Minimum BLAST query or HMMER model coverage for selected hits [default: 0.7].
- --max-copy=<n>
-
Maximum gene copy number per organism for selected hits [default: 3].
- --extract-seqs
-
Sequence extraction switch [default: no]. When specified, selected sequences are stored into a FASTA file using the same basename as other output files. This requires a BLAST database (see option
--databaseabove). - --extract-tax
-
Taxonomy extraction switch [default: no]. When specified, NCBI taxons of selected sequences are stored into a file using the same basename as other output files. This requires a local mirror of the NCBI Taxonomy database.
- --print-plots
-
When specified, plots are printed in PDF format [default: no].
- --gnuplot-term=<str>
-
gnuplot terminal to use for the interactive mode [default: x11]. Other possible choices include qt but the option is open to experiment. On macOS, to avoid the font warning, use
--gnuplot-term='qt font "Arial"'.If needed the gnuplot executable can be specified through the environment variable
OUM_GNUPLOT_EXEC. - --restore-params-from=<file>
-
Batch-mode switch for selecting parameters [default: no]. When specified, parameters are restored from the user-specified JSON file. This option takes precedence on any command-line specified option, such as
--max-hits,--min-covand--max-copy. - --restore-last-params
-
Batch-mode switch for selecting parameters [default: no]. When specified, parameters are restored from the last saved JSON file for each report. This option takes precedence over all other command-line options.
- --skip-config=<file>
-
Path to an optional configuration file specifying the reports to skip based on their raw taxonomic content [default: none]. The assessment is made before any filtering other than
--max-hits.The configuration file follows the classifier format (often YAML) of
classify-ali.pl. This requires enabling taxonomic annotation and thus a local mirror of the NCBI Taxonomy database - --batch-classify
-
Batch-mode switch for skipping reports [default: no]. This option is meant to complement the option
--skip-config. When specified, reports are sorted into subdirectories named after the categories of the config file, in a way similar to what <classify-ali.pl> does. No GUI is presented to the user. - --version
- --usage
- --help
- --man
-
Print the usual program information
AUTHOR
Denis BAURAIN <denis.baurain@uliege.be>
CONTRIBUTOR
Amandine BERTRAND <amandine.bertrand@doct.uliege.be>
COPYRIGHT AND LICENSE
This software is copyright (c) 2013 by University of Liege / Unit of Eukaryotic Phylogenomics / Denis BAURAIN.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.