NAME

MS::Reader::PepXML - A simple but complete pepXML parser

SYNOPSIS

use MS::Reader::PepXML;

my $search = MS::Reader::PepXML->new('search.pep.xml');

# for single search files

while (my $result = $search->next_result) {
    # $result is an MS::Reader::PepXML::Result object
}

# for multi-search files

my $n = $search->n_lists;

for (0..$n-1) {
    
    $self->goto_list($_);
    while (my $result = $search->next_result) {
        # $result is an MS::Reader::PepXML::Result object
    }
   
}

DESCRIPTION

MS::Reader::PepXML is a parser for the pepXML file format for spectral search results. It aims to provide complete access to the data contents while not being overburdened by detailed class infrastructure. Convenience methods are provided for accessing commonly used data. Users who want to extract data not accessible through the available methods should examine the data structure of the parsed object. The dump() method of MS::Reader::XML, from which this class inherits, provides an easy way of doing so.

INHERITANCE

MS::Reader::PepXML is a subclass of MS::Reader::XML, which in turn inherits from MS::Reader, and inherits the methods of these parental classes. Please see the documentation for those classes for details of available methods not detailed below.

METHODS

new

my $search = MS::Reader::PepXML->new( $fn,
    use_cache => 0,
    paranoid  => 0,
);

Takes an input filename (required) and optional argument hash and returns an MS::Reader::PepXML object. This constructor is inherited directly from MS::Reader. Available options include:

  • use_cache — cache fetched records in memory for repeat access (default: FALSE)

  • paranoid — when loading index from disk, recalculates MD5 checksum each time to make sure raw file hasn't changed. This adds (typically) a few seconds to load times. By default, only file size and mtime are checked.

next_result

while (my $s = $search->next_result) {
    # do something with $s
}

Returns an MS::Reader::PepXML::Result object representing the next result (pepXML element <<spectrum_query>>) in the current result list, or undef if the end of records has been reached. In a multi-list file (i.e. multiple <<msms_run_summary>> elements) you must call goto_list() for each one followed by iterating over the list records.

fetch_result

my $s = $search->fetch_result($idx);

Takes a single argument (zero-based record index) and returns an MS::Reader::PepXML::Result object representing the record at that index. Throws an exception if the index is out of range.

result_count

my $n = $search->result_count;

Returns the number of result records in the current result list (not the same as the number of results in the file if it contains multiple runs/lists).

n_lists

my $n = $search->n_lists;

Returns the number of result lists (pepXML <msms_run_summary> elements) in the file. If this number is greater than one, individual lists can be iterated over using goto_list() and next_result().

goto_list

$search->goto_list($idx);

Takes a single argument (zero-based list index) and sets the record pointer to the first result from that list.

raw_file

$search->raw_file($idx);

Takes a single argument (zero-based list index) and returns the raw file path associated with that list. If index is not provided, index 0 is assumed.

CAVEATS AND BUGS

The API is in alpha stage and is not guaranteed to be stable.

Please reports bugs or feature requests through the issue tracker at https://github.com/jvolkening/p5-MS/issues.

SEE ALSO

AUTHOR

Jeremy Volkening <jdv@base2bio.com>

COPYRIGHT AND LICENSE

Copyright 2015-2016 Jeremy Volkening

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.