NAME
Protocol::HTTP::RequestParser - HTTP request parser
SYNOPSIS
use Protocol::HTTP::RequestParser;
my $parser = Protocol::HTTP::RequestParser->new;
my $buffer =
"GET / HTTP/1.0\r\n".
"Host: crazypanda.ru\r\n".
"Langs: Perl, c++\r\n".
"\r\n";
my ($req, $state, $pos, $err) = $parser->parse($buffer);
if ($err) {
die "http error: $err";
}
if ($state < Protocol::HTTP::Message::STATE_DONE) {
# wait for more data
}
process($req);
DESCRIPTION
This class represents client HTTP request parser. Parser is incremental so that you don't need to pass the whole http packet at once.
Parser is an FSM so it's really fast.
METHODS
new()
Constructs new request parser instance.
parse($buffer)
my ($request, $state, $position, $error) = $parser->parse($buffer);
Parses (possibly partial) HTTP request.
The first value returned is a Protocol::HTTP::Request object. Regardless of whether parsing the request is completed or not yet, this object will always be returned. Properties of this object will be partially or fully (depending on the state of parsing) filled with values.
The second value returned is a state of parsing. State may be
- Protocol::HTTP::Message::STATE_HEADERS
-
This is initial state and parsing process won't leave this state until all headers arrive. After leaving this state properties
uri()
,method()
(orcode()
andmessage()
in case of parsing response),http_version()
andheaders()
are fully completed.The next state after this may be either
STATE_BODY
,STATE_CHUNK
orSTATE_DONE
depending on the headers received - Protocol::HTTP::Message::STATE_BODY
-
Parser wants more data for message body (for messages without http chunks). During this state property
body
gets filled. You don't have to wait until all the body arrives to process it. It is okay to read whatever is there, process it, clear and wait for next data part.my $data_part = $message->body; # process or write $data_part somewhere $message->body(""); # if you don't do this, next time you'll get the previous data part plus the one just arrived
- Protocol::HTTP::Message::STATE_CHUNK
-
Parser is waiting for chunk header (for messages with http chunks).
- Protocol::HTTP::Message::STATE_CHUNK_BODY
-
Parser wants more data for message chunk body (for messages with http chunks). Parser acts exactly like in
STATE_BODY
case, continuously collectingbody
property. - Protocol::HTTP::Message::STATE_CHUNK_TRAILER
-
Parser is waiting for chunk trailer (for messages with http chunks).
- Protocol::HTTP::Message::STATE_DONE
-
Parser has finished parsing current message
- Protocol::HTTP::Message::STATE_ERROR
-
Parser encountered an http protocol error. In this case the message object is still valid and its properties are left as they were at the moment the error occured. So you can still inspect what this message might look like (for example, if the error was in headers,
uri()
would be ok).
Next value returned is position in $buffer
at which parsing process stopped.
In case of error, position will be the character that caused that error.
In case of STATE_DONE
, position will be the next character after the end of the message. Everything that is left after this position should probably be passed to parse()
again (http pipelining).
Otherwise (no errors and not yet done), position will always be equal to the length of $buffer
.
The last, 4th value is optional and is only returned if there was an error during parsing process. It is an XS::STL::ErrorCode object which represents Perl API for convenient C++ std::error_code
subsystem. Possible errors are described in Protocol::HTTP::Error
parse_shift($buffer)
my ($request, $state, $err) = $parser->parse_shift($buffer);
Parses HTTP request (same as parse()
) and after that deletes from $buffer
everything that have been consumed during parsing.
The effect is similar to
my ($request, $state, $position, $error) = $parser->parse($buffer);
substr($buffer, 0, $position, '');
and thus $buffer
can't be a read-only value, for example
$parser->parse_shift("constant string"); # WRONG! will die with "modification of read-only value ..."
The meaning and the behaviour of all other parameters are the same as in parse()
reset()
Resets internal parser state, so it is ready to parse new requests.
Parser automatically resets itself after each successfully parsed message, so you only need to call this method if you plan to re-use parser after errors, or you decided to stop parsing not yet fully parsed message and begin parsing another one.
NOTE
Internally (in C++ API) it is also a zero-copy parser, however as it is not convenient and not efficient for Perl to use vectorized strings, one single copying occurs on XS->Perl border when you get body as a single string.