NAME
JSON::Parse - Read JSON into a Perl variable
SYNOPSIS
use JSON::Parse 'parse_json';
my $json = '["golden", "fleece"]';
my $perl = parse_json ($json);
# Same effect as $perl = ['golden', 'fleece'];
Convert JSON into Perl.
DESCRIPTION
JSON means "JavaScript Object Notation" and it is specified in "RFC 4627".
JSON::Parse converts JSON into the nearest equivalent Perl. The function "parse_json" takes one argument, a string containing JSON, and returns a Perl reference. The input to parse_json
must be a complete JSON structure.
The module differs from Perl's standard JSON module by simplifying the handling of Unicode. If its input is marked as Unicode characters, the strings in its output are also marked as Unicode characters.
JSON::Parse also provides two high speed validation functions, "valid_json" and "assert_valid_json", and a function to read JSON from a file, "json_file_to_perl".
FUNCTIONS
parse_json
use JSON::Parse 'parse_json';
my $perl = parse_json ('{"x":1, "y":2}');
This function converts JSON into a Perl structure, either an array reference or a hash reference.
If the first argument does not contain a complete valid JSON text, parse_json
throws a fatal error ("dies"). If the first argument is the undefined value, an empty string, or a string containing only whitespace, parse_json
returns the undefined value.
If the argument contains valid JSON, the return value is either a hash or an array reference. If the input JSON text is a serialized object, a hash reference is returned:
my $perl = parse_json ('{"a":1, "b":2}');
print ref $perl, "\n";
# Prints "HASH".
If the input JSON text is a serialized array, an array reference is returned:
my $perl = parse_json ('["a", "b", "c"]');
print ref $perl, "\n";
# Prints "ARRAY".
json_file_to_perl
use JSON::Parse 'json_file_to_perl';
my $p = json_file_to_perl ('filename');
This is exactly the same as "parse_json" except that it reads the JSON from the specified file rather than a scalar. The file must be in the UTF-8 encoding, and is opened as a character file using the ":encoding(utf8)" pragma. The output is marked as character strings.
valid_json
use JSON::Parse 'valid_json';
if (valid_json ($json)) {
# do something
}
Valid_json
returns 1 if its argument is valid JSON and 0 if not. It also returns 0 if the input is undefined or the empty string.
This is a high-speed validator which runs between roughly two and eight times faster than "parse_json".
Valid_json
does not supply the actual errors which caused invalidity. Use "assert_valid_json" to get error messages when the JSON is invalid.
assert_valid_json
use JSON::Parse 'assert_valid_json';
eval {
assert_valid_json ('["xyz"]');
};
if ($@) {
print "Your JSON was invalid: $@\n";
}
This is the underlying function for "valid_json". It runs at the same high speed, but throws an error if the JSON is wrong, rather than returning 1 or 0. See "DIAGNOSTICS" for the error format, which is identical to "parse_json".
If you send it uninitialized input it will print a warning message:
use warnings;
my $undef = undef;
assert_valid_json ($undef);
# "Use of uninitialized value in subroutine entry"
If you call it with an empty argument you will get a compile-time error:
assert_valid_json ();
# "Not enough arguments for JSON::Parse::assert_valid_json"
OLD INTERFACE
The following alternative function names are accepted. These are the names used for the functions in old versions of this module. These names are not deprecated and will never be removed from the module.
json_to_perl
This is exactly the same function as "parse_json".
validate_json
This is exactly the same function as "assert_valid_json".
Mapping from JSON to Perl
JSON elements are mapped to Perl as follows:
JSON numbers
JSON numbers become Perl numbers, either integers or double-precision floating point numbers, or possibly strings containing the number if parsing of a number by the usual methods fails somehow.
JSON does not allow leading zeros, or leading plus signs, so numbers like +100 or 0123 will cause an error.
JSON strings
JSON strings become Perl strings. The JSON escape characters such as \t
for the tab character (see section 2.5 of "RFC 4627") are mapped to the equivalent ASCII character.
Handling of Unicode
If the input to "parse_json" is marked as Unicode characters, the output strings will be marked as Unicode characters. If the input is not marked as Unicode characters, the output strings will not be marked as Unicode characters. Thus,
# The scalar $sasori looks like Unicode to Perl
use utf8;
my $sasori = '["蠍"]';
my $p = parse_json ($sasori);
print utf8::is_utf8 ($p->[0]);
# Prints 1.
but
# The scalar $ebi does not look like Unicode to Perl
no utf8;
my $ebi = '["海老"]';
my $p = parse_json ($ebi);
print utf8::is_utf8 ($p->[0]);
# Prints nothing.
Escapes of the form \uXXXX (see page three of "RFC 4627") are mapped to ASCII if XXXX is less than 0x80, or to UTF-8 if XXXX is greater than or equal to 0x80.
Strings containing \uXXXX escapes greater than 0x80 are also upgraded to character strings, regardless of whether the input is a character string or a byte string, thus regardless of whether Perl thinks the input string is Unicode, escapes like \u87f9 are converted into the equivalent UTF-8 bytes and the particular string in which they occur is marked as a character string:
no utf8;
# 蟹
my $kani = '["\u87f9"]';
my $p = parse_json ($kani);
print utf8::is_utf8 ($p->[0]), "\n";
# Prints 1, because it's upgraded regardless of the input string's
# flags.
This is modelled on the behaviour of Perl's chr
:
no utf8;
my $kani = '87f9';
print utf8::is_utf8 ($kani), "\n";
# prints a blank line
$kani = chr (hex ($kani));
print utf8::is_utf8 ($kani), "\n";
# prints 1
Surrogate pairs in the form \uD834\uDD1E
are also handled.
JSON arrays
JSON arrays become Perl array references. The elements of the Perl array are in the same order as they appear in the JSON.
Thus
my $p = parse_json ('["monday", "tuesday", "wednesday"]');
has the same result as a Perl declaration of the form
my $p = [ 'monday', 'tuesday', 'wednesday' ];
JSON objects
JSON objects become Perl hashes. The members of the JSON object become key and value pairs in the Perl hash. The string part of each object member becomes the key of the Perl hash. The value part of each member is mapped to the value of the Perl hash.
Thus
my $j = <<EOF;
{"monday":["blue", "black"],
"tuesday":["grey", "heart attack"],
"friday":"Gotta get down on Friday"}
EOF
my $p = parse_json ($j);
has the same result as a Perl declaration of the form
my $p = {
monday => ['blue', 'black'],
tuesday => ['grey', 'heart attack'],
friday => 'Gotta get down on Friday',
};
null
The JSON null literal is mapped to a readonly scalar $JSON::Parse::null
containing the undefined value.
true
The JSON true literal is mapped to a readonly scalar $JSON::Parse::true
containing the value 1.
false
The JSON false literal is mapped to a readonly scalar $JSON::Parse::false
containing the value 0.
RESTRICTIONS
This module imposes the following restrictions on its input.
- JSON only
-
JSON::Parse is a strict parser. It only accepts input which exactly meets the criteria of "RFC 4627". That means, for example, JSON::Parse does not accept single quotes (') instead of double quotes ("), or numbers with leading zeros, like 0123. JSON::Parse does not accept control characters (0x00 - 0x1F) in strings, missing commas between array or hash elements like
["a" "b"]
, or trailing commas like["a","b","c",]
. It also does not accept trailing non-whitespace, like the second "]" in["a"]]
. - No incremental parsing
-
JSON::Parse does not do incremental parsing. JSON::Parse only parses fully-formed JSON strings which include all opening and closing brackets.
- UTF-8 only
-
Although JSON may come in various encodings of Unicode, JSON::Parse only parses the UTF-8 format. If input is in a different Unicode encoding than UTF-8, convert the input before handing it to this module. For example, for the UTF-16 format,
use Encode 'decode'; my $input_utf8 = decode ('UTF-16', $input); my $perl = parse_json ($input_utf8);
or, for a file,
open my $input, "<:encoding(UTF-16)", 'some-json-file';
JSON::Parse does not determine the nature of the octet stream, as described in part 3 of "RFC 4627".
This restriction to UTF-8 applies regardless of whether Perl thinks that the input string is a character string or a byte string. Non-UTF-8 input will cause a fatal error.
DIAGNOSTICS
"valid_json" does not produce error messages. "parse_json" and "assert_valid_json" die on encountering invalid input.
Error messages have the line number and the byte number of the input which caused the problem. The line number is formed simply by counting the number of "\n" (linefeed, ASCII 0x0A) characters in the whitespace part of the JSON. If you find the error message unclear, please report that as a bug.
Parsing errors are fatal, so to continue after an error occurs, put the parsing into an eval
block:
my $p;
eval {
$p = parse_json ($j);
};
if ($@) {
# handle error
}
At the moment the exact content of the diagnostics is not documented so please review the source code if you need more details on potential outputs.
Alternatively, the likely syntax of the various error messages is also visible in the file t/valid-json.t.
SPEED
On the author's computer, the module's speed of parsing is approximately the same or slightly faster than JSON::XS, with small variations depending on the type of input. Some special types of input, such as floating point numbers containing an exponential part, like "1e09", are about two or three times faster to parse with this module than with JSON::XS. This is because in JSON::Parse, parsing of exponentials is done by the system's strtod
function, but JSON::XS contains its own parser for exponentials.
For validation, "valid_json" is faster than any other module known to the author.
There is some benchmarking code in the github repository under the directory "benchmarks" for those wishing to test these claims. The script benchmarks/bench is an adaptation of the similar script in the JSON::XS distribution.
Here is an example with "benchmarks/long.json", originally downloaded from http://dist.schmorp.de/misc/json/long.json:
Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module | 1/min | min |
--------------|------------|------------|
JP::valid | 9691.538 | 0.0051591 |
JSON::Parse | 3313.665 | 0.0150890 |
JSON::XS | 3318.751 | 0.0150659 |
--------------+------------+------------+
Here JP::valid is the running time of JSON::Parse's "valid_json". A higher number in the second column, or a smaller number in the third column, is faster.
Here is an example with "benchmarks/words-array.json":
Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module | 1/min | min |
--------------|------------|------------|
JP::valid | 169535.327 | 0.0002949 |
JSON::Parse | 22533.061 | 0.0022190 |
JSON::XS | 21185.493 | 0.0023601 |
--------------+------------+------------+
Here is an example with "benchmarks/exp.json", containing floating point numbers:
Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module | 1/min | min |
--------------|------------|------------|
JP::valid | 83352.623 | 0.0005999 |
JSON::Parse | 30826.870 | 0.0016220 |
JSON::XS | 13545.743 | 0.0036912 |
--------------+------------+------------+
Here is an example with "benchmarks/literals.json", containing JSON literals:
Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module | 1/min | min |
--------------|------------|------------|
JP::valid | 182519.756 | 0.0002739 |
JSON::Parse | 31564.600 | 0.0015841 |
JSON::XS | 17939.709 | 0.0027871 |
--------------+------------+------------+
Here is an example with "benchmarks/cpantesters.json", a 250K file:
Repetitions: 5 x 10 = 50
--------------+------------+------------+
module | 1/min | min |
--------------|------------|------------|
JP::valid | 876.736 | 0.0057030 |
JSON::Parse | 140.631 | 0.0355539 |
JSON::XS | 131.866 | 0.0379171 |
--------------+------------+------------+
SEE ALSO
- RFC 4627
-
JSON is specified in http://www.ietf.org/rfc/rfc4627.txt.
- JSON, JSON::XS
-
These modules allow both reading and writing of JSON.
TEST RESULTS
The ActiveState test results are at http://code.activestate.com/ppm/JSON-Parse/.
EXPORTS
The module exports nothing by default. All of the functions, "parse_json", "json_file_to_perl", "valid_json" and "assert_valid_json", as well as the old function names "validate_json" and "json_to_perl", can be exported on request.
All of the functions can be exported using the tag ':all':
use JSON::Parse ':all';
SUPPORT
There is a mailing list at <json-parse@googlegroups.com> for announcements and discussions about the module. You can read it on the web at https://groups.google.com/forum/#!forum/json-parse. Membership is open to the public.
AUTHOR
Ben Bullock, <bkb@cpan.org>
LICENSE
JSON::Parse can be used, copied, modified and redistributed under the same terms as Perl itself.