NAME

JSON::Parse - Convert JSON into a Perl variable

SYNOPSIS

use JSON::Parse 'parse_json';
my $json = '["golden", "fleece"]';
my $perl = parse_json ($json);
# Same effect as $perl = ['golden', 'fleece'];

Convert JSON into Perl.

DESCRIPTION

JSON means "JavaScript Object Notation" and it is specified in "RFC 4627".

JSON::Parse converts JSON into the nearest equivalent Perl. The function "parse_json" takes one argument, a string containing JSON, and returns a Perl reference. The input to parse_json must be a complete JSON structure.

The module differs from Perl's standard JSON module by simplifying the handling of Unicode. If its input is marked as Unicode characters, the strings in its output are also marked as Unicode characters.

JSON::Parse also provides two high speed validation functions, "valid_json" and "assert_valid_json", and a function to read JSON from a file, "json_file_to_perl".

FUNCTIONS

parse_json

use JSON::Parse 'parse_json';
my $perl = parse_json ('{"x":1, "y":2}');

This function converts JSON into a Perl structure, either an array reference or a hash reference.

If the first argument does not contain a complete valid JSON text, parse_json throws a fatal error ("dies"). If the first argument is the undefined value or an empty string or a string containing only whitespace, parse_json returns the undefined value.

If the argument contains valid JSON, the return value is either a hash or an array reference. If the input JSON text is a serialized object, a hash reference is returned:

my $perl = parse_json ('{"a":1, "b":2}');
print ref $perl, "\n";
# Prints "HASH".

If the input JSON text is a serialized array, an array reference is returned:

my $perl = parse_json ('["a", "b", "c"]');
print ref $perl, "\n";
# Prints "ARRAY".

json_file_to_perl

use JSON::Parse 'json_file_to_perl';
my $p = json_file_to_perl ('filename');

This is exactly the same as "parse_json" except that it reads the JSON from the specified file rather than a scalar. The file must be in UTF-8.

valid_json

use JSON::Parse 'valid_json';
if (valid_json ($json)) {
    # do something
}

Valid_json returns 1 if its argument is valid JSON and 0 if not. It also returns 0 if the input is undefined or the empty string.

This is a high-speed validator which runs between three and ten times faster than "parse_json".

Valid_json does not supply the actual errors which caused invalidity. Use "assert_valid_json" to get error messages when the JSON is invalid.

assert_valid_json

use JSON::Parse 'assert_valid_json';
eval {
    assert_valid_json ('["xyz"]');
};
if ($@) {
    print "Your JSON was invalid: $@\n";
}

This is the underlying function for "valid_json". It runs at the same high speed, but throws an error if the JSON is wrong, rather than returning 1 or 0. See "DIAGNOSTICS" for the error format, which is identical to "parse_json".

If you send it uninitialized input it will print a warning message:

use warnings;
my $undef = undef;
assert_valid_json ($undef);
# "Use of uninitialized value in subroutine entry"

If you call it with an empty argument you will get a compile-time error:

assert_valid_json ();
# "Not enough arguments for JSON::Parse::assert_valid_json"

OLD INTERFACE

The following alternative function names are accepted. These are the names used for the functions in old versions of this module. These names are not deprecated and will never be removed from the module.

json_to_perl

This is exactly the same function as "parse_json".

validate_json

This is exactly the same function as "assert_valid_json".

Mapping from JSON to Perl

JSON elements are mapped to Perl as follows:

JSON numbers

JSON numbers become Perl numbers, either integers or double-precision floating point numbers, or possibly strings containing the number if parsing of a number by the usual methods fails somehow.

JSON does not allow leading zeros, or leading plus signs, so numbers like +100 or 0123 will cause an error.

JSON strings

JSON strings become Perl strings. The JSON escape characters such as \t for the tab character (see section 2.5 of "RFC 4627") are mapped to the equivalent ASCII character.

Handling of Unicode

If the input to "parse_json" is marked as Unicode characters, the output strings will be marked as Unicode characters. If the input is not marked as Unicode characters, the output strings will not be marked as Unicode characters. Thus,

# The scalar $sasori looks like Unicode to Perl
use utf8;
my $sasori = '["蠍"]';
my $p = parse_json ($sasori);
print utf8::is_utf8 ($p->[0]);
# Prints 1.

but

# The scalar $ebi does not look like Unicode to Perl
no utf8;
my $ebi = '["海老"]';
my $p = parse_json ($ebi);
print utf8::is_utf8 ($p->[0]);
# Prints nothing.

Escapes of the form \uXXXX (see page three of "RFC 4627") are mapped to ASCII if XXXX is less than 0x80, or to UTF-8 if XXXX is greater than or equal to 0x80.

Strings containing \uXXXX escapes greater than 0x80 are also upgraded to character strings, regardless of whether the input is a character string or a byte string, thus regardless of whether Perl thinks the input string is Unicode, escapes like \u87f9 are converted into the equivalent UTF-8 bytes and the particular string in which they occur is marked as a character string:

no utf8;
# 蟹
my $kani = '["\u87f9"]';
my $p = parse_json ($kani);
print utf8::is_utf8 ($p->[0]);
# Prints 1, because it's upgraded regardless of the input string's
# flags.

This is modelled on the behaviour of Perl's chr:

no utf8;
my $kani = '87f9';
print utf8::is_utf8 ($kani), "\n";
# prints a blank line
$kani = chr (hex ($kani));
print utf8::is_utf8 ($kani), "\n";
# prints 1

Surrogate pairs in the form \uD834\uDD1E are also handled.

JSON arrays

JSON arrays become Perl array references. The elements of the Perl array are in the same order as they appear in the JSON.

Thus

my $p = parse_json ('["monday", "tuesday", "wednesday"]');

has the same result as a Perl declaration of the form

my $p = [ 'monday', 'tuesday', 'wednesday' ];

JSON objects

JSON objects become Perl hashes. The members of the JSON object become key and value pairs in the Perl hash. The string part of each object member becomes the key of the Perl hash. The value part of each member is mapped to the value of the Perl hash.

Thus

my $j = <<EOF;
{"monday":["blue", "black"],
 "tuesday":["grey", "heart attack"],
 "friday":"Gotta get down on Friday"}
EOF

my $p = parse_json ($j);

has the same result as a Perl declaration of the form

my $p = {
    monday => ['blue', 'black'],
    tuesday => ['grey', 'heart attack'],
    friday => 'Gotta get down on Friday',
};

null

The JSON null literal is mapped to a scalar $JSON::Parse::null containing the undefined value.

true

The JSON true literal is mapped to a scalar $JSON::Parse::true containing the value 1.

false

The JSON false literal is mapped to a scalar $JSON::Parse::false containing the value 0.

RESTRICTIONS

This module imposes the following restrictions on its input.

JSON only

JSON::Parse is a strict parser. It only accepts input which exactly meets the criteria of "RFC 4627". That means, for example, JSON::Parse does not accept single quotes (') instead of double quotes ("), or numbers with leading zeros, like 0123. JSON::Parse does not accept control characters (0x00 - 0x1F) in strings, missing commas between array or hash elements like ["a" "b"], or trailing commas like ["a","b","c",]. It also does not accept trailing non-whitespace, like the second "]" in ["a"]].

No incremental parsing

JSON::Parse does not do incremental parsing. JSON::Parse only parses fully-formed JSON strings which include all opening and closing brackets.

UTF-8 only

Although JSON may come in various encodings of Unicode, JSON::Parse only parses the UTF-8 format. If input is in a different Unicode encoding than UTF-8, convert the input before handing it to this module. For example, for the UTF-16 format,

use Encode 'decode';
my $input_utf8 = decode ('UTF-16', $input);
my $perl = parse_json ($input_utf8);

or, for a file,

open my $input, "<:encoding(UTF-16)", 'some-json-file'; 

JSON::Parse does not determine the nature of the octet stream, as described in part 3 of "RFC 4627".

This restriction to UTF-8 applies regardless of whether Perl thinks that the input string is a character string or a byte string. Non-UTF-8 input will cause a fatal error.

DIAGNOSTICS

"valid_json" does not produce error messages. "parse_json" and "assert_valid_json" die on encountering invalid input.

Error messages have the line number and the byte number of the input which caused the problem. The line number is formed simply by counting the number of "\n" (linefeed, ASCII 0x0A) characters in the whitespace part of the JSON. If you find the error message unclear, please report that as a bug.

Parsing errors are fatal, so to continue after an error occurs, put the parsing into an eval block:

my $p;                       
eval {                       
    $p = parse_json ($j);  
};                           
if ($@) {                    
    # handle error           
}

At the moment the exact content of the diagnostics is not documented so please review the source code if you need more details on potential outputs. The name of the exception-throwing function is currently "failburger", so search through the C files in the top directory of the distribution for "failburger":

cd JSON-Parse
grep "failburger" Json3-*.c

This name is likely to be changed in future releases of the module.

Alternatively, the likely syntax of the various error messages is also visible in the file t/valid-json.t.

SPEED

On the author's computer, the module's speed of parsing is approximately the same or slightly faster than JSON::XS, with small variations depending on the type of input. Some special types of input, such as floating point numbers containing an exponential part, like "1e09", are about two or three times faster to parse with this module than with JSON::XS. This is because in JSON::Parse, parsing of exponentials is done by the system's strtod function, but JSON::XS contains its own parser for exponentials.

For validation, "valid_json" is faster than any other module known to the author.

There is some benchmarking code in the github repository under the directory "benchmarks" for those wishing to test these claims. The script benchmarks/bench is an adaptation of the similar script in the JSON::XS distribution.

Here is an example with "benchmarks/long.json", originally downloaded from http://dist.schmorp.de/misc/json/long.json:

Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module        |      1/min |        min |
--------------|------------|------------|
JP::valid     |   9691.538 |  0.0051591 |
JSON::Parse   |   3313.665 |  0.0150890 |
JSON::XS      |   3318.751 |  0.0150659 |
--------------+------------+------------+

Here JP::valid is the running time of JSON::Parse's "valid_json". A higher number in the second column, or a smaller number in the third column, is faster.

Here is an example with "benchmarks/words-array.json":

Repetitions: 50 x 100 = 5000		 
--------------+------------+------------+
module        |      1/min |        min |
--------------|------------|------------|
JP::valid     | 169535.327 |  0.0002949 |
JSON::Parse   |  22533.061 |  0.0022190 |
JSON::XS      |  21185.493 |  0.0023601 |
--------------+------------+------------+

Here is an example with "benchmarks/exp.json", containing floating point numbers:

Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module        |      1/min |        min |
--------------|------------|------------|
JP::valid     |  83352.623 |  0.0005999 |
JSON::Parse   |  30826.870 |  0.0016220 |
JSON::XS      |  13545.743 |  0.0036912 |
--------------+------------+------------+

Here is an example with "benchmarks/literals.json", containing JSON literals:

Repetitions: 50 x 100 = 5000
--------------+------------+------------+
module        |      1/min |        min |
--------------|------------|------------|
JP::valid     | 182519.756 |  0.0002739 |
JSON::Parse   |  31564.600 |  0.0015841 |
JSON::XS      |  17939.709 |  0.0027871 |
--------------+------------+------------+

Here is an example with "benchmarks/cpantesters.json", a 250K file:

Repetitions: 5 x 10 = 50
--------------+------------+------------+
module        |      1/min |        min |
--------------|------------|------------|
JP::valid     |    876.736 |  0.0057030 |
JSON::Parse   |    140.631 |  0.0355539 |
JSON::XS      |    131.866 |  0.0379171 |
--------------+------------+------------+

SEE ALSO

RFC 4627

JSON is specified in http://www.ietf.org/rfc/rfc4627.txt.

JSON, JSON::XS

These modules allow both reading and writing of JSON.

TEST RESULTS

The ActiveState test results are at http://code.activestate.com/ppm/JSON-Parse/.

EXPORTS

The module exports nothing by default. All of the functions, "parse_json", "json_file_to_perl", "valid_json" and "assert_valid_json", as well as the old function names "validate_json" and "json_to_perl", can be exported on request.

SUPPORT

There is a mailing list at <json-parse@googlegroups.com> for announcements and discussions about the module. You can read it on the web at https://groups.google.com/forum/#!forum/json-parse. Membership is open to the public.

AUTHOR

Ben Bullock, <bkb@cpan.org>

LICENSE

JSON::Parse can be used, copied, modified and redistributed under the same terms as Perl itself.