NAME
JSON::Repair - reformat JSON to strict compliance
SYNOPSIS
use utf8;
use JSON::Repair 'repair_json';
my $bad_json = <<EOF;
{'very bad':0123,
"
naughty":'json',
value: 00000.00001,
}
// garbage
EOF
print repair_json ($bad_json);
produces output
{"very bad":123,
"\nnaughty":"json",
"value": 0.00001
}
(This example is included as synopsis.pl in the distribution.)
VERSION
This documents version 0.08 of JSON::Repair corresponding to git commit 0d223c0746505268a2a620e28a5917f0de928c3f released on Fri Jan 1 20:19:15 2021 +0900.
DESCRIPTION
Given some "relaxed" JSON text containing such things as trailing commas, comments, or strings containing tab characters or newlines, this module uses heuristics to convert these into strictly compliant JSON.
JSON::Repair is an example of the use of the machine-readable error messages in JSON::Parse.
FUNCTIONS
repair_json
my $repaired = repair_json ($json, %options);
This alters its input in various ways to make it compliant with the JSON specification, or prints an error message if $json
cannot be repaired, and returns the undefined value.
Repairs applied
- Strip trailing commas
-
use JSON::Repair ':all'; print repair_json (q/{"answer":["bob dylan",42,],}/), "\n";
produces output
{"answer":["bob dylan",42]}
(This example is included as trailing-commas.pl in the distribution.)
- Change single quotes to double quotes in keys
-
use JSON::Repair ':all'; print repair_json ("{'answer':42}"), "\n";
produces output
{"answer":42}
(This example is included as single-quotes.pl in the distribution.)
- Add missing object-end, string-end and array-end markers
-
use JSON::Repair ':all'; print repair_json ( '{"stuff":["good' );
produces output
{"stuff":["good"]}
(This example is included as missing-ends.pl in the distribution.)
- Add quotes to unquoted keys
-
use JSON::Repair ':all'; print repair_json ( "{how many roads must a man walk down:42}" );
produces output
{"how many roads must a man walk down":42}
(This example is included as unquoted-keys.pl in the distribution.)
- Add missing commas to objects and arrays
-
The module can add missing commas between the end of object or array values.
use JSON::Repair ':all'; print repair_json (q![1 2 3 4 {"six":7 "eight":9}]!), "\n";
produces output
[1, 2, 3, 4, {"six":7, "eight":9}]
(This example is included as missing-commas.pl in the distribution.)
- Remove comments
-
The module removes C and C++ comments and hash comments (Perl-style comments) from JSON.
This example uses the example from the synopsis of JSON::Relaxed:
use JSON::Repair ':all'; my $rjson = <<'(RAW)'; /* Javascript-like comments are allowed */ { // single or double quotes allowed a : 'Larry', b : "Curly", // nested structures allowed like in JSON c: [ {a:1, b:2}, ], // like Perl, trailing commas are allowed d: "more stuff", } (RAW) print repair_json ($rjson);
produces output
{ "a" : "Larry", "b" : "Curly", "c": [ {"a":1, "b":2} ], "d": "more stuff" }
(This example is included as comments.pl in the distribution.)
This example demonstrates removing hash comments:
use JSON::Repair 'repair_json'; print repair_json (<<'EOF'); { # specify rate in requests/second rate: 1000 } EOF
produces output
{ "rate": 1000 }
(This example is included as hash-comments.pl in the distribution.)
The facility to remove hash comments was added in version 0.02 of the module. It currently uses "C::Tokenize" for the C/C++ comment regexes.
- Sort out broken numbers
-
JSON does not allow various kinds of numbers, such as decimals less than one without a leading zero, such as
.123
(should be0.123
), decimals with an exponent but without a fraction, such as1.e9
(should be1.0e9
), or integers with a leading zero, such as0123
(should be123
). JSON::Repair adds or removes digits to make them parseable.use JSON::Repair ':all'; print repair_json ('[.123,0123,1.e9]');
produces output
[0.123,123,1.0e9]
(This example is included as numbers.pl in the distribution.)
JSON::Repair strips leading zeros as in
0123
without converting the result to octal (base 8). It doesn't attempt to repair hexadecimal (base 16) numbers.The facility to reinterpret numbers was added in version 0.02 of the module.
- Convert unprintable and whitespace characters to escapes in strings
-
Strings containing unprintable ASCII characters and some kinds of whitespace are not allowed in JSON. This converts them into valid escapes.
use JSON::Repair 'repair_json'; my $badstring = '"' . chr (9) . chr (0) . "\n" . '"'; print repair_json ($badstring), "\n";
produces output
"\t\u0000\n"
(This example is included as strings.pl in the distribution.)
This was added in version 0.04 of the module.
- Empty inputs are converted into the empty string
-
Completely empty inputs are converted into
""
.
Options
Valid options are
- verbose
-
my $okjson = repair_json ($json, verbose => 1);
Give a true value to make the module print messages about the operations applied. This facility is largely for debugging the module itself. The messages may be poorly formatted and opaque, and are not guaranteed to be the same in future versions of the module.
Here is the output of the synopsis run with the
verbose
option:use utf8; use JSON::Repair 'repair_json'; my $bad_json = <<EOF; {'very bad':0123, # comment " naughty":'json', value: 00000.00001, } garbage EOF print repair_json ($bad_json, verbose => 1);
produces output
Unexpected character ''' at byte 2. Changing single to double quote. Unexpected character '1' at byte 14. Leading zero in number? Unexpected character '#' at byte 18. Hash comments in object or array? Deleting comment ' comment'. Unexpected character ' ' at byte 20. Changing bad byte 31 into \n. Unexpected character ''' at byte 31. Changing single to double quote. Unexpected character 'v' at byte 39. Unquoted key or value in object? Adding quotes to key 'value' Unexpected character '0' at byte 49. Leading zero in number? Unexpected character '}' at byte 57. Removing a trailing comma. Unexpected character 'g' at byte 58. Trailing garbage 'garbage '? {"very bad":123, "\nnaughty":"json", "value": 0.00001 }
(This example is included as synopsis-verbose.pl in the distribution.)
EXPORTS
"repair_json" is exported on demand. The tag ":all" exports all functions.
use JSON::Repair ':all';
DEPENDENCIES
- JSON::Parse
-
This module relies on "diagnostics_hash" in JSON::Parse to find the errors in the input. Most of the work of JSON::Repair is actually done by JSON::Parse's diagnostics, and then JSON::Repair applies a few heuristic rules to guess what might have caused the error, modify the input, and re-parse it repeatedly until either the input is compliant, or none of the rules can be applied to it.
- C::Tokenize
-
This module uses the regular expression for C comments from C::Tokenize.
- Carp
-
Carp is used to report errors.
- Perl 5.14
-
Unfortunately "diagnostics_hash" in JSON::Parse is only available for Perl 5.14 or later, because it relies on croak_sv in perlapi, which was introduced in Perl 5.14. I'm not sure if there is a way to get the same behaviour with earlier versions of Perl.
SCRIPT
A script repairjson is installed with the module which runs "repair_json" on the files given as arguments:
repairjson file1.json file2.json
The output is the repaired JSON.
The script was added in version 0.02 of the module.
SEE ALSO
See the section "SEE ALSO" in JSON::Parse for a comprehensive list of JSON modules on CPAN and more information about JSON itself.
JSON-like formats
It's very likely that a non-compliant JSON format cannot be handled by this module, because the changes that need to be made to put one variety of JSON-like format into strict JSON are incompatible with the changes that need to be made to fix another. For example, it is impossible to correctly convert the "HJSON" format or the "YAML" format into compliant JSON without breaking other parts of the module. Thus, no comprehensive solution is possible.
Since it is unfeasible to meaningfully convert every possible list of bytes into compliant JSON, JSON::Repair should be regarded as an example which demonstrates the use of the diagnostics provided by the "JSON::Parse" module to repair broken JSON inputs, rather than a general solution.
- HJSON
-
See http://hjson.org. This format cannot be converted to strictly compliant JSON by this module.
- YAML
-
See http://yaml.org. This format cannot be converted to strictly compliant JSON by this module.
AUTHOR
Ben Bullock, <bkb@cpan.org>
COPYRIGHT & LICENCE
This package and associated files are copyright (C) 2016-2021 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.