NAME

JSON::Tokenize - tokenize a string containing JSON

SYNOPSIS

use JSON::Tokenize ':all';
my $input = '{"tuttie":["fruity", true, 100]}';
my $token = tokenize_json ($input);
print_tokens ($token, 0);

sub print_tokens
{
    my ($token, $depth) = @_;
    while ($token) {
        my $start = tokenize_start ($token);
        my $end = tokenize_end ($token);
        my $type = tokenize_type ($token);
        print "   " x $depth;
        my $value = substr ($input, $start, $end - $start);
        print ">>$value<< has type $type\n";
        my $child = tokenize_child ($token);
        if ($child) {
            print_tokens ($child, $depth+1);
        }
        my $next = tokenize_next ($token);
        $token = $next;
    }
}

This outputs

>>{"tuttie":["fruity", true, 100]}<< has type object
   >>"tuttie"<< has type string
   >>:<< has type colon
   >>["fruity", true, 100]<< has type array
      >>"fruity"<< has type string
      >>,<< has type comma
      >>true<< has type literal
      >>,<< has type comma
      >>100<< has type number

VERSION

This documents version 0.57_01 of JSON::Tokenize corresponding to git commit ab987c637a599d090166e789151d1cd972741be5 released on Tue Dec 29 22:26:12 2020 +0900.

DESCRIPTION

This is a module for tokenizing a JSON string. It breaks the string into individual tokens without creating any Perl structures. Thus it can be used for tasks such as picking out or searching through parts of a large JSON structure without storing each part of the entire structure as individual Perl variables in memory.

This module is an experimental part of JSON::Parse and its interface is likely to change. The tokenizing functions are currently written in a very primitive way.

FUNCTIONS

tokenize_json

my $token = tokenize_json ($json);

tokenize_next

my $next = tokenize_next ($token);

Walk the tree of tokens.

tokenize_child

my $child = tokenize_child ($child);

Walk the tree of tokens.

tokenize_start

my $start = tokenize_start ($token);

Get the start of the token as a byte offset from the start of the string. Note this is a byte offset not a character offset.

tokenize_end

my $end = tokenize_end ($token);

Get the end of the token as a byte offset from the start of the string. Note this is a byte offset not a character offset.

tokenize_type

my $type = tokenize_type ($token);

Get the type of the token as a string. The possible return values are

"invalid",
"initial state",
"string",
"number",
"literal",
"object",
"array",
"unicode escape"

tokenize_text

my $text = tokenize_text ($json, $token);

Given a token $token from this parsing and the JSON in $json, return the text which corresponds to the token. This is a convenience function written in Perl which uses "tokenize_start" and "tokenize_end" and substr to get the string from $json.

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT & LICENCE

This package and associated files are copyright (C) 2016-2020 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.