The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Text::Treesitter::Node - an element of a tree-sitter parse result

SYNOPSIS

Usually accessed indirectly, via Text::Treesitter::Tree.

   use Text::Treesitter;

   my $ts = Text::Treesitter->new(
      lang_name => "perl",
   );

   my $tree = $ts->parse_string( $input );

   my $root = $tree->root_node;

   foreach my $node ( $root->child_nodes ) {
      next if $node->is_extra;
      my $name = $node->is_named ? $node->type : '"' . $node->text . '"';

      printf "Node %s extends from line %d to line %d\n",
         $name,
         ( $node->start_point )[0] + 1,
         ( $node->end_point )[0] + 1;
   }

DESCRIPTION

The result of a parse operation is a tree of nodes represented by instances of this class, which are all stored in an instance of Text::Treesitter::Tree. Most of the work of handling the result of a parse operation is done by operating on these tree nodes.

Note that tree-sitter's struct TSNode type is a structure directly and not a pointer to it. Therefore, every time the Perl binding wraps it, it has to create a new object instance for it. You cannot therefore rely on the identity of these objects to remain invariant as a means to keep track of a particular tree node.

METHODS

tree

   $tree = $node->tree;

Returns the Text::Treesitter::Tree instance from which this child node was obtained.

text

   $text = $node->text;

Returns the substring of the tree's stored text that is covered by this node.

type

   $type = $node->type;

Returns a description string giving the name of the grammar rule (or directly an input string for anonymous nodes).

start_byte

   $pos = $node->start_byte;

Returns the offset into the input string where this node's extent begins

end_byte

   $pos = $node->end_byte;

Returns the offset into the input string just past where this node's extent finishes (i.e. the first byte of the input string that is not part of this node).

start_char

end_char

   $pos = $node->start_char;

   $pos = $node->end_char;

Returns the start and end offset position counted in characters (suitable for use with substr, length, etc...) rather than plain bytes.

start_point

   ( $line, $col ) = $node->start_point;

Returns the position in the input text where this node's extent begins, split into a line and column number (both 0-based; the string is considered to start at position (0, 0)). Note that the column is counted in bytes, not characters.

end_point

   ( $row, $col ) = $node->end_point;

Returns the position in the input text just past where this node's extent finishes, split into a row (line) and column number (both 0-based).

start_row

start_column

end_row

end_column

   $row = $node->start_row;
   $row = $node->end_row;

   $col = $node->start_column;
   $col = $node->end_column;

Since version 0.11.

Returns individual fields of the start or end position of the node's extent, all as 0-based indexes.

These are more efficient if you only need the row or column; use "start_point" or "end_point" if you need both.

is_named

   $bool = $node->is_named;

Returns true if the node represents a named rule in the grammar.

is_missing

   $bool = $node->is_missing;

Returns true if the node was inserted by the parser to recover from certain kinds of syntax error.

is_extra

   $bool = $node->is_extra;

Returns true if the node represents something which is not required by the grammar but could appear anywhere (for example, a comment).

has_error

   $bool = $node->has_error;

Returns true if the node or any of its descendents represents a syntax error.

parent

   $parent = $node->parent;

Returns the node's immediate parent; the node from which this node was obtained. Returns undef on the root node.

child_count

   $count = $node->child_count;

Returns the number of child nodes contained by this one.

child_nodes

   @nodes = $node->child_nodes;

Returns a list of child nodes. The length of the returned list will the size given by "child_count".

field_names_with_child_nodes

   @kvlist = $node->field_names_with_child_nodes;

Returns an even-length key/value list containing field names associated with child nodes. The list will be twice as long as the size given by "child_count" and consist of pairs. In each pair, the first value is either a field name or undef if the node has no field name, and the second is the child node itself.

On Perl version 5.36 or above, the multi-variable foreach list syntax may be useful to handle these:

   foreach my ($name, $child) ($node->field_names_with_child_nodes) {
      ...
   }

On earlier version, the List::Util pair functions such as pairs might be used instead:

   use List::Util 'pairs';

   foreach (pairs $node->field_names_with_child_nodes) {
      my ($name, $child) = @$_;
      ...
   }

child_by_field_name

   $child = $node->child_by_field_name( $field_name );

Since version 0.07.

Returns the child node associated with the given field name. This would be the same as the value found by

   my %children = $node->field_names_with_child_nodes;
   $child = $children{ $field_name };

If the node does not have a child with the given field name, an exception is thrown.

try_child_by_field_name

   $child = $node->try_child_by_field_name( $field_name );

Since version 0.07.

Similar to "child_by_field_name" but returns undef if there is no such child rather than throwing an exception.

debug_sprintf

   $str = $node->debug_sprintf();

Returns a debugging test string that represents the node and all its child nodes, in a format similar to tree-sitter's usual S-expr notation.

Basic named nodes are printed with their name in parens; (type). Anonymous nodes have their text string in quotes; "text". Child nodes of named are included within the parens of the type name. Field names are printed as prefixes with a colon.

   (node)

   (node (children) (go) "here")

   (node left: (node) right: (node))

TODO

The following C library functions are currently unhandled:

   ts_node_child_by_field_id
   ts_node_next_sibling
   ts_node_prev_sibling
   ts_node_next_named_sibling
   ts_node_prev_named_sibling
   ts_node_first_child_for_byte
   ts_node_first_named_child_for_byte
   ts_node_descendant_for_byte_range
   ts_node_descendant_for_point_range
   ts_node_named_descendant_for_byte_range
   ts_node_named_descendant_for_point_range
   ts_node_edit
   ts_node_eq

AUTHOR

Paul Evans <leonerd@leonerd.org.uk>