NAME

C::Tokenize - reduce a C file to a series of tokens

REGULAR EXPRESSIONS

The regular expressions can be imported using, for example,

use C::Tokenize '$cpp_re'

to import $cpp_re.

None of the regular expressions does any capturing. If you want to capture, add your own parentheses around the regular expression.

$trad_comment_re

Match /* */ comments.

$cxx_comment_re

Match // comments.

$comment_re

Match both /* */ and // comments.

$cpp_re

Match a C preprocessor instruction.

$char_const_re

Match a character constant, such as 'a' or '\-'.

$operator_re

Match an operator such as + or --.

$number_re

Match a number, either integer, floating point, or hexadecimal. Does not do octal yet.

$word_re

Match a word, such as a function or variable name or a keyword of the language.

$grammar_re

Match other syntactic characters such as { or [.

$single_string_re

Match a single C string constant such as "this".

$string_re

Match a full-blown C string constant, including compound strings "like" "this".

decomment

my $out = decomment ('/* comment */');
# $out = " comment ";

Remove the comments from a string.

tokenize

my $tokens = tokenize ($file);

Convert $file into a series of tokens.

Each token contains

leading

Leading whitespace

name
$name

The value of the type, e.g. $token-{comment}> if $token-{name}> equals 'comment'.

BUGS

Octal not parsed

It does not parse octal expressions.

trigraphs

No handling of trigraphs.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 243:

You forgot a '=back' before '=head2'