NAME

String::Comments::Extract - Extract comments from C/C++/JavaScript/Java source

VERSION

version 0.023

SYNOPSIS

use String::Comments::Extract;

my $source = <<_END_
/* A Hello World program

    Copyright Ty Coon
    // ...and Buckaroo Banzai
  "Yoyodyne"*/

void main() {
    printf("Hello, World.\n");
    printf("/* This is not a real comment */");
    printf("// Neither is this */");
    // But this is
}

// Last comment
_END_

my $comments = String::Comments::Extract::C->extract($source)
# ... returns the following result:

    /* A Hello World program
    
        Copyright Ty Coon
        // ...and Buckaroo Banzai
      "Yoyodyne"*/

      
        
        
        
        // But this is
    

    // Last comment

my @comments = String::Comments::Extract::C->collect($source)
# ... returns the following list:
    (
' A Hello World program

    Copyright Ty Coon
    // ...and Buckaroo Banzai
  "Yoyodyne"',
        ' But this is',
        ' Last comment',
    )

DESCRIPTION

String::Comments::Extract is a tool for extracting comments from C/C++/JavaScript/Java source. The extractor is implemented using an actual tokenizer (written in C via XS [adapted from JavaScript::Minifier::XS]). By using a tokenizer, it can correctly deal with notoriously problematic cases, such as comment-like structures embedded in strings:

std::cout << "This is not a // real C++ comment " << std::endl
printf("/* This is not a real C comment */\n");
# The extractor will ignore both of the above

String::Comments::Extract considers C/C++/JavaScript/Java comment structures the same, so, for now, it doesn't really matter which method you use (this means it will not complain about C++ style comments in C source).

The language agnostic interface to C/C++/JavaScript/Java comment extractor is accessible via String::Comments::Extract::SlashStar

# Can handle slash-star (/* */) and slash-slash (//) comments
String::Comments::Extract::SlashStar->extract
String::Comments::Extract::SlashStar->collect

METHODS

String::Comments::Extract::JavaScript->extract( <source> )

String::Comments::Extract::CPP->extract( <source> )

String::Comments::Extract::C->extract( <source> )

String::Comments::Extract::SlashStar->extract( <source> )

Returns a string representing the comments in <source>

Comment delimeters ( /* */ // ) are left in as-is

Whitespace of <source> is otherwise preserved, so you'll probably have to do some post-processing to get rid of some cruft.

String::Comments::Extract::JavaScript->collect( <source> )

String::Comments::Extract::CPP->collect( <source> )

String::Comments::Extract::C->collect( <source> )

String::Comments::Extract::SlashStar->collect( <source> )

Returns a list containing an item for each block- or line-comment in <source>

Comment delimeters ( /* */ // ) around the comment are removed

Whitespace outside of comments may not be preserved exactly

SEE ALSO

File::Comments

AUTHOR

Robert Krimen <robertkrimen@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2010 by Robert Krimen.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.