NAME

Bencher::Scenario::CSVParsingModules - Benchmark CSV parsing modules

VERSION

This document describes version 0.002 of Bencher::Scenario::CSVParsingModules (from Perl distribution Bencher-Scenario-CSVParsingModules), released on 2021-07-31.

SYNOPSIS

To run benchmark with default option:

% bencher -m CSVParsingModules

To run module startup overhead benchmark:

% bencher --module-startup -m CSVParsingModules

For more options (dump scenario, list/include/exclude/add participants, list/include/exclude/add datasets, etc), see bencher or run bencher --help.

DESCRIPTION

Packaging a benchmark script as a Bencher scenario makes it convenient to include/exclude/add participants/datasets (either via CLI or Perl code), send the result to a central repository, among others . See Bencher and bencher (CLI) for more details.

BENCHMARKED MODULES

Version numbers shown below are the versions used when running the sample benchmark.

Text::CSV_PP 2.01

Text::CSV_XS 1.46

BENCHMARK PARTICIPANTS

  • Text::CSV_PP (perl_code)

    Code template:

    my $csv = Text::CSV_PP->new({binary=>1}); open my $fh, "<", <filename>; my $rows = []; while (my $row = $csv->getline($fh)) { push @$rows, $row }
  • Text::CSV_XS (perl_code)

    Code template:

    my $csv = Text::CSV_XS->new({binary=>1}); open my $fh, "<", <filename>; my $rows = []; while (my $row = $csv->getline($fh)) { push @$rows, $row }
  • naive-split (perl_code)

    Code template:

    open my $fh, "<", <filename>; my $rows = []; while (defined(my $row = <$fh>)) { chomp $row; push @$rows, [split /,/, $row] }

BENCHMARK DATASETS

  • bench-100x100.csv

  • bench-10x10.csv

  • bench-1x1.csv

  • bench-5x5.csv

BENCHMARK SAMPLE RESULTS

Sample benchmark #1

Run on: perl: v5.34.0, CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (4 cores), OS: GNU/Linux LinuxMint version 19, OS kernel: Linux version 5.3.0-68-generic.

Benchmark command (default options):

% bencher -m CSVParsingModules

Result formatted as table (split, part 1 of 4):

#table1#
{dataset=>"bench-100x100.csv"}
| participant  | rate (/s) | time (ms) | pct_faster_vs_slowest | pct_slower_vs_fastest |  errors | samples |
|--------------+-----------+-----------+-----------------------+-----------------------+---------+---------|
| Text::CSV_PP |      32.7 |      30.6 |                 0.00% |              2073.95% |   2e-05 |      20 |
| Text::CSV_XS |     640   |       1.6 |              1854.86% |                11.21% | 6.2e-06 |      20 |
| naive-split  |     710   |       1.4 |              2073.95% |                 0.00% | 2.5e-06 |      20 |

The above result formatted in Benchmark.pm style:

                 Rate  Text::CSV_PP  Text::CSV_XS  naive-split 
 Text::CSV_PP  32.7/s            --          -94%         -95% 
 Text::CSV_XS   640/s         1812%            --         -12% 
 naive-split    710/s         2085%           14%           -- 

Legends:
  Text::CSV_PP: participant=Text::CSV_PP
  Text::CSV_XS: participant=Text::CSV_XS
  naive-split: participant=naive-split

The above result presented as chart:

Result formatted as table (split, part 2 of 4):

#table2#
{dataset=>"bench-10x10.csv"}
| participant  | rate (/s) | time (μs) | pct_faster_vs_slowest | pct_slower_vs_fastest |  errors | samples |
|--------------+-----------+-----------+-----------------------+-----------------------+---------+---------|
| Text::CSV_PP |      1600 |       620 |                 0.00% |              2364.58% | 1.1e-06 |      20 |
| Text::CSV_XS |     19000 |        54 |              1061.89% |               112.12% | 1.1e-07 |      20 |
| naive-split  |     39000 |        25 |              2364.58% |                 0.00% | 5.3e-08 |      20 |

The above result formatted in Benchmark.pm style:

                  Rate  Text::CSV_PP  Text::CSV_XS  naive-split 
 Text::CSV_PP   1600/s            --          -91%         -95% 
 Text::CSV_XS  19000/s         1048%            --         -53% 
 naive-split   39000/s         2380%          116%           -- 

Legends:
  Text::CSV_PP: participant=Text::CSV_PP
  Text::CSV_XS: participant=Text::CSV_XS
  naive-split: participant=naive-split

The above result presented as chart:

Result formatted as table (split, part 3 of 4):

#table3#
{dataset=>"bench-1x1.csv"}
| participant  | rate (/s) | time (μs) | pct_faster_vs_slowest | pct_slower_vs_fastest |  errors | samples |
|--------------+-----------+-----------+-----------------------+-----------------------+---------+---------|
| Text::CSV_PP |      7500 |     130   |                 0.00% |              2000.62% | 2.7e-07 |      20 |
| Text::CSV_XS |     42700 |      23.4 |               466.40% |               270.87% |   2e-08 |      20 |
| naive-split  |    160000 |       6.3 |              2000.62% |                 0.00% | 1.3e-08 |      20 |

The above result formatted in Benchmark.pm style:

                   Rate  Text::CSV_PP  Text::CSV_XS  naive-split 
 Text::CSV_PP    7500/s            --          -82%         -95% 
 Text::CSV_XS   42700/s          455%            --         -73% 
 naive-split   160000/s         1963%          271%           -- 

Legends:
  Text::CSV_PP: participant=Text::CSV_PP
  Text::CSV_XS: participant=Text::CSV_XS
  naive-split: participant=naive-split

The above result presented as chart:

Result formatted as table (split, part 4 of 4):

#table4#
{dataset=>"bench-5x5.csv"}
| participant  | rate (/s) | time (μs) | pct_faster_vs_slowest | pct_slower_vs_fastest |  errors | samples |
|--------------+-----------+-----------+-----------------------+-----------------------+---------+---------|
| Text::CSV_PP |      3370 |     296   |                 0.00% |              2429.52% | 2.5e-07 |      22 |
| Text::CSV_XS |     33000 |      30   |               878.84% |               158.42% | 5.3e-08 |      20 |
| naive-split  |     85400 |      11.7 |              2429.52% |                 0.00% | 3.2e-09 |      22 |

The above result formatted in Benchmark.pm style:

                  Rate  Text::CSV_PP  Text::CSV_XS  naive-split 
 Text::CSV_PP   3370/s            --          -89%         -96% 
 Text::CSV_XS  33000/s          886%            --         -61% 
 naive-split   85400/s         2429%          156%           -- 

Legends:
  Text::CSV_PP: participant=Text::CSV_PP
  Text::CSV_XS: participant=Text::CSV_XS
  naive-split: participant=naive-split

The above result presented as chart:

Sample benchmark #2

Benchmark command (benchmarking module startup overhead):

% bencher -m CSVParsingModules --module-startup

Result formatted as table:

#table5#
| participant         | time (ms) | mod_overhead_time | pct_faster_vs_slowest | pct_slower_vs_fastest |  errors | samples |
|---------------------+-----------+-------------------+-----------------------+-----------------------+---------+---------|
| Text::CSV_PP        |      20   |              13.9 |                 0.00% |               233.89% | 9.8e-05 |      20 |
| Text::CSV_XS        |      17   |              10.9 |                21.13% |               175.64% | 5.4e-05 |      20 |
| perl -e1 (baseline) |       6.1 |               0   |               233.89% |                 0.00% | 5.3e-05 |      20 |

The above result formatted in Benchmark.pm style:

                         Rate  T:C_P  T:C_X  perl -e1 (baseline) 
 T:C_P                 50.0/s     --   -15%                 -69% 
 T:C_X                 58.8/s    17%     --                 -64% 
 perl -e1 (baseline)  163.9/s   227%   178%                   -- 

Legends:
  T:C_P: mod_overhead_time=13.9 participant=Text::CSV_PP
  T:C_X: mod_overhead_time=10.9 participant=Text::CSV_XS
  perl -e1 (baseline): mod_overhead_time=0 participant=perl -e1 (baseline)

The above result presented as chart:

To display as an interactive HTML table on a browser, you can add option --format html+datatables.

CONTRIBUTOR

perlancar (on pc-office) <perlancar@gmail.com>

HOMEPAGE

Please visit the project's homepage at https://metacpan.org/release/Bencher-Scenario-CSVParsingModules.

SOURCE

Source repository is at https://github.com/perlancar/perl-Bencher-Scenario-CSVParsingModules.

BUGS

Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=Bencher-Scenario-CSVParsingModules

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

AUTHOR

perlancar <perlancar@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2021, 2019 by perlancar@cpan.org.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.