NAME
MySQL::Log::ParseFilter - Parse and filter MySQL slow, general and binary logs
SYNOPSIS
use MySQL::Log::ParseFilter;
# Parse all unique queries from logs given on command line
%params = (logs => \@ARGV,
queries => \%queries);
parse_slow_logs(%params);
parse_general_logs(%params);
parse_binary_logs(%params);
calc_final_values(%params);
DESCRIPTION
MySQL::Log::ParseFilter is a Perl module for parsing and filtering MySQL slow, general and binary logs. MySQL::Log::ParseFilter also parses and filters user-defined logs: logs with variable headers and SQL statement meta-properties.
Each MySQL log is formatted differently and poses many problems to accurate parsing. From the range of MySQL server versions to the vast extent of SQL syntax, parsing a MySQL log file is rarely a trivial task if done well.
MySQL::Log::ParseFilter handles all the heavy log chopping, hacking and filtering, allowing a script to simply extract the data that it wants.
Functions
The following functions are exported by default:
set_meta_filter($filter)
-
Set meta-property filter to
$filter
.$filter
is a scalar containing a single string of meta-property filter conditions. See "Meta-Property".Returns nothing.
set_statement_filter($filter)
-
Set SQL statement filter to
$filter
.$filter
is a scalar containing a single string of allowed or disallowed SQL statement types. See "SQL Statement".Returns nothing.
set_grep($pattern)
-
Set the grep pattern against which SQL statements must match. (This could be called the "grep filter.")
$pattern
is a scalar containing a single Perl regex pattern without m// or similar. For example:"^SELECT foo FROM (?:this|that)"
.Returns nothing.
parse_binary_logs(%params)
-
Parse output of mysqlbinlog.
%params
may contain the following key/values:logs
=> ref to array having log file names to parse (REQUIRED)queries
=> ref to hash in which to save unique queries (REQUIRED)all_queries
=> ref to array in which to save all queries (optional)NOTE: MySQL binary logs are, as the name suggests, binary--they are not text files. It is necessary to first "decode" a binary log with the MySQL-provided program mysqlbinlog. The log files given to
parse_binary_logs()
must be the text output from mysqlbinlog ran first on the binary log files (without the --short-form option).Returns nothing.
parse_general_logs(%params)
-
Parse MySQL general logs.
%params
may contain the following key/values:logs
=> ref to array having log file names to parse (REQUIRED)queries
=> ref to hash in which to save unique queries (REQUIRED)all_queries
=> ref to array in which to save all queries (optional)users
=> ref to hash in which to save unique users (optional)Returns nothing.
parse_slow_logs(%params)
-
Parse MySQL slow and microslow logs.
%params
may contain the following key/values:logs
=> ref to array having log file names to parse (REQUIRED)queries
=> ref to hash in which to save unique queries (REQUIRED)all_queries
=> ref to array in which to save all queries (optional)users
=> ref to hash in which to save unique users (optional)microslow
=> 0 = regular slow log, 1 = microslow log (optional)0 is default for
microslow
.Returns nothing.
set_udl_format($format_file)
-
Set the user-defined log format defined in
$format_file
.$format_file
is a scalar containing a single file name. See "USER-DEFINED LOGS".This function should be called before calling
parse_udl_logs()
.Returns nothing.
parse_udl_logs(%params)
-
Parse user-defined logs.
%params
may contain the following key/values:logs
=> ref to array having log file names to parse (REQUIRED)queries
=> ref to hash in which to save unique queries (REQUIRED)all_queries
=> ref to array in which to save all queries (optional)users
=> ref to hash in which to save unique users (optional)Returns nothing.
See "USER-DEFINED LOGS".
calc_final_values(%params, $grand_totals)
-
Calculate final meta-property and grand total values after parsing logs. This function should only be called after calling one of the parse_ functions.
This function calculates: number of unique queries, c_sum_p, averages, grand total sums, percent true for true/false (or yes/no) meta-properties, per-meta-property percentages of grand total sums, per-user percentages of all users, and number of unique users.
%params
is the same hashed passed earlier to one of the parse_ functions.$grand_totals
is a reference to a hash in which to save the grand total sums. Pass 0 orundef
if you do not want grand total sums.Returns total number of queries.
apply_final_meta_filters(%params, $total_queries)
-
Apply final meta-propertry filters for min, max, average, percent and sum meta-property values. This function should be called after calling
calc_final_values()
.%params
is the same hashed passed earlier tocalc_final_values()
or to one of the parse_ functions if you did not calculate final values for some reason.$total_queries
is a reference to a scalar having the total number of queries. Usually, this is obtained from the return value ofcalc_final_values()
.$total_queries
will be adjusted to account for queries which were removed by a filter. Pass 0 orundef
if you do not want an adjusted total queries.Returns total number of queries removed.
META-PROPERTIES
Every SQL statement has many meta-properties. These are values about the SQL statement such as its execution time, how many rows it examines, the MySQL connection ID it is associated with, etc. It is by these values that a MySQL log is filtered (and usually sorted).
Every type of MySQL log provides different meta-properties. The list of all meta-properties is very long so only the basics are given here. For the full list visit http://hackmysql.com/mysqlsla_filters.
But first, it is important to understand the naming scheme that MySQL::Log::ParseFilter uses for meta-properties.
Naming Scheme
Meta-properties are either numeric or string. For strings, the naming scheme does not change: db
is always just db
. For numeric values, however, several additional meta-properties are created and identified by consistent extensions to the base meta-property name.
Take for example t
from slow logs. In addition to this base meta-property, MySQL::Log::ParseFilter also creates: t_min
, t_max
, t_avg
, t_sum
and t_sum_p
(unless set_save_meta_values()
was disabled; see "OPTIONS").
These additional meta-properties are identified by their extensions: _min
, _max
, _avg
, _sum
, _sum_p
. These extensions are consistent and form the naming scheme for most numeric meta-properties. (cid
is a notable exception.) They tell you as well as MySQL::Log::ParseFilter what the additional meta-property value represents: the minimum, maximum, average and sum value of their base meta-property.
_sum_p
means percentage that the base meta-property constitutes of the grand total sum for all those base meta-properties (if grand total sum were calculated when calling calc_final_values()
).
There is another extension for true/false (yes/no) meta-properties: _t
and _t_p
. Currently, this type of meta-property is only found in microslow logs.
This naming scheme is very important when working with user-defined logs because it allows you to know in advance the names the of additional meta-properties that MySQL::Log::ParseFilter will create from the given bases meta-properties.
Slow Logs
c_sum
: Total number of times SQL statement appears in log
host
: Host name of MySQL connection
ip
: IP address of MySQL connection
l
: Time spent acquiring lock
l_sum
: Total time spent acquiring lock
re
: Number of rows examined
rs
: Number of rows sent
t
: Execution time
t_sum
: Total execution time
user
: User of MySQL connection
General Logs
c_sum
: Total number of times SQL statement appears in log
cid
: Connection ID of MySQL connection
host
: Host name of MySQL connection
user
: User of MySQL connection
Binary Logs
c_sum
: Total number of times SQL statement appears in log
cid
: Connection ID of MySQL connection
ext
: Execution time
err
: Error code (if any) caused by SQL statement
sid
: Server ID of MySQL server
SETTING FILTERS
All filters are inclusive: every condition for every filter must pass for the statement to be saved.
Meta-Property
The format of $filter
when calling set_meta_filter($filter)
is [CONDITION],[CONDITION]...
where each [CONDITION]
is [meta][op][val]
.
[meta]
is a meta-property name (listed above or from the full list at http://hackmysql.com/mysqlsla_filters). [op]
is either >, <, or =. And [val]
is the value against which [meta]
from the log must pass according to [op]
. [val]
is numeric or string according to [meta]
. For string values, only = is valid for [op]
.
SQL Statement
The format of $filter
when calling set_statement_filter($filter)
is [+-][TYPE],[TYPE]...
.
[+-] is give only once at the first start of the filter string. + means that the filter is positive: allow only the given [TYPE]
s. - means that the filter is negative: remove the given [TYPE]
s. [TYPE]
is a SQL statement type: SELECT, UPDATE, INSERT, DO, SET, CREATE, DROP, ALTER, etc.
EXAMPLES
Parsing General Logs
# Parse general logs given on command line extracting only SELECT queries
# using database foo and calculate grand total sums
my %queries;
my %grand_totals;
my %params = (
logs => \@ARGV,
queries => \%queries,
);
set_meta_filter("db=foo");
set_statement_filter("+SELECT");
parse_general_logs(%params);
calc_final_values(%params, \%grand_totals);
Parsing Slow Logs
# Parse slow logs given on command line removing SET statements and
# extracting only queries which took longer than 5 seconds to execute
my %queries;
my %params = (
logs => \@ARGV,
queries => \%queries,
);
set_meta_filter("t>5");
set_statement_filter("-SET");
parse_slow_logs(%params);
calc_final_values(%params, 0);
Parsing Binary Logs
# Parse output files from mysqlbinlog given on command line extracting
# only INSERT and UPDATE queries which account for more than 75% of all
# INSERT and UPDATE queries extracted
my %queries;
my %grand_totals;
my %params = (
logs => \@ARGV,
queries => \%queries,
);
set_meta_filter("c_sum_p>75");
set_statement_filter("+INSERT,UPDATE");
parse_binary_logs(%params);
calc_final_values(%params, \%grand_totals);
apply_final_meta_filters(%params, 0);
Complete Mini-Script (dump_type)
#!/usr/bin/perl -w
use strict;
use MySQL::Log::ParseFilter;
my %queries;
if(@ARGV != 2) {
print "dump_type dumps a unique sample of all statements of TYPE from general LOG.\n";
print "Usage: dump_type TYPE LOG\n";
exit;
}
set_statement_filter("+$ARGV[0]");
parse_general_logs( (logs => [ $ARGV[1] ], queries => \%queries) );
foreach(keys %queries) { print "$queries{$_}->{sample}\n"; }
exit;
OPTIONS
MySQL::Log::ParseFilter has six functions to set special options which can be imported with the :options
tag (use MySQL::Log::ParseFilter qw(:DEFAULT :options)
).
set_save_meta_values($val)
-
Save extra meta-property values. Default 1 (enabled). Can be set to 0 (disabled) which will result in only the following meta-properties being saved:
sample
,db
,cid
.Any meta-property value not check in
apply_final_meta_filters()
can still be used (t
,l
,host
,cid
, etc.) set_save_all_values($val)
-
Save "all values": arrays of every single value for certain meta-properties (
meta_all
). Default 0 (disabled). Can be set to 1 (enabled). This does not affect user-defined logs which has a seperate mechanism for saving all values (type nf).At present, enabling this option causes the following all values to be saved: for microslow (msl) patched slow log with InnoDB values:
iorops_all
,iorbytes_all
,iorwait_all
,reclwait_all
,qwait_all
,pages_all
; for binary logs:ext_all
. set_IN_abstraction($val)
-
Abstract IN () clauses further by grouping in groups of $val. Default 0 (disabled).
This is an experimental option. Normally, all IN clauses are condensed from
IN (N, N, N)
toIN (N3)
. This option furthers this abstraction by grouping the condensed IN clauses in groups of $val where $val is the "dividing line."Example: with $val=10
IN (N3)
becomesIN (N0-9)
. Therefore, any IN clause with 0 to 9 values will be condensed and then further abstracted toIN (N0-9)
. Likewise, any IN clauses with 10 to 19 values will be condensed and further abstracted toIN (10-19)
. set_VALUES_abstraction($val)
-
Abstract VALUES () clauses further by removing the number of condensed value sets. Default 0 (disabled). Can be set to 1 (enabled).
This is an experimental option. Normally, all VALUES clauses are condensed from
VALUES (NULL, 'foo'), (NULL, 'bar')
toVALUES (NULL, 'S')2
. This option furthers this abstractiong by removing that number of condensed value sets: 2.Example: two queries
INSERT INTO table VALUES ('S')
andINSERT INTO table VALUES ('S'), ('S'), ('S')
are first condensed toINSERT ... VALUES ('S')1
andINSERT ... VALUES ('S')3
then further abstracted to one single query:INSERT INTO table VALUES ('S')
. set_atomic_statements($val)
-
Treat multi-statement groups atomically when filtering. Default 0 (disabled). Can be set to 1 (enabled).
This is an experimental option. Normally, each statement in a multi-statement group is filtered individually: only those which fail a filter are removed and those which pass are kept. With this option enabled, if any one statement in a group fails, the entire group of statements is removed.
This option does not apply to general logs because general logs never group statements.
set_db_inheritance($val)
-
Allow queries to inherit the last database specified in the log. Default 0 (disabled). Can be set to 1 (enabled).
Normally, the log must explicitly specify the database for each statement. Or, in the case of general logs, the current database is tracked by other means. Sometimes, however, logs only specify the database explicitly once. If this option is enabled, all statements following an explicit database specification inherit that database.
USER-DEFINED LOGS
MySQL::Log::ParseFilter can parse user-defined logs which have variable headers and meta-property values. Accomplishing this is not a trivial task. Therefore the subject is not covered here but at http://hackmysql.com/udl.
HACKS
The following four functions can be imported with the :hacks
tag.
get_meta_filter()
-
Returns hash ref to internal meta filter hash which is structured:
meta => [ op, value ]
. get_statement_filter()
-
Returns hash ref to internal statement filter hash which is structured:
type => 0
. Also haspos_neg => 1 (positive) or 0 (negative)
. passes_meta_filter($meta, $val, $type)
-
$meta
is a meta-property name.$val
is the log value.$type
is 'n' (numeric) or 's' (string). Returns 1 on pass, 0 on fail. passes_statement_filter($type)
-
$type
is a SQL statement type (SELECT, CREATE, DROP, etc.), case insensitive. Returns 1 on pass, 0 on fail.
DEBUGGING
Calling MySQL::Log::ParseFilter::set_debug(1)
will enable debugging and cause MySQL::Log::ParseFilter to print a flood of debugging information to STDOUT. This may be necessary if you feel a function is not working correctly because, although they do not return errors, they print debugging messages.
BUGS
There are no known bugs. Please contact me if you find one. Expect that I will ask for at least a portion of your log because that makes finding and fixing the bug easier.
AUTHOR
Daniel Nichter <perl@hackmysql.com>
http://hackmysql.com/
SEE ALSO
- http://hackmysql.com/mlp
-
MySQL::Log::ParseFilter home page
- http://hackmysql.com/udl
-
Document describing how to make user-defined logs
- http://hackmysql.com/mysqlsla
-
mysqlsla uses every part of MySQL::Log::ParseFilter to analyze, sort and report data from MySQL logs. To study MySQL::Log::ParseFilter in all its glory, study mysqlsla. In fact, MySQL::Log::ParseFilter was born from mysqlsla.
- http://hackmysql.com/microsecond_slow_logs
-
Document summarizing microsecond resolution support for MySQL slow logs
- http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html
-
Official MySQL slow query log documentation.
- http://dev.mysql.com/doc/refman/5.0/en/query-log.html
-
Official MySQL general query log documentation.
- http://dev.mysql.com/doc/refman/5.0/en/binary-log.html
-
Official MySQL binary query log documentation.
- http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html
-
Official MySQL documentation for mysqlbinlog
VERSION
v1.00
COPYRIGHT AND LICENSE
Copyright 2008 Daniel Nichter
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.