Revision history for Hailo
0.33 2010-03-20 01:57:33
- Optimize Hailo::Engine::Default to use less method calls. On
t/hailo/real_workload.t (i.e. mass replies) this speeds up Hailo
by 8%:
s/iter System Hailo lib Hailo
System Hailo 74.8 -- -7%
lib Hailo 69.4 8% --
Furthermore replace the use of ->fetchall_hashref in a tight
loop with ->fetchall_arrayref. This sped up mass replies by
almost 60% (added to the 8% above):
s/iter System Hailo lib Hailo
System Hailo 68.2 -- -36%
lib Hailo 43.6 57% --
But aside from selective benchmarking this made Hailo around 5%
faster in the common case:
s/iter System Hailo lib Hailo
System Hailo 21.5 -- -6%
lib Hailo 20.3 6% --
0.32 2010-03-19 12:00:22
- t/storage/dbd-options.t wasn't updated to take into account the
renaming of modules done in 0.31. It would fail on machines that
didn't have an older version of Hailo installed when running
`make test'.
- t/hailo/non_standard_plugin.t whines with `Issuing rollback()
due to DESTROY without explicit disconnect()' on some systems
since it doesn't use the Hailo::Test framework.
Issuing rollbacks at the right time is an open issue with
Hailo. I haven't been able to make it do the right thing by
sprinkling around destructors in the main code, that'll cause
things to be destroyed prematurely (probably some silly race
condition).
- Re-add Data::Section dependency. We need it for the
Words-utf8-text.t test.
0.31 2010-03-18 21:45:25
- Optimization and cleanup release. Hailo is now much much
snappier and eats less memory. Here's how long it takes to run
the test suite before/after 0.30:
s/iter 0.30 Hailo 0.31 Hailo
0.30 Hailo 20.2 -- -16%
0.31 Hailo 16.9 19% --
- Split out Hailo::Storage::* into Hailo::Engine::* and
Hailo::Storage::*. This makes it possible to write pluggable
engines again (that ability was removed in 0.09). It's the
intent to write a XS version of the Default engine to make Hailo
even faster.
- In addition the storage backends have been moved
around. Hailo::Storage::DBD is now just Hailo::Storage and
DBD::Pg, DBD::mysql and DBD::SQLite are now directly under the
Hailo::Storage namespace as Hailo::Storage::PostgreSQL,
Hailo::Storage::MySQL and Hailo::Storage::SQLite.
For now "Pg" and "mysql" as short names for the storage backends
are supported for backwards compatability but this support may
be removed in a future release.
- Rather than use the ad-hoc Data::Section + Template::Toolkit way
of generating our SQL just use an ugly pure-perl-based class.
Hailo now uses ~7.2MB of memory when starting up & replying
rather than ~10MB as it did before. The startup time is also
reduced from around 250ms to 140ms.
See http://blogs.perl.org/users/aevar_arnfjor_bjarmason/2010/03/benchmarking-dbixclass-vs-plain-dbi-on-hailo.html
for some of the other things that I tried before settling
on this hack.
- Don't manually use SQLite's `SELECT last_insert_rowid()' or
PostgreSQL's `INSERT ... RETURNING' in the engine. Instead use
DBI's `last_insert_id()' which uses those two automatically.
- Ditch Module::Pluggable: Hailo now can only load one of its
hardcoded core modules as a plugin or alternatively a foreign
module if it's prefixed with + before the module name. See
Hailo's main documentation for more info.
- Fix incorrect SYNOPSIS examples in the documentation for the
PostgreSQL, SQLite and MySQL backends.
0.30 2010-03-15 15:18:01
- Don't set EXLOCK on temporary files we create. This completely
broke Hailo tests on platforms like FreeBSD which aren't as
promiscuous as Linux about file locking.
- Use Dir::Self in hailo/Hailo::Command to work around the 0.29
bug in t/command/shell.t on some platforms like FreeBSD where
IPC3::Run calling a script that called FindBin didn't work
as expected.
- Add more testing including a really basic test for DBIx::Class
debugging (from the dbix-class branch) and making TAP output
more verbose.
- Run all the tests Hailo::Test runs internally for each engine
one-by-one using the DBD::SQLite memory driver. This makes sure
the internal tests don't depend on each other in odd ways.
0.29 2010-03-13 10:32:43
- Remove Data::Random as a dependency. It fails the most tests of
all the dists we depend on and we don't really need it for
anything.
0.28 2010-03-13 10:05:57
- Update README.pod which hadn't been bumped since 0.25
- Fix example in Hailo.pm's SYNOPSIS that didn't work and add an
example for a bare ->reply().
- Fix some code perlcritic whined about.
0.27 2010-03-13 09:41:46
- Stop depending on Term::ReadLine::Gnu and use Term::ReadLine
instead. I tested Term::ReadLine once and found that it was
really bad (no history, C-p, C-n etc.) but now with
PERL_RL='Perl o=0' everything's magically awesome in it.
Term::ReadLine::Gnu was the #1 cause of our test failures so
it's good not to depend on it.
Also only set PERL_RL if it isn't set already.
0.26 2010-03-13 08:04:32
- Split the X::Getopt parts of Hailo into Hailo::Command. This way
the speed / memory penalty of loading all the command-line
related modules is only applicable if running the command-line
interface. using Hailo takes 1MB less memory now and loads a
total of 56 modules instead of 74.
- Due to the split it was possible to rename the `brain_resource'
attribute to `brain'. The former is still provided for backwards
compatibility.
- A lot of miscellaneous cleanups in the code made possible by
splitting the core of Hailo from the command line UI.
- DEMOLISH was broken. it would build storage objects during
global destruction if they didn't exist.
- Add --examples switch to be used as --help --examples, examples
are now not part of --help by default since they took up most of
the terminal & obscured the option help output.
- A lot has been changed in the test suite. Below is a partial
summary:
- Test the ->run method in Hailo::Command completely. Previously
only a subset of its functionality was tested. The only thing
that isn't tested completely is the invocation of
Hailo::UI::ReadLine via ->run.
- Completely test the ->train and ->learn methods and make
->learn() die on unknown input like HashRefs.
- Test the --help output.
0.25 2010-03-12 17:45:42
- Improved documentation of the Tokenizer role and the DBD class
- Added more tests for the ReadLine UI
- Always run the Test::Script::Run tests
0.24 2010-03-12 01:38:56
- Repository metadata was wrong due to RT#55136 in
Dist::Zilla::Plugin::Repository
- Add some very exhaustive tests for the storage engine. This
brings our test coverage up to 94.1% up from 92.5%. The tests
aren't run by default to to the time they take.
- Capitalize the first word of /^but...no/
0.23 2010-03-11 20:08:27
- Increase test coverage, coverage is now up to 92.5%
- Random reply tests were disabled for MySQL for no
reason. They're now enabled.
- Rewording the Hailo UPGRADE section
- Re-arrange the Storage::DBD* code to be more Moosy and use roles
as they should be used
- Remove dead test code in Hailo::Test that was used for flat hash
backends who couldn't generate random replies
- Test the ->ready() storage method on all backends as part of
Hailo::Test
- Test Hailo::stats() on all backends as part of Hailo::Test
- Test the bin/hailo script directly if Test::Script::Run is
available.
0.22 2010-03-10 08:46:54
- A Bug in Dist::Zilla ruined 0.21. The unpacked tarball contained
home/avar/g/hailo/Hailo-0.21 instead just Hailo-0.21 at the top
level.
0.21 2010-03-09 18:25:46
- Word tokenizer: Various improvements to capitalization
- Use Sys::Prctl to set legacy process name under Linux
- Added documentation about upgrading Hailo
- Hailo now uses Dist::Zilla instead of Module::Install to build
releases
0.20 Sun Feb 28 00:29:32 GMT 2010
- Use Mouse instead of Moose by default but depend on both of
them.
This saves about 8MB of resident memory bringing Hailo's memory
usage with SQLite from 28MB to 20MB. Startup time is also
reduced with the test suite running around 46% faster.
- Run tests with Moose and Mouse during `make test`
- Drop MouseX::Role::Strict / MooseX::Role::Strict. Spotted when
switching to Moose but mainly we just didn't care about using
it.
- Word tokenizer: Improve punctuation when words are split with '/'
0.19 Sat Feb 27 04:23:03 GMT 2010
- Move File::Slurp from 'requires' to 'test_requires'
- Make the default pragma logic a bit simpler
- MySQL backend: Don't make host a required storage_args argument,
MySQL will use localhost by default.
- MySQL backend: Document collation settings that have to be right
for Hailo not to blow up.
- Fix some capitalization/punctuation issues of words with dashes/quotes
- A new hailo-benchmark-replies utility and documentation in
Hailo.pm about its results.
0.18 Fri Feb 26 05:02:17 GMT 2010
- Don't keep the brain in memory by default anymore, but enable some
safety-sacrificing performance optimizations instead
- DBD::SQLite backend: It's now possible to set any PRAGMA SQLite
supports at the start of the connection by supplying C<pragma_*>
parameters in in C<storage_args>. See
Hailo::Storage::DBD::SQLite documentation for more info.
- Issue #28: Implement a ready() method for backends. This
un-breaks the command-line interface with non-SQLite backends.
- Word tokenizer: Fix capitalization of the first word in some cases
- Add more exhaustive tests for the Word tokenizer.
- Un-break t/storage/DBD-mysql.t test
- Never test PostgreSQL / MySQL unless explicitly told to do so.
0.17 Tue Feb 23 04:06:50 GMT 2010
- Remove all storage engines that weren't DBD::*. I.e. the Perl
backend and the flat Perl::Flat & CHI::* backends.
These backends were added to experiment with alternate backends,
but between them they had no redeeming quality aside from
increasing our number of backends & tests. The downside is that
we constantly had to deal with errors in these backends that
weren't present in our primary DBD::* targets.
- Remove Log::Log4perl. We weren't using it for anything except
printing one log line. Maybe we'll add it in the future with
proper support. See Issue #15.
- Hailo now uses less memory by lazy-loading various modules that
it previously loaded even if they were redundant.
- Issue #12: --not-a-valid-option now prints the same help output
--help would. MooseX::Getopt::Basic is evil and hard to override
so this is done with some hackery.
- Renamed Hailo::Storage::Mixin::DBD to Hailo::Storage::DBD, since
mixin is really a misnomer for this base class.
- Improved documentation including a new SYNOPSIS with examples.
- Error on training with an undef argument.
- t/storage/dbd-switch-order.t didn't clean up the tempfile it was
using.
- More capitalization improvements.
0.16 Mon Feb 22 17:08:46 GMT 2010
- Don't seed a reply with a token which is too rare
- Make the Word tokenizer split "example.com" into 3 tokens, while still
keeping "3.14" as one token. Also accept ',' as a decimal point.
- Various improvements to capitalization in the Word tokenizer
- Don't run the ReadLine UI if --stats is supplied
- Allow keeping the entire SQLite database in memory while running
- Make that the default behavior to reduce IO
- Declare undeclared File::Slurp dependency
- Declare undeclared Test::Script dependency
- Fixed utf8 problems with ReadLine UI
- Optimize the SQL schema bit, which shaves about 10% off the size of
the DB and cuts more than half of the time needed to generate a reply
0.15 Thu Feb 18 23:55:19 GMT 2010
- Allow specifying SQLite's cache size with --storage-args
- Reduce likelhood of returning a reply which is identical to the input
- Instead of depending on version "0" of most modules (i.e. any
version) depend on the latest CPAN version. We know this works;
but we have no idea if the older modules work.
- Changed to word tokenizer so that it doesn't return whitespace tokens,
instead returning a flag which dictatesthe whitespace policy of the
token in question. Changed the default Markov order to 2 to compensate.
- Fixed a bug with the DBD::* backends not retrieving the Markov order
from an existing database
- Removed Text::Trim dependency due to it not being acceptable for Debian
- Learn from the input when using the ReadLine UI
- Removed Test::Exit hard dependency for tests: Not in Debian yet
- Added a --stats option to print some statistics about the brain
0.14 Sat Feb 13 17:07:30 GMT 2010
- Fixed a bug in the test suite preventing optional backends from
being tested
- All tests now use Hailo::Test, consequently backends now go
through much more thorough testing. Lots of other test related
since from 0.13.
- utils/hailo-benchmark: Rewritten to use Hailo::Test, the
benchmark is now more representive of actual Hailo usage.
- Hailo->learn() can now take an arrayref
0.13 Sat Feb 13 09:19:52 GMT 2010
- Add missing .trn files to the test suite
- Fix number of tests in t/storage/all.t
0.12 Sat Feb 13 08:55:25 GMT 2010
- If asked for a reply before we've learned anything, return nothing
instead of spewing warnings
- Issue #19: Ability to ->train() from filehandle as well as from
a file
- Re-enable t/bug/tokens-repeat.t test disabled in
cc189bd7a2dc56561c71868f061307ee5068f904
- When replying to some input, pay more attention to rare tokens
- Hailo::Storage::Mixin::Hash would inevitably die due to not importing uniq()
- Allow Hailo->train() to take an arrayref, filename, or filehandle argument
0.11 Fri Feb 12 09:44:13 GMT 2010
- Corrected outdated documentation in some places
- Fixed a problem with the SQLite backend not reading some information
from an existing brain if reply() is called first
- Fix --reply option, its argument was being ignored
0.10 Fri Feb 12 02:31:34 GMT 2010
- Normalized the SQL schema some more. This breaks compatability with old
brains of course, but training/learning is quite a bit faster now.
- Removed Hailo::Engine and moved most of its logic into the storage
backends
- Fixed module loader picking Perl::Flat when Perl was requested
- Always return a reply, even when input tokens are unknown or missing
0.09 Thu Feb 11 02:36:49 GMT 2010
- Disable SQLite's journal while training. Speeds up long imports.
- Add Perl::Flat backend which keeps things in a simple key-value
hash where key and value are both Str. It can be subclassed to
store data in e.g. BerkeleyDB, Cache or other key-value
backends.
- Add CHI backend with File, Memory, BerkeleyDB etc. backends
- Use MooseX::Role::Strict instead of Moose::Role
- Use Log::Log4perl for logging
- SQLite broke if using a :memory: brain if a :memory: file existed
- Use Module::Pluggable for finding plugins
0.08 Wed Feb 10 00:06:20 GMT 2010
- 0.07 broke the PostgreSQL and MySQL backend. Fixed them.
- Made it less likely that non-SQLite backends will be broken in
the future by moving the DB-specific SQL out of Pg.pm and
mysql.pm into macros in SQL.pm
- Use of $. in Hailo.pm broke file-based backends such as Cache.pm
- Make MySQL docs copy-pasteable
- Add a benchmark script as utils/hailo-benchmark
0.07 Tue Feb 9 15:23:44 GMT 2010
- Note: The storage backends for this release have been changed in such
a way that it is incompatible with brains created by older releases
- Add missing dependencies on Test::Script/MX::Getopt::Dashes
- The Words tokenizer now compresses whitespace when tokenizing as
well as whitespace-trimming the output it produces
- Make start/end expressions only start/end sentences most of the time
instead of all the time
- Issue #13: `hailo -b brain' will launch an interactive ReadLine
terminal
- Don't exit() on print_version=> in run(), just return()
- Add $VERSION to all .pm files
- Use namespace::clean everywhere
0.06 Sat Jan 30 19:21:28 GMT 2010
- Construct SQL's dbd_options with lazy_build, not default. This
makes it easy to add additional options in the individual
storage engines.
- Remove some dead code in Hailo::Storage::Perl
- Explicitly disconnect sqlite's dbh / sth handles. This should
fix some cpantesters FAILs we're getting which print "database
is locked" errors.
0.05 Sat Jan 30 13:55:18 GMT 2010
- Shuffle key tokens and don't reuse them. Should make for more random
replies.
- Check for definedness of $self->brain in Hailo::Storage::*
- Use autodie to catch open/close errors
- Hailo->learn() was broken when print_progress was false
- Add tests for Hailo invocation
- Use MooseX::StrictConstructor
0.04 Fri Jan 29 17:48:49 GMT 2010
- You know that bug we talked about being fixed in 0.03? It was
still there now it's actually fixed.
- Use Class::MOP::load_class() instead of eval { require $str } to load plugins
- Depend on Perl 5.10
- Added MySQL storage backend, don't use it.
0.03 Fri Jan 29 14:37:17 GMT 2010
- Fixed a fatal error in Hailo::Engine::Default that would
inevitable occur on any large brain. When Hailo was given
repeating input with such as [ qw(badger ! badger !) ] where
the probability of all the given token following each other was
100% (i.e. there's nothing to break the loop) it would start
generating infinitely long replies.
This was fixed by adding a guard clause in Hailo::Engine::Default
which breaks the loop if we're up to C<$order * 10> and the
number of unique tokens in the reply is less than the model
C<$order>.
0.02 Fri Jan 29 03:54:32 GMT 2010
- Fix typo in NAME in Hailo::Tokenizer::Words which caused the POD
not to be displayed on search.cpan.org
- Present options in --help output in reverse sort order
- Add facility to pass arguments to storage/engine/tokenizer from
the command line or via Hailo->new(). Make Hailo::Storage::Pg
use this facility for its database connection arguments.
- Fix spelling error in Hailo's POD
- --reply on the command line didn't work
0.01 Fri Jan 29 00:39:54 GMT 2010
- First CPAN release