##-*- Mode: Change-Log; coding: utf-8; -*-
##
## Change log for perl distribution DiaColloDB

v0.09.002 Tue, 26 Apr 2016 15:46:17 +0200 moocow
	* fixed comparison profile stringification for new pack()-encoded profiles,
	  regression for v0.09.001 "f2 bug" fix

v0.09.001 Tue, 26 Apr 2016 14:49:29 +0200 moocow
	* fixed double-counting f2 for multiple item1 targets with shared item2 collocates in Cofreqs::subprofile1() 1-pass mode
	* added auto-upgrade framework
	  - DiaColloDB::Upgrade - top-level API
	  - DiaColloDB::Upgrade::Base - subclass API & defaults
	  - added subclass ::v0_08_to_v0_09_multimap for v0.09.x multimap format change
	  - dcdb-upgrade.perl : top-level auto-upgrade script
	* added compatiblity mode for multimaps as DiaColloDB::MultiMapFile::v0_08
	* fixed -nokeep option to dcdb-create.perl
	* TDF union: avoid storage of non-persistent object keys qw(docmeta wdmfile logas reusedir)
	* TDF union: fixed 'bus error' resulting from attempt to mmap() temporary data beyond EOF
	  - arose in dta+dwds trying to include 'pnd' metadata only indexed in dta
	  - temporary PackedFile tdf.d/mvals_pnd.pf had no entries for dwds data (pnd not indexed)
	  - readPdlFile(...,Dims=>[$NC]) choked with 'bus error'
	* Client::list overhaul
	  - new default fudge=>10 should be safe (but rather expensive)
	  - re-factored Client::list::profile() and compare() methods
	* improved Client and Client::list documentation
	  - added "incorrect independent collocate frequencies" section to Client::list documentation
	  - milder form of this bug applies even to single native CoFreqs indices ("f2 bug", see above)
	* workaround for incorrect independent collocate frequency acquisition code in Cofreqs ("f2 bug")
	  - f2 were computed as marginals only over those (x1,x2,date) triples with f(x1,x2,date) > 0,
	    rather than over all (*,x2,date \in slice)
	  - result were in general underestimates of f2
	  - fix uses 2-pass acquisition strategy, ca. 10x slower for frequent targets (e.g. 'Mann')
	    ~ old subprofile() method refactored into subprofile1() and subprofile2()
	  - todo: possibly re-factor db structure to use tdf-style {tenum} rather than {xenum},
	    minimize group-key lookup & optimize for serial cofreqs dba2 file access
	  - added 'onepass' query option for fast, old, incorrect f2 frequency acquisition (Cofreqs only)

v0.08.006 Thu, 10 Mar 2016 16:52:19 +0100 moocow
	* added dbexport() support for TDF relations
	* allow option pass-through for Profile::Multi::compile()
	* fixed utf8 handling in TDF::qinfo() query templates

v0.08.005 Mon, 07 Mar 2016 10:02:12 +0100 moocow
	* fixed pod =encoding typo in Profile.pod
	* added 'verbose' option to Profile::(Multi)Diff::saveHtmlFile
	  - include sub-profile frequencies in diff html output, used by www wrappers if 'debug' flag is set.
	* updated module-list and installation sketch in README

v0.08.004 Fri, 04 Mar 2016 13:25:20 +0100 moocow
	* remove temporary PDL headers created by DiaColloDB::PackedFile::toPdl(), used by TDF::union()
	* fixed buggy Profile::trim() call on undefined (empty) profiles in Profile::Diff::pretrim()
	* updated PODs for command-line utilities
	* updated & improved API module documentation

v0.08.003 Fri, 26 Feb 2016 15:14:43 +0100 moocow
	* added missing PODs to MANIFEST
	* added more DiaColloDB::Document subclasses:
	  - DiaColloDB::Document::JSON - raw JSON dump
	  - DiaColloDB::Document::TCF - CLARIN-D TCF (attributes {w,p,l} only; metadata from abused <source> element)
	  - DiaColloDB::Document::TEI - basic TEI-like XML (flexible but slow)

v0.08.002 Tue, 23 Feb 2016 10:51:02 +0100 moocow
	* added Document::DDCTabs options trimGenre, trimAuthor
	* added explicit PDL dependency in CONFIGURE_REQUIRES + PREREQ_PM: try to be cpantesters-friendly (see RT bug #112321)
	* added manual check for PDL in Makefile.PL: disable PDL-Utils/ subdir build if PDL isn't installed

v0.08.001 Fri, 29 Jan 2016 12:35:44 +0100 moocow
	* added co-occurrence profiles over (term x document) frequency matrix via DiaColloDB::Relation::TDF
	  - requires PDL, PDL::CCS, etc.: should be safe to omit, only loaded on demand
	* re-worked compile-time filtering; new options to dcdb-create.perl:
            -tfmin TFMIN : minimum global term frequency, regardless of DATE component (default=5)
            -lfmin LFMIN : minimum global lemma frequency (default=5)
	  - prunes enums too, which keeps them smaller and speeds up access

v0.07.015 Wed, 04 Nov 2015 14:18:20 +0100 moocow
	* added mi3 profiles a la Rychlý (2008)
	* report log-log-likelihood scores (extra log() for better scaling)
	* singularity checking for log-likelihood computations

v0.07.014 Tue, 03 Nov 2015 11:42:26 +0100 moocow
	* added 1-sided log-likelihood ratio profiles a la Evert (2008)

v0.07.013 2015-11-02 12:52:56 +0100 moocow
	* fix for Profile::empty(): a profile is empty if it contains no collocates, even if it has nonzero f1

v0.07.012 Wed, 28 Oct 2015 13:04:20 +0100 moocow
	* omit {pgood},{pbad} restrictions in Relation::qinfoData()
	  - these are too expensive for large corpora, resulting in timeouts for KWIC-links

v0.07.011 Tue, 29 Sep 2015 09:10:33 +0200 moocow
	* require perl >= v5.10.0 (for // operator)

v0.07.010 2015-09-24  moocow
	* moved DDC dependency and include to new CPAN-friendly DDC::Concordance
	* updated README
	* distcheck fixes
	* fixed fill/trim/alignment bug in ddc-diff ('fill' option wasn't being properly honored)

v0.07.009 2015-08-03  moocow
	* relation-wise dbinfo
	  - merged -r 15066:15067 diacollo-0.07.006+vsem into DiaColloDB.pm, DiaColloDB/Relation.pm

v0.07.008 2015-07-31  moocow
	* honor {xdmin},{xdmax} in DiaColloDB::xidsByDate()
	  - fixes 'cannot align non-trivial multi-profiles of unequal size' bug in corpora with bogus dates (e.g. zeitungen)
	* ignore Makefile.old

v0.07.007 2015-07-23  moocow
	* merged -r15021:15022 branch diacollo-0.07.006+vsem into Relation/DDC.pm
	  - fix for e.g. author-profiles
	* allow ddc queries without primary targets (=1), for 'subcorpus comparison'
	* merged -r 15013:15014 diacollo-0.07.006+vsem into DDC.pm
	  - fixes for pseudo-corpus comparison

v0.07.006 2015-07-20  moocow, tweaks
	* plots/*: pretty diff- and score-function plots
	* documented -diff option to dcdb-query.perl
	* Profile/Diff.pm pre-trimming tweaks, lavg fix
	* doc fixes; lf, lfm score-funcs
	* more diff documentation
	* added, documented -diff=OP option (adiff,diff,sum,min,max,avg,havg)

v0.07.005 2015-07-08  moocow
	* ddc groupby-request parsing tweak
	* groupby without token attributes
	* ddc tweak for groupby without a token field -- still not working (keys()-queries fail)

v0.07.004 2015-07-02  moocow
	* fixed bogus $DiaColloDB::MMCLASS = "DiaColloDB::MultiMapFile::MMap" (not yet written)
	* readme fixes
	* distribution, docs, readme, htmlifypods
	* fix mantis bug #804 : don't trim empty sub-profiles in diff mode

v0.07.003 2015-06-01  moocow
	* renamed 'local' profiling option to 'global' (for better web-wrapper transparency and defaults)

v0.07.002 2015-05-29  moocow
	* missing profile fix for diff (argh)
	* added misc/ddc-sample.txt: notes on #SAMPLE keyword
	* merged -r14464:HEAD diacollo-0.06+ddc intro trunk

v0.05.001 2015-04-23  moocow
	* reverted trunk to current state of diacollo-0.05.001-pre-vsem branch
	* benchmark -iters for dcdb-query.perl
	* started trying to add DocClassify-based DSem to DiaColloDB: stuck on questions of modularity
	* 'logwhich' option: log multiple sub-classes

v0.05.001 2015-03-24  moocow
	* EnumFile fixes for missing keys
	* EnumFile::Tied : tied interface to EnumFile
	  - EnumFile and friends (except for FixedLen::MMap) now allow in-memory cache to override file contents for i2s(), s2i()

v0.05 2015-03-23  moocow
	* more verbose union messages
	* added wvi-doc2terms.perl: not very encouraging
	* woe is me: additive term-identities don't look kosher with word2vec
	* work on topic-doc matrix (WAY TOO BIG  sentence-based model with k=200)
	* word2vec tweaks: a bit further along...
	* union tweaks
	* union() now uses temporary  objects to map attribute indices (ai2u, xi2u)
	  - should improve memory usage a bit
	  - individual maps are still loaded to memory on a per-db basis
	    (at most 1 at any time) in Cofreqs::union and Unigrams::union
	* stricter request handling (die on unsupported attributes)
	* groupby and generic requests working via web-wrapper
	  - thought: should we model the query language on ddc (maybe even
	    use DDC::XS or similar) for max compatibility?
	* updated MANIFEST
	* parseRequest() for user queries working
	* added {maxExpand} option to kludge memory-hogging queries
	* factored out parseRequest() from groupby()
	  + TODO: implement generic target query using parseRequest() rather than named parameters
	* dbinfo for http (add url), list, file, http
	* dbinfo, timestamp, disk usage
	* remove MYMETA.yml from svn; ignore some other stuff
	* EnumFile: more fixes for perl 5.18.2
	* more groupby fixes
	* attrs/groupby hack for shared arrays
	* removed 'use bytes' pragmas almost everywhere
	  - deprecated in perl 5.18.2 (ubuntu 14.04.1 / kira)
	  - workaround is to use utf8::encode() and length(), if needed on a temporary
	* delete empty records for test-check-enum
	* added test-check-enum.perl
	* buggy diacollo : taz

v0.04 2015-03-09  moocow
	* 'having' filters, wip
	* adopt xdmin,xdmax for union
	* use lib qw(lib) for update-header
	* merged -r r14008:14041 branch diacollo-0.03+attrs intro trunk : compile-time user-defined attributes

v0.03 2015-03-04  moocow
	* metadata parsing for Document/DDCTabs.pm
	* w2v test functionality now in w2v-compile.perl + w2v-query.perl
	* removed cofreqs debugging log stuff
	* utf8 parsing mode (improved filter regex matching)
	* removed generated Makefile from svn
	* tweaks for d* integration
	* added dump.mak from old Makefile r13904
	* export tweaks
	* cofreqs loading tweaks, timing
	* union tweaks and woes : seems basically working now
	* dump DiaColloDB::Persistent subclass files
	  - toArray(), fromArray() for PackedFile
	  - work-in-progress: DiaColloDB::union()
	* Client layer working and pretty much tested
	* dcdb-query.perl added to MANIFEST
	* added dcdb-query.perl : replaces dcdb-(profile|compare).perl
	* moved Client/Distributed.pm -> Client/list.pm

v0.02 2015-02-24  moocow
	* DiaColloDB/Client/Distributed.pm: error pass-through
	* distributed client stuff
	  - functionality is basically in place, but NOT CORRECT
	  - getting (fudge*k)-best items from sub-corpora wonks up the
	  results (e.g. 'gnädig' doesn't appear for Mann vs Frau in
	  distributed kern), other frequencies and scores are off too
	* Diff improvements: trimming via absolute value, add() support
	* utf8 tweaks
	* DiaColloDB::compare(): basically working ("diff" profiles)

v0.01 2015-02-20  moocow
	* initial version