RCS file: RCS/Similars.pm,v
Working file: Similars.pm
head: 1.27
branch:
locks: strict
access list:
symbolic names:
keyword substitution: kv
total revisions: 27; selected revisions: 27
description:
Possibly identical/similar file locator
----------------------------
revision 1.27
date: 2008/10/31 16:07:34; author: tong; state: Exp; lines: +3 -3
- cosmetic amend to module identification
----------------------------
revision 1.26
date: 2008/10/31 15:59:05; author: tong; state: Exp; lines: +4 -3
- cosmetic amend: show explicitly how the test is invoked.
----------------------------
revision 1.25
date: 2008/10/29 16:59:40; author: tong; state: Exp; lines: +4 -4
- update contact email
----------------------------
revision 1.24
date: 2008/10/29 16:47:22; author: tong; state: Exp; lines: +86 -37
- update test & output, therefore update pod as well.
----------------------------
revision 1.23
date: 2008/10/28 23:05:39; author: tong; state: Exp; lines: +5 -2
- version deprecated. Future versions are released as File::Find::Similars.
----------------------------
revision 1.22
date: 2006/12/25 23:59:33; author: tong; state: Exp; lines: +25 -3
- add process_stdin
- tested fine
find . \( -type f -o -type l \) -follow -printf "%p\t%s\n" | fileSimilars.pl -
----------------------------
revision 1.21
date: 2006/12/25 22:29:19; author: tong; state: Exp; lines: +4 -4
- bug fix, $fc_level=0 to compare files between dirs now works
----------------------------
revision 1.20
date: 2006/12/25 22:17:29; author: tong; state: Exp; lines: +6 -6
- s/process_files/process_entries/g
----------------------------
revision 1.19
date: 2006/12/25 22:08:29; author: tong; state: Exp; lines: +19 -20
- move sub process_files, nothing else
----------------------------
revision 1.18
date: 2003/08/13 04:30:46; author: tong; state: Exp; lines: +57 -20
- more accurate similarity check.
- fine tune config variables
- bug fix for negative fdsize
- bug fix for empty soundex array
- add 3rd array wash function, for language specific washing
- add Chinese support
----------------------------
revision 1.17
date: 2003/08/10 16:32:32; author: tong; state: Exp; lines: +6 -6
- document enhancement
- for public release v1.3
----------------------------
revision 1.16
date: 2003/08/10 16:25:51; author: tong; state: Exp; lines: +12 -2
- document enhancement
- for public release v1.3
----------------------------
revision 1.15
date: 2003/08/10 16:05:25; author: tong; state: Exp; lines: +100 -57
- quote the dir names
- new algorithm for similarity check
p * soudex + (1-p) * fSize
- introduced hash variable %config
- move the pod to fileSimilars. Rewrite its own pod.
- fix bug, able to invoke again, w/o duplicate file entries
- add LICENSE
- for public release v1.3
----------------------------
revision 1.14
date: 2002/12/26 18:12:59; author: tong; state: Exp; lines: +53 -15
- minior bug fix
- main on document enhancement
----------------------------
revision 1.13
date: 2002/12/26 01:52:16; author: tong; state: Exp; lines: +47 -19
- clean up debug messages
- quote the file names
- the most robust soundex handling that support all cases, eg 'aa bb.cc'
- introduced two arrwash_ functions for customization
----------------------------
revision 1.12
date: 2002/12/25 23:24:04; author: tong; state: Exp; lines: +6 -4
- Chinese file names friendly. Previously "Illegal division by zero" when met with Chinese file names.
----------------------------
revision 1.11
date: 2002/12/25 23:10:54; author: tong; state: Exp; lines: +19 -17
- bug fix, more robust. Fix 3 bugs for file names like aa.mp3, 1.mp3:
* file extensions
* decompose SuchKindOfWord
* short file names
- clearer varible names and tech docs
----------------------------
revision 1.10
date: 2002/09/16 22:55:09; author: tong; state: Exp; lines: +10 -10
- change package name from File::Find::Similars to File::Searcher::Similars
----------------------------
revision 1.9
date: 2002/09/16 22:16:36; author: tong; state: Exp; lines: +2 -4
- remove dependency on fdispatch
----------------------------
revision 1.8
date: 2001/09/26 02:19:34; author: tong; state: Exp; lines: +6 -5
- explain columns
----------------------------
revision 1.7
date: 2001/09/26 01:53:27; author: tong; state: Exp; lines: +67 -151
- change to .pm module
----------------------------
revision 1.6
date: 2001/09/22 04:08:21; author: tong; state: Exp; lines: +41 -18
First public release
----------------------------
revision 1.5
date: 2001/09/21 01:38:55; author: tong; state: Exp; lines: +6 -5
- treat fileInfos[$ii] the same as fileInfos[$jj]
----------------------------
revision 1.4
date: 2001/09/21 01:33:17; author: tong; state: Exp; lines: +62 -37
- increase us-ability
- --level=1 checking
- more documentation
----------------------------
revision 1.3
date: 2001/09/20 01:18:01; author: tong; state: Exp; lines: +21 -14
- new diff algorithm, name difference is most significant now.
- output file by file size instead of unseen difference level
----------------------------
revision 1.2
date: 2001/09/20 00:07:33; author: tong; state: Exp; lines: +20 -11
All 1st release functionality ready.
----------------------------
revision 1.1
date: 2001/09/19 03:34:25; author: tong; state: Exp;
Initial revision
=============================================================================