NAME

File::Filename::Convention - test a filename against a file naming convention

SYNOPSIS

use File::Filename::Convention 'get_filename_hash';

my $filename = 'James Edward-2007.txt';
my $metadata = get_filename_hash($filename,[[qw(name year ext)]]);

for (keys %$metadata){
	print "$_ is $$metadata{$_}\n";
}

DESCRIPTION

get_filename_hash()

my $hash = get_filename_hash(
	$filename,
	['city','state'],
	{
		state => sub { qr /^MD$|^VA$|^NY$/ }, # only states MD, VA, nad NY will be valid for us
	}
);

This would only return a hash if a file is named 'city' (a string) and one of the states NY, MD or VA.

The following $filename(s) would match and return a hash ref:

Silver Spring-MD.txt
Waynesboro-VA.txt
Saratoga Springs-NY.txt

They would return:

{ city => 'Silver Spring', state => 'MD' },
{ city => 'Waynesboro', state => 'VA' },
{ city => 'Saratoga Springs', state => 'NY' },

The following filenames would NOT match:

Silver Spring.txt
VA-Waynesboro.txt
Saratoga Springs-NY-3.txt

Thus they would not return a hash ref, they would return undef.

EXAMPLE

Let's say that we have a file naming convention that has two kinds of files.

The first kind of file is a 'map' file, for which in the filename, we could place the author, and the date it was created. For convenience, we will also use a 'code' in this filename, the code will be 'MAP'. For 'MAP' files, imagine we want to always have the date first, then the author, then the code, finally the extension.

20070731-John G Reggie-MAP.pdf
20070731.John G Reggie.MAP.pdf
20070731_John G Reggie_MAP.pdf
20070731#John G Reggie@MAP.pdf

The second kind of file is a 'layout' file. In the filename we expect to have also the 'author' a date, and also a building code. The code for this type will be 'LAY'. For these 'LAY' files, we want to have the author first, and then the date, the building code, finally the file naming convention code and the extension. So valid filenames would be:

John G Reggie-20070816-B34-LAY.pdf
John G Reggie.20070816.B34.LAY.pdf
John G Reggie_20070816_B34_LAY.pdf
John G Reggie#20070816@B34-LAY.pdf

Here's how we would enforce this file naming convention:

my $fields= [
	['date','author','code','ext'], # for MAP	
	['author','date','building_code','code','ext'],	# for LAY	
	['date','ext'], # notes files
];

my $matchsubs = {
	date => sub { qr/\d{6,8}/ },
	code => sub { qr/^LAY$|^MAP$/ },
	building_code => sub { qr/^B\d+$/ },
};

Let's imagine we loaded a list of filenames:

my @filenames = (
'20070731-John G Reggie-MAP.pdf',
'20070731John G Reggie-MAP.pdf',	
'20070731-John G Reggie-MAP.pdf',
'John G Reggie_20070816_B34_LAY.pdf',
'John Jeff Notes 1.txt',	
);

Now let's check those..

for (@filenames){
	my $hash = get_filename_hash($_, $fields, $matchsubs);

	### $_
	### $hash
}

As you will see, two files do not match our convention.

What if you want to make sure that the codes and extensions match as you wish?

my $fields= [
	['date','author','code'=> 'MAP', 'ext'=> 'pdf'], # for MAP	
	['author','date','building_code','code' => 'LAY','ext' => 'pdf'],	# for LAY	
	['date','ext' => 'txt'], # notes files
];

This will match case insensitive.

SEE ALSO

File::Name LEOCHARRE::DEBUG

Revision

$Revision: 1.5 $

AUTHOR

Leo Charre leocharre at cpan dot org