NAME
FileSystem::LL::FAT - Perl extension for low-level access to FAT partitions
SYNOPSIS
use FileSystem::LL::FAT;
blah blah blah
DESCRIPTION
MBR_2_partitions($sector)
($fields, @partitions) = MBR_2_partitions($sector) or die "Not an MBR";
Takes the first sector as a string, extracts the partition info and other information. Currently the only fields in the hash referenced by $fields is bootcode
(string of length 446) and signature
(0xAA55).
Each element of @partitions is a hash reference with fields
raw is_active start_head start_sec_trac type end_head end_sec_track
start_lba sectors start_sec end_sec start_trac end_track
Returns an empty list unless signature is correct.
interpret_bootsector($bootsector)
Takes a string containing 512Byte bootsector; returns a hash reference with decoded fields. The keys include
jump oem sector_size sectors_in_cluster FAT_table_off num_FAT_tables
root_dir_entries total_sectors1 media_type sectors_per_FAT16
sectors_per_track heads hidden_sectors total_sectors2
machine_code FS_type boot_signature volume_label physical_drive
ext_boot_signature serial_number raw
bpb_ext_boot_signature guessed_FAT_flavor
total_sectors sectors_per_FAT pre_sectors last_cluster sector_of_cluster0
(the last line contains info calculated based on other entries; guessed_FAT_flavor
is one of 12,16,32, and bpb_ext_boot_signature
is the ext_boot_signature
calculated assuming FAT12 or FAT16 layout of bootsector).
Additional flavor-dependent keys: in FAT32 case
sectors_per_FAT32 FAT_flags version rootdir_start_cluster
fsi_sector_sector bootcopy_sector_sector reserved1 reserved2
otherwise
extended_bpb head___dirty_flags
check_bootsector($fields)
Takes a hash reference with decoded fields of a bootsector; returns TRUE if minimal sanity checks hold; die()s otherwise.
interpret_directory($dir, $is_fat32, [$keep_del, [$keep_dots, [$keep_labels]]])
($res, $files) = interpret_directory($dir, $is_FAT32);
$files = interpret_directory($dir, $is_FAT32);
Takes catenation of directory cluster(s) as a string, extracts information about the files in the directory. Each element of array referenced by $files is a hash reference with keys
raw basename ext attrib name_ext_case creation_01sec time_create
date_create date_access cluster_high time_write date_write cluster_low
size cluster name dos_name time_creation
is_readonly is_hidden is_system is_volume_label is_subdir is_archive is_device
and possibly lfn_name
, lfn_name_UTF16
, lfn_raw
(if applicable). (The last row lists flags extracted from attrib
.)
basename
and $<ext> are parts of the "DOS name" (lowercased if indicated by the flags), time_create
has 0.01sec granularity (while time_creation
has 2sec granularity). Entries for deleted files are filtered out unless $keep_del is TRUE; . and .. are also filtered out unless $keep_dots is TRUE; records representing volume labels are also deleted unless $keep_labels is TRUE. If not filtered out, hashes for deleted files have an extra key deleted
with a true value.
lfn_raw
contains an array reference with all the fractional entries which contain the Long File Name. Each of them is a hash reference with keys
raw seq_number name_chars_1 attrib nt_reserved checksum_dosname
name_chars_2 cluster_low name_chars_3 name_chars
$res is 'end'
if end-of-directory entry is encountered; it is 'mid'
if directory ends in middle of LFN info. Otherwise $res is not defined.
FAT_2array($fat, $s, $w [, $offset [, $lim ] ] )
Takes a reference $s to a string, at offset $offset of which is the string representation of the FAT table; the length of FAT table in bytes is assumed to be $lim. $offset defaults to 0, $lim defaults to go to the end of string.
Appends to the array referenced by $fat a numeric array representating FAT. $w is the bitwidth of the field (in 12,16,32).
check_FAT_array($fat, $b [, $offset ])
$fat is a reference to a numeric array, or to the string containing the representation of FAT at $offset (which defaults to 0). $b is a hash reference with keys guessed_FAT_flavor
, media_type
(e.g., the result of interpret_bootsector()).
Returns TRUE if the first two clusters satisfy the FAT conventions; otherwise die()s.
cluster_chain($cluster, $maxc, $fat, $b [, $compress [, $offset ] ])
($total, $chain) = cluster_chain($cluster, $maxc, $fat, $b, $offset);
$fat is a reference to numeric array, or to the string containing the representation of FAT at $offset (which defaults to 0). $cluster is the start cluster, $maxc is the maximal number of clusters to look for (0 meaning no limit). $b is a hash reference with keys guessed_FAT_flavor
, last_cluster
(e.g., the result of interpret_bootsector()).
$chain is an array reference with the clusters in the chain. $total is FALSE if no end-of-a-chain marker was seen; otherwise it contains the total number of clusters.
If $compress is TRUE (defaults to FALSE), the cluster chain is run-compressed: each continuous run of clusters is converted to a pair of numbers: the starting cluster number, and length in clusters. If $compress is a subroutine reference, then it is called with these numbers as arguments; otherwise these numbers are pushed into $chain.
read_FAT_data($fh, $how [, $offset, $b, $FAT ])
$hash = read_FAT_data($fh, $how [, $offset, $b, $FAT ]);
Extracts one or more of MBR, bootsector, FAT table, root directory from a file $fh containg "contents of a disk". $fh may be a reference to a file handle, or a name of the file. The optional argument $offset is the offset inside the file of the first entry to extract, or of bootsector (default 0).
The hash reference $how contains extraction instructions. If values of keys do_MBR
, do_bootsector
, do_FAT
, do_rootdir
are defined, the corresponding parts of filesystem are read. If do_MBR
's value is 'maybe'
and do_bootsector
's is defined, the MBR part is checked whether it is an actual MBR or a bootsector. The actual value of the key do_FAT
chooses the copy of FAT to work with.
The value of key partition
governs which partition of 0..3 to choose (only primaries are currently supported); if not defined, and the number of valid partitions differs from 1, the call die()s.
If the value of key FAT_separate
is TRUE, $offset is the offset of the start of (the first) FAT in the file; otherwise it is the offset of MBR or bootsector (offsets of other parts are calculated as needed). If the value of kye rootdir_is_standalone
is TRUE, rootdir is assumed to be the whole content of the file.
If the values of keys parse_MBR
, parse_bootsector
, parse_FAT
, parse_rootdir
are defined (or this is needed for processing of remaining parts to extract), the corresponding read parts are interpreted as in MBR_2_partitions(), interpret_bootsector(), FAT_2array(), interpret_directory().
The corresponding parsed values are put into $hash->{MBR}
, $hash->{bootsector}
; if not parsed, the values are hash references {raw => STRING}
. $hash->{FAT}
is suitable for argument of cluster_chain(): it is either a reference to string representation of the FAT, or to array representation of FAT.
(To avoid overflowing the memory) the FAT is converted to array only if parse_FAT
is defined, AND the number of clusters is below a certain limit. The limit is the value of parse_FAT
unless 0; if 0, the default value 3000000 is used (the corresponding memory usage for array FAT representation is about 60MB).
When bootsector is read, $hash->{bootsector_offset}
is the actual offset of bootsector (useful if $offset is actually referencing an MBR). Finally, if parse_rootdir
's value is defined, $hash->{rootdir_files}
is a reference to array of files in the root directory, $hash->{rootdir_ended}
is true if end-of-directory marker was seen (i.e., the directory ends before the end of the allocated space); anyway, $hash->{rootdir_raw}
is string representation of the root directory.
The keys keep_del
, keep_dots
, keep_labels
are given as corresponding arguments to interpret_directory(). If values referenced by raw_FAT
is TRUE, or by parse_FAT
is undefined, $hash->{FAT_raw}
contains a reference to the string representation of FAT.
write_dir($fh, $o_root, $d, $b, $FAT, [$how, $depth, $offset, $exists])
recursively extract the content of directory $d (a reference to raw string representation of the directory as represented on disk). $depth zero corresponds to no extraction of subdirectories (give undef
or an insanely large number to have unlimited depth; e.g., 1e100). $fh should be a file handle representing the disk content with bootsector at $offset. $o_root is the output directory: the files in $d will be put there.
If $exists is TRUE, $o_root exists. (The parent of $o_root should always exist.)
$how is an optional hash reference, with values for keys keep_del
, keep_dots
, keep_labels
giving arguments for interpret_directory() call.
write_file($fh, $dir, $file, $b, $FAT [, $offset ] )
Extract $file (should be a hash reference representing a record from a directory) into a directory $dir. $fh should be a file handle representing the disk content with bootsector at $offset.
EXPORT
None by default.
EXAMPLES
perl -MFileSystem::LL::FAT=interpret_directory -wle "
{local $/; binmode STDIN; $s = <STDIN>}
(undef,@f) = interpret_directory $s, 1;
print qq($_->{is_subdir} $_->{cluster}\t$_->{size}\t$_->{name}) for @f"
< dir-clusters
outputs content of a "directory converted to a file" (may be created by disasterous chkdsk run), including the starting cluster.
Given an information about the number of "pre-cluster sectors", and size of the cluster, one can convert the starting cluster number to starting sector number. Then one can extract the files by raw-read of the disk partition:
$sector = $bootsec->{pre_sectors}
+ ($cluster - 2)*$bootsec->{sectors_in_cluster}
= $bootsec->{sector_of_cluster0}
+ $cluster * $bootsec->{sectors_in_cluster}
Likewise, one can inspect a bootsector via
perl -MFileSystem::LL::FAT=interpret_bootsector,check_bootsector -wle
"{local $/; binmode STDIN; $s = <STDIN>}
$b = interpret_bootsector $s; check_bootsector $b;
print qq($_\t=>\t$b->{$_}) for sort keys %$b"
< disk.bootsector
On DOSish systems one can read bootsector of drive d: by reading the first 512 bytes of the file \\.\d:. E.g., with dd one could do it as
dd if=//./d: bs=512 count=1 of=disk.bootsector
On UNIXish systems one needs to find the corresponding device file (by calling mount or /sbin/mount?), and do
dd if=/dev/hda3 bs=512 count=1 of=disk.bootsector
Other DOSish conventions (see also diskext, bootpart, mkbt programs):
\\?\Device\Harddisk0\Partition0 # Partition0 is entire disk
//./physicaldrive0
/dev/fd0 # Floppy 0 under CygWin
/dev/sdc # physical HDs No. 2 (=c) under CygWin
/dev/sdc1 # Same, partition 1
Other programs may be used too:
D:\mkbt20>mkbt -x -c e: c:\bootstrap-e2
* Expert mode (-x)
* Copy bootsector mode (-c)
dd if=//./e: of=c:/bootstrap-e-dd count=16
dd --list
CygWin's dd
may be flacky; you may want to try http://www.chrysocome.net/dd. You may need "elevated privilige" under Vista.
BUGS
When lowercasing non-LFN names, which codepage should one use (and how)?
We ignore LFNs records with seq-number > 0x7F
, unless 0xE5. When do they appear?
How to follow logical partitions?
Test suite is practically absent...
When recursing into a directory without FAT table present, we assume that subdirs have size of one cluster. To do otherwise, need to check that subsequent clusters are not directories; how to do it?
And how often are directories continuous on disk?
SEE ALSO
See http://en.wikipedia.org/wiki/Fat32.
AUTHOR
Ilya Zakharevich, <ilyaz@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2009 by Ilya Zakharevich
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.