NAME
Win32::LongPath - provide functions to access long paths and Unicode in the Windows environment
SYNOPSIS
use File::Spec::Functions;
use Win32::LongPath;
use utf8;
# make a really long path w/Unicode from around the world
$path = 'c:';
while (length ($path) < 5000) {
$path = catdir ($path, 'ελληνικά-русский-日本語-한국-中國的-עִברִית-عربي');
if (!testL ('e', $path)) {
mkdirL ($path) or die "unable to create $path ($^E)";
}
}
print 'ShortPath: ' . shortpathL ($path) . "\n";
# next, create a file in the path
$file = catfile ('more interesting characters فارسی-தமிழர்-ພາສາລາວ');
openL (\$FH, '>:encoding(UTF-8)', $file)
or die ("unable to open $file ($^E)");
print $FH "writing some more Unicode characters\n";
print $FH "דאס שרייבט אַ שורה אין ייִדיש.\n";
close $FH;
# now undo everything
unlinkL ($file) or die "unable to delete file ($^E)";
while ($path =~ /[\/\\]/) {
rmdirL ($path) or die "unable to remove $path ($^E)";
$path =~ s#[/\\][^/\\]+$##;
}
DESCRIPTION
Although Perl natively supports functions that can access files in Windows these functions fail for Unicode or long file paths (i.e. greater than the Windows MAX_PATH value which is about 255 characters). Win32::LongPath overcomes these limitations by using Windows wide-character functions which support Unicode and extended-length paths. The end result is that you can process any file in the Windows environment without worrying about Unicode or path length.
Win32::LongPath provides replacement functions for most of the native Perl file functions. These functions attempt to imitate the native functionality and format as closely as possible and accept file paths which include Unicode characters and can be up to 32,767 characters long.
Some additional functions are also available to provide low-level features that are specific to Windows files.
Paths
File and directory paths can be provided containing any of the following components.
path separators: Both the forward (/) and reverse (\) slashes can be used to separate the path components.
Unicode: Unicode characters can be used anywhere in the path provided they are supported by the Windows file naming standard. If Unicode is used, the string must be internally identified as UTF-8. See perlunicode for more information on using Unicode with Perl.
drive letter: The path can begin with an upper or lower case letter from A to Z followed by a colon to indicate a drive letter path. For example,
C:/path
(fullpath) orc:path
(relative path).UNC: The path can begin with a UNC path in the form
\\server\share
or//server/share
.extended-length: The path can begin with an extended-length prefix in the form of
\\?\
or//?/
.
All input paths will be converted (normalized) to a fullpath using the extended-length format and wide characters. This allows paths to be up to 32,767 characters long and to include Unicode characters. The Microsoft specification still limits the directory component to MAX_PATH (about 255) characters.
Output paths will be converted back (denormalized) to a UTF-8 fullpath that begins with a drive letter or UNC.
NOTE: See the Naming Files, Paths, and Namespaces topic in the Microsoft MSDN Library for more information about extended-length paths.
Return Values
Unless stated otherwise, all functions return true (a numeric value of 1) if successful or false (undef) if an error occurred. Generally, if a function fails it will set the $! value to the failure. However, $^E will have the more specific Windows error value.
FILE FUNCTIONS
This section lists the replacements for native Perl file functions. Since "openL" returns a native Perl file handle, functions that use open file handles (read, write, close, binmode, etc.) can be used as is and do not have replacement functions. In like manner, "sysopenL" also returns a native Perl file handle.
Functions that are specific to the Unix environment (chmod, chown, umask, etc.) do not have replacements.
- linkL OLDFILE,NEWFILE
-
If the Windows file system supports it, a hard link is created from NEWFILE to OLDFILE.
linkL ('goodbye', 'до свидания') or die ("unable to link file ($^E)");
- lstatL PATH
-
Does the same thing as the "statL" function but will retrieve the statistics for the link and not the file it links to.
- openL FILEHANDLEREF,MODE,PATH
-
open is a very powerful and versatile Perl function with many modes and capabilities. The openL replacement does not provide the full range of capability but does provide what is needed to open files in the Windows file system. It only supports the three-argument form of open.
FILEHANDLEREF cannot be a bareword file handle or a scalar variable. It must be a reference to a scalar value which will be set to be a Perl file handle. For example:
openL (\$fh, '<', $file) or die ("unable to open $file: ($^E)");
For the most part, MODE matches the native definition and can begin with <, >, >>, +<, +> and +>> to indicate read/write behavior. The |-, -|, <-, -, >- modes are not valid since they apply to pipes, STDIN and STDOUT. Read-only is assumed if the read/write symbols are not used. MODE can also include a colon followed by the I/O layer definition. For example:
openL (\$fh, '>:encoding(UTF-8)', $file);
PATH is the relative or fullpath name of the file. It cannot be undef for temporary files, a reference to a variable for in-memory files or a file handle.
# these are WRONG! openL ($infh, '', $infile); openL (INFILE, '', $infile); openL (\$infh, '', undef); openL (\$infh, '', \$memory); openL (\$infh, '', INFILE); openL (\$infh, '-|', "file<$infile"); # these are correct # append infile to outfile openL (\$infh, '', $infile) or die ("unable to open $infile: ($^E)"); openL (\$outfh, '>>', $outfile) or die ("unable to open $outfile: ($^E)"); while (<$infh>) { print $outfh $_; } eof ($infh) or print "terminated before EOF!\n"; close $infh; close $outfh;
- readlinkL PATH
-
Returns the path that a junction/mount point or symbolic link points to. If PATH is not provided, $_ is used. It will fail for hard links.
# symlinks should always be equal symlinkL ($orig, $slink) or die ("unable to symlink file ($^E)"); $rlink = readlinkL ($slink) or die ("unable to read link ($^E)"); die ("links not equal!") if ($rlink ne $orig); # hard links should always be undef linkL ($orig, $hlink) or die ("unable to link file ($^E)"); !readlinkL ($hlink) or die ("should have failed!");
- renameL OLDNAME,NEWNAME
-
Changes the name or moves OLDNAME to NEWNAME. Renames directories as well as files. Cannot move directories across volumes.
NOTE: See MoveFile in the Microsoft MSDN Library for more information.
# should work renameL ('c:/file', 'c:/newfile'); # fails, can't move file to directory renameL ('d:/file', '.'); # should work for files renameL ('e:/file', 'f:/newfile'); # should work renameL ('d:/dir', 'd:/topdir/subdir'); # fails, can't move directory across volumes renameL ('c:/dir', 'd:/newdir');
- statL PATH
-
Returns an object with the statistics for the file. PATH must be a path to a file and cannot be a file or directory handle. If it is not provided, $_ is used. If there is an error gathering the statistics undef is returned and the error variables are set. The definition of object elements are very similar to the native Perl stat function although the access method is like File::stat.
- FILE_ATTRIBUTE_ARCHIVE
- FILE_ATTRIBUTE_COMPRESSED
- FILE_ATTRIBUTE_DEVICE
- FILE_ATTRIBUTE_DIRECTORY
- FILE_ATTRIBUTE_ENCRYPTED
- FILE_ATTRIBUTE_HIDDEN
- FILE_ATTRIBUTE_INTEGRITY_STREAM
- FILE_ATTRIBUTE_NORMAL
- FILE_ATTRIBUTE_NOT_CONTENT_INDEXED
- FILE_ATTRIBUTE_NO_SCRUB_DATA
- FILE_ATTRIBUTE_OFFLINE
- FILE_ATTRIBUTE_PINNED
- FILE_ATTRIBUTE_READONLY
- FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS
- FILE_ATTRIBUTE_RECALL_ON_OPEN
- FILE_ATTRIBUTE_REPARSE_POINT
- FILE_ATTRIBUTE_SPARSE_FILE
- FILE_ATTRIBUTE_SYSTEM
- FILE_ATTRIBUTE_TEMPORARY
- FILE_ATTRIBUTE_UNPINNED
- FILE_ATTRIBUTE_VIRTUAL
Directories:
S_IFDIR
,S_IRWXU
,S_IRWXG
andS_IRWXO
Files:
S_IFREG
,S_IRUSR
,S_IRGRP
andS_IROTH
Files without read-only attribute:
S_IWUSR
,S_IWGRP
andS_IWOTH
Files with BAT, CMD, COM and EXE extension:
S_IXUSR
,S_IXGRP
andS_IXOTH
atime: Last access time in seconds. NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 1 day for the access time. See the Microsoft MSDN Library for more information about file time.
attribs: File attributes as returned by the Windows GetFileAttributes () function. Use the following constants to retrieve the individual values. See the Microsoft MSDN Library for more information about the meaning of these values. Import these values into your environment if you do not want to refer to them with the
Win32::LongPath::
prefix.ctime: Although defined to be inode change time in seconds for native Perl, it will reflect the Windows creation time.
dev: The Windows serial number for the volume. See the Microsoft MSDN Library for more information.
gid: Is always zero.
ino: Is always zero.
mode: File mode (type and permissions).
use Fcntl ':mode'
can be used to extract the meaning of the mode. Regardless of the actual user and group permissions, the following bits are set.mtime: Last modify time in seconds. NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 2 seconds for the modification time. See the Microsoft MSDN Library for more information about file time.
nlink: Is always one.
rdev: Same as dev.
size: Total size of the file in bytes. Has a value of zero for directories.
uid: Is always zero.
use Fcntl ':mode'; use Win32::LongPath qw(:funcs :fileattr); # get object testL ('e', $file) or die "$file doesn't exist!"; $stat = statL ($file) or die ("unable to get stat for $file ($^E)"); # this test for directory $stat->{mode} & S_IFDIR ? print "Directory\n" : print "File\n"; # is the same as this one $stat->{attribs} & FILE_ATTRIBUTE_DIRECTORY ? print "Directory\n" : print "File\n"; # show file times as local time printf "Created: %s\nAccessed: %s\nModified: %s\n", scalar localtime $stat->{ctime}, scalar localtime $stat->{atime}, scalar localtime $stat->{mtime};
- symlinkL OLDFILE,NEWFILE
-
If the Windows OS, file system and user permissions support it, a symbolic link is created from NEWFILE to OLDFILE.
OLDFILE can be a relative or full path. If relative path is used, it will not be converted to an extended-length path.
NOTE: See CreateSymbolicLink in the Microsoft MSDN Library for more information about symbolic links.
symlinkL ('no problem', '問題ない') or die ("unable to link file ($^E)"); symlinkL ('c:/', 'rootpath') or die ("unable to link file ($^E)");
- sysopenL FILEHANDLEREF,PATH,MODE
-
Performs the same function as the native Perl sysopen function but only supports the three-argument form of sysopen.
FILEHANDLEREF cannot be a bareword file handle or a scalar variable. It must be a reference to a scalar value which will be set to be a Perl file handle. For example:
sysopenL (\$fh, $file, O_CREAT | O_EXCL) or die ("unable to open $file: ($^E)");
PATH is the relative or fullpath name of the file.
MODE matches the native definition.
- testL TYPE,PATH
-
Used to replace the native -X functions. TYPE is the same value as the -X function. For example:
# these are equivalent die 'unable to read!' if -r $file; die 'unable to read!' if testL ('r', $file);
The supported TYPEs and their values are:
b: Block device. Always returns undef.
c: Character device. Always returns undef.
d: Directory.
e: Exists.
f: Plain file. Returns true if not a directory of Windows offline file.
l: Link file. Only returns true for junction/mount points and symbolic links.
o or O: Owned. Always returns true.
r or R: Read. Always returns true.
s: File has nonzero size (returns size in bytes).
w or W: Read. Returns true if the file does not have the read-only attribute.
x or X: Read. Returns true if the file has one of the following extensions: bat, cmd, com, exe.
z: Zero size.
- unlinkL PATH[,...]
-
Deletes the list of files. If successful, it returns the number of files deleted. It will fail if the file has the read-only attribute set. It returns undef if an error occurs, and the error variable is set to the value of the last error encountered.
# if you do this you don't know which failed die ("delete of some files failed!") if !unlinkL ($f1, $f2, $f3, $f4); # this identifies the failures foreach my $file ($f1, $f2, $f3, $f4) { unlinkL ($file) or print "Unable to delete $file ($^E)\n"; }
- utimeL [ATIME],[MTIME],PATH[,...]
-
Changes the access and modification times on each file. ATIME and MTIME are the numeric times from the time () function. If both are undef then the times will be changed to the current time. If only one is undef that one will use a time value of zero.
PATH must be the path to a file.
If successful, it returns the number of files changed. It returns undef if an error occurs, and the error variable is set to the value of the last error encountered.
NOTE: This function is not supported in Cygwin and will return an error.
NOTE: Different file systems have different time resolutions. For example, FAT has a resolution of 2 seconds for modification time and 1 day for the access time. See the Microsoft MSDN Library for more information about file time.
# set back 24 hours $yesterday = time () - (24 * 60 * 60); utimeL ($yesterday, $yesterday, $file) or die ("unable to change time on $file ($^E)"); # this is the same as the touch command utimeL (undef, undef, $file) or die ("unable to change time on $file ($^E)");
DIRECTORY FUNCTIONS
NOTE: Although extended-length paths are used, the Microsoft specification still limits the directory component to MAX_PATH (about 255) characters.
- chdirL PATH
-
Changes the working directory. If PATH is missing it tries to change to
$ENV{HOME}
if it is set, or$ENV{LOGDIR}
if that is set. If neither is set then it will do nothing and return.Unlike other functions, the PATH cannot exceed MAX_PATH characters, although it can contain Unicode and be in the extended-path format.
chdirL ($path) or die ("unable to change to $path ($^E)");
- getcwdL
-
Returns the fullpath of the current working directory. This does not replace a native Perl function since none exists. It works like the curdir function in File::Spec.
print "The current directory is: ", getcwdL (), "\n";
- mkdirL PATH
-
Creates a directory which inherits the permissions of the parent. If PATH is not provided, $_ is used. An error is returned if the parent directory does not exist.
mkdirL ($dir) or die ("unable to create $dir ($^E)");
- rmdirL PATH
-
Deletes a directory. If PATH is not provided, $_ is used. An error is returned if the directory is not empty.
rmdirL ($dir) or die ("unable to delete $dir ($^E)");
OPENDIR FUNCTIONS
Unlike the "openL" function which returns a native handle, the open directory functions must create a directory object and then use that object to manipulate the directory. The native Perl rewinddir, seekdir and telldir functions are not supported.
- new
-
Creates a directory object.
$dir = Win32::LongPath->new ();
- closedirL
-
Closes the current directory for reading.
$dir->closedirL ();
- opendirL PATH
-
Opens a directory for reading. If the directory object is already open the existing directory will be closed before opening the new one.
my $path = 'c:/rootdir/very long directory name/First Level'; $dir->opendirL ($path) or die ("unable to open $path ($^E)");
- readdirL
-
Reads the next item in the directory. In list context returns all the items as a list. Otherwise returns the next item or undef if there are no more items or an error occurred.
NOTE: Only the item name is returned, not the whole path to the item.
use Win32::LongPath qw(:funcs :fileattr); # search down the whole tree search_tree ($rootdir); exit 0; sub search_tree { # open directory and read contents my $path = shift; my $dir = Win32::LongPath->new (); $dir->opendirL ($path) or die ("unable to open $path ($^E)"); foreach my $file ($dir->readdirL ()) { # skip parent dir if ($file eq '..'){ { next; } # get file stats my $name = $file eq '.' ? $path : "$path/$file"; my $stat = lstatL ($name) or die "unable to stat $name ($^E)"; # recurse if dir if (($file ne '.') && (($stat->{attribs} & (FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_REPARSE_POINT)) == FILE_ATTRIBUTE_DIRECTORY)) { search_tree ($name); next; } # output stats print "$name\t$stat->{attribs}\t$stat->{size}\t", scalar localtime $stat->{ctime}, "\t", scalar localtime $stat->{mtime}, "\n"; } $dir->closedirL (); return; }
MISCELLANEOUS FUNCTIONS
The following functions are not native Perl functions but are useful when working with Windows.
- abspathL PATH
-
Returns the absolute (fullpath) for PATH. If the path exists, it will replace the components with Windows' long path names. Otherwise, it returns a path that may contain short path names.
$short = '../SYSTEM~2.PPT'; $long = abspathL ($short); print "$short = $long\n"; # if it exists it could print something like # ../SYSTEM~2.PPT = c:\rootdir\subdir\System File.ppt # if not, it might print # ../SYSTEM~2.PPT = c:\rootdir\subdir\SYSTEM~2.PPT # probably not the same because TMP is short path chdirL ($ENV {TMP}) or die "unable to change to TMP dir!"; $curdir = getcwdL (); if (abspathL ($curdir) ne $curdir) { print "not the same!\n"; }
- attribL ATTRIBS,PATH
-
Sets file attributes like the DOS attrib command.
ATTRIBS is a string that identifies the attributes to enable or disable. A plus sign (+) enables and a minus sign (-) disables the attributes that follow. If not provided, a plus sign is assumed.
The attributes are identified by letters which can be upper or lower case. The letters and their values are:
H: Hidden.
I: Not content indexed. This value may not be valid for all file systems.
R: Read-only.
S: System.
# sets System and hidden but disables read-only # could also be '-r+sh', 's-r+h', '+hs-r', etc. attribL ('sh-r', $file) or die "unable to set attributes for $file ($^E)";
- copyL FROM,TO
-
Copies the FROM file to the TO file. If the file exists it is overwritten unless it is hidden or read-only. If it does not exist it inherits the permissions of the parent directory. File attributes are copied with the file. If the FROM file is a symbolic link the target is copied and not the symbolic link. If the TO file is a symbolic link the target is overwritten.
copyL ($from, $to) or die "unable to copy $from to $to ($^E)";
- shortpathL PATH
-
Returns the short path of the file. It returns a blank string if it is unable to get the short path.
if (shortpathL ($file) eq '') { or die "unable to get shortpath for $file"; }
- volinfoL PATH
-
Returns an object with the volume information for the PATH. PATH can be a relative or fullpath to any object on the volume. The object elements are:
- FILE_CASE_PRESERVED_NAMES
- FILE_CASE_SENSITIVE_SEARCH
- FILE_FILE_COMPRESSION
- FILE_NAMED_STREAMS
- FILE_PERSISTENT_ACLS
- FILE_READ_ONLY_VOLUME
- FILE_SEQUENTIAL_WRITE_ONCE
- FILE_SUPPORTS_ENCRYPTION
- FILE_SUPPORTS_EXTENDED_ATTRIBUTES
- FILE_SUPPORTS_HARD_LINKS
- FILE_SUPPORTS_OBJECT_IDS
- FILE_SUPPORTS_OPEN_BY_FILE_ID
- FILE_SUPPORTS_REPARSE_POINTS
- FILE_SUPPORTS_SPARSE_FILES
- FILE_SUPPORTS_TRANSACTIONS
- FILE_SUPPORTS_USN_JOURNAL
- FILE_UNICODE_ON_DISK
- FILE_VOLUME_IS_COMPRESSED
- FILE_VOLUME_QUOTAS
maxlen: The maximum length of path components (the characters between the backslashes; usually directory names).
name: The name of the volume.
serial: The Windows serial number for the volume.
sysflags: System flags. Indicates the features that are supported by the file system. Use the following constants to retrieve the individual values. Import these values into your environment if you do not want to refer to them with the
Win32::LongPath::
prefix.NOTE: See the Microsoft MSDN Library for more information about this feature.
use Win32::LongPath qw(:funcs :volflags); $vol = volinfoL ($file) or die "unable to get volinfo for $file"; if (!($vol->{sysflags} & FILE_SUPPORTS_REPARSE_POINTS)) { die "symbolic links will not work on $vol->{name}!"; }
MODULE EXPORTS
All functions are automatically exported by default. The following tags export specific values:
:all: all values
:funcs: all functions
:fileattr: file attributes used by the "statL" and "lstatL" functions
:volflags: system flags used by the "volinfoL" function
LIMITATIONS
This module was developed for the Microsoft WinXP and greater environment. It also supports the Cygwin environment.
AUTHOR
Robert Boisvert <rdbprog@gmail.com>
CREDITS
Many thanks to Jan Dubois for getting Windows support started with Win32. It remains the number one module in use on almost every Windows installation of Perl.
A big thank you (どうもありがとうございました) to Yuji Shimada for Win32::Unicode. The concepts used there are the basis for much of Win32::LongPath.
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.