NAME
Tie::CharArray - Access Perl scalars as arrays of characters
SYNOPSIS
use Tie::CharArray;
my $foobar = 'a string';
tie my @foo, 'Tie::CharArray', $foobar;
$foo[0] = 'A'; # $foobar = 'A string'
push @foo, '!'; # $foobar = 'A string!'
print "@foo\n"; # prints: A s t r i n g !
tie my @bar, 'Tie::CharArray::Ord', $foobar;
$bar[0]--; # $foobar = '@ string!'
pop @bar; # $foobar = '@ string'
print "@bar\n"; # prints: 64 32 115 116 114 105 110 103
Alternative interface functions
use Tie::CharArray qw( chars codes );
my $foobar = 'another string';
my $chars = chars $foobar; # arrayref in scalar context
push @$chars, '?'; # $foobar = 'another string?'
$_ += 2 for codes $foobar; # tied array in list context
# $foobar = 'cpqvjgt"uvtkpiA'
my @array = chars $foobar; # WARNING: @array isn't tied!
DESCRIPTION
In low-level programming languages such as C, and to some extent Java, strings are not primitive data types but arrays of characters, which in turn are treated as integers. This closely matches the internal representation of strings in the memory.
Perl, on the other hand, abstracts such internal details away behind the concept of scalars, which can be treated as either strings or numbers, and appear as primitive types to the programmer. This often better matches the way people think about the data, which facilitates programming by making common high-level manipulation tasks trivial.
Sometimes, though, the low-level view is better suited for the task at hand. Perl does offer functions such as ord()/chr(), pack()/unpack() and substr() that can be used to solve such tasks with reasonable efficiency. For someone used to the direct access to the internal representation offered by other languages, however, these functions may feel awkward. While this is often only a symptom of thinking in un-Perlish terms, sometimes being able to manipulate strings as character arrays really does simplify the code, making the intent more obvious by eliminating syntactic clutter.
This module provides a way to manipulate Perl strings through tied arrays. The operations are implemented in terms of the aforementioned string manipulation functions, but the programmer normally need not be aware of this. As Perl has no primitive character type, two alternative representations are provided:
Strings as arrays of single-character strings
The first way is to represent characters as strings of length 1. In most cases this is the most convenient representation, as such "characters" can be printed without explicit transformations and written as ordinary Perl string literals.
This representation is provided by the main class Tie::CharArray. As the class maps most array operations directly to calls to substr(), several features of that function apply. (Below, @foo
is an array tied to Tie::CharArray and $n
is a positive integer.)
$foo[@foo]
is an empty string,$foo[@foo+$n]
isundef
.Assigning to
$foo[@foo+$n]
is a fatal error. So is splice() beyond the end of the array.If you assign an empty string (or
undef
) to an element, any later elements are shifted down.If you assign a string longer than one character to an element, any later elements are shifted up.
In general, if you only put one-character strings into the array, and don't go beyond its end, there should be no problems.
Strings as arrays of small integers
While the representation described above is usually the most convenient one, it still does not allow direct arithmetic manipulation of the character code values. For tasks where this is needed, an alternative representation is provided by the subclass Tie::CharArray::Ord. Note that it is perfectly possible to manipulate a single string through both interfaces at the same time. As the array operations are still based on substr(), the first two of the above caveats apply here as well. Unicode support depends on whether and how the underlying perl implementation supports it.
Alternative interface functions
Since using tie() can sometimes seem inconvenient, Tie::CharArray can also export two functions to perform the tying internally. The functions are reproduced below in their entirety.
sub chars ($) {
tie my @chars, 'Tie::CharArray', $_[0];
return wantarray ? @chars : \@chars;
}
sub codes ($) {
tie my @codes, 'Tie::CharArray::Ord', $_[0];
return wantarray ? @codes : \@codes;
}
When called in scalar context, they return a reference to a tied array through which the characters of the string given to them can be manipulated. In list context the functions return the tied array itself.
This is of rather limited use, since the tied array is only temporary, and assigning it to a permanent array only copies the values it contains but does not tie the permanent array. However, if the temporary array is passed to a subroutine or a foreach
loop, perl will alias the elements directly to the temporary array instead of copying them. What that means in practice is that you can write:
foreach my $ch (chars $string) {
# reverse bits in each character
$ch = pack "b*", unpack "B*", $ch;
}
BUGS
Exposing the peculiarities of substr() to the user might be considered a bug. In any case, it is a feature which one should probably not rely too much on, as it might change in future revisions.
Some sort of a warning might be appropriate if chars() or codes() is called in list context and the return value won't be aliased. I just have no idea how to implement that in current versions of Perl.
CHANGES
- 1.00 (14 April 2001)
-
Added exportable functions chars() and codes(). Removed use of Tie::Array. Only loads Carp if needed.
AUTHORS
Copyright 2000-2001, Ilmari Karonen. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
Address bug reports and comments to: perl@itz.pp.sci.fi