NAME
tr - translate or delete characters
SYNOPSIS
tr [ -cdsUC ] [ SEARCHLIST [ REPLACEMENTLIST ] ]
DESCRIPTION
The tr program copies the standard input to the standard output with substitution or deletion of selected characters. Input characters found in SEARCHLIST are mapped into the corresponding characters of REPLACEMENTLIST. When REPLACEMENTLIST is short it is padded to the length of SEARCHLIST by duplicating its last character.
Here are the options:
- c
-
Complement the SEARCHLIST.
- d
-
Delete found but unreplaced characters.
- s
-
Squash duplicate replaced characters.
- U
-
Translate to/from UTF-8.
- C
-
Translate to/from 8-bit char (octet).
In either string, the notation a-b
means a range of characters from a
to b
in increasing ASCII order. Customary Perl escapes are honored, such as \n
for newline, \012
for octal, and \x0A
for hexadecimal codes.
If the -c flag is specified, the SEARCHLIST character set is complemented. If the -d flag is specified, any characters specified by SEARCHLIST not found in REPLACEMENTLIST are deleted. (Note that this is slightly more flexible than the behavior of some tr programs, which delete anything they find in the SEARCHLIST, period.) If the -s flag is specified, sequences of characters that were transliterated to the same character are squashed down to a single instance of the character.
If the -d flag is used, the REPLACEMENTLIST is always interpreted exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, the final character is replicated till it is long enough. If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated. This latter is useful for counting characters in a class or for squashing character sequences in a class.
The first -U or -C flag applies to the left side of the translation. The second one applies to the right side. If present, these flags override the current utf8 state.
EXAMPLES
The following command creates a list of all the words in file1 one per line in file2, where a word is taken to be a maximal string of alphabetics.
tr -cs A-Za-z "\n" <file1 >file2
The following command strips the 8th bit from an input file:
tr "\200-\377" "\000-\177"
The following command translates Latin-1 to Unicode:
tr -CU "\0-\xFF" ""
The following command translates Unicode to Latin-1
tr -UC "\0-\x{FF}" ""
NOTE
This command is implemented using Perl's tr
operator. See the documentation in perlop for details on its operation.
BUGS
There is no way to catch the file open error on ARGV handle processing, so the exit status does not reflect file open failures.
Not all systems have Unicode support yet, in which case the -U or -C flags would cause a fatal error.
AUTHOR
Tom Christiansen, tchrist@perl.com.
COPYRIGHT and LICENSE
This program is copyright (c) Tom Christiansen 1999.
This program is free and open software. You may use, modify, distribute, and sell this program (and any modified variants) in any way you wish, provided you do not restrict others from doing the same.