NAME

Lingua::DE::ASCII - Perl extension to convert german umlauts to and from ascii

SYNOPSIS

use Lingua::DE::ASCII;
print to_ascii("Umlaute wie ä,ö,ü,ß oder auch é usw. " .
               "sind nicht im ASCII Format " .
               "und werden deshalb umgeschrieben);
print to_latin1("Dies muesste auch rueckwaerts funktionieren ma cherie");
               

DESCRIPTION

This module enables conversion from and to the ASCII format of german texts.

It has two methods: to_ascii and to_latin1 which one do exactly what they say.

Please note that both methods take only one scalar as argument and not whole a list.

EXPORT

to_ascii($string) to_latin1($string)

BUGS

That's only a stupid computer program, faced with a very hard ai problem. So there will be some words that will be always hard to retranslate from ascii to latin1 encoding. A known example is the difference between "Maß(einheit)" and "Masseentropie" or similar. Another examples are "flösse" and "Flöße" or "(Der Schornstein) ruße" and "Russe", "Geheimtuer(isch)" and "Geheimtür", "anzu-ecken" and "anzücken". Also, it's hard to find the right spelling for the prefixes "miss-" or "miß-". In doubt I tried to use to more common word. I tried it with a huge list of german words, but please tell me if you find a serious back-translation bug.

This module is intended for ANSI code that is e.g. different from windows coding.

Misspelled words will create a lot of extra mistakes by the program. In doubt it's better to write with new Rechtschreibung.

The to_latin1 method is not very quick, it's programmed to handle as many exceptions as possible.

I avoided localizations for character handling (thus it should work on every computer), but the price is that in some rare cases of words with multiple umlauts (like "Häkeltülle") some buggy conversions can occur. Please tell me if you find such words.

AUTHOR

Janek Schleicher, <bigj@kamelfreund.de>

SEE ALSO

Lingua::DE::Sentence (another cool module)

1 POD Error

The following errors were encountered while parsing the POD:

Around line 847:

Non-ASCII character seen before =encoding in 'ä,ö,ü,ß'. Assuming CP1252