The algorithm is as follows:

1) Read in the 'post' (PS name) table from the new font and use it to generate VOLT names for every glyph. The PSnames are the basis of VOLT's glyph names as follows: if the glyph has a PSname other than '.notdef', and if that PSname is strictly legal (alpha followed by alphanumeric only) and unique (used only once in the font) then the PSname becomes the VOLT name, otherwise the VOLT "generic" name "GLYPH_n" (where n is replaced with the glyph ID) is used. (If the PSname is non-standard or wasn't unique, then a warning is issued).

2) Parse MS cmap to get Unicode to glyph mappings of new font

For each glyph in the new font, a structure (anonymous hash) is constructed that has contains an 'ID' field and either or both of 'NAME' and '@UNICODES'. Eventually (after a later step) the structures will have a 'GDEF' entry that points to a GDEF structure (see below)

These structures are reached by looking them up in one of three hashes: %GlyphFromID, %GlyphFromName, or %GlyphFromCmapUnicode.

3) Read lines from inVoltProj and, for each one, build a GDEF structure which contains one or more of 'LINE', 'NAME', 'UNICODE', 'UNICODEVALUES', 'ID','TYPE', or 'COMPONENTS'. Eventually the structure may also have 'NEWID'. Using the UNICODE and UNICODEVALUES strings, construct '@UNICODES'

Similar to the Glyph structures, GDEF structures can be located by looking them up in one of three hashes: %GdefFromID, %GdefFromName, %GdefFromGdefUnicode

Sample GDEF lines follow:

DEF_GLYPH "U062bU062cIsol" ID 1097 UNICODE 64529 TYPE LIGATURE COMPONENTS 2 END_GLYPH DEF_GLYPH "middot" ID 167 UNICODEVALUES "U+00B7,U+2219" TYPE BASE END_GLYPH

(Note: Here and in later examples of VOLT source, quotes on names are not present in some earlier versions of VOLT)

After we have all the GDEF lines, we match up names and/or Unicode values in order to make up data for the new GDEF lines. The result will be adding a 'NEWID' field to the GDEF structure and 'GDEF' field to Glyph structures.

Once the mappings are done, we can then process GSUB and Anchor definitions as we see them in the source.

A lot of error conditions are tested in the process.

### TODO: The current algorithm first matches glyphs and GDEFS based on names. Then a second pass is made to see if any as-yet-unmatched gyphs and GDEFS can be matched based on Unicode value (e.g., the font author has changed "overscore" to "macron"). If any such matches are made, this represents a change in glyph name. Currently this code does NOT fix up OT lookups that reference such glyphs so that they reference the new name. (Anchor point records *are* fixed up because they are identified by GID, not name). If there are references in lookups, VOLT should fail to compile because it won't be able to find the glyph, but it would be nice if this program fixed up the lookups itself. As it is, it simply warns of the condition.

### TODO: FIX UP FOLLOWING COMMENTS -- these were based on 1.1

After extracting the field values from the GDEF, including the OldVoltName there are two cases:

a) If the OldVoltName is generic (i.e., of the form "GLYPH_n") then there is no way for us to migrate this GDEF to the new file. Therefore if the rest of the GDEF looks untouched then this GDEF is silently ignored, else a warning about possible loss of data is issued.

b) The OldVoltName is not generic, then look it up in %NewGID. If it is present, issue a warning if it has already been used. Otherwise replace the GID in the line with the new GID and mark it used. Also, add a direct mapping from old GID to new GID to the %NewGID hash. A warning is issued if the OldVoltName isn't present in %NewGID.

4) GDEF lines must be written out in glyph order, which might have changed from the old source to the new. Any generic entries will have to have "untouched" GDEFs synthesized. At this point we can build a list of new glyphs to be appended to the log.

4) We also have to fixup Anchor definitions. A typical Anchor line is:

DEF_ANCHOR "Below" ON 553 COMPONENT 1 LOCKED AT POS DX 312 DY -540 END_POS END_ANCHOR

The ON field is a glyph number, so these have to be fixed up by mapping glyph number to name, and then to new glyph number. Anchors do not have to be in any order, so we can just fix them up as we see them (provided we have already seen the GDEFs).

Oh yes, one additional difficulty: VOLT source uses \013 to separate lines. But if you are using some other editor to generate the input you might have something else as line ending. So I've added code to automatically detect the convention for inVoltFile, and I use \013 for the output.