Changes for version 2.28 - 2000-03-27

  • Junked local (Expat.xs) declaration parsing and patched expat to handle XML declarations, element declarations, attlist declarations, and all entity declarations. By eliminating both shadow buffers and local declaration parsing in Expat.xs, I've eliminated the two most common sources of serious bugs in the expat interface. o thus fixed the segfault and parse position bugs reported by Ivan Kurmanov <iku@fnmail.com> o and the doctype bug reported by Kevin Lund <Kevin.Lund@westgroup.com> o The element declaration handler no longer receives a string, but an XML::Parser::ContentModel object that represents the parsed model, but still looks like a string if referred to as a string. This class is documented in the XML::Parser::Expat pod under "XML::Parser::ContentModel Methods". o The doctype declaration handler no longer receives the internal subset as a string, but in its place a true or undef value indicating whether or not there is an internal subset. Also, it's called prior to processing either the internal or external DTD subset (as suggested by Enno Derksen <enno@att.com>.) o There is a new DoctypeFin handler that's called after finishing parsing all of the DOCTYPE declaration, including any internal or external DTD declarations. o One bit of lossage is that recognized_string, original_string, and default_current no longer work inside declaration handlers.
  • Added a handler that gets called after parsing external entities: ExternEntFin. Suggested by Jeff Horner <jhorner@netcentral.net>.
  • parsefile, file_ext_ent_handler, & lwp_ext_ent_handler now all set the base path. This problem has been raised more than once and I'm not sure to whom credit should be given.
  • The file_ext_ent_handler now opens a file handle instead of reading the entire entity at once.
  • Merged patches supplied by Larry Wall to (for perl 5.6 and beyond) tag generated strings as UTF-8, where appropriate.
  • Fixed a bug in xml_escape reported by Jerry Geiger <jgeiger@rios.de>. It failed when requesting escaping of perl regex meta-characters.
  • Laurent Caprani <caprani@pop.multimania.com> reported a bug in the Proc handler for the Debug style.
  • <chocolateboy@usa.net> sent in a patch for the element index mechanism. I was popping the stack too soon in the endElement fcn.
  • Jim Miner <jfm@winternet.com> sent in a patch to fix a warning in Expat.pm.
  • Kurt Starsinic pointed out that the eval used to check for string versus IO handle was leaving $@ dirty, thereby foiling higher level exception handlers
  • An expat question by Paul Prescod <paul@prescod.net> helped me see that exeptions in the parse call bypass the Expat release method, causing memory leaks.
  • Mark D. Anderson <mda@discerning.com> noted that calling recognized_string from the Final method caused a dump. There are a bunch of methods that should not be called after parsing has finished. These now have protective if statements around them.
  • Updated canonical utility to conform to newer version of Canonical XML working draft.

Modules

Lowlevel access to James Clark's expat XML parser
A perl module for parsing XML documents

Provides

in Expat/Expat.pm
in Parser.pm
in Expat/Expat.pm
in Expat/Expat.pm
in Parser.pm
in Parser.pm
in Parser.pm
in Parser.pm