XML::Parser is implemented as a Java XS module (XMLParserExpat.java) backed by JDK's built-in SAX parser (javax.xml.parsers.SAXParser). This replaces the native C/XS expat bindings with a pure-Java equivalent, dispatching SAX events to the same Perl callback interface.
- Java XS:
src/main/java/org/perlonjava/runtime/perlmodule/XMLParserExpat.java - Perl shim:
src/main/perl/lib/XML/Parser/Expat.pm(modified from upstream) - Backend: JDK SAX (Apache Xerces built into the JDK)
- SAX vs DOM: SAX chosen for streaming event model that maps naturally to expat's callback API
- Namespace dualvars: Namespace-qualified names use
DualVar(numericIndex, stringName)matching expat's behavior whereint($name)gives namespace index - BYTE_STRING encoding: ParseString uses
ISO_8859_1forBYTE_STRINGinput to avoid double-encoding raw UTF-8 bytes - SystemId un-resolution: SAX resolves relative systemIds to absolute
file:///URIs;unresolveSysId()strips the base to recover the original relative paths
Current: 45/45 test files pass (100%), 434/434 subtests pass (100%)
All XML::Parser 2.56 tests pass (excluding 2 Devel::CheckLib C compiler detection tests that are irrelevant to PerlOnJava).
Tests are stored in src/test/resources/module/XML-Parser/ and run via:
make test-bundled-modulesThis uses JUnit 5 (ModuleTestExecutionTest.java) with @Tag("module") to discover and execute all .t files under module/*/t/. The test runner chdirs to the module directory so relative paths resolve correctly.
Tests fixed: decl.t (44/46→46/46), foreign_dtd.t (0/5→5/5), checklib_findcc.t (2/3→3/3), checklib_tmpdir.t (1/3→3/3)
- NOTATION type format fix: Off-by-one bug in
attributeDecl()—substring(8)→substring(9)to strip the space SAX adds afterNOTATION - XMLDecl for text declarations: Added
fireTextDeclHandler()inresolveEntity()to fire the XMLDecl callback for text declarations in external parsed entities (withversion=undef), beforeconvertEncoding()rewrites the encoding - UseForeignDTD: When
UseForeignDTD => 1and no DOCTYPE exists, calls ExternEnt handler with(parser, base, undef, undef), reads DTD content, injects<!DOCTYPE _fdt SYSTEM "__perlonjava_foreign_dtd__">after the XML declaration, and resolves the synthetic system ID inresolveEntity() - "undefined entity" error message: SAX reports
"was referenced, but not declared"for undefined entities; mapped to expat's"undefined entity"format informatError() - Devel::CheckLib: Replaced 9-line stub with real upstream source from XML-Parser-2.56 tarball
Files: XMLParserExpat.java