package xml-light
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=f58c2b3db70ad1ba080b0d306ae32f82ccbb95dabb92c599cdc467d1e44e003d
sha512=fec6b83f8342a37bdad0fc745032f1faa57b359365ab53c2376fb031613a83a3139766f2d646a9b9b8d67da25252f2499a03de4caaef7bd8738f9b183ef84b6e
doc/xml-light/XmlParser/index.html
Module XmlParser
Source
Xml Light Parser
While basic parsing functions can be used in the Xml
module, this module is providing a way to create, configure and run an Xml parser.
Abstract type for an Xml parser.
type source =
| SFile of string
| SChannel of in_channel
| SString of string
| SLexbuf of Lexing.lexbuf
Several kind of resources can contain Xml documents.
This function enable or disable automatic DTD proving with the parser. Note that Xml documents having no reference to a DTD are never proved when parsed (but you can prove them later using the Dtd
module (by default, prove is true).
When parsing an Xml document from a file using the Xml.parse_file
function, the DTD file if declared by the Xml document has to be in the same directory as the xml file. When using other parsing functions, such as on a string or on a channel, the parser will raise everytime Xml.File_not_found
if a DTD file is needed and prove enabled. To enable the DTD loading of the file, the user have to configure the Xml parser with a resolve
function which is taking as argument the DTD filename and is returning a checked DTD. The user can then implement any kind of DTD loading strategy, and can use the Dtd
module functions to parse and check the DTD file (by default, the resolve function is raising Xml.File_not_found
).
When a Xml document is parsed, the parser will check that the end of the document is reached, so for example parsing "<A/><B/>"
will fail instead of returning only the A element. You can turn off this check by setting check_eof
to false
(by default, check_eof is true).
Once the parser is configurated, you can run the parser on a any kind of xml document source to parse its contents into an Xml data structure.
When several PCData elements are separed by a \n (or \r\n), you can either split the PCData in two distincts PCData or merge them with \n as seperator into one PCData. The default behavior is to concat the PCData, but this can be changed for a given parser with this flag.