SgmlReader is an XmlReader API over any SGML document. A command line utility is also provided which outputs the well formed XML result. HTML is an SGML grammar, so you can use this tool to convert HTML into well-formed XML. The utility was created by Microsoft XML guru Chris Lovett.
This example demonstrates how to retrieve a remote Web page, parse it with the SgmlReader class, and then use XPath to access specific nodes in the well-formed HTML.
|