How to stop downloading DTDs on anti-xml
Today, I wrote a small Scala code to parse
http://pragprog.com/titles
.
However, it takes long time because Java's XML infrastructure
downloads DTDs on-the-fly!
I found a solution on DTD download error while parsing XHTML document in XOM and applied it to anti-xml.
import com.codecommit.antixml.SAXParser import javax.xml.parsers.SAXParserFactory import java.net.URL import scala.io.Source object Main { val LOAD_EXTERNAL_DTD = "http://apache.org/xml/features/nonvalidating/load-external-dtd" def main(argv: Array[String]): Unit = { val source = Source.fromURL(new URL("http://pragprog.com/titles"), "UTF-8") val factory = () => { val f = SAXParserFactory.newInstance f.setNamespaceAware(true) f.setValidating(false) f.setFeature(LOAD_EXTERNAL_DTD, false) f.newSAXParser } val doc = new SAXParser(factory).fromSource(source) println(doc) } }
If you want to use the above code now,
please clone anti-xml's repository and type publish-local
on sbt.
On anti-xml v0.3, SAXParser doesn't take factory
and anti-xml v0.4-SNAPSHOT doesn't provide a binary for 2.9.1.