TOMÁŠ HUBÁLEK BLOG: BAVTE SE PŘIMĚŘENĚ…

XML: CDATA section cannot contain any character!

Written By: Tomáš Hubálek - Oct• 30•06

I work with XML a few years but now I was surprised that CDATA section cannot contain any character. I found document that contains ASCII char 27 (Escape) in CDATA section and parsing of this document failed with exception:

org.dom4j.DocumentException: Error on line 38508 of document file:///.../rm_N444148061030074509.xml :
An invalid XML character (Unicode: 0x1b) was found in the CDATA section.
Nested exception: An invalid XML character (Unicode: 0x1b) was found in the CDATA section.

When do you look into XML specification, valid characters in XML documents are:

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

and ASCII 27 does not meet this requirement :-)

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply

Your email address will not be published. Required fields are marked *