597

I have some XML text that I wish to render in an HTML page. This text contains an ampersand, which I want to render in its entity representation: &.

How do I escape this ampersand in the source XML? I tried &, but this is decoded as the actual ampersand character (&), which is invalid in HTML.

So I want to escape it in such a way that it will be rendered as & in the web page that uses the XML output.

lospejos
  • 1,976
  • 3
  • 19
  • 35
AJM
  • 32,054
  • 48
  • 155
  • 243
  • 1
    The claim in the latest revision of this question that *"the actual ampersand character (&) ... is invalid in HTML."* is false. Indeed, even the accepted answer to the linked question provided as justification states *"HTML5 allows you to leave it unescaped, but only when the data that follows does not look like a valid character reference"*. – Mark Amery Aug 31 '16 at 21:03

10 Answers10

472

When your XML contains &, this will result in the text &.

When you use that in HTML, that will be rendered as &.

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
Wim ten Brink
  • 25,901
  • 20
  • 83
  • 149
216

As per §2.4 of the XML 1.0 spec, you should be able to use &.

I tried & but this isn't allowed.

Are you sure it isn't a different issue? XML explicitly defines this as the way to escape ampersands.

John Feminella
  • 303,634
  • 46
  • 339
  • 357
  • 5
    This was perfectly reasonable when posted, but changes (or perhaps clarifications) to the question since have made it seem nonsensical as an answer. For one thing, the quoted passage is no longer present in the question. – Mark Amery Aug 31 '16 at 20:58
164

The & character is itself an escape character in XML so the solution is to concatenate it and a Unicode decimal equivalent for & thus ensuring that there are no XML parsing errors. That is, replace the character & with &.

Martin Schneider
  • 14,263
  • 7
  • 55
  • 58
trouble
  • 1,659
  • 1
  • 10
  • 3
  • 7
    I really prefer this solution! Should also be possible to use the hexadecimal notation: `&` – CodeManX Apr 26 '14 at 03:24
  • 2
    Logically, why would this work? Both strings have an ampersand, including the one with the character code on the end... – sijpkes Feb 25 '16 at 04:19
  • 7
    @sijpkes Because the ampersand here tells the parser that the following characters are used to represent another character, which in this case would be an ampersand. An ampersand isn't "illegal" in XML-- it just has a special meaning. It means "all of the characters after this until you hit a semicolon should be translated to something else". When you have an ampersand normally, without the descriptive characters and trailing semicolon, the parser gets confused. – Riley Major Aug 30 '16 at 14:37
  • 1
    This is the answer for me. Adding & in the Location of my Response Header fixed it and is not showing the Ampersand on the Response Header. :D – iamjoshua Aug 02 '19 at 05:24
  • 7
    Stack Overflow is so great. Here is an almost 11 year old post that solves my problem. And it has been viewed over 690,000 times. – Bill May 30 '20 at 14:59
  • Yesss Bill, now 2023 and 826,000 times and solving the same problem, hahahaha, GREAT !!! – ddanone Aug 08 '23 at 17:31
  • The semicolon is necessary! – Markus Sep 01 '23 at 13:46
83

Use CDATA tags:

 <![CDATA[
   This is some text with ampersands & other funny characters. >>
 ]]>
Patrick Hofman
  • 153,850
  • 22
  • 249
  • 325
scragar
  • 6,764
  • 28
  • 36
  • 5
    This is a guess rather than an answer. – Bryan Oakley Aug 25 '09 at 14:24
  • 10
    It might be a guess; it is correct though. CDATA markers allow raw ampersands to be used. – Quentin Aug 25 '09 at 14:40
  • 19
    The origional post never made clear where the & was to be used, CDATA tags cannot be used for attribute values, only for the actual content of the tags, hence the reason I included the '?'. – scragar Aug 25 '09 at 19:34
  • 1
    This is also great for characterizing xml data and this answer is helpful in many other scenarios concerning xml rendering. For me, it really helped in Camel XML DSL, when I needed to set the body or some header with some XML data, the Camel XML parser ignored the CDATA contents, reading them as a stream of characters. Without this the camel engine throws invalid xml structure exceptions – Kimutai Dec 01 '17 at 05:59
  • 2
    This is exactly the answer I needed, because in my case I'm not sure what characters might be coming in the XML, so I need to escape everything in that section. – Matt Jan 24 '19 at 16:22
55

&amp; should work just fine. Wikipedia has a list of predefined entities in XML.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
nikc.org
  • 16,462
  • 6
  • 50
  • 83
16

In my case I had to change it to %26.

I needed to escape & in a URL. So &amp; did not work out for me. The urlencode function changes & to %26. This way neither XML nor the browser URL mechanism complained about the URL.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Serhat Akay
  • 536
  • 3
  • 10
  • 8
    Yes. Note though that the OP was about escaping in XML. Escaping in a URL is a different issue. The real fun begins when you have URLs in XML, or XML-fragments in URLs... – Oskar Berggren Nov 14 '13 at 12:10
  • urlencode() in what environment? [In PHP](https://www.php.net/manual/en/function.urlencode.php)? – Peter Mortensen Nov 09 '21 at 14:23
8

I have tried &amp, but it didn't work. Based on Wim ten Brink's answer I tried &amp;amp and it worked.

One of my fellow developers suggested me to use &#x26; and that worked regardless of how many times it may be rendered.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mcampos
  • 141
  • 1
  • 7
  • What about the semicolons? Code formatting can be used to work around formatting problems here (but it is also possible without - using "ironic" formatting). – Peter Mortensen Nov 09 '21 at 14:19
7

&amp; is the way to represent an ampersand in most sections of an XML document.

If you want to have XML displayed within HTML, you need to first create properly encoded XML (which involves changing & to &amp;) and then use that to create properly encoded HTML (which involves again changing & to &amp;). That results in:

&amp;amp;

For a more thorough explanation of XML encoding, see:

What characters do I need to escape in XML documents?

Community
  • 1
  • 1
Riley Major
  • 1,904
  • 23
  • 36
4

<xsl:text disable-output-escaping="yes">&amp;&nbsp;</xsl:text> will do the trick.

Isaac Truett
  • 8,734
  • 1
  • 29
  • 48
Rick
  • 57
  • 1
0

Consider if your XML looks like below.

<Employees Id="1" Name="ABC">
  <Query>
    SELECT * FROM EMP WHERE ID=1 AND RES<>'GCF'
  <Query>
</Employees>

You cannot use the <> directly as it throws an error. In that case, you can use &#60;&#62; in replacement of that.

<Employees Id="1" Name="ABC">
  <Query>
    SELECT * FROM EMP WHERE ID=1 AND RES &#60;&#62; 'GCF'
  <Query>
</Employees>

14.1 How to use special characters in XML has all the codes.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sarath Subramanian
  • 20,027
  • 11
  • 82
  • 86