Bawolff's rants: Write up DiceCTF 2022: Flare and stumbling upon a Traefik 0-day (CVE-2022-23632)

A while back, I participated in DiceCTF. During the contest I was the first person to solve the problem "Flare":

This was pretty exciting. Normally I'm happy just to be able to solve a problem — I'm never first. Even better though, my solution wasn't the intended solution, and I had instead found a 0-day in the load balancing software the contest was using!

The contest was a while ago so this post is severely belated. Nonetheless, given how exciting this all was, I wanted to write it up.

The Problem

The contest problem was short and sweet. You had a flask app behind Cloudflare. If you accessed it with an internal IP address (e.g. 192.168.0.2 or 10.0.0.2), you get the flag. The entire code is just 18 lines, mostly boilerplate:

import os
import ipaddress

from flask import Flask, request
from gevent.pywsgi import WSGIServer

app = Flask(__name__)
flag = os.getenv('FLAG', default='dice{flag}')

@app.route('/')
def index():
    ip = ipaddress.ip_address(request.headers.get('CF-Connecting-IP'))

    if isinstance(ip, ipaddress.IPv4Address) and ip.is_private:
        return flag

    return f'No flag for {format(ip)}'

WSGIServer(('', 8080), app).serve_forever()

Avenues of Attack

Simple to understand, but how to attack? The direct approach seems unlikely; The service is behind Cloudflare — if we can trick Cloudflare into thinking we are connecting to a site from an arbitrary IP address, then that would be a very significant vulnerability. If we could actually make a TCP connection to Cloudflare from a IP address that should not be globally routable, then something is seriously wrong with the internet.

Thinking about it, it seems like the following approaches are possible:

Hack Cloudflare to send the wrong CF-Connection-IP header (Seems impossible)
Use some sort of HTTP smuggling or header normalization issue to send something that flask thinks is a valid CF-Connection-IP header, but Cloudflare doesn't recognize, and hope that when faced with 2 conflicting headers, flask chooses ours instead of Cloudflare's. (Also seems impossible)
Find the backend server and connect to it directly, bypassing Cloudflare allowing us to send whatever header we want

With that in mind, I figured it had to be the third one. After all, this is a challenge, so it must have a solution, hence I figured it's probably neither impossible nor involving a high value vulnerability in a major CDN.

I was wrong of course. The intended solution was sort of a combination of the first possibility which i dismissed as impossible and python having an "interesting" definition of an "internal" IP, which there is no way I ever would have gotten. More on that later.

Trying to Find the Backend

So now that I determined my course of action, I started hunting for the backend server. I tried googling around for snippets from the page to see if any other sites came up in google with the same text. I tried looking at various "dns history" sites that were largely useless. I tried certificate transparency logs, but no. I even tried blindly typing in variations on the challenge's domain name.

The setup for the challenges were as follows: This challenge was on https://flare.mc.ax. The other challenges were all on *.mc.ax. This was the only challenge served by Cloudflare, the rest were served directly by the CTF infrastructure using a single IP address and a wildcard certificate.

With that in mind, I thought, maybe I could connect to the IP serving the other challenges, give the flare.mc.ax SNI and host header, and perhaps I will be directly connected to the backend. So I tried that, as well as the domain fronting version where you give the wrong SNI but the right host header. This did not work. However, to my surprise instead of getting a 404 response, I got a 421 Misdirected Redirect.

421 essentially means you asked for something that this server is not configured to give you, so you should re-check your DNS resolution to make sure you have the right IP address and try again. In HTTP/2, you are allowed to reuse a connection for other domains as long as it served a TLS certificate that would work for the other domain (This is called "Connection coalescing"). However, sometimes that back-fires especially with wildcard certs. Just because a server serves a TLS certificate for *.example.com, doesn't mean it knows how to literally handle everything under example.com since some subdomains might be served by a different server on a different IP. The new error code was created for such cases, to tell the browser it should stop with the connection coalescing, look up the DNS records for the domain name again, and open a separate connection. We needed a new code, because if the server just responded with a 404, the browser wouldn't know if its because the page just doesn't exist, or if its because the connection was inappropriately coalesced.

Looking back, I'm not sure I should have seen this as a sign. After all I was asking for a domain name that this server did not serve but had a correct certificate for, so this was the appropriate HTTP status code to give. However, the uniqueness of the error code and sudden change in behaviour around the domain name I was interested in, made me feel like I was on to something.

So I tried messing around with variations in headers and SNI. I tried silly things like having Host: flare.ac.mx/foo in the hopes that it would maybe confuse one layer, but another layer would strip it off, and get me the site i wanted or something like that.

Why settle for partial qualification?

Eventually I tried Host: flare.ac.mx. (note the extra dot at the end) with no SNI.

curl 'https://104.196.60.107' -vk --header 'host: flare.mc.ax.' --http1.1 --header 'CF-Connecting-IP: 127.0.0.1' --header 'X-Forwarded-For: 127.0.0.1' --header 'CF-ray: REDACTED' --header 'CF-IPCountry: CA' --header 'CF-Visitor: {"scheme":"https"}' --header 'CDN-Loop: cloudflare'

It worked.

Wait what?

What does a dot at the end of a domain name even mean?

In DNS, there is the concept of a partially qualified domain name (PQDN), and its opposite, the fully qualified domain name (FQDN). A partially qualified domain name is similar to a relative path - you can setup a default DNS search path in your DNS config (usually set by DHCP) that acts as a default suffix for partially qualified domain names. For example, if your default search path is example.com and you look up the host foo DNS will check foo.example.com.

I imagine this made more sense during the early internet, when it was more a "network of networks", and it was more common that you wanted to look up local resources on your local network.

In DNS, there is the root zone, which is represented by the empty string. This is the top of the DNS tree, which has TLDs like com or net, as its children.

If you add a period to the end of your DNS name, for example foo., this tells the DNS software that it isn't a partially qualified DNS name, but what you actually want, is to look up the foo domain in the root zone. So it does not lookup foo.example.com., but instead just foo..

For the most part, this is an obscure bit of DNS trivia. However, as far as web browsers are concerned, the PQDN example.net and FQDN example.net. are entirely separate domains. The same origin policy treats them differently, cookies set to one do not affect the other (TLS certificates seem to work for both though). In the past, people have used this trick to avoid advertisers on some websites.

So why did this work

So at this point, I solved the puzzle, obtained the flag and submitted it. Yay me!

But I still wasn't really sure what happened. I assumed there was some sort of misconfiguration involved, but I wasn't sure what. I did not have the configuration of the load-balancer. For that matter, I wasn't even sure yet which load balancing software was in use.

After I solved the problem, the competition organizers reached out and asked me what method I use. I imagine they wanted to see if I found the intended solution or stumbled upon something else. When I told them, they were very surprised. Apparently mutual TLS had been set up with cloudflare, and it should have been impossible to contact the backend if you did not have the correct client certificate, which I did not.

Wait what!?

The load balancing software in question was Traefik. In it, you can configure various requirements for different hosts. For example, you can say that a certain host needs a specific version of TLS, specific ciphers, specific server certificate or even a specific client certificate (mTLS). There is also a section for default options. In this case, they had one set of TLS settings for most of the domains, and a rule for the flare domain that you needed the correct client certificate to get access.

In normal operation, the SNI is checked and the appropriate requirements are applied. In the event that the SNI doesn't match the host header, and the host header matches a domain with different TLS requirements then the default requirements, a 421 status code is sent. This is all good.

However, if the host header has a period at the end to make it a FQDN, the code checking the TLS requirements doesn't think it matches the non-period version, so only the default requirements apply. However the rest of the code will still forward the request to the correct domain and process it as normal.

Thus, you can bypass any domain specific security requirements by not sending the SNI and adding an extra period to the host header.

This would be one thing for settings like min TLS version. However, it is an entirely different thing for access control settings such as mutual TLS as it allows an attacker to fully bypass the required client certificate, getting access to things they shouldn't be able to.

I reported this to the Traefik maintainers, and they fixed the issue in version 2.6.1. It was assigned CVE-2022-23632 and you can read more about it in their advisory. This was pretty exciting as well, as Traefik is used by quite a few people, and based on their github security advisory page, this is the first high-severity vulnerability that has been found in it.

What was the intended solution?

I found out later from the organizers of the CTF, that the intended solution was something very different. I definitely would never have came up with this myself.

The intended solution was to exploit two weird behaviours:

The python ipaddress library considers class E IP addresses (i.e. 240.0.0.0/4) to be "private". Class E addresses are reserved for future use. They are not globally routable, but they aren't private either, so it is odd that python considers them private. Python's own docs say that is_private is "True if the address is allocated for private networks" linking to IANA's official list, even though 240.0.0.0/4 is not listed as private use on that list.
Cloudflare has a feature where if your site does not support IPv6, you can enable "Pseudo IPv4", where the ipv6 connections will be forwarded as if they come from a class E IPv4 address. Cloudflare talks more about it in their blog post.

Which is a pretty fascinating combination.

Initially I discarded the possibility you could make Cloudflare give the wrong IP address, because I thought that would be such a major hack, that it wouldn't show up in this type of contest; people would either be reporting it or exploiting it, depending on the colour of their hat. However, my assumption was based on the idea that any sort of exploit would let you pick your IP. Being able to present as a random class E (which class E IP is based on an md5 hash of the top 64 bits of your IPv6 address, so you cannot choose it), is no where near as useful as being able to chose your IP (Everyone is just so trusting of 127.0.0.1). While this is a fascinating challenge, its hard to imagine a non-contrived situation where this would be a useful primitive. Making network access control in the real world that just blacklists all globally routable IPs instead of your own network seems silly. Even sillier would be to whitelist class E for some reason. Sure I guess an attacker could masquerade as one of these class E addresses to confuse anti-abuse systems, but if the site properly processes those connections, then it seems anti-abuse systems are likely to handle them just as easily as a normal IP. Since its still tied to your real IP, you can't hop between them unless you can hop between real IPs. If it ever really became an issue, Cloudflare lets you disable them, and there is also an additional header with the original IPv6 address you can use. At worst, maybe it makes reading logs after an incident more complicated, but this would be a really bad way to hide yourself in a world where VPNs are cheap and tor is free. In conclusion - its a fascinating behaviour, but practically speaking doesn't seem exploitable in the way that "make Cloudflare report your ip is something other than your ip" sounds like it would be exploitable at first glance.

Bawolff's rants

Followers

Blog Archive

About Me

Monday, July 18, 2022

Write up DiceCTF 2022: Flare and stumbling upon a Traefik 0-day (CVE-2022-23632)