Fuzzing for wolfSSL

Larry Stefonic of wolfSSL contacted me after he’d noticed my project for fuzzing cryptographic libraries called Cryptofuzz. We agreed that I would write a Cryptofuzz module for wolfSSL.

I activated the wolfSSL module for Cryptofuzz on Google’s OSS-Fuzz, where it has been running 24/7 since. So far, Cryptofuzz has found a total of 8 bugs in wolfCrypt.

Larry and Todd Ouska then asked me if I was interested in writing fuzzers for the broader wolfSSL library. I was commissioned for 80 hours of work.

I started by implementing harnesses for the TLS server and client. Both support 5 different flavors of TLS: TLS 1, 1.1, 1.2, 1.3 and DTLS.

wolfSSL allows you to install your own IO handlers. Once these are in place, each time wolfSSL wants to either read or write some data over the network, these custom handlers are invoked, instead calling recv() and send() directly.

For fuzzing, this is ideal, because fuzzers are best suited to operate on data buffers rather than network sockets. Working with actual sockets in fuzzers is possible, but this tends to be slower and more complex than piping data in and out of the target directly using buffers.

Hence, by using wolfSSL’s IO callbacks, all actual network activity is sidestepped, and the fuzzers can interact directly with the wolfSSL code.

Emulating the network

In the write callback, I embedded some code that specifically checks the outbound data for uninitialized memory. By writing this data to /dev/null, it can be evaluated by valgrind and MemorySanitizer.

Furthermore, I ensured that my IO overloads mimic the behavior of a real network.

On a real network, a connection can be closed unexpectedly, either due to a transmission failure, a man-in-the-middle intervention or as a deliberate hangup by the peer.

It is interesting to explore the library’s behavior in the face of connection issues, as this can activate alternative code paths that normally are not traversed, so this strategy harbors the potential to find bugs that are missed otherwise.

For example, what if wolfSSL wants to read 50 bytes from a socket, but the remote peer sends only 20?

These are situations that are feasible if an attacker were to deliberately impose transfer throttling in their communication with an endpoint running wolfSSL.

Addition and subtraction have shown to pose a challenge in programming, especially when they pertain to array sizes; many buffer overflows and infinite loops in software (not wolfSSL in particular) can be traced back to off-by-one calculations and integer overflows.

Networking software like wolfSSL needs to keep a tally of completed and pending transmissions and in light of this it is a worthwhile experiment to observe what will happen when faced with uncommon socket behavior.

Finding instances of resource exhaustion

Buffer overflows are not the only kind of bug software can suffer from.

For example, it would be unfortunate if an attacker could bring down a TLS server by sending a small, crafted packet.

Fuzzing can be helpful in finding denial of service bugs. Normally, fuzzers use code coverage as a feedback signal. By instead using the branch count or the peak memory usage as a signal, the fuzzer will tend to find slow inputs (many branches taken means a long execution time) or inputs that consume a lot of memory, respectively.

Several years ago I implemented some modifications to libFuzzer which allow me to easily implement fuzzers that find denial-of-service bugs. For my engagement with wolfSSL, I applied these techniques to each fuzzer that I wrote. I ended up providing three binaries per fuzzer:

  • a generic one that seeks to find memory bugs, using code coverage as a signal
  • one that tries to find slow inputs by using the branch count as a signal
  • one that finds inputs resulting in excessive heap allocation

Emulating allocation failures

Using wolfSSL_SetAllocators(), wolfSSL allows you to replace its default allocation functions. This opens up interesting possibilities for finding certain bugs.

One thing I did in my custom allocator was to return an invalid pointer for a malloc() or realloc() call requesting 0 bytes. This way, if wolfSSL would try to dereference this pointer, a segmentation fault will occur.

This special code is needed because even AddressSanitizer will not detect access to a 0-byte allocated region, but it is important to test for, as such behavior can lead to real crashes on systems like OpenBSD, which intentionally return an invalid pointer from malloc(0), just like my code does.

Another possibility of implementing your own memory allocator is that it can be designed to fail sometimes.

On most desktop systems, malloc() always succeeds, but that may not be the case universally, especially not on resource-constrained systems which cannot resort to page swapping for acquiring additional memory.

Allocation failures activate code paths which are normally not accounted for by unit tests. I implemented this behavior for all fuzzers I wrote for wolfSSL.

In the TLS-specific code, 5 bugs were found.

Fuzzing auxiliary code

TLS is large and complex, and it can take fuzzers a while to traverse all its code paths, so in the interest of efficiency, I wrote several additional fuzzers specifically aimed at subsets of the library, like X509 certificate parsing (historically a wellspring of bugs across implementations), OCSP request and response handling (for which a subset of HTTP is implemented) and utility functions like base64 and base16 coders.

This approach found 9 additional bugs.

Testing the bignum library


wolfSSL comes with a bignum library that it uses for asymmetric cryptography. Because it is imperative that computations with bignums are sound, I took a project of mine called bignum-fuzzer (which has also found security bugs in other bignum libraries, like OpenSSL’s CVE-2019-1551) and appropriated it for use with wolfSSL. It is not only able to find memory bugs, but also incorrect calculation results.

I set out to test the following sub-libraries in wolfSSL:

  • Normal math
  • Single-precision math (–enable-sp)
  • Fastmath (–enable-fastmath)

5 instances of incorrect calculations were found. The other bugs involved invalid memory access and hangs.

wolfSSH

In addition to wolfSSL and wolfCrypt, I also spent some time looking at wolfSSH, which is the company’s SSH library offering.

In this component I uncovered 7 memory bugs, 1 memory leak and 1 crash bug.

Differential fuzzing of cryptographic libraries

Cryptofuzz

Cryptofuzz is a project that fuzzes cryptographic libraries and compares their output in order to find implementation discrepancies. It’s quite effective and has already found a lot of bugs.

Bugs in cryptographic libraries found with Cryptofuzz

It’s been running continually on Google’s OSS-Fuzz for a while and most of the recent bugs were found by their machines.

Not all of these are security vulnerabilities, but some can be, depending on the way the API’s are used by an application, the degree to which they allow and use untrusted input, and which output they store or send.

If there had been any previous white-hat testing or fuzzing effort of the same scope and depth, these bugs would have transpired sooner, so it’s clear this project is filling a gap.

Another automated cryptography testing suite, Project Wycheproof by Google, takes a directed approach, with tailored tests mindful of cryptographic theory and historic weaknesses. Cryptofuzz is more opportunistic and generic, but more thorough in terms of raw code coverage.

Currently supported libraries are: OpenSSL, LibreSSL, BoringSSL, Crypto++, cppcrypto, some Bitcoin and Monero cryptographic code, Veracrypt cryptographic code, libgcrypt, libsodium, the Whirlpool reference implementation and small portions of Boost.

This is a modular system and the inclusion of any library is optional. Additionally, no library features are mandatory. Cryptofuzz works with whatever is available.

What it does

  • Detect memory, hang and crash bugs. Many cryptographic libraries are written in C, C++ and assembly language, which makes them susceptible to memory bugs like buffer overflows and using uninitialized memory. With the aid of sanitizers, many of these bugs become apparent. Language-agnostic programming errors like large or infinite loops and assertion failures can be detected as well. For example: Memory corruption after EVP_CIPHER_CTX_copy() with AES-GCM in BoringSSL.
  • Internal consistency testing. Libraries often provide multiple methods for performing a specific task. Cryptofuzz asserts that the end result is always the same irrespective of the computation method. This is a variant of differential testing. A result is not checked against another library, but asserted to be equivalent across multiple methods within the same library. For example: CHACHA20_POLY1305 different results for chunked/non-chunked updating in OpenSSL.
  • Multi-library differential testing. Given multiple distinct implementations of the same cryptographic primitive, and assuming that at least one is fully compliant with the canonical specification, deviations will be detected. For example: Wrong Streebog result in LibreSSL.

What it doesn’t do

It does not detect timing or other side channels, misuse of the cryptographic API and misuse of cryptography. It will also not detect bugs involving very large inputs (eg. gigabytes). Asymmetric encryption is not yet supported.

How it works

Input splitting

The fuzzing engine (libFuzzer) provides a single data buffer to work with. This is fine for simple fuzzers, but Cryptofuzz needs to extract many variables, like operation ID, operation parameters (cleartext, key, etc), module ID’s and more.

For input splitting I often use my own C++ class that allows me extract any basic type (int, bool, float, …) or more complex types easily.

The class can be instantiated with the buffer provided by the fuzzing engine.

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    Datasource ds(data, size);

You can then easily extract anything you want:

auto a = ds.Get<uint64_t>();
auto b = ds.Get<bool>();
auto c = ds.Get<char>();

And so forth. To extract a variable number of items:

    std::vector<uint8_t> v;
    do {
        v.push_back(ds.Get<uint8_t>());
    } while ( ds.Get<bool>() == true );

Internally, the library performs length-encoded deserialization from the raw buffer into these types. As soon as it is out of data, it throws an exception, which you catch.

For more examples of this technique, see my recent fuzzers for LAME, PIEX and GFWX.

The idea multiple input data streams rather than just one, with each input type stored in its own corpus, is also prominently featured in my own fuzzer, which I hope to release as soon as I find the time for it..

Modules

The terms ‘module’ and ‘cryptographic library’ can be used interchangeably, for the most part; a module is a C++ class that is an interface for Cryptofuzz to pass data into the library code, and consume its output.

Operations

A fixed set of operations is supported. These currently include hashing (message digest), HMAC, CMAC, symmetric encryption, symmetric decryption and several KDF algorithms.

Each supported operation corresponds to a virtual function in the base class Module. A module can support an operation by overriding this function. If a module does not implement an operation, this is not a problem; Cryptofuzz is opportunistic, imposes few implementation constraints and only demands that if a result is produced, it is the correct result.

Cryptofuzz makes extensive use of std::optional. Each operation is implemented as a function that returns an optional value. This means that if a module cannot comply with a certain request, for instance when the request is “compute the SHA256 hash of the text loremipsum“, but SHA256 is not supported by the underlying library, the module can return std::nullopt.

Modifiers

A library sometimes offers multiple ways to achieve a task. Some examples:

  • libgcrypt optionally allows the use of ‘secure memory’ for each operation
  • OpenSSL offers encryption and decryption operation through the BIO and EVP interfaces
  • BoringSSL and LibreSSL can perform authenticated encryption through both the EVP interface as well as the EVP_AEAD interface
  • Often, libraries optionally allow in-place encryption, where cleartext and subsequent ciphertext reside at the same memory location (and the other way around for decryption)

Modifiers are buffers of data, extracted from the original input to the harness, and are passed with each operation. They allow the code to diversify internal operation handling. This is important in the interest of maximizing code coverage and increasing the odds of finding corner cases that either crash or produce an incorrect result.

By leveraging input splitting, modifiers can be used for choosing different code paths at runtime.

A practical example. Recent OpenSSL and LibreSSL can compute a HMAC with the EVP interface, or with the HMAC interface, and BoringSSL only provides the HMAC interface. This is how I branch based on the modifier and the underlying library:

std::optional<component::MAC> OpenSSL::OpHMAC(operation::HMAC& op) {
    Datasource ds(op.modifier.GetPtr(), op.modifier.GetSize());

    bool useEVP = true;
    try {
        useEVP = ds.Get<bool>();
    } catch ( fuzzing::datasource::Datasource::OutOfData ) {
    }

    if ( useEVP == true ) {
#if !defined(CRYPTOFUZZ_BORINGSSL)
        return OpHMAC_EVP(op, ds);
#else
        return OpHMAC_HMAC(op, ds);
#endif
    } else {
#if !defined(CRYPTOFUZZ_OPENSSL_102)
        return OpHMAC_HMAC(op, ds);
#else
        return OpHMAC_EVP(op, ds);
#endif
    }
}

Another example. A lot of the cryptographic code in OpenSSL, LibreSSL and BoringSSL is architectured around so-called contexts, which are variables (structs) holding parameters relevant to a specific operation. For example, if you want to perform encryption using the EVP interface, you’re going to have to initialize an EVP_CIPHER_CTX variable and pass it to functions like EVP_EncryptUpdate and EVP_EncryptFinal_ex.

For each type of CTX, the libraries provide copy functions. For EVP_CIPHER_CTX this is EVP_CIPHER_CTX_copy. It does what it says it does: copy over the internal parameters from one place to another, such that the copied instance is semantically identical to the source instance. But in spite of seeming trivial, this is not just an alias for memcpy; there are different handlers for different ciphers and ciphermodes and copying may require deep-copying substructures.

I created a class that abstracts creating, copying and freeing contexts. Using C++ templates and method overloading I was able to easily generate a tailored class for each type of context (EVP_MD_CTX, EVP_CIPHER_CTX, HMAC_CTX and CMAC_CTX). The class furthermore provides a GetPtr() method with which you can access a pointer to the context. (Don’t stop reading here — it will become clear 😉 ).

template <class T>
class CTX_Copier {
    private:
        T* ctx = nullptr;
        Datasource& ds;

        T* newCTX(void) const;
        int copyCTX(T* dest, T* src) const;
        void freeCTX(T* ctx) const;

        T* copy(void) {
            bool doCopyCTX = true;
            try {
                doCopyCTX = ds.Get<bool>();
            } catch ( fuzzing::datasource::Datasource::OutOfData ) { }

            if ( doCopyCTX == true ) {
                T* tmpCtx = newCTX();
                if ( tmpCtx != nullptr ) {
                    if ( copyCTX(tmpCtx, ctx) == 1 ) {
                        /* Copy succeeded, free the old ctx */
                        freeCTX(ctx);

                        /* Use the copied ctx */
                        ctx = tmpCtx;
                    } else {
                        freeCTX(tmpCtx);
                    }
                }
            }

            return ctx;
        }

    public:
        CTX_Copier(Datasource& ds) :
            ds(ds) {
            ctx = newCTX();
            if ( ctx == nullptr ) {
                abort();
            }
        }

        T* GetPtr(void) {
            return copy();
        }

        ~CTX_Copier() {
            freeCTX(ctx);
        }
};

I instantiate the class with a reference to a Datasource, which is my input splitter. Each time I need to pass a pointer to the context to an OpenSSL function, I call GetPtr(). This extracts a bool from the input splitter, and decides whether or not to perform a copy operation.

Here’s my message digest code for the OpenSSL module:

std::optional<component::Digest> OpenSSL::OpDigest(operation::Digest& op) {
    std::optional<component::Digest> ret = std::nullopt;
    Datasource ds(op.modifier.GetPtr(), op.modifier.GetSize());

    util::Multipart parts;

    CF_EVP_MD_CTX ctx(ds);
    const EVP_MD* md = nullptr;

    /* Initialize */
    {
        parts = util::ToParts(ds, op.cleartext);
        CF_CHECK_NE(md = toEVPMD(op.digestType), nullptr);
        CF_CHECK_EQ(EVP_DigestInit_ex(ctx.GetPtr(), md, nullptr), 1);
    }

    /* Process */
    for (const auto& part : parts) {
        CF_CHECK_EQ(EVP_DigestUpdate(ctx.GetPtr(), part.first, part.second), 1);
    }

    /* Finalize */
    {
        unsigned int len = -1;
        unsigned char md[EVP_MAX_MD_SIZE];
        CF_CHECK_EQ(EVP_DigestFinal_ex(ctx.GetPtr(), md, &len), 1);

        ret = component::Digest(md, len);
    }

end:
    return ret;
}

I instantiate CF_EVP_MD_CTX as ctx once, passing a reference to ds, near the top of the function.

Then, each time I need access to the ctx, I call ctx.GetPtr(): once as a parameter to EVP_DigestInitEx(), once to EVP_DigestUpdate() and once to EVP_DigestFinal_ex().

Each time ctx.GetPtr() is called, the actual context variable EVP_MD_CTX may or may not be copied. Whether it is copied or not depends on the modifier. The contents of the modifier is ultimately determined by the mutations performed by the fuzzing engine.

This approach is more versatile than copying the context only once, because this will catch bugs that depend on context copying at a specific place, or as part of a specific sequence, whereas copying it just once potentially comprises a narrower state space.

Input partitioning

It is common for cryptographic libraries to allow the input to be passed in steps. If you want to compute a message digest of the text “loremipsum”, there are a large amount of distinct hashing steps you could perform. For example:

  • Pass lorem, then ipsum
  • Pass lor, then em, then ipsum
  • Pass lor, then em, then ip, then sum.

And so forth. To test whether an input always produces the same output regardless of how it’s partitioned, all current Cryptofuzz modules will attempt to pass input in chunks where it is supported.

Helper functions are provided to pseudo-randomly partition an input based on the modifier, and implementing is often as easy as:

Datasource ds(op.modifier.GetPtr(), op.modifier.GetSize());
util::Multipart parts = util::ToParts(ds, op.cleartext);
for (const auto& part : parts) {
    CF_CHECK_EQ(HashUpdate(ctx, part.first, part.second), true);
}

Operation sequences

The same operation can be executed multiple times. This is even necessary for differential testing; to discover differences between libraries, the same operation must be executed by each library before their results can be compared.

Each time the same operation is run, a new modifier for it is generated.

So we end up with a sequence of identical operations but each run by a certain module, and each with a new modifier.

Fuzzing multiple cryptographic libraries with Cryptofuzz

Once the batch of operations has been run, Cryptofuzz filters out all empty results. In the remaining set, each result must be identical. If it is not, then Cryptofuzz prints the operation parameters (cleartext, cipher, key, IV, and so on) and the result for each module, and calls abort().

Multi-module support

Modules, along with the cryptographic library that they implement, are linked statically into the fuzzer binary.

More libraries do not necessarily make it slower. Each run, Cryptofuzz picks a set of random libraries to run an operation on. Adding more libraries does not cause an increase in total operations; only a few operations will be executed each run, regardless of the total amount of modules available.

Additional tests

Each time an encryption operation succeeds, Cryptofuzz will attempt to decrypt the result using the same module. If the decryption operation either fails, or the decryption output is not equivalent to the original cleartext, a crash will be enforced.

Cryptofuzz also allows “static” tests that operate on a single result. Here is a postprocessing hook that gets called after a SymmetricEncrypt operation, and detects IV sizes larger than 12 bytes in encrypt operations with ChaCha20-Poly1305, which is a violation of the specification (see also OpenSSL CVE-2019-1543 which was not found by me but upon which I based this test).

static void test_ChaCha20_Poly1305_IV(const operation::SymmetricEncrypt& op, const std::optional<component::Ciphertext>& result) {
    using fuzzing::datasource::ID;

    /*
     * OpenSSL CVE-2019-1543
     * https://www.openssl.org/news/secadv/20190306.txt
     */

    if ( op.cipher.cipherType.Get() != ID("Cryptofuzz/Cipher/CHACHA20_POLY1305") ) {
        return;
    }

    if ( result == std::nullopt ) {
        return;
    }

    if ( op.cipher.iv.GetSize() > 12 ) {
        abort();
    }
}

void test(const operation::SymmetricEncrypt& op, const std::optional<component::Ciphertext>& result) {
    test_ChaCha20_Poly1305_IV(op, result);
}

Funding

I’ve been working on this project full-time for several months. I suppose I could seize this opportunity to say that I do this to save the planet from the apocalypse due to a signed integer overflow in OpenSSL, but I’m really just addicted to staring at fuzzer statistics ;).

It’s a lot of fun, but also a lot of work.

I’m especially interested in exploring the fringes of an API’s legal use, and this requires a close reading of the documentation, getting my implementation exactly right and if I get a crash or odd result, working out whether this is due to a fault of my own or not.

Perhaps surprisingly, writing good bug reports is a lot of work. Ideally I want to present readily compilable proof of concept code to the library maintainers so that they won’t have to get bogged down in my fuzzer’s technical details in order to understand a bug in their own code.

I’m proud of the project as it stands, and I’d love to expand it to support more very widely used libraries like Go and Java’s cryptographic code. Considering what I’ve seen so far, I’d be surprised to not find more bugs in popular cryptographic software.

Google has rewarded me $1000 for initial integration of Cryptofuzz into OSS-Fuzz, for which I’m grateful. This is all the income this project has generated so far, and I’m not complaining because it was a hobby project from the outset, but going forward I will have to forgo Cryptofuzz enhancements in pursuit of more profitable ventures, for the simple reason that I need an income and can’t spend all my time on hobby projects forever.

I submitted a grant proposal to the Core Infrastructure Initiative about one month ago. This project seems to align largely with their objectives. Unfortunately, I have not yet received a response.

If you rely on some cryptographic library (Go? Java? Rust? Javascript?) and would like to fund its integration into Cryptofuzz, please get in touch.

Edit 26 September 2019: In addition to the $1000 initial integration reward, I’ve received an additional $5000 from Google for ideal integration. A private person has also donated two $100 Amazon gift cards, which I’ve partially spent on purchasing a Raspberry Pi 4 for testing cryptographic libraries on ARM. The Core Infrastructure Initiative never responded.

Edit 24 October 2019: 35 bugs found. Someone donated $1,000 (thanks again) and I’ve submitted a grant proposal to RIPE NCC’s Community Projects Fund. I’ve been working on elliptic curve cryptography support in a separate branch and will merge this soon.

Full disclosure: libsrtp multiple vulnerabilities

I wrote a fuzzer for libsrtp for purely recreational reasons. I reported the bugs I found to the libsrtp security mailing list several months ago. Finally those bugs seem to have been fixed in the git master tree. Apparently these findings and fixes for them don’t seem to prompt a new release. Cisco has stopped responding and I don’t know what the deal is. I recently also contacted Cisco Talos but they didn’t respond at all. So I’ve decided to publish my fuzzers. I put considerable effort in it but I’m now tired of this project because nobody really seems to care, and I am abandoning it.  Underwhelming experience, exception to the rule.

EDIT 23/03: Removed invalid information about Talos. My bad.

Security audit of SoftEther VPN finds 11 security vulnerabilities

A security audit of the widely used SoftEther VPN open source VPN
client and server software [1] has uncovered 11 remote security
vulnerabilities. The audit has been commissioned by the Max Planck
Institute for Molecular Genetics [2] and performed by Guido Vranken
[3]. The issues found range from denial-of-service resulting from
memory leaks to memory corruption.

The 80 hour security audit has relied extensively on the use of
fuzzers [4], an approach that has proven its worth earlier with the
discovery of several remote vulnerabilities in OpenVPN in June of 2017
[5]. The modifications made to the SoftEther VPN source code to make
it suitable for fuzzing and original code written for this project are
open source [6]. The work will be made available to Google’s OSS-Fuzz
initiative [7] for continued protection of SoftEther VPN against
security vulnerabilities. An updated version of SoftEther VPN that
resolves all discovered security vulnerabilities is available for
download immediately [8].

[1] https://www.softether.org/
[2] https://www.molgen.mpg.de/2168/en
[3] https://guidovranken.wordpress.com/
[4] https://en.wikipedia.org/wiki/Fuzzing
[5] https://guidovranken.wordpress.com/2017/06/21/the-openvpn-post-audit-bug-bonanza/
[6] https://github.com/guidovranken/SoftEtherVpn-Fuzz-Audit
[7] https://github.com/google/oss-fuzz/blob/master/README.md
[8] http://www.softether.org/5-download/history

Thank you very much OSTIF

In May I started building fuzzers for OpenVPN because I liked engaging in the challenge of finding more vulnerabilities after two fresh audits. I never intended or expected to receive money for this. In addition to the money donated by people and companies to my Bitcoin address (thank you very much again), OSTIF reached out to me and offered to reward me with a bounty of $5000 for the vulnerabilities and for completing the fuzzers. Thank you so much!

The fuzzers can be found here: https://github.com/guidovranken/openvpn/tree/fuzzing There are still some small portions of code that remain un-fuzzed. I am very busy with contracting work so I won’t be working on it sometime soon. You are welcome to extend it and find more vulnerabilities, and you might be eligible for bounties yourself.

OSTIF is currently running a fundraiser to get OpenSSL 1.1.1 audited. Check it out and spread the word.

One more OpenVPN vulnerability (CVE-2017-12166)

This concerns a remote buffer overflow vulnerability in OpenVPN. It has been fixed in OpenVPN 2.4.4 and 2.3.18. It is suspected that only a small number of users is vulnerable to this issue, because it requires having explicitly enabled the outdated ‘key method 1’.

The advisory can be found here: https://community.openvpn.net/openvpn/wiki/CVE-2017-12166

If you appreciate my discovery, you may donate some BTC to address 1BnLyXN2QwdMZLZTNqKqY48bU4hN2A3MwZ

In ssl.c, key_method_1_read() calls read_key() which doesn’t perform adequate
bounds checks. cipher_length and hmac_length are specified by the
peer:

1643 uint8_t cipher_length;
1644 uint8_t hmac_length;
1645
1646 CLEAR(*key);
1647 if (!buf_read(buf, &cipher_length, 1))
1648 {
1649 goto read_err;
1650 }
1651 if (!buf_read(buf, &hmac_length, 1))
1652 {
1653 goto read_err;
1654 }

And this many bytes of data are then read into key->cipher and key->hmac:

1656 if (!buf_read(buf, key->cipher, cipher_length))
1657 {
1658 goto read_err;
1659 }
1660 if (!buf_read(buf, key->hmac, hmac_length))
1661 {
1662 goto read_err;
1663 }

In other words, it’s a classic example of lack of a bounds check resulting in a buffer overflow.

Bitcoin fuzzers

I got some requests to fuzz Bitcoin, so I did. They can be found here:

https://github.com/guidovranken/bitcoin/tree/fuzzing/fuzzers

I expect them to be merged into the main project soon.

So far only one issue has been found: https://github.com/bitcoin/bitcoin/pull/11081 . This code is currently unused and does not pose a security risk (forks of Bitcoin may want to check whether they are using it).

Judging by the number of issues found (1) after extensive fuzzing, the Bitcoin code appears to be exceptionally well-written. Which is also exceptionally good news, because this code is not only used by Bitcoin but also by many, many altcoins, and thus guards billions and billions of dollars.

I’m actively working on expanding the fuzzers and their code coverage (as much as time permits).

Tip jar: 1BnLyXN2QwdMZLZTNqKqY48bU4hN2A3MwZ

In other news, I have a new OpenVPN vulnerability coming up that’s the worst yet in terms of severity but only affects a small number of users. To be announced.

bc

11 remote vulnerabilities (inc. 2x RCE) in FreeRADIUS packet parsers

FreeRADIUS is the most widely deployed RADIUS server in the world. It is the basis for multiple commercial offerings. It supplies the AAA needs of many Fortune-500 companies and Tier 1 ISPs.” (http://freeradius.org)

FreeRADIUS asked me to fuzz their DHCP and RADIUS packet parsers in version 3.0.x (stable branch) and version 2.2.x (EOL, but receives security updates). 11 distinct issues that can be triggered remotely were found.

The following is excerpted from freeradius.org/security/fuzzer-2017.html which I advise you to consult for more detailed descriptions of the issues at hand.

There are about as many issues disclosed in this page as in the previous ten years combined.

v2, v3: CVE-2017-10978. No remote code execution is possible. A denial of service is possible.
v2: CVE-2017-10979. Remote code execution is possible. A denial of service is possible.
v2: CVE-2017-10980. No remote code execution is possible. A denial of service is possible.
v2: CVE-2017-10981. No remote code execution is possible. A denial of service is possible.
v2: CVE-2017-10982. No remote code execution is possible. A denial of service is possible.
v2, v3: CVE-2017-10983. No remote code execution is possible. A denial of service is possible.
v3: CVE-2017-10984. Remote code execution is possible. A denial of service is possible.
v3: CVE-2017-10985. No remote code execution is possible. A denial of service is possible.
v3: CVE-2017-10986. No remote code execution is possible. A denial of service is possible.
v3: CVE-2017-10987. No remote code execution is possible. A denial of service is possible.
v3: CVE-2017-10988. No remote code execution is possible. No denial of service is possible. Exploitation does not cross a privilege boundary in a correct and realistic product deployment.

Contact me if

  • you are a vendor of a (open source) C/C++ application and want to eliminate security issues in your product
  • you or your company relies on an (open source) C/C++ application and want ensure that it is secure to use
  • you’d like to organize a crowdfunding campaign to eliminate security issues in an open source C/C++ application for the benefit of all who rely on it
  • for any other reason

I almost always find security issues.

guidovranken at gmail com

libFuzzer-gv: new techniques for dramatically faster fuzzing

It’s not how long you let it run, it’s how you wiggle your fuzzer

Sun Tzu

I spent some time hacking libFuzzer and pondering its techniques. I’ve come up with some additions that I expect will dramatically speed up finding certain edge cases.

First of all a huge vote of appreciation for Michał Zalewski and the people behind libFuzzer and the various sanitizers for their work. The remarkable ease by which fuzzers can be attached to arbitrary software to find world-class bugs that affect millions is at least as commendable as the technical underpinnings. The shoulders of giants.

You can find my fuzzer here: https://github.com/guidovranken/libfuzzer-gv

Remember that these features are very experimental. Developers of libFuzzer and other fuzzers are encouraged to merge these features into their work if they like it.

Code coverage is just one way to guide the fuzzer

Code coverage is the chief metric that a fuzzer like libFuzzer uses to increase the likelihood that a code path resulting in an error is found. But the exact course of code execution is determined by many more factors. These factors are not accounted for by code coverage metrics alone. So I’ve implemented a number of additional program state signalers that help reach faulty code quickly. Without these, certain bugs will be uncovered only after a very long time of fuzzing.

Stack-depth-guided fuzzing

void recur(size_t depth, size_t maxdepth)
{
    if (depth >= maxdepth) {
        return;
    }

    recur(depth + 1, maxdepth);
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    size_t i, maxdepth = 0;

    for (i = 0; i < size; i++) {
        if (i % 3 == 0 && data[i] == 0xAA) {
            maxdepth += 1;
        }
    }

    maxdepth *= 400;
    recur(0, maxdepth);
    return 0;
}

Given enough 0xAA’s in the input, the program will crash due a stack overflow (recursing too deep). With -stack_depth_guided=1 -use_value_profile=1 it usually takes about 0.5 – 5 seconds to crash on my system.

With just -value_profile=1 (and ASAN_OPTIONS=coverage=1:coverage_counters=1), it takes about 5-10 minutes. I think this is pure chance though. I’ve done runs where it was still busy after an hour.

static void getStackDepth(void) {
  size_t p;
  asm("movq %%rsp,%0" : "=r"(p));
  p = 0x8000000000000000 - p;
  if (p > fuzzer::stackDepthRecord) {
      fuzzer::stackDepthRecord = p;
      if (fuzzer::stackDepthBase == 0) {
          fuzzer::stackDepthBase = p;
      }
  }
}

(yes, this specific implementation works only on x86-64. If this doesn’t work for you, comment it out or change it to suit your architecture.)

If you need a fuzzer input that exceeds a certain stack depth as a file, you can lower the stack size with ulimit -s before running the fuzzer. It will crash and libFuzzer writes the fuzzer input to disk.

Crashes due to excessive recursion are, I think, an under-appreciated class of vulnerabilities. For server applications, it matters a lot that an untrusted client can perform a stack overflow on the server. These vulnerabilities are relatively rare, but I did manage to find a remote, unauthenticated crasher in high-profile software (Apache httpd CVE-2015-0228).

A lot of applications that parse context-free grammar, such as

  • Programming languages (an expression can contain an expression can contain an expression..)
  • Serialization formats (JSON: an array can contain an array can contain an array ..)

are in theory susceptible to this.

PS: you can use my tool to find call graph loops in binaries.

Intensity-guided fuzzing

This feature quantifies the number of instrumented locations that are hit in a single run. It is the aggregate of non-unique locations accessed.

So if a certain for loop of 1 iteration causes the coverage callback to be called 5 times, the same loop of 5 iterations results in an aggregate value of 5*5=25.

Great to find slow inputs.

Allocation-guided fuzzing

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    size_t i, alloc = 0;
    void* p;

    for (i = 0; i < size; i++) {
        if (i % 3 == 0 && data[i] == 0xAA) {
            alloc += 1;
        }
    }

    if (alloc >= 1350)
        alloc = -1;
    p = malloc(alloc);
    free(p);
    return 0;
}

Given enough 0xAA’s in the input, the program will perform an allocation of -1 bytes. AddressSanitizer does not tolerate this and it will crash.

With -alloc_guided=1 -value_profile=1, it usually takes 10-25 seconds on my system until it crashes (which is what we want).

With just -value_profile=1 (and ASAN_OPTIONS=coverage=1:coverage_counters=1), it was still running after more than an hour. It has very little to go on, and it cannot figure out the logic.

I expect this feature will help to find certain threshold-constrained issues. For instance, an application runs fine if less than 8192 elements of something are involved. Beyond that threshold, it resorts to different, erroneous logic (maybe a wrong use of realloc()). This feature guides the fuzzer towards that pivot.

Aside from finding crashes, this feature is great at providing insight into the top memory usage of an application, and it automatically finds the worst case input in terms of heap usage (because fuzzing is guided by the malloc()s). If you can discover an input that makes a server application reserve 50MB of memory whereas the average memory usage for normal requests is 100KB, it’s not a vulnerability in the traditional sense (although it may be a very cheap DoS opportunity), but it might make you consider refactoring some code.

Custom-guided fuzzing

libFuzzer expects that LLVMFuzzerTestOneInput returns 0. It will halt if it returns something else. It isn’t used for anything else at this moment. So I thought I’d put it to good use. Use -custom_guided=1.

You can now connect libFuzzer to literally anything. I’m experimenting with connecting to a remove server in LLVMFuzzerTestOneInput, hashing what the server returns, and return the number of unique hashes produced so far. So I am in fact fuzzing a remote, uninstrumented application.

Disable coverage-guided fuzzing

Use -no_coverage_guided=1 to disable coverage-guided fuzzing. This is useful if you want to rely purely on, say, allocation guidance.

Techniques tried and discarded

Favoring efficient mutators

I’ve tried keeping a histogram for mutator efficacy. So each time a certain mutator (like EraseBytes, InsertBytes, …) was responsible for an increase in code coverage, I incremented its histogram value. Then, when the mutator for the next iteration had to be selected, I favored the most efficient mutator (but less efficient mutators could be chosen as well, just with a smaller likelihood).

Upon class construction I created a lin-log look-up table. For 5 mutators, it looks like this:

LUT = [0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4]

Every iteration, I sorted the histogram and save the order of the indices. So if the histogram looks like this:

Mutator 0: 100 hits
Mutator 1: 1000 hits
Mutator 2: 500 hits
Mutator 3: 1200 hits
Mutator 4: 10 hits

The (reverse) sorted sequence of indices is then:

LUT2 = [4, 0, 2, 1, 3]

To choose a new mutator:

curMutator = LUT2[ LUT[ rand() % numMutators ] ]

So mutator 3 is now strongly favored (chance of 1 in 3), but there is still a 1 in 15 chance that mutator 4 gets chosen.

Unfortunately, this effort was in vain. It appeared to only slow down fuzzing. Apparently the fuzzer needs mutator diversity in order to reach new coverage. Or I have been overlooking something, in which case you are free to comment ;).

Unique call graph traversal

I figured that an approach that embeds both stack-depth-guidance and code intensity-guidance is to keep an array of code locations hit by the application in one run, hash the array, and use the number of unique hashes as guidance. Unfortunately this number increments for nearly every input, and soon memory is exhausted. Maybe a less granular coverage instrumentation could work.

Fuzzing tips du jour

  • Sanitizers and fuzzers are distinct technologies. You can fuzz without sanitizers (and sanitize without fuzzing): speed up corpus generation by an order of magnitude -> then test the corpus with sanitizers.
  • Developers: you can use fuzzing to verify application logic. Put an abort() where you normally print a debug message when an assert() failed that you believe should never fail. Now fuzz it.
  • Sometimes optimizations and compiler versions matter. gcc + ASAN detects an issue in the following program with -O0, but not with -O1 and higher: int main(){char* b;int l=strlen(b);} . clang doesn’t find it with any optimization flag. The reverse (crashes with -O3, not with -O1) can also happen (see my OpenSSH CVE-2016-10012). Security that relies on specific compiler versions and flags is probably a great way to contribute backdoored code to open-source software, if you are so inclined. Had I been a bad hombre, this is what I would do. Maintainers testing your code with a their clang -O2 build system + regression tests + fuzzing rig will probably not detect your malicious code hiding in plain sight, but it is nonetheless going to creep into some percentage of binaries.

Work

There’s been a lot of commercial interest in my activities after OpenVPN. Yes, I am available for contracting work.

I’ve recently completed work for a well-respected open-source application. I had a wonderful run: about 10 remote vulnerabilities in one week (release 17 Jul 2017).

I love to go full-out on software and exploit every technique known to me to squeeze out every vulnerability. I’ve got a lot of lesser-known tricks up my sleeve that I like to use.

Feel free to contact me: guidovranken @ gmail com and inquire about the possibilities.

fuzzing-is-magic
fuzzing is literally magic