A Code Signature Plugin for IDA

When reversing embedded code, it is often the case that completely different devices are built around a common code base, either due to code re-use by the vendor, or through the use of third-party software; this is especially true of devices running the same Real Time Operating System.

For example, I have two different routers, manufactured by two different vendors, and released about four years apart. Both devices run VxWorks, but the firmware for the older device included a symbol table, making it trivial to identify most of the original function names:

VxWorks Symbol Table

VxWorks Symbol Table

The older device with the symbol table is running VxWorks 5.5, while the newer device (with no symbol table) runs VxWorks 5.5.1, so they are pretty close in terms of their OS version. However, even simple functions contain a very different sequence of instructions when compared between the two firmwares:

strcpy from the VxWorks 5.5 firmware

strcpy from the VxWorks 5.5 firmware

strcpy from the VxWorks 5.5.1 firmware

strcpy from the VxWorks 5.5.1 firmware

Of course, binary variations can be the result of any number of things, including differences in the compiler version and changes to the build options.

Despite this, it would still be quite useful to take the known symbol names from the older device, particularly those of standard and common subroutines, and apply them to the newer device in order to facilitate the reversing of higher level functionality.

Continue reading

Mucking About With SquashFS

SquashFS is an incredibly popular file system for embedded Linux devices. Unfortunately, it is also notorious for being hacked up by vendors, causing the standard SquashFS tools (i.e., unsquashfs) to fail when extracting these file systems.

While projects like the Firmware-Mod-Kit (FMK) have amassed many unsquashfs utilities to work with a wide range of SquashFS variations found in the wild, this approach has several draw backs, most notably that each individual unsquashfs tool only supports its one particular variation. If you run into a SquashFS image that is mostly compatible with a given unsquashfs tool, but has some minor modification, you can’t extract it – and worse, you probably don’t know why.

So what are these “minor modifications” that cause unsquashfs to fail?

Continue reading

WRT120N fprintf Stack Overflow

With a good firmware disassembly and JTAG debug access to the WRT120N, it’s time to start examining the code for more interesting bugs.

As we’ve seen previously, the WRT120N runs a Real Time Operating System. For security, the RTOS’s administrative web interface employs HTTP Basic authentication:

401 Unauthorized

401 Unauthorized

Most of the web pages require authentication, but there are a handful of URLs that are explicitly allowed to bypass authentication:

bypass_file_list("/cgi-bin/login /images/ /login...");

bypass_file_list(“/cgi-bin/login /images/ /login…”);

Full list of bypass files

Full list of bypass files

Any request whose URL starts with one of these strings will be allowed without authentication, so they’re a good place to start hunting for bugs.

Continue reading

Encryption vs Compression, Part 2

I’ve recently been examining the feasibility of differentiating compressed data from encrypted data based on variations in the entropy of the data. Initial results showed some promise, but were tested against too small of a sample set to draw any hard conclusions. Since then, I’ve been experimenting with larger data sets (more files and more varied types of encryption / compression) with quite satisfactory results.

The TL;DR is that 98% of the compressed files tested were correctly identified as compressed, and 100% of the encrypted files were identified as not compressed (i.e., encrypted).

In general the entropy of compressed data shows significant variances from that of encrypted data, and can be reliably identified with very few false positives. While identification of certain compression algorithms (namely LZMA) still present some practical concerns depending on your situation, even those compressions were reliably distinguishable from encrypted data during testing due to non-random data in the file’s header structure (see ‘Analysis’ below).

What is particularly exciting, at least for me, is that compression formats which would have been otherwise unknown (e.g., files that weren’t signatured by file or binwalk) were easily identified as compressed through entropy analysis.

Continue reading

Differentiate Encryption From Compression Using Math

When working with binary blobs such as firmware images, you’ll eventually encounter unknown data. Particularly with regards to firmware, unknown data is usually either compressed or encrypted. Analysis of these two types of data is typically approached in very different manners, so it is useful to be able to distinguish one from the other.

The entropy of data can tell us a lot about the data’s contents. Encrypted data is typically a flat line with no variation, while compressed data will often have at least some variation:

Entropy graph of an AES encrypted file

Entropy graph of an AES encrypted file

Entropy graph of a gzip compressed file

Entropy graph of a gzip compressed file

But not all compression algorithms are the same, and some compressed data can be very difficult to visually distinguish from encrypted data:

Entropy graph of an LZMA compressed file

Entropy graph of an LZMA compressed file

However, there are a few tests that can be performed to quantify the randomness of data. The two that I have found most useful are chi square distribution and Monte Carlo pi approximation. These tests can be used to measure the randomness of data and are more sensitive to deviations in randomness than a visual entropy analysis.

Continue reading

Reverse Engineering Serial Ports

Given the name of this blog and the number of requests that I’ve had, I think it’s high time we discussed serial ports; specifically, serial ports in embedded systems.

My goal here is to describe the techniques that I’ve found effective in identifying and reverse engineering embedded serial ports through the use of definitive testing and educated guesses, and without the need for expensive equipment.


Introduction

Serial ports are extremely useful to embedded developers, who commonly use them for:

  • Accessing the boot loader
  • Observing boot and debug messages
  • Interacting with the system via a shell

Needless to say, this functionality is also useful to hackers, so finding a serial port on an embedded device can be very advantageous. As a case study, we’ll be examining the PCB of a Westell 9100EM FiOS router for possible serial ports:

Westell 9100EM PCB

Continue reading

Writing a bFLT Loader for IDA

I was recently working on some uClinux-based devices and needed to disassemble some of the binaries in the firmware. Unfortunately, IDA doesn’t have a loader for the bFLT file format used by uClinux:

No bFLT Loader

Fortunately, I was able to find a bFLT loader over at rockbox.org. Unfortunately this bFLT loader doesn’t process the relocation or global offset tables, which means that string and data cross-references aren’t properly resolved in the disassembled code:

Rockbox bFLT Loader

Fortunately, writing our own IDA loader (especially for a simple file format like bFLT) is pretty easy. Let’s start by taking a look at the layout of a bFLT file.

Continue reading

Emulating NVRAM in Qemu

Being able to emulate embedded applications in Qemu is incredibly useful, but not without pitfalls. Probably the most common issue that I’ve run into are binaries that try to read configuration data from NVRAM; since the binary is running in Qemu and not on the target device, there is obviously no NVRAM to read from.

Embedded applications typically interface with NVRAM through a shared library. The library in turn interfaces with the MTD partition that contains the device’s current configuration settings. Many programs will fail to run properly without the NVRAM configuration data, requiring us to intercept the NVRAM library calls and return valid data in order to properly execute the application in Qemu.

Here’s a Web server extracted from a firmware update image that refuses to start under Qemu:

It looks like httpd can’t start because it doesn’t know what IP address to bind to. The IP can’t be set via a command line argument, so it must be getting this data from somewhere else. Let’s fire up IDA and get cracking!

Continue reading

Exploiting Embedded Systems – Part 4

So far in this series we’ve found that we can log in to our target TEW-654TR router by either retrieving the plain text administrator credentials via TFTP, or through SQL injection in the login page. But the administrative web interface is just too limited – we want a root shell!

We’ll be doing a lot of disassembly and debugging, so having IDA Pro Advanced is recommended. But for those of you who don’t have access to IDA, I’ve included lots of screenshots and an HTML disassembly listing courtesy of IDAnchor.

Continue reading

Exploiting Embedded Systems – Part 3

In part 2 of this series we found a SQL injection vulnerability using static analysis. However, it is often advantageous to debug a target application, a capability that we’ll need when working with more complex exploits later on.

In this segment we won’t be discovering any new vulnerabilities, but instead we will focus on configuring and using our debugging environment. For this we will be using Qemu and the IDA Pro debugger. If you don’t have IDA you can use insight/ddd/gdb instead, but in my experience IDA is far superior when it comes to embedded debugging.

Continue reading