From China, With Love

Lest anyone think that D-Link is the only vendor who puts backdoors in their products, here’s one that can be exploited with a single UDP packet, courtesy of Tenda.

After extracting the latest firmware for Tenda’s W302R wireless router, I started looking at /bin/httpd, which turned out to be the GoAhead webserver:

Server header string in /bin/httpd

Server header string in /bin/httpd

But Tenda has made a lot of special modifications themselves. Just before entering the HTTP receive loop, main calls InitMfgTask, which spawns the MfgThread function as a separate thread:

pthread_create(&var_10, 0, MfgThread, 0);

pthread_create(&var_10, 0, MfgThread, 0);

Hmmm…InitMfgTask and MfgThread? Related to manufacturing tasks perhaps? Iiiiiinteresting…

Continue reading

Some IDA Plugins

I’ve posted a few of my IDA plugins on github. Though simple, I’ve found their functionality quite useful when reversing firmware and RISC architectures:

  • Defining ASCII strings not defined during IDA’s auto analysis
  • Defining undefined bytes in the data segment as DWORDs (allowing IDA to resolve function/jump table pointers, etc)
  • Defining undefined bytes in the code segment as code/functions
  • Finding references to any highlighted text (such as registers and immediate values) within the current function
  • Auto-naming MIPS stack variables generated by the compiler for storing registers ($s0-$s7, $gp, etc)

Hopefully others will find them useful as well.

Encryption vs Compression, Part 2

I’ve recently been examining the feasibility of differentiating compressed data from encrypted data based on variations in the entropy of the data. Initial results showed some promise, but were tested against too small of a sample set to draw any hard conclusions. Since then, I’ve been experimenting with larger data sets (more files and more varied types of encryption / compression) with quite satisfactory results.

The TL;DR is that 98% of the compressed files tested were correctly identified as compressed, and 100% of the encrypted files were identified as not compressed (i.e., encrypted).

In general the entropy of compressed data shows significant variances from that of encrypted data, and can be reliably identified with very few false positives. While identification of certain compression algorithms (namely LZMA) still present some practical concerns depending on your situation, even those compressions were reliably distinguishable from encrypted data during testing due to non-random data in the file’s header structure (see ‘Analysis’ below).

What is particularly exciting, at least for me, is that compression formats which would have been otherwise unknown (e.g., files that weren’t signatured by file or binwalk) were easily identified as compressed through entropy analysis.

Continue reading

Differentiate Encryption From Compression Using Math

When working with binary blobs such as firmware images, you’ll eventually encounter unknown data. Particularly with regards to firmware, unknown data is usually either compressed or encrypted. Analysis of these two types of data is typically approached in very different manners, so it is useful to be able to distinguish one from the other.

The entropy of data can tell us a lot about the data’s contents. Encrypted data is typically a flat line with no variation, while compressed data will often have at least some variation:

Entropy graph of an AES encrypted file

Entropy graph of an AES encrypted file

Entropy graph of a gzip compressed file

Entropy graph of a gzip compressed file

But not all compression algorithms are the same, and some compressed data can be very difficult to visually distinguish from encrypted data:

Entropy graph of an LZMA compressed file

Entropy graph of an LZMA compressed file

However, there are a few tests that can be performed to quantify the randomness of data. The two that I have found most useful are chi square distribution and Monte Carlo pi approximation. These tests can be used to measure the randomness of data and are more sensitive to deviations in randomness than a visual entropy analysis.

Continue reading

IDAScript For Linux and OSX

Being able to run IDA scripts from the command line is very useful, but can be a bit kludgy. Fortunately, idascript was written to simplify this process. Unfortunately (for me), it was written for Windows.

Since I work primarily in a Linux environment, I re-wrote the idascript utility in Python. I also added a few features to the idascript Python module, for convenience:

  • Script arguments are accessible via the normal sys.argv
  • The script can be terminated via the normal sys.exit function
  • The directory to your collection of IDA scripts (specified during install) is added to sys.path

Installation is straightforward:

eve@eve:~/idascript$ sudo ./install.py 
Absolute path to your IDA install directory: /opt/ida/bin
 
Absolute path to the directory where you usually keep all your IDA scripts: /opt/ida/scripts
 
IDA_INSTALL_PATH = /opt/ida/bin
IDA_SCRIPT_PATH = /opt/ida/scripts
IDA_OUT_FILE = /tmp/idaout.txt

Using existing IDAPython scripts with idascript is as easy as importing the idascript module:

import idascript

print "Cross references to strcpy:"

for xref in XrefsTo(LocByName("strcpy")):
    print "0x%.8X  %s" % (xref.frm, GetDisasm(xref.frm))

And usage of idascript itself is the same as the original idascript utility:

eve@eve:~$ idascript ./target.idb ./strcpy.py 
Cross references to strcpy:
0x00407F68  jalr    $t9 ; strcpy
0x0040B9B8  jalr    $t9 ; strcpy
0x0040E5BC  jr      $t9 ; strcpy
0x0041D448  jalr    $t9 ; strcpy
0x00422C04  jalr    $t9 ; strcpy
0x00422D04  jalr    $t9 ; strcpy
0x00424C4C  jalr    $t9 ; strcpy
0x00425400  jalr    $t9 ; strcpy
0x00430358  jalr    $t9 ; strcpy
0x0043045C  jalr    $t9 ; strcpy
0x00434118  jalr    $t9 ; strcpy
0x00436A30  jalr    $t9 ; strcpy
0x0043CE48  jalr    $t9 ; strcpy
0x00407F58  la      $t9, strcpy
0x0040B9AC  la      $t9, strcpy
0x0040E598  la      $t9, strcpy
0x0041D440  la      $t9, strcpy
0x00422BF8  la      $t9, strcpy
0x00422CF8  la      $t9, strcpy
0x00422D74  la      $t9, strcpy
0x00424C44  la      $t9, strcpy
0x004253F0  la      $t9, strcpy
0x004302D8  la      $t9, strcpy
0x00430454  la      $t9, strcpy
0x00434110  la      $t9, strcpy
0x00436A28  la      $t9, strcpy
0x0043CE40  la      $t9, strcpy
0x00498ECC  .word strcpy

Jailbreaking the NeoTV

Today we’ll be jailbreaking the Netgear NTV300 set top box…with a TV remote.

The Netgear NeoTV 300

Negear’s NeoTV set top boxes are designed to compete with the popular Roku, and can stream video from all the usual sources (Netflix, HuluPlus, Youtube, etc). The NTV300 is one of the least expensive NeoTV models, and while a GPL release is available, it contains only copies of the various standard open source utilities used by the NTV300. All the interesting bits – such as Netflix streaming, or the ability to build a custom firmware image – are not included.

Inside the NTV300 we find a Mediatek ARM SoC, a 128MB NAND flash chip and 256MB of RAM:

Inside the NTV300

Continue reading

Exploiting a MIPS Stack Overflow

Although D-Link’s CAPTCHA login feature has a history of implementation flaws and has been proven to not protect against the threat it was intended to thwart, they continue to keep this feature in their products. Today we’ll be looking at the CAPTCHA implementation in the D-Link DIR-605L, which is a big-endian MIPS system running Linux 2.4.

A pre-authentication vulnerability exists in the DIR-605L’s processing of the user-supplied CAPTCHA data from the Web-based login page. The formLogin function in the Boa Web server is responsible for handling the login data, and obtains the value of the FILECODE POST variable using the websGetVar function. The FILECODE value contains a unique string identifying the CAPTCHA image displayed on the login page, and is saved to the $s1 register:

$s1 = FILECODE

If the CAPTCHA feature is enabled, this value is later passed as the second argument to the getAuthCode function:

FILECODE value being passed to getAuthCode

The getAuthCode function saves the FILECODE value back to the $s1 register:

$s1 = $a1

Which in turn is passed as the third argument to sprintf, (note the ‘%s’ in the sprintf format string):

sprintf’s are bad, mmmk?

The result of the sprintf is saved to the address contained in $s0, which is the address of the stack variable var_80:

$a0 = var_80

This is a classic stack based buffer overflow, and overflowing var_80 allows us to control all of the register values saved onto the stack by getAuthCode’s function prologue, including the saved return address and the saved values of the $s0 – $s3 registers:

getAuthCode stack layout

Continue reading

Reverse Engineering a DTV Converter

I have an old DTV converter sitting around gathering dust, so I thought it would be interesting to take a look inside:

Inside the DTV Converter

As you can see, there’s not much there: a Thomson TV tuner, an IR receiver, 32MB of RAM and a 2MB flash chip (on the underside of the board). What really makes this interesting though is the LGDT1111 SoC; this is a DTV chip manufactured by LG, so it’s a little different than the Broadcom/Atheros/Ralink/etc SoCs found in a lot of other consumer devices. It is very popular with many DTV converters though, so determining its CPU architecture and reversing the underlying firmware could be interesting.

Digging around on the Internet turned up a nice block diagram of the LGDT1111 (courtesy of MVPtek):

LGDT1111 Block Diagram

The MVPtek web site states that the SoC uses an “AMR926EJ-STM” controller…could they mean an ARM926EJ-STM? Hmmm…

Continue reading

Hacking the Linksys WMB54G

Today we’re going to take a look at an interesting little device, the Linksys WMB54G wireless music bridge.

WMB54G

This is a pretty specialized device, so it’s likely a fairly minimalistic system. Even the administrative interface is small and simple:

WMB54G Administrative Interface

The Linksys support page doesn’t have any firmware updates available, so let’s take a peek at the hardware.

Opening the case reveals an expectedly limited system, with just 2MB of flash, 8MB of RAM and a small processor covered up by a heat sink:

WMB54G Internals

There are two connectors on the right hand side of the board, labelled J5 and J9. J5 appears to be a JTAG connector, while J9 shows promise of being a serial port:

J5 and J9 Connectors

Continue reading

Emulating NVRAM in Qemu

Being able to emulate embedded applications in Qemu is incredibly useful, but not without pitfalls. Probably the most common issue that I’ve run into are binaries that try to read configuration data from NVRAM; since the binary is running in Qemu and not on the target device, there is obviously no NVRAM to read from.

Embedded applications typically interface with NVRAM through a shared library. The library in turn interfaces with the MTD partition that contains the device’s current configuration settings. Many programs will fail to run properly without the NVRAM configuration data, requiring us to intercept the NVRAM library calls and return valid data in order to properly execute the application in Qemu.

Here’s a Web server extracted from a firmware update image that refuses to start under Qemu:

It looks like httpd can’t start because it doesn’t know what IP address to bind to. The IP can’t be set via a command line argument, so it must be getting this data from somewhere else. Let’s fire up IDA and get cracking!

Continue reading