Binwalk v1.0, Now With Python!

Binwalk 1.0 has just been released and has been completely re-written as a Python module. This means that not only does it feature smarter scanning and signature processing features that were much, much easier to implement in Python, but it is now fully scriptable.

Aside from a few new options (and the removal of a few depreciated ones), the command line usage is pretty much the same. My personal favorite options to pass to binwalk are ‘-re’, which besides being a reference to reverse engineering, will attempt to extract data from the target file and clean up after itself (very handy for when there are a lot of false positive LZMA files extracted!):

$ binwalk -re firmware.bin

Scripting with binwalk is pretty straight forward. To perform a simple scan (equivalent to running binwalk with no command line options):

import pprint
from binwalk import Binwalk

binwalk = Binwalk()
pprint.PrettyPrinter().pprint(binwalk.scan('firmware.bin'))
binwalk.cleanup()

Check out the wiki for more command line usage and API examples.

Binwalk 0.5 Release

In celebration of the world not ending, a new version of Binwalk has been released. Notable changes:

  • Much improved signatures for several common file types, particularly JFFS2
  • Smart signature” keyword support, for more reliable and faster scans
  • Ability to invoke external applications to process extracted files

The latter feature is probably of most interest, and is implemented as an extension of the pre-existing –dd option:

$ binwalk --dd='gzip:gz:gunzip %e' firmware.bin

The above command instructs Binwalk to extract any file whose description contains the text ‘gzip’, save it to disk with a ‘gz’ file extension, and to then run the ‘gunzip %e’ command (the %e is a placeholder that will be replaced with the actual name of the extracted file). This allows for auto extraction and decompression of gzipped files.

Although multiple –dd options may be specified, there are probably several common file types that you always want to be extracted whenever they are encountered. Binwalk 0.5 allows you to place multiple –dd arguments into the $HOME/.binwalk/extract.conf file:

# Extract and decompress gzip and lzma files
gzip:gz:gunzip %e
lzma:7z:7zip -d %e

# Extract private keys, but don't run anything
private key:key

The extract rules from this file are applied whenever the –extract option is specified:

$ binwalk --extract firmware.bin

There are several default extract rules that come with Binwalk by default. These are stored in /usr/local/etc/binwalk/extract.conf, and will be updated whenever the –update option is specified. Note that many of these extract rules expect the firmware-mod-kit to be installed to /opt/firmware-mod-kit, but these rules can be overridden by those in the $HOME/.binwalk/extract.conf file.

This means that a Binwalk scan can now not only identify embedded files, but also extract and decompress them for you automatically:

$ binwalk --extract firmware.bin 

DECIMAL   	HEX       	DESCRIPTION
-------------------------------------------------------------------------------------------------------
0         	0x0       	TRX firmware header, little endian, header size: 28 bytes,  image size: 13533184 bytes, CRC32: 0x15289B44 flags/version: 0x10000
28        	0x1C      	gzip compressed data, was "piggy", from Unix, last modified: Mon Dec  3 13:09:06 2012, max compression
2005108   	0x1E9874  	Squashfs filesystem, little endian, non-standard signature,  version 3.1, size: 11525877 bytes, 2743 inodes, blocksize: 131072 bytes, created: Mon Dec  3 13:49:31 2012 

$ ls
1C  1E9874.squashfs  firmware.bin  squashfs-root/
$ ls squashfs-root
bin  dev  etc  home  JNAP  lib  libexec  linuxrc  mnt  opt  proc  root  sbin  sys  tmp  usr  var  www

IDAScript For Linux and OSX

Being able to run IDA scripts from the command line is very useful, but can be a bit kludgy. Fortunately, idascript was written to simplify this process. Unfortunately (for me), it was written for Windows.

Since I work primarily in a Linux environment, I re-wrote the idascript utility in Python. I also added a few features to the idascript Python module, for convenience:

  • Script arguments are accessible via the normal sys.argv
  • The script can be terminated via the normal sys.exit function
  • The directory to your collection of IDA scripts (specified during install) is added to sys.path

Installation is straightforward:

eve@eve:~/idascript$ sudo ./install.py 
Absolute path to your IDA install directory: /opt/ida/bin
 
Absolute path to the directory where you usually keep all your IDA scripts: /opt/ida/scripts
 
IDA_INSTALL_PATH = /opt/ida/bin
IDA_SCRIPT_PATH = /opt/ida/scripts
IDA_OUT_FILE = /tmp/idaout.txt

Using existing IDAPython scripts with idascript is as easy as importing the idascript module:

import idascript

print "Cross references to strcpy:"

for xref in XrefsTo(LocByName("strcpy")):
    print "0x%.8X  %s" % (xref.frm, GetDisasm(xref.frm))

And usage of idascript itself is the same as the original idascript utility:

eve@eve:~$ idascript ./target.idb ./strcpy.py 
Cross references to strcpy:
0x00407F68  jalr    $t9 ; strcpy
0x0040B9B8  jalr    $t9 ; strcpy
0x0040E5BC  jr      $t9 ; strcpy
0x0041D448  jalr    $t9 ; strcpy
0x00422C04  jalr    $t9 ; strcpy
0x00422D04  jalr    $t9 ; strcpy
0x00424C4C  jalr    $t9 ; strcpy
0x00425400  jalr    $t9 ; strcpy
0x00430358  jalr    $t9 ; strcpy
0x0043045C  jalr    $t9 ; strcpy
0x00434118  jalr    $t9 ; strcpy
0x00436A30  jalr    $t9 ; strcpy
0x0043CE48  jalr    $t9 ; strcpy
0x00407F58  la      $t9, strcpy
0x0040B9AC  la      $t9, strcpy
0x0040E598  la      $t9, strcpy
0x0041D440  la      $t9, strcpy
0x00422BF8  la      $t9, strcpy
0x00422CF8  la      $t9, strcpy
0x00422D74  la      $t9, strcpy
0x00424C44  la      $t9, strcpy
0x004253F0  la      $t9, strcpy
0x004302D8  la      $t9, strcpy
0x00430454  la      $t9, strcpy
0x00434110  la      $t9, strcpy
0x00436A28  la      $t9, strcpy
0x0043CE40  la      $t9, strcpy
0x00498ECC  .word strcpy

Binwalk 0.4.5 Release

Binwalk 0.4.5 is now available. This release includes a couple of bug fixes, including a (small) memory leak, and a signature parsing bug which prevented certain signatures from loading properly.

A new command line option has been added as well: –dd. This feature instructs Binwalk to extract embedded files that it finds automatically. For example, to extract all ‘gzip’ files and save them with the extension ‘gz’:

$ binwalk firmware.bin --dd=gzip:gz

To extract all gzip files but only the first JFFS2 entry:

$ binwalk firmware.bin --dd=gzip:gz --dd=jffs2:jffs2:1

To extract every file that Binwalk identifies, use the ‘all’ keyword:

$ binwalk firmware.bin --dd=all:dat

All string matches are case insensitive. Extracted files are named by their respective hexadecimal offsets in the original file. The extracted files will contain all data from the offset where the signature was found to EOF.

Get Binwalk 0.4.5 here.

A Better Way to TFTP

Working with embedded devices, I end up using TFTP quite a bit. While most operating systems offer TFTP clients, they tend to be a bit archaic and lack simple features that we hacker types might find useful. So of course, I rolled my own.

Tfcp is a TFTP client utility written in Python using the excellent tftpy module. Usage is simple and mimics that of scp:

Uploading file ‘foo’ to ‘/tmp/bar’:

$ tfcp ./foo.txt 192.168.1.1:/tmp/bar

Downloading ‘/tmp/bar’ to your current working directory:

$ tfcp 192.168.1.1:/tmp/bar .

There are two key features that I like about tfcp:

  1. It is non-interactive, which means it’s easily scriptable and all tfcp commands get stored in your command history
  2. It allows you to specify both the local and remote file names

Although these are simple, seemingly innocuous features, they are severely lacking in most TFTP client utilities, and as we’ll soon see, they can be key features when analyzing/exploiting embedded systems.

You can grab tfcp from the Google Code page; you’ll need to install tftpy first, either from source, or through apt-get (python-tftpy).

Hardware Hacking With Python

In preparation for our Embedded Device Exploitation classes, I’ve just released my latest project, the Gumbi board:

New Gumbi boards, fresh off the press

The Gumbi board provides a flexible USB interface to the real world in the form of 64 digital I/O pins – all controllable from the comfort of your Python shell, allowing you to rapidly prototype and create new tools for interfacing with external devices.

Take flashbin for example, an open source flash programmer I’ve written for working with external parallel flash chips.

Although popular for firmware storage, parallel NOR flash chips are particularly difficult for hobbyists/hackers to work with because their interface typically requires 30 to 40 I/O pins (or more!). This tends to result in error-prone wiring that has to be re-wired whenever you need to interface with a different chip:

Using the Gumbi board however, everything can be defined (and re-defined) in software. Just plug the chip in, create a flashbin config file that defines the pin configuration for your target chip, and you’re ready to go:

A 4MB NOR flash chip connected to the Gumbi board via a ZIF socket adapter

Dumping firmware from the 4MB flash chip with flashbin

Continue reading

Writing a bFLT Loader for IDA

I was recently working on some uClinux-based devices and needed to disassemble some of the binaries in the firmware. Unfortunately, IDA doesn’t have a loader for the bFLT file format used by uClinux:

No bFLT Loader

Fortunately, I was able to find a bFLT loader over at rockbox.org. Unfortunately this bFLT loader doesn’t process the relocation or global offset tables, which means that string and data cross-references aren’t properly resolved in the disassembled code:

Rockbox bFLT Loader

Fortunately, writing our own IDA loader (especially for a simple file format like bFLT) is pretty easy. Let’s start by taking a look at the layout of a bFLT file.

Continue reading

Binwalk 0.4.2 Release

Binwalk v0.4.2 has just been released. One of the major drawbacks to binwalk in the past has been scan time, which can take quite a while on larger files. Thanks to some user-supplied suggestions, I’m happy to say that scan times have been improved by several orders of magnitude; scans that previously took 10+ minutes now finish in just 30 seconds!

Some new search options have been added as well, one of my favorites being –raw-bytes. This option allows you to specify a sequence of bytes to search for without having to create a custom entry in the magic file:

$ binwalk --raw-bytes="abcdefg" firmware.bin
$ binwalk --raw-bytes="\x00\x01\x02\x03" firmware.bin

Get the 0.4.2 release here.