Writing a bFLT Loader for IDA

I was recently working on some uClinux-based devices and needed to disassemble some of the binaries in the firmware. Unfortunately, IDA doesn’t have a loader for the bFLT file format used by uClinux:

No bFLT Loader

Fortunately, I was able to find a bFLT loader over at rockbox.org. Unfortunately this bFLT loader doesn’t process the relocation or global offset tables, which means that string and data cross-references aren’t properly resolved in the disassembled code:

Rockbox bFLT Loader

Fortunately, writing our own IDA loader (especially for a simple file format like bFLT) is pretty easy. Let’s start by taking a look at the layout of a bFLT file.

bFLT (binary flat file) is a minimalist file format used by uClinux executables in place of the more common (but more bloated) ELF format. bFLT files have a pretty simple layout:

bFLT File Layout

The file starts with a bFLT header (more on this later). The header is followed by some NULL padding, typically 8 or 16 bytes.

Next is the .text section of the file, which contains the executable code. The entry point of the application is at the first byte of the .text section.

After the .text section comes the .data section. If the binary was built as a position-independent executable (-fPIC), the first part of the .data section will contain the global offset table.

Finally comes the .bss section, which contains the relocation table. Note that the size of the .bss section reported in the header typically extends beyond the physical size of the binary itself.

The file header contains all of the information (stored in big endian format) we need in order to locate the beginning and end of each of these sections:

struct bflt_header
{
        uint8_t magic[4];             // "bFLT"
        uint32_t version;             // version 4 is the most common
        uint32_t entry;               // beginning of .text
        uint32_t data_start;          // end of .text, beginning of .data
        uint32_t data_end;            // end of .data, beginning of .bss
        uint32_t bss_end;             // end of .bss
        uint32_t stack_size;
        uint32_t reloc_start;         // beginning of data relocation entries
        uint32_t reloc_count;         // number of data relocation entries
        uint32_t flags;               // RAM | GZIP | GOTPIC
        uint32_t build_date;
        uint32_t filler[5];                
};

Knowing this, we can write a very simple IDA loader in Python that checks to see if the target file is a bFLT file then loads the file into the IDA database, creating the .text, .data and .bss segments:

BFLT_HEADER_SIZE = 0x40

# Returns 0 if the file is not a bFLT file
# Returns a string description if the file is a bFLT file
def accept_file(li, n):

        retval = 0

        if n == 0:
                # Seek to the beginning of the file
                li.seek(0)

                # Make sure this is a bFLT v4 file
                if li.read(4) == BFLT_MAGIC and struct.unpack(">I", li.read(4))[0] == BFLT_VERSION:
                        retval = "%s v%d executable" % (BFLT_MAGIC, BFLT_VERSION)

        return retval

# Load the file into the IDA database
# Return 1 on success, 0 on failure
def load_file(li, neflags, format):

        # Read in the bFLT header fields
        li.seek(0)
        (magic, version, entry, data_start, data_end, bss_end, stack_size, reloc_start, reloc_count, flags) = struct.unpack(">IIIIIIIIII", li.read(4*10))

        # Load the file data (sans header) into IDA
        li.file2base(entry, entry, data_end, True)

        # Define the .text .data and .bss segments
        add_segm(0, entry, data_end, ".text", "CODE")
        add_segm(0, data_start, data_end, ".data", "DATA")
        add_segm(0, data_end, bss_end, ".bss", "BSS")

        return 1

This is what the rockbox bFLT loader does. However, as mentioned before, this loader does not process the data relocation entries. Without this, data references will not be properly resolved.

See, each data reference in the bFLT file is off by BFLT_HEADER_SIZE (0x40) bytes. We need to find each of these data references and “relocate” them to point to the proper address.

The bFLT header tells us where the relocation entries start and how many entries there are. Each relocation entry is a 32-bit pointer to an address that contains a data reference that needs to be relocated (i.e., it’s a pointer to a pointer). The relocation entries are stored in big endian format and do not take into account the size of the bFLT header. Some simple code to process the first relocation entry might look like:

# Read in the first relocation entry and add BFLT_HEADER_SIZE to get a pointer to the data pointer we need to relocate
li.seek(reloc_start)
reloc_offset = struct.unpack(">I", li.read(4))[0] + BFLT_HEADER_SIZE

# Read in the data pointer. Add BFLT_HEADER_SIZE in order to "relocate" the pointer and make it point to the right place.
li.seek(reloc_offset)
reloc_offset_patched = struct.unpack(">I", li.read(4))[0] + BFLT_HEADER_SIZE

# Replace pointer at reloc_offset with the relocated address
PatchDword(reloc_offset, reloc_offset_patched)

Wrap this logic into a simple loop that patches all the relocation entries and voila! We have proper string references:

Relocation Addresses Patched

This loader works great as long as the bFLT binary is not position independent. If it is position independent (compiled with the -fPIC flag), then any GOT data references won’t be resolved:

bFLT Built With -fPIC

The GOT is an array of 32 bit addresses that starts at the beginning of the .data section and ends with the address 0xFFFFFFFF. Unfortunately each address in the GOT is off by BFLT_HEADER_SIZE bytes. All we need to do is walk through these GOT addresses and add BFLT_HEADER_SIZE to each entry:

# Add a repeatable comment and name the offset so that all references to GOT are obvious
MakeRptCmt(data_start, "GLOBAL_OFFSET_TABLE")
MakeName(data_start, "GOT")

# Get the first GOT entry
li.seek(data_offset)
got_entry = struct.unpack("<I", li.read(4))[0]

if got_entry > 0:
        # The actual data is located at GOT entry + BFLT_HEADER_SIZE
        new_entry = got_entry + BFLT_HEADER_SIZE

        # Patch the GOT entry with the new address
        PatchDword(data_offset, new_entry)

Loop through all the GOT entries with this code and all the GOT offsets will be fixed up and pointing to the proper offsets.

Now, any strings that are referenced through the GOT still won’t be automatically resolved by IDA because there are still several levels of indirection. However, it is now very clear where and how the GOT is being referenced. For example, after processing the GOT we see that the address of the GOT is loaded into R10, then an address is loaded from R10+0x0C into R3:

GOT Entries Patched

If we go to GOT + 0x0C, we find a pointer to our string:

Pointer at GOT + 0x0C

String at 0x5E00

The full source for the bFLT loader can be downloaded here, along with a couple other tools that I’ve written or found useful:

  • Flthdr, a utility for manipulating and decompressing bFLT files (from the elf2flt project)
  • Readbflt, a readelf-like utility for bFLT files

Additional resources for uClinux, bFLT and IDA loaders:

Bookmark the permalink.

4 Responses to Writing a bFLT Loader for IDA

  1. Julio Cruz says:

    Hello Craig,

    Nice post.

    Do you use this python script with IDAPython plugin?

    Thanks

    • Craig says:

      Just drop it in IDA’s loaders directory; IDA will list it as one of your loader options when you open a BFLT binary in IDA.

  2. Julio Cruz says:

    Thanks Craig. That’s worked.
    I did a simple uclinux┬┤s application (print hello world) and later I disassembled it, to test the loader.
    Any idea/advice to execute this application from the obtained assembler code?
    I used the option “Produce File…/Create ASM file…”?
    Thanks

  3. JC says:

    Hi Craig,

    I have the following sample code (main.c):

    int main(int argc, char **argv)
    {
    return 0;
    }

    After pass this code by the toolchain [i) gcc -S main.c, ii) as main.o, iii) ld.exe -elf2flt crti.o crtn.o crt1.o libc.a main.o), the program run without problems in a Coldfire board.

    After that, I disassemble the main.flt (obtained from the previous process) with your bFLT loader and IDA (with default options). Later, I produce the assembler code (mainR.s) and I do the following tasks: i) as mainR.s, ii) ld.exe -elf2flt, and iii) Execute the mainR.flt.

    In this last task, I get the following:

    “Unable to read code data bss, errno 22”

    The header FLT is (of mainR.flt):

    Magic: bFLT
    Rev: 4
    Build Date: Mon Feb 18 18:02:34 2013
    Entry: 0x50
    Data Start: 0x480
    Data End: 0x3f
    BSS End: 0x3f
    Stack Size: 0x1000
    Reloc Start: 0x3f
    Reloc Count: 0xb
    Flags: 0x1 ( Load-to-Ram )

    And the original, before use the loader and IDA is (main.flt):

    Magic: bFLT
    Rev: 4
    Build Date: Mon Feb 18 17:46:13 2013
    Entry: 0x44
    Data Start: 0x400
    Data End: 0x43c
    BSS End: 0x470
    Stack Size: 0x1000
    Reloc Start: 0x43c
    Reloc Count: 0x2f
    Flags: 0x1 ( Load-to-Ram )

    I’m not sure where the problem is.

    Please, if you can give any advices.

    Thanks

    JC

Leave a Reply

Your email address will not be published. Required fields are marked *