Actions

SCHG

Sonic Forces/Formats/BINA

From Sonic Retro

Revision as of 19:35, 8 February 2019 by Radfordhound (talk | contribs) (Removed a bit of information that wasn't true (why did I think that? lol))
Sonic Community Hacking Guide
Sonic Forces
File Index
wars_0

wars_1
wars_patch

BINA Formats
GEdit

NOTE: This page is highly based on my page on Sonic Colors formats: SCHG:Sonic_Colors

The BINA format is a generic container format used by Sonic Team in titles such as Sonic the Hedgehog (2006 game), Sonic Colors, Sonic Lost World, and now in Sonic Forces.

Generally speaking, any file in the BINA Format is structured as follows:

  • BINA Header
  • DATA Node
    • Data (This is entirely specific to the type of file, E.G. gedit files contain gedit data)
    • String Table
    • Offset Table (aka BINA Final Table or BINA Footer)

Each of the BINA-specific components are detailed below.

Header

struct Header
{
    char[4] BINASignature = "BINA";
    char[3] VersionNumber = "210";
    char EndianFlag; // "B" if the file is big-endian, "L" if the file is little-endian.
    uint FileSize;
    ushort NodeCount; // How many Nodes are present in the file. 99.9999% of the time this is just set to 1, as practically every BINA file contains only one node; the "DATA Node".
    ushort Unknown1 = 0; // Probably just padding? Possibly related to the "Footer" found in some Colors formats?

    Node[NodeCount] Nodes; // All nodes present in the file, as described below.
}

Nodes

BINA Files are comprised of a series of BINA Nodes. Each node starts off as a simple structure like this:

struct Node
{
    char[4] Signature; // Used to describe what type of node this is.
    uint Length; // The length of this node and all of its contents, from the beginning of Signature until the end of the node.
}

DATA Node

While it's clear the system was designed with the intent of there being multiple nodes of varying types (each with a different structure), from all of our findings thus far we've only ever seen one type of node in BINA files; the "DATA Node":

struct DataNode : Node
{
    Signature = "DATA";
    Length = (FileSize - 16); // Only true if the DATA node is the only node in the file, however this is always the case from our findings.

    uint StringTableOffset; // The non-absolute (relative to the beginning of the Data array below, which is always 0x40 from our findings) offset to the BINA String Table explained below.
    uint StringTableLength; // The length of the BINA String Table explained below.

    uint OffsetTableLength; // The length of the BINA Offset Table explained below.
    ushort AdditionalDataLength = 0x18;
    ushort Padding = 0; // Just two nulls to pad-out AdditionalDataLength to 4 bytes.
    byte[AdditionalDataLength] AdditionalData; // Exact purpose unknown. Seems to always just contain nulls?

    byte[Length - (0x18 + AdditionalDataLength)] Data; // Specific to each type of file. E.G. gedit files contain gedit data.
}

String Table

The DATA Node's string table is literally just an array of null-terminated strings (which are referenced via offsets throughout the file). The only other thing noteworthy about them is that the last value in the array must be padded to a 0x4 offset.

Offset Table

This is the most complicated part of the BINA format by far, so brace yourselves!!
To understand the purpose of offset tables, you need to understand how the game actually loads BINA files. You see, BINA files are read via "direct memory loading". Put simply, this means the game just grabs all the data from the file, casts it to a struct/class, and uses it, no additional parsing required!

This comes with a lot of obvious plus sides, such as load times being about as fast as they can get. However, it also unfortunately comes with a couple downsides. If anything's wrong with the file, for example, even if it's an extremely minor error that wouldn't matter to code that parses the data, it's pretty much guaranteed it will be enough cause the game to just crash without any error message or anything (which is kind of a pain for us modders, haha).

The biggest problem by far, however, comes from handling structs with pointers (aka "offsets") in them, like this:

struct Whatever
{
    char* Name = "Brad";
    int VeryImportantInt = 7;
}


In memory, "Name" is of course going to be the location (or "address") of the beginning of the "Brad" text in-memory. That's how pointers always work; simple, and to the point (sorry)!

Problem is, unlike data in files, data in memory is kinda random. We don't know where that "Brad" text (or anything from the file for that matter) is going to be in memory until we actually go to read the data from the file. So, what do we do?

If only we knew the position of every offset in the file... then we could just read the file into memory, get the position in-memory where that data was just read to, go to each offset, and add that position we just got to its value.

ENTER: The BINA Offset Table! In concept, it's actually very simple; it's literally just an array that lists the position of every offset in the file (not counting the offsets in the header, such as the offset to the string table) so we can fix the problem with direct memory loading pointers as described above.

However, in execution, BINA offset tables are quite complicated, as they use a few clever techniques to ensure the offset tables are as small as possible.

As an example, let's use some values (in hex) from an actual BINA offset table, namely the first 4 bytes in w5a01_obj_area01.gedit's offset table:

44 48 42 42


The first value in this table (44 in hex) is represented as the following in binary:

0100 0100


So basically, here's how this works. The first two bits represent how long this offset is, corresponding to one of the values in the table below:

00 = This offset is 0 bits long, you've reached the end of the offset table!
01 = This offset is 6 bits long.
10 = This offset is 14 bits long.
11 = This offset is 30 bits long.


In this case, since the first two bits are 01, the offset is 6 bits long, so we read the next 6 bits (00 0100) to get the value for the offset. Simple, right?

Now here's where things get a little tricky.

You see, these values aren't simply the positions of each offset. Rather, these values represent the distance between this offset and the last offset we've currently read (or the length of the BINA Header, aka 64 bytes, if we haven't actually read an offset yet).

Beyond that, due to how binary works, the last two bits in a value can only possibly be used to represent a value from 0-3, which wouldn't be very helpful here, as each offset, at a minimum, must be 4 bytes long!

So, to better take advantage of space, all of the values have actually been bit-shifted by two to the right (01 0100 is stored in the file like 00 0101, for example). Therefore, you also need to bit-shift these values back to the left by 2 to counter-act this.

Example

Phew, alright, we covered a lot just then! Still with us? Good!
Let's go over how you'd read each offset in the above example table.

The first value in our example offset table is 44 in hex and 0100 0100 in binary. Since the first two bits are 01, it means we need to read the next 6 bits (00 0100) to get our value. But remember, these values are actually all shifted by 2 to the right, so we need to shift it to the left by 2 to get the actual value: 0001 0000 (or 16).

Now we just add that to the position of the last offset we read, except in this case we haven't actually read an offset yet, so instead we use the length of the BINA header (again, 64 bytes). 64 + 16 = 80. So, the position of our first offset is 80! Now we just have to repeat this process for each offset in the table.

The next offset in this table is 0x48 (0100 1000). The first two bits are 01, so we use the next 6 bits (00 1000) as the value for the offset. Left-shift it by two as described above, and we get 0010 0000 (32). Add that to the position of the last offset we read (80), and presto, we get the position of our second offset (112)!

See? I think you get it now. Happy modding!

Sonic Community Hacking Guide
General
SonED2 Manual | Subroutine Equivalency List
Game-Specific
Sonic the Hedgehog (16-bit) | Sonic the Hedgehog (8-bit) | Sonic CD (prototype 510) | Sonic CD | Sonic CD (PC) | Sonic CD (2011) | Sonic 2 (Simon Wai prototype) | Sonic 2 (16-bit) | Sonic 2 (Master System) | Sonic 3 | Sonic 3 & Knuckles | Chaotix | Sonic Jam | Sonic Jam 6 | Sonic Adventure | Sonic Adventure DX: Director's Cut | Sonic Adventure DX: PC | Sonic Adventure (2010) | Sonic Adventure 2 | Sonic Adventure 2: Battle | Sonic Adventure 2 (PC) | Sonic Heroes | Sonic Riders | Sonic the Hedgehog (2006) | Sonic & Sega All-Stars Racing | Sonic Unleashed (Xbox 360/PS3) | Sonic Colours | Sonic Generations | Sonic Forces
Technical information
Sonic Eraser | Sonic 2 (Nick Arcade prototype) | Sonic CD (prototype; 1992-12-04) | Dr. Robotnik's Mean Bean Machine | Sonic Triple Trouble | Tails Adventures | Sonic Crackers | Sonic 3D: Flickies' Island | Sonic & Knuckles Collection | Sonic R | Sonic Shuffle | Sonic Advance | Sonic Advance 3 | Sonic Battle | Shadow the Hedgehog | Sonic Rush | Sonic Classic Collection | Sonic Free Riders | Sonic Lost World
Legacy Guides
The Nemesis Hacking Guides The Esrael Hacking Guides
ROM: Sonic 1 | Sonic 2 | Sonic 2 Beta | Sonic 3

Savestate: Sonic 1 | Sonic 2 Beta/Final | Sonic 3

Sonic 1 (English / Portuguese) | Sonic 2 Beta (English / Portuguese) | Sonic 2 and Knuckles (English / Portuguese)
Move to Sega Retro
Number Systems (or scrap) | Assembly Hacking Guide | 68000 Instruction Set | 68000 ASM-to-Hex Code Reference | SMPS Music Hacking Guide | Mega Drive technical information