Advice on the PE Analyzer

For Bro 2.5, I’d like to add some more functionality to the Windows Portable Executable analyzer. I think there’s a lot of valuable data that could be extracted, but the format is rather challenging to work with. Some protocol pseudocode would be:

0000: import_address_table is at 0010
0010: entry #1 is at 0030
0020: entry #2 is at 0050
0030: entry #1 address is at 0100
0040: entry #1 name is at 0110
0050: entry #2 address is at 0140
0060: entry #2 name is at 0120

Much of the data is simply pointers to offsets in the file where the actual data resides. My thought for parsing this is to write two helper functions in C++:

jump_to_next_interesting_offset()

// Skips to the offset of the next thing I would like to parse

get_data_context_at_current_offset()

// Get contextual information for how to parse the current data
// (e.g. this is the the address of entry #1 in the import address table).

Does anyone have suggestions for what data structures I should use to store the necessary data? Storing the offsets shouldn’t be very difficult, but each offset will need context associated with it in order to know how to parse it once the analyzer gets there, and how to associate the data residing there with data that’s already been parsed.

Thanks,

–Vlad

For Bro 2.5, I'd like to add some more functionality to the Windows
Portable Executable analyzer. I think there's a lot of valuable data that
could be extracted, but the format is rather challenging to work with. Some
protocol pseudocode would be:

> 0000: import_address_table is at 0010
> 0010: entry #1 is at 0030
> 0020: entry #2 is at 0050
> 0030: entry #1 address is at 0100
> 0040: entry #1 name is at 0110
> 0050: entry #2 address is at 0140
> 0060: entry #2 name is at 0120

Much of the data is simply pointers to offsets in the file where the actual
data resides. My thought for parsing this is to write two helper functions
in C++:

> jump_to_next_interesting_offset()
> // Skips to the offset of the next thing I would like to parse

> get_data_context_at_current_offset()
> // Get contextual information for how to parse the current data
> // (e.g. this is the the address of entry #1 in the import address table).

I think this generally sounds like a sound approach. One thing that might
help with this, if you are not already aware of this - it is a tad tricky
to get binpac to use your file analyzer subclass of PE, instead of the
base BroFileAnalyzer class - the SSL analyzer shows and example on how to
get binpac to do this.

Does anyone have suggestions for what data structures I should use to store
the necessary data? Storing the offsets shouldn't be very difficult, but
each offset will need context associated with it in order to know how to
parse it once the analyzer gets there, and how to associate the data
residing there with data that's already been parsed.

This might be a tad naive - but would a std::priorityqueue containing a
class, that has the offset (which is used for sorting) as well as all the
contextual information that you have to store make sense?

Generally - I think it would be really cool to have this feature in Bro :slight_smile:

Johanna