suggestion request

Hi,

For the ones dealing with machine learning KDD Cup 99 dataset is used widely for testing the algorithm proposed. (http://kdd.ics.uci.edu/databases/kddcup99/task.html)

The data set is created with some features defined at Table 1, Table 2 and Table 3. Now i would like to test my algorithm with the real data so i will collect traffic and convert it to KDD Cup 99 format. I am searching a method.

How can i gather or calculate the properties at Table2? The properties are

number of hot'' indicators: meaning hidden directory creation number of failed login attempts 1 if successfully logged in; 0 otherwise number of compromised’‘continuous
1 if root shell is obtained; 0 otherwise
1 if su root'' command attempted; 0 otherwise number of root’’ accesses
number of file creation operations
number of shell prompts
number of operations on access control files
number of outbound commands in an ftp session

So it seems payload analysis is required? Anybody who had experience with such thing or any suggestion? I will be listening a mirrored port and saving the traffic data to db. Can Bro Time Machine help me on this issue?

Cheers.