There is a clarity issue with the current style of the new scripts I've been writing with where the state is stored before it's logged. My typical model has been to define two record types; Module::Info and Module::Log. An example looks like this....
type Log: record {
uri: string;
host: string;
};
type Info: record {
log: Log;
pending_requests: SomeOtherDataType;
};
The state is stored rather haphazardly in either the Log record or the Info record depending on if it's ever going to be logged. The Log value is what ends up being sent onward to the logging framework, so by default all of it's fields are logged. The $pending_requests field can't be logging in any reasonable form and never really has any need to be logged so it's stored in the parent ::Info record.
Robin an I discussed creating a new attribute to support this notion of data for use by the logging framework. If we make a &log attribute, that could transmit the intent of the field to the logging framework. Here's an example of how this would look...
type Info: record {
uri: string &log;
host: string &log;
pending_requests: SomeOtherDataType;
};
Although that attribute is annoying use case specific, I think it really makes for a much cleaner interface for creating these records that are meant for carrying data forward to the point of logging considering that in many cases there is other data that needs carried forward but is never logged (like pending_requests). I'll give some examples of the two styles in use now..
New style...
active_http_session$uri = unescaped_URI;
Those are sort of silly examples but I wanted to show them to point out how often the $log$ fragment is reused, but it really isn't anything but syntactical noise that shouldn't really be there. I think that the &log attribute would make the interfaces to these scripts quite a bit more palatable and enjoyable to use.
I think that the &log attribute would make the interfaces to these scripts
quite a bit more palatable and enjoyable to use.
Thoughts?
I haven't worked w/ the new upcoming log framework for my opinion to carry much weight, but it does look like a &log attribute might make more sense and simplify things for a script writer.
Although that attribute is annoying use case specific
I don't really get why it has to be use-case specific and not a general way to convey to the logging framework that "only these record fields" are meant to be logged. Is it just a problem because it requires more verbosity in the case that all fields are meant to be logged (i.e. script writer might forget to add the &log attribute) ? Or is there an example you can give to explain what you mean?
Is it just a problem because it requires more verbosity in the case that all fields are meant to be logged (i.e. script writer might forget to add the &log attribute) ?
Nah, I don't think that needing the &log attribute explicitly needed for each field is a problem. I think it's better than many alternatives.
Or is there an example you can give to explain what you mean?
It's a non-reusable attribute. It only applies in this one scenario, but I suppose the same can also be said for most of the file based attributes like &raw_output. I guess I don't have a really good example and the attribute really does make for clearer intentions when writing scripts compared to the current model of the two separate records.
It would be nice for documentation generation too because there isn't any question about what fields/records are intended for the logging framework. In the existing model, there would need to be some sort of indicator for a record to indicate that the record is intended for the logging framework because you don't want to guess based on the type name.
It's a non-reusable attribute. It only applies in this one scenario,
I think I understand now, tell me if this rephrasing is wrong -- in practice you'd only ever use &log in record types that get passed to the logging framework and it doesn't make sense to tag any other identifiers with it (which, if done, shouldn't actually do anything).
It would be nice for documentation generation too because there isn't
any question about what fields/records are intended for the logging
framework. In the existing model, there would need to be some sort of
indicator for a record to indicate that the record is intended for the
logging framework because you don't want to guess based on the type
name.
Yeah. It does seem to simplify (at least part of) this task as well.
type Info: record {
uri: string &log;
host: string &log;
This in general sounds better to me than the two-separate-records approach.
However, I wonder if this could be done in a non-use-case-specific manner.
You could picture something like:
type attribute &log; # declares "&log" as a new attribute,
# in this case without an associated value
and then (as I guess you already have in mind) the logging framework
can first test for whether $uri is present, and, if so, whether it
has &log set (via the existing ?$$ operator).
Ah! Nice. I like that too. Now you're going to get me thinking about other ways this can be used.
I generally like this for one reason already, it brings more of the "core" functionality (creating new attributes) to the scripting layer. I'm probably always going to be in favor of anything that does that.
If we did the script level implementation, it would only immediately create another script dependency with the core. Almost all of the new logging framework code that Robin has written is core code in C++ so we'd never actually be using the ?$ operator for check for the attribute in the script layer. It seems like we'd *have* to think of another use case for it to justify implementing the script level attribute creation unless it's really easy since since implementing yet another attribute is so easy.
I was about to say something similar: why don't we go with the &log
attribute for now and generalize to user-definable attributes if/when
we find more use cases?
I was about to say something similar: why don't we go with the &log
attribute for now and generalize to user-definable attributes if/when
we find more use cases?
I had (for some reason) thought that the processing of this attribute was
mostly at script-level. If it's indeed instead in the core [*], then I agree
it makes sense to not generalize at this point.
Vern
[*] Yes, you got me to use the term "core". "Event engine" doesn't sound
right for something like this which isn't at all about generating
events. (Which just makes me think it should in fact be implemented
at the script level.
One more question regarding this: Seth and I had discussed earlier
that if a record does not have any "&log" fields, the logging
framework could by default log all fields. We dismissed that as "too
much magic", but thinking about it again, I'd like to reconsider.
The reason is that if I pass a record in for logging, chances are that
I want something logged. If there's no explicit &log attribute, it
seems natural to assume that one wants to log everything. That also
looks nicer and is less cumbersome than having to add &log to every
single field to get the same effect. And I don't really see how one
would pass a record to logging with the intention of *not* logging
anything.
Haha! That does make sense. I still sort of think I'd rather have the attribute be required though.
Oh! I just thought of an example. Someone could write that it collects state, but none of it is ever logged because it's only intended to extended and built upon in further scripts. If we had the "absence of &log means log everything" convention, it would want to log our state tracking variables. It also becomes really confusing since we can extend record types in other locations and things that would formerly log are suddenly not being logged because the type was extended and a one of the extended values had the &log attribute.
Just for the record, I don't think this would be appropriate either due to the recent record extension mechanism. It would essentially prohibit someone from including non-logged state information in a script extending one of the logging records with the separate syntax indicating "log all".
Ok, convinced, I see the record-extension argumnet. The main point I'm
worried about is that we'll end up with lots of &log attributes in
these combined log/state records. But ok, let's see how it turns out.
I'll add the &log attribute as discussed.
The reason is that if I pass a record in for logging, chances are that
I want something logged. If there's no explicit &log attribute
What about explicitly associating an &log with the entire record, rather
than with each of it fields? (I'm not quite picturing the usage you have
in mind here, so this may or may not be appropriate.)
I can give a little detail about why I didn't like it.
type Info: record {
ts: time &log;
id: conn_id &log;
};
Given the above type, I wouldn't want someone giving &log to the entire record because if later, someone else comes along with a script that extends that record to include some extra fields like this...
redef record Info += {
did_something: bool &default=F &log;
ready_to_log: bool &default=F;
};
...the whole record would be logged and &ready_to_log might be a variable that they included for tracking if the whole record is prepared to be logged. This is a technique I use in a few scripts to let users decide which point they'd like to log at based on costs/benefits on early or late logging. The author of the extension script wouldn't want the $ready_to_log attribute logged since it's really just an internal tracking variable.
redef record Info += {
did_something: bool &default=F &log;
ready_to_log: bool &default=F;
};
What about requiring &log to be added to extensions (if one wants the
entire extension logged by default), otherwise they go back to the normal
behavior of only associating &log with fields for which it's explicitly
mentioned?
This is a bit different from how attributes currently propagate, but we
don't (I think) have other maybe-it's-the-whole-aggregate-maybe-it's-a-field
attributes, so the above wouldn't be inconsistent with existing practice.