I’ve been working with a colleague mapping scanning activity. We are able to capture JA3 fingerprint and match it up with the cleartext User-Agent strings.
I’m considering throwing together a database with this information and wanted to get insight from others to see if it’s worth it. User-Agent strings can obviously change so the mapping may be a bit weak.
Please let me know what the list thinks. Worth it or not?
to chime in here a bit - I think this can be useful - but please give the
data in an as detailed format as possible. So - if that is possible,
please do not just include the JA3 hash and the user-agent, but also
include the parts that make up the JA3 hash (and consider including more
information).
That makes it possible to, e.g. see how close several fingerprints are to
each other, which can be useful.
Also - as a more generit remark - one has to be quite careful on how to
interpret such fingerprints; in our experience, collisions (several pieces
of software that use the same underlying library, or have the same
fingerprint for different reasons) are quite common; in our measurements
for a recent paper (http://icir.org/johanna/papers/imc18tlsdeployment.pdf)
it was so common that we did not use it for a whole bunch of data analysis
that we planned.
On a side-note - we also published a list of TLS fingerprints that were
generated for that paper; it is accessible at https://github.com/platonK/tls_fingerprints and might potentially be of
interest to some people of the list.
However, the same caeveat applies - one has to be a bit careful on how to
interpret the data.
Thank you all for the feedback. The goal of this work is to provide a more realtime aggregation of JA3 information and mappings.
I’ve spoken with Johanna previously and completely agree aggregating the client hello details. This data set could be really great for research as she said above.
My thought would be to try and host something like the SSL Notary. This would be continually growing while most JA3 databases are stagnant, or at most periodically updated. There are several different mappings that I’d be interested in tracking. I’d like to build a couple of scripts that could be distributed to feed back into a database. Some of the mappings I’d like to see are the following:
Then make this data available via some type of API. We could provide a REST API or maybe a DNS type lookup. It’d be quite an undertaking but if others find it of interest and can contribute I’d be able to get more cycles for it.
This reiterates some of the concerns we had and why I’m looking at building this project. He had some good ideas about clues which are available within the extension values. It gave me some ideas on how to apply it and extend JA3.
Love to hear what others think. Feel free to ping me directly.