I can try to summarize the current status of the plugin, to give this
discussion some additional context:
* A large stream of log output to NSQ is working. I was pushing about 1
billion log lines/day for months with no issues.
* ElasticSearch output stopped working with ElasticSearch version 2.0,
since they changed the delimiter rules. However, this should be
fixable with the change Seth introduced for 2.5 (and perhaps we
should update that to be the default?)
* A medium to large stream of log output to ElasticSearch requires a
lot of tuning and I think is still problematic. I think memory slowly
creeps up in most cases (ElasticSearch starts garbage-collecting, and
stops responding for a while). I haven't done work with ElasticSearch
2.0 to see how that affects this. Perhaps splitting out the logger
node will help with this? I'm not sure.
* I think that the main issue that Seth was referencing is that the log
writer doesn't check the response code from NSQ or ElasticSearch. If
the server responds with a 500 or other error code, it might make
sense to retry sending the messages a couple of times? Right now,
they just get dropped, so this can be a lossy log writer.
So, I'm a bit hesitant to deprecate this, since I think it still works
for NSQ, and it still works (in some cases) for ElasticSearch.
Ironically, it works better for ElasticSearch in 2.5 than it would for
2.4.1, since the delimiter configuration option was introduced.
That being said, I'm also hesitant to take this on myself, simply
because we don't have an ElasticSearch cluster at NCSA.
I think it makes sense to generalize this as an HTTP/JSON log writer,
but we still need to tackle the question of what we do with messages
that fail to be delivered.
Generalizing it might be a bit tricky. For example, ElasticSearch needs
to post to http://1.2.3.4:9000/$log_name, while NSQ needs to add a
line containing the log_name before each log line.
--Vlad
"Azoff, Justin S" <jazoff@illinois.edu> writes: