These are chat archives for Yelp/elastalert

6th
Sep 2017
Roman
@invizus
Sep 06 2017 09:39
I got another false positive flatline alert. Noticed that flatline queries twice every minute. is that right?
Roman
@invizus
Sep 06 2017 10:23
all the rules start once, and flatline rules start twice in one run. I read few issues and modified my flatline rule, will see if that worked.
sathishdsgithub
@sathishdsgithub
Sep 06 2017 15:09
I have a two node ES-cluster with different IP address in each node. I need to index the data on the ES cluster. I believe in elastalert we don't have an option to add multiple host IP address in es_host: .I explored the options in google some recommended to use ha_proxy behind ES . But the problem is if I use ha_proxy the elastalert will query only one node at a time, my requirements is to make a query request to the entire es_cluster during the single query.
Andrew Rose
@andrewrosezen
Sep 06 2017 18:28
@sathishdsgithub when you query a node in the cluster for data that exists on another node, the nodes are intended to communicate with one another and give you the full result set
if Node A can''t query Node B for Index C (Well, really it is looking at shards which by default will be distributed evenly-ish across both nodes), your cluster is not a cluster
as an example, all of my queries are run against an elasticsearch node that holds no data at all (a client node)
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:31
@andrewrosezen in elastalert we can specify only one es_host ip address..in my scenario my ES cluster has two nodes with two different ip address..how should I specify the two IP address in es_host?
Andrew Rose
@andrewrosezen
Sep 06 2017 18:32
unless your goal is load balancing the queries, you do not need to do this
unless what you really have is two 1-node clusters
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:34
@andrewrosezen if I use DNS load balancing will it make query request to both ES nodes ?
Just an additional note both the ES nodes has different set of logs...
Andrew Rose
@andrewrosezen
Sep 06 2017 18:35
if you load balance, it will select a backend to query based on the load balancing rules, usually round-robin or some stickiness
AH
ok
you do not have an elasticsearch cluster
you have two 1-node elasticsearch clusters
when clustered, you treat all the data as one big database, and query it on whichever host you like
and ES automatically routes requests for index data to the right place
all of this happens (or, is meant to happen) in the background, along the inter-node communication ports (default 93xx)
when you do a GET http://nodeIP:PORT/_cluster/health?pretty what do you get back?
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:44
I'm using a graylog elastic search cluster in which we have to specify the ES node IP address. elasticsearch_hosts = http://node1:9200,http://user:password@node2:19200
Andrew Rose
@andrewrosezen
Sep 06 2017 18:45
ok, but do your nodes know about each other/
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:46
My cluster name will be the same for node1 and node2 ...Now I need to query these nodes from elastalert? In elastalert I have to specify es_host : node IP address
Andrew Rose
@andrewrosezen
Sep 06 2017 18:46
elastalert doesn't need to support having multiple hosts to query because an ES cluster handles routing of requests to hosts that have that data transparently, is what I'm getting at
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:47
Yes nodes knows each other...Limitations of elastalert is we can specify only one node IP address
Ok let's say if my cluster name is graylog ...if I specify es_host: graylog ....how does the elastalert will make a network socket connection to both the node IP address?
Andrew Rose
@andrewrosezen
Sep 06 2017 18:48
elastalert absolutely does not need to
ok
so
lets say you have an index called greylog
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:49
Should I map both the node IP address to single DNS host entry ?
Andrew Rose
@andrewrosezen
Sep 06 2017 18:49
no
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:50
May be I'm missing something ..let me know the solution for this ?
Andrew Rose
@andrewrosezen
Sep 06 2017 18:50
lets say you have two nodes, and one index called greylog, and let's assume that greylog has 5 shards and is told to make 1 replica (the default)
when data goes to elasticsearch for indexing
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:50
Replica is disabled in my scenario
I'm not duplicating the logs
Andrew Rose
@andrewrosezen
Sep 06 2017 18:51
kk
even with replicas=0
even with shards=1 and replicas=0
node1 decides that node2 should hold the actual data for greylog
so you send data to node1 (OR node2)
and the node that gets the data passes it along across the inter-node communication ports to node2, because the cluster has decided that node2 should hold it
good so far?
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:54
Ok
Andrew Rose
@andrewrosezen
Sep 06 2017 18:54
so now you decide you want to query the data
if your cluster is working properly, if you query node1 for greylog documents
node1 says "ok, so where did I stash greylog" and it looks up that it stored that data on node2
so then node1, on its own, asks node2 to look for the data that it has
node2 responds, passing it to node1
and then node1 returns it to you
if you asked node2 about greylog directly, it would figure out that greylog is local to itself, and would return the data to you directly
if your cluster is working, it does not matter which node you query for data
it plans a distributed search and passes everything along back to you from the node you query
this is what is meant by a cluster, in ES
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:58
What if both the node has same index name ? Example: graylog_*
Andrew Rose
@andrewrosezen
Sep 06 2017 18:58
if node1 doesn't know about data in node2, and vice versa, your nodes are not clustered
when they are clustered, that is the same index
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:59
Ok , now what should I specify in es_host ? Either node1 or node2 IP ?
Andrew Rose
@andrewrosezen
Sep 06 2017 18:59
it breaks it up into segments called shards and distributes them across nodes
for smaller scale stuff, I'd just query the one that isn't attached to kibana
sathishdsgithub
@sathishdsgithub
Sep 06 2017 18:59
I'm not using kibana
Andrew Rose
@andrewrosezen
Sep 06 2017 18:59
but ultimately you want to use a client node, since it isn't using thread time to index data
oh, in that case just pick one
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:00
I use graylog for log collection and display
So either node1 or node2 IP address will work ? Correct me if I'm wrong
Andrew Rose
@andrewrosezen
Sep 06 2017 19:00
either will work
as long as your nodes are actually communicating with one another the way they are meant to
unless greylog is doing something really weird
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:02
Ok I will specify node1 ip address in es_host and test it and update you
Andrew Rose
@andrewrosezen
Sep 06 2017 19:02
the nodes are just interfaces to one big networked database
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:02
Oh ok...thanks for your short elastic search tutorials :smile:
@andrewrosezen will update you tomorrow about the test results.. thanks again
Andrew Rose
@andrewrosezen
Sep 06 2017 19:03
I tried not to get too deep into the technical part of how it all works, but as a rule of thumb, any node should behave like all the nodes together
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:05
One last question...
IP address are just to make a elastic search cluster and for inter communication between the nodes ?
One last question...
Node address are just to make a elastic search cluster and for inter communication between the nodes ?
Andrew Rose
@andrewrosezen
Sep 06 2017 19:05
nodes talk to each other over the network, so they need to bind to at least two ports
9200 is the default for the "front door" http/rest interface
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:07
If I add the third node they all communicate each other on port 9300 correct?
Sorry 9200
Andrew Rose
@andrewrosezen
Sep 06 2017 19:07
and then they bind to a port between 9300-9399 by default (random I think) which is where they actually talk to each other
you talk to ES on 9200, nodes talk to other nodes on 93xx
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:08
Cool thanks...
Andrew Rose
@andrewrosezen
Sep 06 2017 19:08
the Elastic docs about node communication are pretty good
sathishdsgithub
@sathishdsgithub
Sep 06 2017 19:09
Conclusion : the ES nodes are just interfaces to one big networked database..so we can use any node IP address for elastalert...
Andrew Rose
@andrewrosezen
Sep 06 2017 19:10
yep
if that doesn't work, your nodes aren't configured right
this functionality is more or less why ES is used. The full-text-search is also pretty great but ES is Lucene with magic distributed queries
felixrod78
@felixrod78
Sep 06 2017 20:10
Hi is there any workaround to optimize the time expend in reading all the alerts?
Or even, is possible to save them in a DDBB?
Quentin Long
@Qmando
Sep 06 2017 22:40
@felixrod78 What's a DDBB, and what do you mean by time expended reading alerts?