These are chat archives for atomix/atomix

8th
Sep 2018
Jordan Halterman
@kuujo
Sep 08 2018 00:04

Yeah I figure this out:

The key is not to block container start since DNS will be updated only at the point the container is up and pod signaling a ready state

I don’t know why all the container orchestration platforms do this
it’s a PITA but I guess it’s okay to just use the liveness check once the cluster has been bootstrapped
Maxim Manco
@mmanco
Sep 08 2018 00:05
Agree, this is how it is now and we need to dance around it:)
Jordan Halterman
@kuujo
Sep 08 2018 00:05
we had basically the same problem with Docker Swarm
working on a POC a while back
thanks a lot
Maxim Manco
@mmanco
Sep 08 2018 00:08
With pleasure, thank you for Atomix!
Jordan Halterman
@kuujo
Sep 08 2018 00:08
so here’s my implementation for minikube: https://github.com/atomix/atomix-k8s/blob/master/atomix-local.yaml
basically it doesn’t use a readiness probe and blocks the startup script until all the hostnames are reachable: https://github.com/atomix/atomix-k8s/blob/master/docker/scripts/start-atomix#L161-L163
still a few issues but it seems to start and cluster fine now
Maxim Manco
@mmanco
Sep 08 2018 00:12
Yeap, looks like the right path to go
For out application we have to use readiness in order to take some instances out of traffic for some period of time.
Maxim Manco
@mmanco
Sep 08 2018 00:22
BTW going back to the issue I raised previously with MessageDecoder throwing exceptions on unexpected payloads which in our case are generated by a probing system.
Thinking it may be a good idea to read some identifier before the READ_SENDER_VERSION step. That will prevent the decoder from reading unexpected messages. WDYT?
Jordan Halterman
@kuujo
Sep 08 2018 00:26
So, actually the preamble is supposed to be used for that, but currently the preamble is read pretty far up the chain and the decoder doesn’t exit when the preamble is inconsistent. The version probably has to remain first since it’s used for backwards compatibility. But really the preamble should be written/read in the header and validated after the version, and either an unsupported version or invalid preamble should cause the decoder to stop processing messages altogether
maybe will have to make a v2 encoder/decoder and move some of the bytes around
without that validation, it’s also possible for an invalid message to cause an OOM from allocating a message byte[] from some arbitrary number pulled from the bytes
Enabling SSL would be the answer for security, but it should still be handled before allocating any byte[]s to avoid this type of issue, which isn’t necessarily a security risk but can probably cause a node to crash
Actually a researcher provided a script that showed that as an attack vector in the past. But it can really be done just by any process accidentally sending bytes to the wrong port over TCP too
Maxim Manco
@mmanco
Sep 08 2018 01:19
I see, was not going higher up the chain to see that the preamble is getting read. it does make sense for the decoder to decode what it meant to decode and ignore other non relevant stuff
Johno Crawford
@johnou
Sep 08 2018 14:27
@kuujo what we do in our application to combat that is to log at different log levels depending on the state of the decoder
The same thing occurs when enabling TLS too