These are chat archives for linkedin/pinot

10th
Feb 2016
Hisham Mardam-Bey
@mardambey
Feb 10 2016 01:54
After the changes I mentioned above, I was just able to create the real-time table with com.linkedin.pinot.core.realtime.impl.kafka.KafkaConfluentAvroMessageDecoder
Jean-François Im
@jfim
Feb 10 2016 01:55
Nice, ^5
Hisham Mardam-Bey
@mardambey
Feb 10 2016 01:55
Now I gotta get some data pumped into Kafka; produced by a Confluent compatible producer.
(=
Jean-François Im
@jfim
Feb 10 2016 01:55
sweet
Hisham Mardam-Bey
@mardambey
Feb 10 2016 01:56
Half of the work was mucking around with Docker networking bits and bobs. hehe
Jean-François Im
@jfim
Feb 10 2016 01:56
Haha I can imagine
I'm really looking forwards to having a docker image and having the cluster management tools that we have as part of the open source release
It'll make it so much easier for people to deploy this
If it's not private information, are you running on your own hardware or on some cloud provider (eg. aws)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 01:59
Some could provider (eg. aws).
(=
confluent-platform:
  image: socialorra/confluent-platform
  ports:
    - "8081:8081"
    - "2181:2181"
    - "9092:9092"
  environment:
    - KAFKA_ADVERTISED_HOST_NAME=localhost

pinot:
  image: mardambey/docker-pinot-confluent-platform
  ports:
    - "8098:8098"
    - "8099:8099"
    - "9002:9000"
  links:
    - confluent-platform
  environment:
    - ZK_ADDRESS=confluent-platform:2181
That's the docker-compose.yml file for this stuff.
Jean-François Im
@jfim
Feb 10 2016 02:00
oh nice
so the confluent platform provides zk out of the box?
that's nifty
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:01
The distribution has ZK, Kafka, Schema Registry, and other parts (Kafka Connect, etc.)
The Docker image used there gets their 2.0.0 distro, unpacks it, and runs ZK, Kafka, and the Schema Registry in a single container.
Jean-François Im
@jfim
Feb 10 2016 02:01
Oh I see
I assume though the zk isn't clustered so that's not for prod envs?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:02
Bingo.
Jean-François Im
@jfim
Feb 10 2016 02:02
gotcha
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:02
Both of the above are meant for development.
Jean-François Im
@jfim
Feb 10 2016 02:02
Makes sense
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:03
Once we're past this stage, if this is to go into prod, we'll need containers for each of Pinot's components.
Jean-François Im
@jfim
Feb 10 2016 02:03
Right
Well it's good to have a single container for dev purposes anyway :)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:04
nod (=
Jean-François Im
@jfim
Feb 10 2016 02:04
Is the memory allocated to each node also configurable through env vars?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:05
Not right now. Very little is configurable through env vars, only the stuff I've needed so far to get running.
Jean-François Im
@jfim
Feb 10 2016 02:05
Sorry, my question wasn't clear
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:05
I can see a lot more env vars being made available though, definitely useful.
Jean-François Im
@jfim
Feb 10 2016 02:05
I assume Docker has some kind of per-container limit so that one container cannot bring the whole thing down
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:06
I've not looked into that, so I'll assume so for now (=
Jean-François Im
@jfim
Feb 10 2016 02:06
haha
okay
I assume it's configurable in Docker, it would make no sense for that not to be
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:06
mem_limit: 2000m
Jean-François Im
@jfim
Feb 10 2016 02:06
oh gotcha
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:06
docker-compose supports this.
Jean-François Im
@jfim
Feb 10 2016 02:07
makes sense
I wonder if it would be doable eventually to do full roundtrip between docker and Helix
Helix supports instance provisioning
So Helix could request more Docker containers and deploy them
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:08
Gotcha.
It all depends on where the Docker containers are running.
Jean-François Im
@jfim
Feb 10 2016 02:09
So you could basically say to Helix "create this table with n servers" and it will figure out it requires n servers and provisions them
Well, we'll look into that when we're there :)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:09
Pretty much every system that manages containers has a way for you to ask it to scale stuff.
Jean-François Im
@jfim
Feb 10 2016 02:09
right
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:10
Kubernetes, Mesos/Marathon, Swarm, etc all have these types of features. (=
Jean-François Im
@jfim
Feb 10 2016 02:10
I assume it'll be just a matter of having some kind of Helix adapter for provisioning systems (yarn/mesos/swarm)
right
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:10
nod
Jean-François Im
@jfim
Feb 10 2016 02:11
Man, I can't wait until the only thing I have to do to provision new servers is change some config and push it to git
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:11
hehe
Jean-François Im
@jfim
Feb 10 2016 02:11
and that config file just applies the table changes and provisions new instances
when we're there, I can take a really long vacation
I checked with our SRE to see if it would be possible to opensource the management tools and we have a good idea on how to do it
though I don't have a timeline for that
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:13
That's a good start.
Jean-François Im
@jfim
Feb 10 2016 02:13
but basically our pinot config is stored in git and we'd like for the tool to be usable as a git post-commit hook
so creating new tables is just edit/git commit/code review/git push
It's not a git post-commit hook yet
Hisham Mardam-Bey
@mardambey
Feb 10 2016 02:16
nod
Hisham Mardam-Bey
@mardambey
Feb 10 2016 15:57
@jfim I'm starting to reaaaaally very quickly want to table admin scripts, hehe. Specially to drop tables atm, since I'm experimenting recreating tables, changing columns, etc.
hehe
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:19
Just got my app to publish into Kafka running in the Confluent Docker container, then Pinot picked up the message, decoded via Confluent's schema registry; and indexed it to the real-time table that I previously created.
Used the web interface to see the data in the table; I don't have the system pulling from Pinot via the Java API yet.
Jean-François Im
@jfim
Feb 10 2016 18:49
Nice :)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:52
@jfim I just tried it from Pinot running in the Docker container; I haven't seen the data yet.
I know that Pinot is calling init() on my code, since I see my props printed out on the screen.
but the decode() method is not called.
I'll really have to put some debugging in there and rebuild the container and see if it's being called at all or not.
Jean-François Im
@jfim
Feb 10 2016 18:52
Have you pushed the event before creating the table?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:53
This is where stuff sucks: works outside the container, but not in the damn thing.
Actually, no, I don't think so.
I created first.
Jean-François Im
@jfim
Feb 10 2016 18:53
okay
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:53
Then pushed.
Jean-François Im
@jfim
Feb 10 2016 18:53
hmm
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:53
I was wondering if that mattered.
Jean-François Im
@jfim
Feb 10 2016 18:53
It does
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:53
I even tried restarting Pinot.
So, I can try to rebuild the containers from scratch, and push data first then create the table.
Jean-François Im
@jfim
Feb 10 2016 18:54
You can set the kafka offset that it uses
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:54
If I see nothing in Pinot, then there's an issue.
Then I'll have to debug the decoder more closely.
Jean-François Im
@jfim
Feb 10 2016 18:54
By default I think it's set to the end of the stream, not the beginning
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:54
@jfim set the Kafka offset via like a zk shell?
I published more events into Kafka after restarting Pinot, still didn't pick them up.
Jean-François Im
@jfim
Feb 10 2016 18:55
Oh
Then something's wrong :)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:55
I was thinking, even if it didnt pick up the initial events, it should pick up subsequent ones.
Jean-François Im
@jfim
Feb 10 2016 18:55
right
Do you have zooinspector?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:56
Yea, that's why I wanna debug the decoder, and see if the decode method is called at all.
I don't have zooinspector.
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:56
nod; I'll get it going.
I'm going to rebuild the Pinot container with more logging.
Jean-François Im
@jfim
Feb 10 2016 18:57
You can check the state in ZK
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:57
I want to know if the decoder is being called at all.
Jean-François Im
@jfim
Feb 10 2016 18:57
If you look at the state of the helix cluster with zooinspector
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:57
nod
Jean-François Im
@jfim
Feb 10 2016 18:58
in EXTERNALVIEW you should see your realtime table and it should have one segment in ONLINE state
Hisham Mardam-Bey
@mardambey
Feb 10 2016 18:58
nod
I might not be able to try out zooinspector until a few hours from now. I'm rebuilding the container so I can know more either way, and I've gotta jump into a meeting soon.
Jean-François Im
@jfim
Feb 10 2016 18:59
okay :)
Ping me when you can look at the state in zk, we can then debug what's happening
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:10
@jfim ping (=
@jfim rebuilt with debugging in the decode method, nothing comes out on the screen. I've got zooinspector up and running now!
Jean-François Im
@jfim
Feb 10 2016 20:42
@mardambey Can you check what's in EXTERNALVIEW?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:43
Sure.
Jean-François Im
@jfim
Feb 10 2016 20:43
You should see yourtable_OFFLINE and yourtable_REALTIME
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:43
brokerResource
facebook_ads_OFFLINE
facebook_ads_REALTIME
flights_OFFLINE
flights_REALTIME
Jean-François Im
@jfim
Feb 10 2016 20:44
so if you look at your REALTIME table, it should have segments in ONLINE state
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:45
{
  "id":"facebook_ads_REALTIME"
  ,"simpleFields":{
    "BUCKET_SIZE":"0"
    ,"IDEAL_STATE_MODE":"CUSTOMIZED"
    ,"INSTANCE_GROUP_TAG":"facebook_ads_REALTIME"
    ,"MAX_PARTITIONS_PER_INSTANCE":"1"
    ,"NUM_PARTITIONS":"1"
    ,"REBALANCE_MODE":"CUSTOMIZED"
    ,"REPLICAS":"1"
    ,"STATE_MODEL_DEF_REF":"SegmentOnlineOfflineStateModel"
    ,"STATE_MODEL_FACTORY_NAME":"DEFAULT"
  }
  ,"listFields":{
  }
  ,"mapFields":{
    "facebook_ads_REALTIME__Server_172.17.0.4_8098__facebook_ads_REALTIME_1455132381884_0__0__1455132382308":{
      "Server_172.17.0.4_8098":"ONLINE"
    }
  }
}
Jean-François Im
@jfim
Feb 10 2016 20:45
That's the brokerResource :P
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:45
oops
Hold on (=
Edited, above.
Jean-François Im
@jfim
Feb 10 2016 20:46
Ah so yeah the consumer should be started
Do you see any exceptions in the server logs?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:47
Let me check, nothing on the console, I think the log files will be empty.
mardambey @mardambey checks
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:47
log files are all zero sized.
Jean-François Im
@jfim
Feb 10 2016 20:47
Oh wow
I guess log4j isn't set up properly then
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:48
This is how I start'em in the container:
bin/pinot-admin.sh StartServer $ZK_OPT
I don't point to a log4j config file or anything
Perhaps I need to include a -D on the command line?
Jean-François Im
@jfim
Feb 10 2016 20:50
hmm
mardambey @mardambey tries
Jean-François Im
@jfim
Feb 10 2016 20:50
I wonder if we have multiple log4j.properties files in the classpath
ugh
it's set to ERROR
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:52
indeed.
Jean-François Im
@jfim
Feb 10 2016 20:53
I'd set it to info
It'll start giving a lot more info :)
Hisham Mardam-Bey
@mardambey
Feb 10 2016 20:53
(-D wont work btw, as it's obviously not understood by the commands within pinot-admin)
I'll try.
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:17
So far nothing extra on the console, with s/ERROR/INFO/g
Jean-François Im
@jfim
Feb 10 2016 21:17
I think it should log in pinotServer.log
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:22
Still zero sized.
Jean-François Im
@jfim
Feb 10 2016 21:25
Hmm
let me check
Jean-François Im
@jfim
Feb 10 2016 21:31
Are there other log files?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:34
None that I can see, just 3 zero sized ones (broker, server, controller)
Something I see pop up on the console is:
Pool for (XXXXX_8098) : Setting up timeout job to run every 21600000
(replaced server name for XXXXX)
Jean-François Im
@jfim
Feb 10 2016 21:40
Is that the only thing on the console?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:40
Argh, damn it.
Now the data's in.
I suddenly saw ZK logs from the Confluent platform container, I refreshed the web interface for Pinot, data's in.
It's calling my code, and printing out the debug:
Asked to decode data of length 29
This is again, Pinot running outside Docker.
I copied the Pinot distribution from inside the container the my machine directly, and just ran it there, to verify if the distro iwthin the container is good or not. So now we know it's good.
Question remains: why do I never see this happening when Pinot is running within Docker.
ie: I never see: Pool for (XXXXX_8098) : Setting up timeout job to run every 21600000
And I never see Pinot actually consume from Kafka.
(just init the decoder)
Jean-François Im
@jfim
Feb 10 2016 21:44
hmm
so it works outside of the docker container but not inside?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:45
So far, yes.
Jean-François Im
@jfim
Feb 10 2016 21:45
Hmm
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:45
I'm going to repeat this within Docker one more time.
Jean-François Im
@jfim
Feb 10 2016 21:45
okay
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:47
Is there a relation between that logging message about the "Pool for ... to run every xxxx" and the consumption from Kafka process?
Jean-François Im
@jfim
Feb 10 2016 21:48
No, I don't even know where that message comes from :)
Logging should be fixed in latest master
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:52
Sweet - I'll merge that up later today.
Just restarted thigns up in Docker again, I saw:
pinot_1 | Pool for (3a8c1520acb2_8098) : Setting up timeout job to run every 21600000
Jean-François Im
@jfim
Feb 10 2016 21:53
oh so it works in docker now?
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:53
Nicely generated random host name, hehe
Jean-François Im
@jfim
Feb 10 2016 21:53
haha
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:53
I didn't see data get pulled in yet.
Just that message.
Jean-François Im
@jfim
Feb 10 2016 21:53
gotcha
It shouldn't take too long for realtime ingestion to start
Hisham Mardam-Bey
@mardambey
Feb 10 2016 21:57
Yea, it doesn't when I run locally... Same issue within Docker now. Not consuming.
This is the only thing I see relating to Pinot/Kafka/ZK after I create the table, and that "pool" log appears:
confluent-platform_1 | [2016-02-10 21:51:05,728] INFO Got user-level KeeperException when processing sessionid:0x152cd2768810010 type:create cxid:0x14 zxid:0xee txntype:-1 reqpath:n/a Error Path:/consumer s/facebook_ads_REALTIME_1455141062796_0/owners/facebook_ads-realtime Error:KeeperErrorCode = NoNode for /consumers/facebook_ads_REALTIME_1455141062796_0/owners/facebook_ads-realtime (org.apache.zookeeper. server.PrepRequestProcessor) confluent-platform_1 | [2016-02-10 21:51:05,764] INFO Got user-level KeeperException when processing sessionid:0x152cd2768810010 type:create cxid:0x15 zxid:0xef txntype:-1 reqpath:n/a Error Path:/consumer s/facebook_ads_REALTIME_1455141062796_0/owners Error:KeeperErrorCode = NoNode for /consumers/facebook_ads_REALTIME_1455141062796_0/owners (org.apache.zookeeper.server.PrepRequestProcessor) confluent-platform_1 | [2016-02-10 21:51:06,244] INFO Got user-level KeeperException when processing sessionid:0x152cd276881000e type:setData cxid:0x114 zxid:0xf3 txntype:-1 reqpath:n/a Error Path:/PinotC luster/INSTANCES/Server_172.17.0.4_8098/CURRENTSTATES/152cd276881000e/facebook_ads_REALTIME Error:KeeperErrorCode = NoNode for /PinotCluster/INSTANCES/Server_172.17.0.4_8098/CURRENTSTATES/152cd276881000e/ facebook_ads_REALTIME (org.apache.zookeeper.server.PrepRequestProcessor)
Those are bascially the logs from ZK, showing that Pinot is at least trying to do something; but stops at this point.
I'm going to take a break and head to the gym; and hit this back in the evening... I'll also merge the logging changes you made, might end up revealing more.
Cheers so far bud (=
Jean-François Im
@jfim
Feb 10 2016 21:59
yeah
You could check if those nodes exist in zk
I'm glad to hear you're back to the gym now :)