These are chat archives for freeman-lab/discussion

1st
Mar 2016
Seth Vincent
@sethvincent
Mar 01 2016 00:59
@freeman-lab :ghost: :ghost: :ghost:
I'm planning to do a sprint of hexaworld stuff starting this weekend. here's my list of things I've got in mind: https://gist.github.com/sethvincent/42c73859d81a736b46cb
Jeremy Freeman
@freeman-lab
Mar 01 2016 01:02
@sethvincent awesome! good timing, i was planning on doing a push this week, so weekend will be a good time to jump in
list overall looks good but let's plan to chat later this week? thursday maybe?
Seth Vincent
@sethvincent
Mar 01 2016 01:12
@freeman-lab cool! thursday before noon or after 5 p.m. eastern time would work for me
Jason Wittenbach
@jwittenbach
Mar 01 2016 01:31
@freeman-lab just sent the first draft of the Spark Summit stuff your way
@andrewosh I sent it your way as well
Andrew Osheroff
@andrewosh
Mar 01 2016 01:39
@jwittenbach cool looking it over now
Jeremy Freeman
@freeman-lab
Mar 01 2016 01:43
@jwittenbach cool, can you pop it into a google doc?
and when is it due?
@sethvincent sweet, lets do thursday 6pm est
Jason Wittenbach
@jwittenbach
Mar 01 2016 01:44
sure thing
due tonight
I’m assuming midnight
though it doesn’t say
Jeremy Freeman
@freeman-lab
Mar 01 2016 01:45
oh oh, but tonight and not tomorrow =)
let's assume midnigh
Jason Wittenbach
@jwittenbach
Mar 01 2016 01:47
ah yeah, tonight
sorry for the last minute-ness
Google Doc share sent
Jeremy Freeman
@freeman-lab
Mar 01 2016 01:48
sweet
first comment it shorten =)
but i'll try to make that more specific
Jeremy Freeman
@freeman-lab
Mar 01 2016 02:39
@jwittenbach posted an edit
in the doc
Jason Wittenbach
@jwittenbach
Mar 01 2016 02:53
@freeman-lab thanks! that flows really nicely
freeman-lab command line implementation 235x faster than hadoop
freeman-lab ogd: ^ reminds me of our gasket idea to support each
ogd lol
ogd we should start a 'data science that doesnt involve java' movement
freeman-lab hehe yeah i'd be happy to say goodbye to the jvm
Gitter Robot
@gitter-robot
Mar 01 2016 03:18
ogd also that works on windows
ogd i think our stuff should exist between the stuff in that post
Gitter Robot
@gitter-robot
Mar 01 2016 03:23
ogd e.g. between unix commands and hadoop
ogd but i think node is awesome for that
freeman-lab yeah totally, especially giving a smoother ux around xargs
ogd yea and npm
freeman-lab yup!
Gitter Robot
@gitter-robot
Mar 01 2016 03:28
freeman-lab @sofroniewn have we talked about gasket yet? https://github.com/datproject/gasket
freeman-lab we should try it for your workflows that chain together sequences of notebooks and python scripts
ogd i should just make this the readme but the spec is here datproject/gasket#17
Kyle Kelley
@rgbkrk
Mar 01 2016 03:45
@freeman-lab @ogd have you thought about wrapping https://github.com/PivotalRD/libhdfs3 for node?
Another means to stick it to Hadoop where people already have data. ;)
Gitter Robot
@gitter-robot
Mar 01 2016 03:46
ogd @rgbkrk WHOA cool
ogd looks like it would be annoying to write node bindings for https://github.com/PivotalRD/libhdfs3/tree/apache-rpc-9/src/client
ogd but maybe theres a cli or something
Gitter Robot
@gitter-robot
Mar 01 2016 03:57
freeman-lab @rgbkrk cool! didn't know about that, had seen this one though http://snakebite.readthedocs.org/en/latest/, has a cli
Seth Vincent
@sethvincent
Mar 01 2016 04:00
@freeman-lab :+1:
Kyle Kelley
@rgbkrk
Mar 01 2016 04:19
ah neat, I'd been checking out http://hdfs3.readthedocs.org/en/latest/ by @mrocklin and folks
Matthew Rocklin
@mrocklin
Mar 01 2016 05:57
We opted for libhdfs3 because it had some things that snakebite didn't, notably data-local reads and Kerberos. Libhdfs3 seems to be a full libhdfs-compliant client API. Snakebite was a bit more of a grab-bag of useful functionality.
They're both great though. Snakebite is easier to install if you're installing from source.