These are chat archives for FreeCodeCamp/DataScience

Aug 2016
Quincy Larson
Aug 09 2016 08:37
@maxmatthews @evaristoc yes - public API endpoints are on the road map, and may be available in 2016. We have a lot of competing priorities :)
Aug 09 2016 09:37


Just for general interest... outdated reference though (so maybe not 2$ per day; probably other settings too)

I think after installing Hadoop something like HBase or Hike (?) is required to manipulate the data.

Aug 09 2016 09:58
More about the differences between Hadoop and HBase - I think that having Hadoop will be enough for this exercise: if I understand correctly, skale (as well as Shark) evaluate the plain data stored in Hadoop in-memory instead of files, but I am not sure:
Aug 09 2016 10:09

Then there is that we can simply load data in HDFS/Hadoop by using command line:

Now, the data will be stored as files (very much like a simple mongoDB format with no schema at all). HBase is one of the many ways to give a schema or ordering to the no-structured data. All the existing options that run on top of HDFS (Cassandra, for example) seems to provide different advantages and disadvantages.

Aug 09 2016 10:45
Sorry no Shark but Spark

About how to set a Spark system - it should be similar for skale according to what I have read in the skale chatroom:

For this exercise the simplest option should be the one.

Aug 09 2016 10:47
you need to ask about @someone!