These are chat archives for FreeCodeCamp/DataScience
discussion on how we can use statistical methods to measure and improve the efficacy of http://freeCodeCamp.com
H2O is a company developing a Free Software platform very similar to Spark but with different algorithms and possibilities. Rather that competing against each other, H2O and Spark are becoming complementary. H2O is ok hosting Spark code but for a better use you should know Scala or better Java (the source code). H2O makes a lot of use of JVM and makes a lot of its code portable through jars.
I think it could be interesting to go through it, I was informed today that H2O can be used in databricks (for those who completed the Spark code).
Currently very busy learning Angular2 and gladly coming across with one of the topics that motivated me to study JS and nodejs: Reactive Extensions or Rx.
For what I have seen so far, Rx are currently used more for data manipulation/transformation and not for analytics. One important reason is the nature of the stream data type as "flux" - Rx, as mentioned by the presenter in the video below, is not more than querying stream (in general async) data.
Analysing static, sync data is for now the preferred way for many kinds of analyses because many analytical methods require getting the dataset properties that will become parameters or attributes for the analysis. With stream data having those properties is harder or even impossible.
However Rx could solve interesting problems. Here an example: