These are chat archives for FreeCodeCamp/DataScience
discussion on how we can use statistical methods to measure and improve the efficacy of http://freeCodeCamp.com
@cuent : some people here started to talk about F# in this channel last week and I checked some exercises. It is gaining my interest more and more. In fact recently I am finding python bit too small and constraining...
T-SNE: a Dutch person seems to be behind the development of the technique? Reading about... Is that a sort of clustering technique? Or mainly representational, like Multidimensional Scaling? Hmmm... seems more like the last one... Looks VERY simple conceptually too!
kNN again!, with apparently MC + binary search + back to t-Student and concepts in information theory to measure the loss function...
Amazing how some simple stuff could be more effective than too complex implementations in the general cases... we were realising that in a discussion during the last meetup...
@cuent: for what I am reading, the key here is finding the "probability distribution between pairs of high-dimensional objects" and then the probability distribution of the similarity distance. Looks like a (double) Bayesian approach to me unless your approach is parametric... Also it is here where the sampling takes place: it could be a numerical approximation? It seems that involves a bit of tweaking if you get fancy.
Those are my first impressions, not sure if I am correct...
Not sure if you are interested just in the practice of the technique or also in the concepts? Don't know a course but for basic concepts I would suggest something in Computational Statistics if that is not your field: it could be valuable to understand the implementation.
(particularly @darwinrc but maybe @cuent and @luishendrix92? @Lightwaves?)
Who want to check and play with this? Maybe a very small project here on our own? Something perhaps for Latinamerican users?
I can share with you some advances already done by some companies here, trying to get involved.
Sorry about my lack of involvement in the skale project the last months. I am still interested but stuck with some basic stuff to be honest. I still suggest to keep an eye on that project, who knows what comes. Keep you updated.
@evaristoc Yes, I am interested a lot in clustering algorithms.
I didn't know that was possible to build presentations on D3.js. Is there some tool to create presentations? or should we write code from start?