These are chat archives for FreeCodeCamp/DataScience
discussion on how we can use statistical methods to measure and improve the efficacy of http://freeCodeCamp.com
Hi, @qmikew1! The text mining is currently progressing.
I have been trying very simple techniques with verbose python to check if translating from python to JS makes sense.
The purpose of the text mining project at the moment is not considering sentiment analysis: only "speech act", particularly finding questions and requests for help, but also beginning and end of conversations. In the last part of project (this week) I tried a lazy, instance-based analysis comparing "distances". A methodology similar to k-NN where k is the whole training dataset. The training dataset I have been using is a popular corpus. I was using it to compare composition and the same corpus modified to compare sentence structure, to get two measures. Then I just quickly checked if there were regions in the "scatter plot" of those distances that better grouped the targeted sentences.
I was using a small sample with data from one of the campers mentioned in the DataScience list above and the results were promising, although still with some caveats.
The results hasn't been published yet. I would like to find time to compare to other technique first. I am afraid I will have to use more python capabilities for that...
@qmikew1 The general idea of the room is also to invite people to be proactive by presenting/joining small projects? Projects can be a week long, and the only requirement would be that JS should be used at some point. It is not compulsory: you can just simple check room progress: you have been in the list for some time, so feel free to do as you want, "advisoring" is also fine.