These are chat archives for FreeCodeCamp/DataScience
discussion on how we can use statistical methods to measure and improve the efficacy of http://freeCodeCamp.com
@becausealice2 You asked about my interest in datadotworld. I am currently exploring a proposal to make use of this repository: https://github.com/freeCodeCamp/open-data .
One of the things I am evaluating is how to store datasets that are too big to be kept in the Github repository. datadotworld could be an option.
The repo is planned to host past, present and future projects that have used fCC data. That includes your projects, @becausealice2.
For those who are interest in Deep Learning and Machine Learning in general:
The big topic seems to be UNSUPERVISED LEARNING. Not that you have to focus on that, but keep that in mind.
The big, important contribution that developments in unsupervised models will bring is to allow the system to learn from scratch without any previous data or no much data.
Then think about the following: how to formulate a new emerging learning by combining different learnings. Think of the semantic web. This is something that current system are not doing properly.
This reminds me an article I was recently reading about AlphaGo. A recent development of the DeepMind's AlphaGo program put in practice an unsupervised learning approach : instead of using previous learnings (games) played by humans of the the Go game, they put the new version to play against itself, learning to play Go from scratch following a set of rules. The final results were:
What is the conclusion? MODELS are still very important. In this case, the model you are providing (the set of rules of the game) allows a good level of development of the program.
Machine Learning is based on DATA but it is not able to provide more answers beyond the data that the algorithm gets. Machine Learning is just an optimisation heuristic more applicable when you don't have that model by hand, but if you compare that with for example the impact of the Einstein's theoretical work, I think you would agree with me that the impact that Einstein had on providing explanations about the surrounding world is still more valuable because you can derive new knowledge from it.
That it is why Unsupervised Learning is so important: they are trying to create a computer that it is able to think of new solutions, not to optimise the existing ones.
If you manage to find the MODEL that explain the phenomenon (think of maths here), you win more. So my advise is: keep working with models. Machine Learning won't exclude the needs of Math, I am afraid.