These are chat archives for FreeCodeCamp/DataScience

24th
Sep 2017
NDuma
@NDuma
Sep 24 2017 06:05
I rarely log in; however, am pleased to see how much is happening here. Is there a list of current projects and past findings including available data sets?
Please PM or Mention me as I'll likely login again in the future to find a reference {eg: data archaeology}
Thanks!
Also; Chingu Cohorts ... I'm glad they're doing it; however, it's a direct rip on my model -> seriosly :/ ... lmk if you want to do it right and better; I have the original Slack chats.
evaristoc
@evaristoc
Sep 24 2017 08:26
@NDuma do you??? I am trying to get that data to complete this project!
https://evaristoc.github.io/fCC_R3c/
We should talk. I am currently working the analytical part with someone in this channel too (@dmesquita). I wanted people to get more involved as I would like the datasets generated from this exercise to be proposed as an exercise in Data Mining / Machine Learning for the DSR and fCC overall.
@mcbarlowe Nice job! Question: why ROC?
This message was deleted
evaristoc
@evaristoc
Sep 24 2017 08:49

The logistic regression gives you a very good idea of possible variables involved.

Good discussion! Only the following charts : "xG Distributions in Test Data" are in general unclear and not well discussed.

It would be nice to get a summary of your findings at the end. Example:

  • A logistic regression model predicts reasonably well the goal scoring according to these variables ....
  • The data that is fairly predicted by the model is the one of the individuals but not the team.
    (OBSERVATION: I am guessing in the "prediction" section you are using a test dataset? Test_Fenwick_Data)

I would suggest to find out what are the shots that the model fails the most and suggest what you would improve?

Matthew Barlowe
@mcbarlowe
Sep 24 2017 14:35
@evaristoc I used ROC because I was reading that was a good test to test the accuracy of the model
evaristoc
@evaristoc
Sep 24 2017 17:41

Something about NN and optimization of gradient descent techniques: ADAM.

http://ruder.io/optimizing-gradient-descent/index.html#adam

evaristoc
@evaristoc
Sep 24 2017 18:08

@mcbarlowe Yes. I was trying to find reasons why not to use it and didn't find many - the ROC is applicable to several cases. I was recently more into using statistics but in fact the use of ROC and AUC vs other methods is actually more a question of culture. For example, ROC is much used between those with medical background. It has its followers in Marketing too.

I have always liked it but for a moment I don't know why I started thinking it was applicable to some specific cases. It seems the number of cases is wider that I thought. I made a wrong assumption actually.

That's ok then. I will check when it is more applicable and try to use it more frequently. Thanks!

CamperBot
@camperbot
Sep 24 2017 18:08
evaristoc sends brownie points to @mcbarlowe :sparkles: :thumbsup: :sparkles:
:cookie: 127 | @mcbarlowe |http://www.freecodecamp.com/mcbarlowe
evaristoc
@evaristoc
Sep 24 2017 18:12
If anyone else need to inspect material about ROC and people supporting it, a few links: