These are chat archives for FreeCodeCamp/DataScience

18th
Aug 2016
Eric Leung
@erictleung
Aug 18 2016 03:45
Another data science resource, "Computational and Inferential Thinking: Foundations in Data Science". Uses Python3 to teach data science.
evaristoc
@evaristoc
Aug 18 2016 11:36
@erictleung Checking the resources you gave... useful altough I think you need to be registered to get full access? Not sure... PEOPLE: the training proposed by @erictleung will contain some statistics + Python... by the way: I am surprised how easy some courses that in the past would have been called "Basic Statistical Analysis" are becoming "Basic Data Science" courses... :) ... anyway: it is more or less the same...

People

How are you progressing through the "Big Data Analysis with Spark"?

The first week can take between 3-5 hours to finish, approx.
As I said, they are not giving any substantial content for Machine Learning / Statistics, although you can progress through without: the exercises are sometimes self-explained and copy/paste could be enough to complete them... However, you might complete the exercises without understanding what is going on so if you feel that you want more explanations please leave some questions here, at Piazza or even the FCC Forum...
@Lightwaves: have you tried already?
evaristoc
@evaristoc
Aug 18 2016 11:46

People

The What is Data Science Series

I think it is worth having a look... it is hard to follow as the sounds is really bad, but you will have the perspective of 4 business practicians (managers) talking about:
  • What Data Science is and the different forms of the role
  • What they expect from a recruited Data Scientist
  • What the regular job of a Data Scientist is
  • etc
evaristoc
@evaristoc
Aug 18 2016 11:53

People

Few Advices from Data Scientists if we are trying exercises

I was in contact with some people at skale.me, privately and publicly; here some suggestions:
  • try to put a strong focus on ETL (Extract-Transform-Load) aspects of the job, something that it is not very much considered at the different courses but that it is in fact a very important aspect of the whole (Big) Data analysis (I can say I agree with that claim...)
  • we can keep it simple and run everything locally; I would say this is more our decision in case we want to experience with complex settings
  • Victor Millan, the person in the skale.me video, got an error when working with Hadoop instantiation in one of the exercises that ended affecting its computer, so we possibly have to be careful with Hadoop!!
Lightwaves
@Lightwaves
Aug 18 2016 17:46
OK Here is a question did anyone run into this issue
blob
Lightwaves
@Lightwaves
Aug 18 2016 17:52
Based on the output my code would have passed but it's weird the module isn't found. It's also weird to get used how select is used in spark. Trying to wrap my head around it.
Lightwaves
@Lightwaves
Aug 18 2016 18:00
nvm I went too fast the library wasn't installed
I was able to pass the test after installing the library with the instructions from lab 0 so back to going through it
evaristoc
@evaristoc
Aug 18 2016 20:09
:+1: @Lightwaves
evaristoc
@evaristoc
Aug 18 2016 20:21

People

One recommended e-book at Piazza about Data Science

A book I was pointed to was exactly the same book that @erictleung was suggesting above!: "Computational and Inferential Thinking: Foundations in Data Science". The e-book will take you into the lessons they gave at Berkeley, with videos and slides.
Lightwaves
@Lightwaves
Aug 18 2016 20:43
I just finished the lab it was pretty interesting I just need to submit it to the autograder
Quincy Larson
@QuincyLarson
Aug 18 2016 20:44
This message was deleted

@/all Stack Overflow released the full dataset from their survey! You can download the 5mb .csv file here: https://drive.google.com/uc?export=download&id=0B0DL28AqnGsrV0VldnVIT1hyb0E

I am writing an article about the pros and cons of working remotely. I have two questions these data could answer:

  1. What is the difference in salaries between people who report working remotely and people who don't?

  2. What if the developer is in a different country than the company they work for - how does that impact salary?

Does anyone have a moment to find answers for these (and maybe even make a visualization?) I will credit you in my Medium article and link back to your personal webpage or Twitter account.

evaristoc
@evaristoc
Aug 18 2016 20:53
:+1: @QuincyLarson !
People: We can use this channel to see what comes from the analyses!
Quincy Larson
@QuincyLarson
Aug 18 2016 23:08
@evaristoc I have a feeling there are many, many insights that could be drawn from this dataset. 50k respondents and very few of the questions overlap with the New Coder Survey.