These are chat archives for FreeCodeCamp/DataScience

15th
Jan 2018
Alice Jiang
@becausealice2
Jan 15 2018 02:33
@rscales02 One that I do all. the. damn. time. is the wrong order for my scales, like [max, min]
and it's not even sometimes. It's very near every damn time
Lone programmer?
@MateoCzyzewski
Jan 15 2018 07:31
Hello everyone?
can i have some questions about machine learning and also some questions that are related to math for real data scientist?
evaristoc
@evaristoc
Jan 15 2018 10:25
@MateoCzyzewski Hi! I would suggest you to try and see what it comes.
evaristoc
@evaristoc
Jan 15 2018 12:10
@vaibhavn541 Sorry for coming back to you so late. I haven't used Theano but I suggest to read the error carefully. It seems a very simple configuration issue. I can't help you more than that, sorry.
macnux
@macnux
Jan 15 2018 16:37
Python web scraping examples using multipel libraries
https://likegeeks.com/python-web-scraping/
evaristoc
@evaristoc
Jan 15 2018 18:52
@macnux thanks for sharing!!
CamperBot
@camperbot
Jan 15 2018 18:52
evaristoc sends brownie points to @macnux :sparkles: :thumbsup: :sparkles:
api offline
macnux
@macnux
Jan 15 2018 19:06
@evaristoc You are welcome!!
evaristoc
@evaristoc
Jan 15 2018 19:57

PEOPLE

The article about Emojis has been approved for editing! Wish me luck!!
Josh Goldberg
@GoldbergData
Jan 15 2018 20:06
Good luck! @evaristoc great work!
mstellaluna
@mstellaluna
Jan 15 2018 20:06
@evaristoc :clap:
evaristoc
@evaristoc
Jan 15 2018 20:27

@all
I want to reference everyone in this group to make some quick comments about the Data Plans for this year:

NEW CODERS SURVEY

There is a planned edition for the Yearly New Coders Survey this year too. This project is a flag one. The last two editions occurred at some point between May and Jul. I would say it will happen the same this year. Usual owner of this project has been @erictleung but he is becoming a very busy man. That means the project might need someone who could help with it. Whether Eric is available or not, I invite those interested in R language to get a look at this project. He did a small masterclass of R coding there. The project is mature enough so it is accessible to less advanced users.


Meanwhile I would invite all of you to stay in tune with this chat or with Contributors: https://gitter.im/FreeCodeCamp/Contributors.

Eric Leung
@erictleung
Jan 15 2018 23:07

@evaristoc always curious about the yearly coders survey :smile:

As @evaristoc mentioned, I do have a lot of R code that can be used to manipulate and "clean" the raw data to be analysis ready. I have been meaning to make it into an R package to use it easier and to be tested. The one thing that might be good to discuss is adding or removing questions, depending on what might be useful to learn about the new coder community.

Josh Goldberg
@GoldbergData
Jan 15 2018 23:16
@erictleung can you share some examples of your R code for data munging?
I’m relatively versed in R. I’m not sure if this project overlaps with the other one we spoke about @evaristoc
Eric Leung
@erictleung
Jan 15 2018 23:18
@GoldbergData here's the code for the 2017 survey and here's the code for the 2016 survey. The main components of the code are unchanged, thus, why I'd like to make this into a package so that there's less repetition and guessing of the functionality of these functions I've created.
Josh Goldberg
@GoldbergData
Jan 15 2018 23:19
this assumes the data doesn’t change y/y?
@erictleung
Eric Leung
@erictleung
Jan 15 2018 23:20
@GoldbergData right, those scripts I've created are only fit for the data for that corresponding year. Thus, the headache of going through my code again every year and making sure it still can work for the current survey data.
Josh Goldberg
@GoldbergData
Jan 15 2018 23:21
never seen select_. Is that different than select from dplyr? @erictleung
Eric Leung
@erictleung
Jan 15 2018 23:22
@GoldbergData it's used so that you can interpolate the column name from a variable. It is only slightly different from the normal select in dplyr.
Josh Goldberg
@GoldbergData
Jan 15 2018 23:22
ah okay. @erictleung. This is a lot of code. Anyway to shorten it?
Eric Leung
@erictleung
Jan 15 2018 23:24
@GoldbergData yes, this is a bit unwieldy. There are two ideas for "shortening" it. Break up the script into smaller scripts that group functions of similar function. Or write the script into its own R package, which inherently will group functions of similar function, and demand good documentation for each of the functions declared.
@GoldbergData if you use RStudio, there's an option to view each section and subsection of the script to make it easier to parse as a human.
Josh Goldberg
@GoldbergData
Jan 15 2018 23:25
is this FCC’s data? If so maybe cleaning up the front end survey could make munging easier?
Yeah I use Rstudio
Eric Leung
@erictleung
Jan 15 2018 23:27
@GoldbergData can you elaborate on what you mean by "cleaning up the front end survey" means? The scripts I've written do clean up the coder survey, like renaming columns and making sure the data is somewhat consistent internally.
Josh Goldberg
@GoldbergData
Jan 15 2018 23:27
row 1064 that long list. I’d probably store it somewhere else and call it
I mean if we can retrieve better inputs, less back-end cleaning would be required? I’m not sure how the survey actually works. Just a high-level thought @erictleung
the code does not look too complicated. it’s just long. Nice work though @erictleung
Eric Leung
@erictleung
Jan 15 2018 23:33
@GoldbergData Typeform has been used to implement the survey. And their column names are really long. So thus, that function in Line 1064. Another way to bulk rename columns that might be easier is encoding the column-new column name pairs is to put it into a list/dictionary so that you can easily replace column names with their new column names. Never got around to it but something I've thought of.
@GoldbergData and thanks :smile:
CamperBot
@camperbot
Jan 15 2018 23:33
erictleung sends brownie points to @goldbergdata :sparkles: :thumbsup: :sparkles:
:cookie: 129 | @goldbergdata |http://www.freecodecamp.org/goldbergdata
Josh Goldberg
@GoldbergData
Jan 15 2018 23:34
@erictleung I see. So are you completely off the project?