These are chat archives for FreeCodeCamp/DataScience

9th
Nov 2017
Josh Goldberg
@GoldbergData
Nov 09 2017 00:00
@evaristoc planning on working on more visualizations tonight. I sent some things I started on last night. @QuincyLarson. Is there a particular format you want? Is there a particular visualization you’re interested in Quincy? I’ll do my own digging. But currency if there is something you’re interested in.
Josh Goldberg
@GoldbergData
Nov 09 2017 07:42
this is as far as I got…there is so much more to dig into. I apologize I didn’t make it far. I tried my best. Could do better with more time. @QuincyLarson @evaristoc
Dhrubajit-Das
@Dhrubajit-Das
Nov 09 2017 08:48
hello everyone,

​Hi, this is the link to my blog:
http://dhrubajitdas44.blogspot.in/​

It contains my machine learning/ deep learning projects, few as of now. More coming.
Any kind of criticism/suggestions/corrections are very much welcome, as it will help me learn. I am a beginner.
If any experts/instructors would like to give a review on the projects, that would be great. And the students, follow the blog if you find it useful and for future updates.
Thank you

evaristoc
@evaristoc
Nov 09 2017 09:34

@GoldbergData

The charts are carefully prepared! It is obvious you have done data analysis before. Very nice! I hope you get more time from Quincy to handle still different topics? Graphs are too focused on gender and age, IMO. But definitively very nicely worked.

Can you please share here the references you might have used to prepare the graphs? Blogs? Tutorials? Others?

@Dhrubajit-Das your blog cannot be reached. However, I would like to invite you to include some data from our datasets? Also to take invitations to work on some specific assignments as @GoldbergData and others did and write something perhaps in the fCC Medium publication? That will surely favour your visibility. We will more than happy to help with that.
Josh Goldberg
@GoldbergData
Nov 09 2017 13:21

@evaristoc thank you for your kind words. That is very nice of you to say, and has made me feel better already about my insecurities. I’ll repost my GitHub link and you can see the code for yourself.

You see a lot of gender and age, because that’s where I started. I need a couple more days to do more interesting analysis to get some deeper insights. These were just surface because that’s where I started. I’ve used a multitude of reference over the months to build up my skill in R. Some include Udacity’s data analyst nanodegree, data camp R track, R for Data Science by Hadley Wickum. And a multitude of other online sources. My Twitter (@GoldbergData) is where I find a lot of useful blog posts on data science. I RT a lot of stuff. Ask any questions, and I’ll happily answer. My knowledge of R is a piecemeal of all these resources I’ve studied.

CamperBot
@camperbot
Nov 09 2017 13:21
goldbergdata sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:cookie: 379 | @evaristoc |http://www.freecodecamp.com/evaristoc
Josh Goldberg
@GoldbergData
Nov 09 2017 13:23
And I would love to devote time to working on more projects with @QuincyLarson. I am a big fan of what he does, and the mission of FCC! @evaristoc. I thus volunteer my little services.
evaristoc
@evaristoc
Nov 09 2017 13:26
:+1: !
evaristoc
@evaristoc
Nov 09 2017 13:32

The format looked pretty much like d3.js, although in fact it could be the other way around : usual designs in d3.js might trend to look like R ggplots. Same with Python.

Which library did you use to produce the HTML file?

And are you using any R repl in particular? I use RStudio.

Matthew Barlowe
@mcbarlowe
Nov 09 2017 13:46
That's R markdown that's been knitted into html i believe
evaristoc
@evaristoc
Nov 09 2017 15:15

@mcbarlowe I am getting an error after trying to open the big dump (BitTorrent) file with R?

library('rjson')
json_data <- fromJSON(file='output.json')
#Error in paste(readLines(file, warn = FALSE), collapse = "") : 
#result would exceed 2^31-1 bytes

It is big, but I was able to open it with python last time - after long time evaluating the file and finding out some options online.

Can you please test?

freeCodeCamp/open-data#25

Josh Goldberg
@GoldbergData
Nov 09 2017 15:15
@evaristoc I’ve never seen d3.js (I probably have but didn’t know). This was made with ggplot. The look is a piecemeal of existing themes with some of my artistic touch. That’s the result you see. @mcbarlowe is right. It’s produced with knitr. I use Rstudio. Knitr makes it pretty easy to produce nice looking documents. Though I am considering learning LaTex. I am not sure but I believe I can incorporate it somehow in my journey through data science in the long run.
Have you tried pulling in pieces of the data if it is too large? I’ve heard of methods before dealing with this. I believe R will load the data set into memory. And if your memory caps out, R crashes.
@evaristoc
Matthew Barlowe
@mcbarlowe
Nov 09 2017 15:16
@evaristoc I can when I got home tonight
evaristoc
@evaristoc
Nov 09 2017 15:17
No hurry
@GoldbergData you are also using R. Would you have time to lend for the tests? They are simple - trying to download and open the files with your preferred tool. Better if Python AND R.
You should try to download the file from where it is now.
Matthew Barlowe
@mcbarlowe
Nov 09 2017 15:19
as far as LaTex unless you are doing a lot of mathematical writing I wouldn't make it a priority
evaristoc
@evaristoc
Nov 09 2017 15:19
Anyway not compulsory, @GoldbergData
Matthew Barlowe
@mcbarlowe
Nov 09 2017 15:21
although I would never tell someone not to learn something if they are interested in nit
evaristoc
@evaristoc
Nov 09 2017 15:21
@GoldbergData I might have to agree with @mcbarlowe. I have used it in the past, specially when trying to give details about formulas. But for projects targeting a more general audience could be even counterproductive.
Josh Goldberg
@GoldbergData
Nov 09 2017 15:23
@evaristoc @mcbarlowe fair points. I wasn’t going to devote much mindshare to it. Was just going to make my resume for starters lol. I can try to load the data set tonight if that works? My python tools are rusty. If it’s not too time consuming, I can try both systems. But definitely I can do R.
evaristoc
@evaristoc
Nov 09 2017 15:27
@GoldbergData actually leave it for later then. Take your time to contact @QuincyLarson about your results instead. Agree with him what else you can do. That is currently more like priority IMO.
Josh Goldberg
@GoldbergData
Nov 09 2017 15:43
@evaristoc sounds good.
evaristoc
@evaristoc
Nov 09 2017 15:55

@mcbarlowe

I succeeded with this library:

library(jsonlite)
json_data <- fromJSON("output.json", flatten=TRUE)

Can you please try both too?

Let me know your results later. Again, no hurries.
Any additional info, like speed, reliability of the an analysis, etc would be great. It is big so don't be surprise if everything crashes if don't have enough memory to handle the file.
Try simple aggregates.
This is the issue:
freeCodeCamp/open-data#25
evaristoc
@evaristoc
Nov 09 2017 16:06
Your call, @mcbarlowe. You decide what to do with the file.
Quincy Larson
@QuincyLarson
Nov 09 2017 21:09
@GoldbergData Awesome - thank you for making these. These are some nice visualizations.
CamperBot
@camperbot
Nov 09 2017 21:09
quincylarson sends brownie points to @goldbergdata :sparkles: :thumbsup: :sparkles:
:cookie: 121 | @goldbergdata |http://www.freecodecamp.com/goldbergdata
Quincy Larson
@QuincyLarson
Nov 09 2017 21:09
@GoldbergData I'm going to start pulling together the article and I'll let you know if there are any other big questions that come to mind.
James Riall
@jriall
Nov 09 2017 21:16
hi all, please could I get an API key for the open-api?
It's pointing me here.
Josh Goldberg
@GoldbergData
Nov 09 2017 21:30

@QuincyLarson you’re welcome. Sorry I work full time and do this in my free time (although my full time work overlaps with this field exactly). I’m going to make some more visualizations tonight. I can export any graphs if you like and email them in high resolution. If you have a preferred dimension for the image let me know.

When is your estimated publish date? I can throw any headers or color schemes on the visualizations to match your article. Or footnotes.

Quincy Larson
@QuincyLarson
Nov 09 2017 22:03
@GoldbergData Awesome - I would also recommend posting these directly to Kaggle: https://www.kaggle.com/kaggle/kaggle-survey-2017
@GoldbergData realistically I may not publish until Sunday night.
Josh Goldberg
@GoldbergData
Nov 09 2017 22:14
Yeah I probably will @QuincyLarson. It will provide you an easy link for reference in the article. I also have it on GitHub. Feel free to message me as you work through the article if some visualizations or analyses comes to mind. You can offload that work to me so you can focus on writing.
In the mean time I’ll just try to produce visualizations that make sense, but don’t necessarily follow a narrative. It’ll give you creative writing freedom to pick and choose as you like
@QuincyLarson