These are chat archives for FreeCodeCamp/DataScience

10th
May 2017
Alice Jiang
@becausealice2
May 10 2017 01:25
can someone tell me if this Python code makes sense to them? To my eyes it's adding the word regardless of meeting the conditions or not...
for review in reviews:
    for word in review.split(' '):
        if total_counts[word] > min_count:
           if word in pos_neg_ratios.keys():
              if pos_neg_ratios[word] >= polarity_cutoff or pos_neg_ratios[word] <= -polarity_cutoff:
                   review_vocab.add(word)
        else:
          review_vocab.add(word)
forgive the indenting whatsit, I give up trying to gitter format to make it perfect :angry:
evaristoc
@evaristoc
May 10 2017 06:45
People:
Look what Google is up to. Inviting to rioting in some parts of the world...
https://www.technologyreview.com/s/604303/apple-picking-robot-prepares-to-compete-for-farm-jobs/
evaristoc
@evaristoc
May 10 2017 08:15
Keeping this person around: Peter Cooper.
http://peterc.org/
evaristoc
@evaristoc
May 10 2017 09:13
@jayvora92 try to get some ideas from what it was done last year? I commented some examples and ideas in the following post:
https://gitter.im/FreeCodeCamp/DataScience?at=590c803e9d90dc7a1c4ba530
@becausealice2 Nice posting!! Thanks!
CamperBot
@camperbot
May 10 2017 09:14
:warning: @becausealice2's account is not linked with freeCodeCamp. Please visit the settings and link your GitHub account.
evaristoc sends brownie points to @becausealice2 :sparkles: :thumbsup: :sparkles:
evaristoc
@evaristoc
May 10 2017 09:17
@becausealice2 about your Python code:
review_vocab is a set so it should contain only 1 word without duplicates.
Check your else condition. Not sure what you are trying to do with adding those words.
Also try to verify the condition for the third nested if.
Honman Yau
@honmanyau
May 10 2017 09:37
@becausealice2 @evaristoc I think the code seems fine, and I don't think it's adding every word in review to review_vocab—I think a word would only be added to the review_vocab if its number of occurrences in review is less than min_count. Guessing without context, I think a word would only be added to review_vocab through the conditions after the first if if its frequency of appearance (or perhaps distribution) in review is outside of the expected range (polarity_cutoff).
Honman Yau
@honmanyau
May 10 2017 11:01

I got really distracted when I was about to start the D3 force-directed graph project and ended up doing this: http://codepen.io/honmanyau/full/NjwQbq

Would this be along the line of something useful? :) If so, and if the derived dataset seems useful, I was wondering if someone could babysit me through how to go about sharing it on GitHub once!

Alice Jiang
@becausealice2
May 10 2017 11:48
@honmanyau @evaristoc I suppose a little context might help. The code is supposed to only add if a word's total count is greater than the min_count and the absolute value of it's pos_neg_ratio is greater than the polarity_cutoff
I'll have a look again at the solution code, the else condition may belong to the second nested if. I swear it was the first, though...
Rishabh Chakrabarti
@bassdeveloper
May 10 2017 12:10
@becausealice2 forget it it's an old message
evaristoc
@evaristoc
May 10 2017 15:17
@honmanyau Great!! I suspect what it is but can you explain? I think we can use it for something if it is what I think it is.
@becausealice2 I think the else condition is not required then.
@honmanyau It is another thing. Based on what you are linking the nodes? Is that a pageRank?
I like how it looks, but I would like to know what represents...
evaristoc
@evaristoc
May 10 2017 15:23

OK!!

The data for this force-directed graph is derived from the 2017 New Coder Survey conducted by Free Code Camp (GitHub repository). This particular derived dataset shows in which other communities 13803 Campers are also involved in.

Sorry! I was opening the editor in a small window! It is GREAT! @honmanyau

Alice Jiang
@becausealice2
May 10 2017 16:24
Okay so an update on the code snippet from before, I might be misreading the instructions because the comment before the code snippet in the solution says:
## New for Project 6: only add words that occur at least min_count times
#                     and for words with pos/neg ratios, only add words
 #                     that meet the polarity_cutoff
Alice Jiang
@becausealice2
May 10 2017 16:30
but the instructions say
Change so words are only added to the vocabulary if they occur in the vocabulary more than min_count times.
Change so words are only added to the vocabulary if the absolute value of their postive-to-negative ratio is at least polarity_cutoff
Before I raise an issue, would you get the same train of thought as the comment with those instructions?
evaristoc
@evaristoc
May 10 2017 17:32
@becausealice2
at least means equal or more, so "more than" is incorrect
polarity_cutoff seems to be a concept; I think this is correct
Alice Jiang
@becausealice2
May 10 2017 17:36
polarity cutoff is just to ensure that we're not counting neutral words as either positive or negative.
My confusion is because the instructions feel like they're implying the code should only include words that meet both conditions but the solution code feels closer to one or the other
evaristoc
@evaristoc
May 10 2017 17:43
@becausealice2
It is probably like this? The first if filters for all min_count words BUT for those that are min_count AND with pos/neg ratios, just only those that meet cutoff
Your code above doesn't seem to reflect that.
One of the issues with Python: indentation. Check your else
@becausealice2 ^^^
evaristoc
@evaristoc
May 10 2017 17:49
@nmbrgts interesting topic, your blog...
Alice Jiang
@becausealice2
May 10 2017 17:55
I Checked the indentation, and it was off but my point stands
the instructions read, to me, as the conditions BOTH have to be met, where the solution code is only looking for min_count as a required condition
and the polarity_cutoff as optional pending a pos_neg_ratio
WAIT
I just answered my own question
thanks guys xD
evaristoc
@evaristoc
May 10 2017 17:58
:+1: !
evaristoc
@evaristoc
May 10 2017 18:05

People

About getting a degree as Data Scientist. This is a discussion. I think having an scientific way of thinking is required and that gets trained at uni, not always in MOOCs. Also the theoretical background could be required.
https://www.forbes.com/sites/quora/2017/05/08/do-i-need-an-advanced-degree-to-become-a-data-scientist/

My question mark about studying data science without a good uni background could suggest that only knowing the methods is enough. What it is not told by many of those courses is when your results are misled. Critical thinking is not learned in MOOCs, IMO. You need a bit more than that.

evaristoc
@evaristoc
May 10 2017 18:11

(Sorry, checking my twitter that I let abandoned...)

People

Another link:
https://whatsthebigdata.com/2016/04/22/top-10-data-science-influencers-on-twitter/

Ghost
@ghost~580ed9c0d73408ce4f309ef0
May 10 2017 18:57
ello
Jay Vora
@jayvora92
May 10 2017 19:41
@evaristoc and @erictleung thanks for the info. I will dive in the data and get some data insights
CamperBot
@camperbot
May 10 2017 19:41
jayvora92 sends brownie points to @evaristoc and @erictleung :sparkles: :thumbsup: :sparkles:
:cookie: 349 | @evaristoc |http://www.freecodecamp.com/evaristoc
:cookie: 501 | @erictleung |http://www.freecodecamp.com/erictleung
Freddie O
@FreddieFO
May 10 2017 20:26
Hi guys, I have a question for you all. Please I would appreciate complete honesty. I have no academic background in computer science/programming or statistics. However I’m fascinated by the results of complex analysis by data scientist so I want to enter this field.
Currently I’m learning to program with python and learning the fundamentals of CS. I think learning statistics and other data science related topics at this stage will be overwhelming.
Is it better to first focus on becoming a software engineer then gradually begin to learn core data science topics in order to become one?
Alice Jiang
@becausealice2
May 10 2017 20:53
@FreddieFO Personally, I feel that Data Science is quickly becoming a large enough industry that saying you want to become a data scientist could mean many things, and how it would be best for you to approach your study to prepare yourself to enter the field is going to depend entirely on what you're hoping to do. if you want to approach from a programmer angle, then absolutely go for it, just don't underestimate the importance of having a solid grasp on the other aspects of DS, even if it's not an especially advanced comprehension.
Tyler
@nmbrgts
May 10 2017 22:11
@evaristoc thanks, I wanted to do more posts dealing misleading popular statistics, but haven't gotten around to it. any feedback is appreciated. i feel that my writing isn't as clear as it could be..
CamperBot
@camperbot
May 10 2017 22:11
nmbrgts sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:cookie: 350 | @evaristoc |http://www.freecodecamp.com/evaristoc
Tyler
@nmbrgts
May 10 2017 22:21
whoops! i didn't realize how brownie points work.. i don't post much.