These are chat archives for FreeCodeCamp/DataScience

17th
May 2016
Sam Aiken
@SamAI-Software
May 17 2016 01:33

Advice for the last one? Easy: show the proportions and look for a way to report the n per category somewhere else? Perhaps in the sub-title of the figures...
Proportions instead of total counts for this one will allow cross-comparisons (a simple way to standardise the data...)

Not totally sure what do you mean...

stacked bars proportions 2.png
Did you mean anything like that?
@evaristoc
evaristoc
@evaristoc
May 17 2016 08:23
@SamAI-Software yep... but without the percentage labelling inserted in the boxes: unclear info. It would be a column per HoursLearning per softwareYesNo. You have to provide the sub-sample n per segment somewhere to indicate that some of them didn't get big samples (eg. softwareYesNo = Yes, time. HoursLearning < 11h). Be aware that that includes those who are not studying... It would be interesting to evaluate those breaks so they are more characteristic: what is the minimal time of HoursLearning so you can say the person is really having time to study? Or better said: is there a characteristic time to indicate that the person is studying one topic per week, several topics per week, or even going to a full time training? However, your breaks are good enough!
Sam Aiken
@SamAI-Software
May 17 2016 08:25

what is the minimal time of HoursLearning so you can say the person is really having time to study?

Yeah, I know this problem. We don't have any minimum value, because some people don't study at all. Excluding them would be not the best idea..

Anyway, working on graphics all day long, hope to show smth interesting in a few hours
@evaristoc btw, do you know how to reverse X axis? Because scale_x_reverse() gives me an error...
evaristoc
@evaristoc
May 17 2016 08:31
@SamAI-Software I am not sure if not studying wouldn't be a good idea...
And "reverse X axis"? The order of the numbers you mean? Try a sort ascend?
Sam Aiken
@SamAI-Software
May 17 2016 08:31
Here
1.2 IsSoftwareDev by MonthsRange (total programming experience) (720).jpeg
I'm trying to make all my next graphs into horizontal with coord_flip(), cause it looks better
But it always makes the left bar (usually the biggest one) to be on the bottom. But it's more logical to put the biggest bar on top!
I've spend like almost 2 hours trying to fix that lol, but didn't find any solution. Did you have that experience?
evaristoc
@evaristoc
May 17 2016 08:54
@SamAI-Software yes... The graph is being made by position in the x vector. One easy trick is just reverting the values of the factors. There are other better tricks but I don't remember now... eric should now...
That looks nicer, yes! Wait! Actually... it is not so intuitive... I would stick to vertical for this case...
Anyway: it is also you who would make the decision... I am just suggesting...
evaristoc
@evaristoc
May 17 2016 09:01
@SamAI-Software I think it is y and not x what you want to revert...
I think ggplot works the coord_flip first and then you have to revert y
Man... nice work in general, @SamAI-Software, really... we need to make this available somewhere as soon as possible...
Sam Aiken
@SamAI-Software
May 17 2016 09:05

I think it is y and not x what you want to revert...

Already tried :(

scale_y_reverse.jpeg

we need to make this available somewhere as soon as possible

Wow, wow, easy, easy :)
It's just the beginning! More to come soon.

One easy trick is just reverting the values of the factors.

@evaristoc hmm? Can you give a bit more details? Didn't get it

Google suggests me many pages with Revert a factor to its numeric values, but I guess you mean smth different
Sam Aiken
@SamAI-Software
May 17 2016 09:11

Wait! Actually... it is not so intuitive... I would stick to vertical for this case...

Horizontal graphics would be great for other questions like Job Preference, Role, Bootcamps, etc.
So I'm trying to get them right.

evaristoc
@evaristoc
May 17 2016 09:13
try scale_y_reverse()
Sam Aiken
@SamAI-Software
May 17 2016 09:14

try scale_y_reverse()

Yeah, already did. Graphic above :point_up: May 17, 2016 4:05 PM

But what do you mean by reverting the values of the factor? I might try that
evaristoc
@evaristoc
May 17 2016 09:18

It could be the way you are writing the ggplot expression... anyway...

If you get into the factor instance of the variable, you will see that it is a vector that goes like this:
c("0-11 months", "12-59 months", "60+ months")

The factor values are the indices of that vector. You should have to find a way to revert the order of those factors so ggplot will take the corresponding indices according to what you want...

You can alter that order, I donĀ“t remember how...
Sam Aiken
@SamAI-Software
May 17 2016 09:20
Oh! Yeah, I saw smth like that today
evaristoc
@evaristoc
May 17 2016 09:20

Just create a new column with the altered order.

Also you can use a column with altered integer numbers as factor and then use labelling according to the name found in the target column.

But I am sure you can make it work with ggplot too...

Sam Aiken
@SamAI-Software
May 17 2016 09:21
Yeah, will try to reverse manually
evaristoc
@evaristoc
May 17 2016 09:21
It is just about checking how you are nesting your expressions...
Sam Aiken
@SamAI-Software
May 17 2016 09:36
Btw guys, which of these two graphics do you find more intuitive and easy to understand and suits better to this statement:

About how many hours do you spend learning each week?

  • Almost half (47%) of coders with 5+ years of experience who already work as software developers still spend at least 10 hours each week learning programming.
1.3.1 HoursRange by MonthsRange & IsSoftwareDev (1080).jpeg
1.3.0 IsSoftwareDev by MonthsRange & HoursRange (1080).jpeg
Basically they show the same thing, but in a different way. I already know that data, so I wonder which will you find more understandable
evaristoc
@evaristoc
May 17 2016 10:14

(@SamAI-Software ... I should agree with you that the first one with joint proportions seems to be more informative... a bit more difficult to follow but more informative...)

The second one suggests that:

  • There are more students not-SoftDev that are studying, than those who started a job as SoftDev. But this is also clear in the first one...
  • The longer the time as not-SoftDev, the more the study dedication time (??? sample size???).
  • The more experienced as SoftDev, the more likely you will be dedicating 0 to 10 hours per week to learning some software. But the second chart is not really giving the right picture, for other comparisons...

(See how the 0 hours becomes relevant: there were people who answered the survey that were not registered in any training. I would try a break rather closer to 0 instead of "0-10"; something like "0-3" or "0-5" hours per week)

The first chart is more informative indeed... I was trying to suggest ways to standardise the data of SoftDev and not-SoftDev so it would be easily comparable (i.e. transform the data so assumes same sample sizes). But might not work...

By the way: who are those you are counting as not-SoftDev? If they are not working as SoftDev, what are they doing that they were answering this questionnaire??

Answers always open new questions...
evaristoc
@evaristoc
May 17 2016 10:25
@SamAI-Software I think that we need to recreate stories. I am in favour to suggest 5 Big stories to group different analyses. Think about preparing papers. We need a topic to discuss per paper. The graphs should be facilitate the story-telling.

In one section, we will tell the story about Resources and Events, for example. The other would be more about Working as..., The other would be General Demographics... something like that...

@erictleung: what do you think about above?

Sam Aiken
@SamAI-Software
May 17 2016 10:46
I think @erictleung is too busy with the code, he has lot's of stuff to do :)

The graphs should be facilitate the story-telling.

@evaristoc yeah, that's exactly what I'm trying to do - to tell a story :+1:

  • The longer the time as not-SoftDev, the more the study dedication time (??? sample size???).
That's a good point!

The more experienced as SoftDev, the more likely you will be dedicating 0 to 10 hours per week to learning some software.

I think it's because most of experienced respondents (76%) already have a job, so they don't have much time to study, so they tend to be in a group "0-10 hours per week"
Isn't it obvious or should I point it out with a text? Because I didn't want to point out what looks like very logical and obvious fact. But if it's not so obvious - I can mention that.

Sam Aiken
@SamAI-Software
May 17 2016 12:26

@krisgesling about your visualizations - we finally have a whole combined data (draft) with new variable names, you can grab it here 2016-FCC-New-Coders-Survey-Data.csv

It's still not clean yet, but at least it has column names, that would be used in the final release.

Kris Gesling
@krisgesling
May 17 2016 13:23
Cool thanks for the heads up. I made sure to have column names as variables as I figured these would change over time. Hopefully will have the multi-map version done in the next two days
Zac Cassini
@zcassini
May 17 2016 13:25
Packt Pubilishing Free Book of the day: Python Data Visualization Cookbook
Get it at https://www.packtpub.com/packt/offers/free-learning
Sam Aiken
@SamAI-Software
May 17 2016 13:40

Hopefully will have the multi-map version done in the next two days

@krisgesling :+1: :+1: :+1:

that book is like over 170 usd, i got ch9 of that
evaristoc
@evaristoc
May 17 2016 19:53
@zcassini actually: there will be offers for the whole month!!!
Zac Cassini
@zcassini
May 17 2016 19:55
@evaristoc I think its everyday not just this month. 2 data sciencey books in a row is cool though.
just checked and I got 2 books last month. I need to remember to check everyday.
evaristoc
@evaristoc
May 17 2016 20:30
Yes... this month is the Data Month, though...
Zac Cassini
@zcassini
May 17 2016 20:31
ohh. I wonder what tomorrows will be.
crossing my fingers that we get something like Getting Started with Julia or Mastering Julia this month.
evaristoc
@evaristoc
May 17 2016 20:38
I think there are no many books written about Julia yet, are there?
Zac Cassini
@zcassini
May 17 2016 20:38
theres a few. Packt has those 2 titles
Julias site says thre are 3 books on Julia alll by Packt. And the 7 more Lanaguages in 7 days book.
evaristoc
@evaristoc
May 17 2016 20:51
Then let's follow... Julia is still in diapers, but better to start watching at... (someone see this post and may suspect I am doing something pervert...)
Gayathry Dasika
@gayathry2612
May 17 2016 21:20
Hi . I'm new here.
Can someone share the book practical data analysis ,by hector cuesta ? I missed the deadline .
I've got data visualization with python ebook. Its there for another 1 hr for free on the website
Jacob Bogers
@jacobbogers
May 17 2016 21:23
@gayathry2612 you can download most of these books from "library genysis"
@gayathry2612 (removed bootleg linkp)
Gayathry Dasika
@gayathry2612
May 17 2016 21:26
Thanks Jacob. I'll remember that. :)
Jacob Bogers
@jacobbogers
May 17 2016 21:27
I want some brownie points) thank my alias))
join my group its about mathematics and Linear programming in perticular
Gayathry Dasika
@gayathry2612
May 17 2016 21:27
@jacobbogers Thanks =)
CamperBot
@camperbot
May 17 2016 21:27
gayathry2612 sends brownie points to @jacobbogers :sparkles: :thumbsup: :sparkles:
:cookie: 331 | @jacobbogers |http://www.freecodecamp.com/jacobbogers
Jacob Bogers
@jacobbogers
May 17 2016 21:28
NUM NUM NUM NUM)
Gayathry Dasika
@gayathry2612
May 17 2016 21:31
I'm just starting off with ML. Currently learning Neural networks on Coursera
Jacob Bogers
@jacobbogers
May 17 2016 21:38
I think its a bit hyped these days,..., ML,, i dont see any courses offering fundamentals
Gayathry Dasika
@gayathry2612
May 17 2016 21:38
Anyone participates in data science competitions at kaggle here ?
Jacob Bogers
@jacobbogers
May 17 2016 21:39
I might join, i just was notified of kaggle
thanks for the kaggle note, thanks @gayathry2612 y
CamperBot
@camperbot
May 17 2016 21:39
jacobbogers sends brownie points to @gayathry2612 :sparkles: :thumbsup: :sparkles:
:cookie: 8 | @gayathry2612 |http://www.freecodecamp.com/gayathry2612
Gayathry Dasika
@gayathry2612
May 17 2016 21:40
Packt titles are good ~ Machine Learning with R , Python : brett lance
For fundamentals
I'm learning from there .
Jacob Bogers
@jacobbogers
May 17 2016 21:41
yup, I like packt, got all my nodejs and expressjs books from there
Gayathry Dasika
@gayathry2612
May 17 2016 21:41
That's nice
Jacob Bogers
@jacobbogers
May 17 2016 21:42
for R i used "advanced R",, I hate that langauge http://www.amazon.com/Advanced-Chapman-Hall-Hadley-Wickham/dp/1466586966
that book is pretty decent, I might port the R -core to node (its written in C)
Gayathry Dasika
@gayathry2612
May 17 2016 21:43
Oh
Jacob Bogers
@jacobbogers
May 17 2016 21:43
I think that langauge as a programming language is terrible, but i learned it anyway
it has a rich toolset, because R is open source and exist for some time
Gayathry Dasika
@gayathry2612
May 17 2016 21:43
its too early for me to comment , because I haven't practiced so much :)
Jacob Bogers
@jacobbogers
May 17 2016 21:44
I signed up at kaggle,..,
Gayathry Dasika
@gayathry2612
May 17 2016 21:44
Great
I signed up long ago. I'm still doing my 101s
@jacobbogers You can also form teams and solve problems together
Jacob Bogers
@jacobbogers
May 17 2016 21:46
hey I was going to suggest that
I did applied physics at www.tudelft.nl speciality theoretical fluid dynamics
so i can unload some mathematics on that stuff
num num num
Gayathry Dasika
@gayathry2612
May 17 2016 21:47
That is cool .
Jacob Bogers
@jacobbogers
May 17 2016 21:47
we can team up @gayathry2612
have some fun
OK, I need to do design work for my "tomato clock project" 3, more projects to go before front-end cert
Gayathry Dasika
@gayathry2612
May 17 2016 21:48
Sure . Ill send details on your math group
Jacob Bogers
@jacobbogers
May 17 2016 21:48
Nice to have met you @gayathry2612 and thanks for the link))) again
CamperBot
@camperbot
May 17 2016 21:48
jacobbogers sends brownie points to @gayathry2612 :sparkles: :thumbsup: :sparkles:
:warning: jacobbogers already gave gayathry2612 points
Jacob Bogers
@jacobbogers
May 17 2016 21:49
please do @gayathry2612
Gayathry Dasika
@gayathry2612
May 17 2016 21:49
Thanks @jacobbogers nice to meet you
CamperBot
@camperbot
May 17 2016 21:49
gayathry2612 sends brownie points to @jacobbogers :sparkles: :thumbsup: :sparkles:
:warning: gayathry2612 already gave jacobbogers points