These are chat archives for FreeCodeCamp/DataScience

16th
Sep 2015
C. Choi
@coding-choi
Sep 16 2015 07:57
I think you ultimately want to measure the variables that contribute to what you define as a "successful" user of your website.
Is it people who successfully complete the course? who ultimately find jobs? or just "customer" satisfaction?
I minored in Statistical analysis while I was doing my PhD in Psychology..... so I'm slightly interested in any sort of data science, although I still cringe at anything more than Regression analysis and ANOVA.
C. Choi
@coding-choi
Sep 16 2015 08:04
Fun fact - job satisfaction is not correlated with job performance. That is to say, the unhappiest people make the best workers.
C. Choi
@coding-choi
Sep 16 2015 08:14
Why is there so much interest in passive users? I think that says more about inherent personality differences than any real measure of what you consider successful users of your website. I am usually a lurker, but today I drank wayyyy too much coffee. But I'd consider myself a passive user.
Yikes, I just realized this may be a seriously involved contributor room after scrolling through previous messages.... uhm. Sorry! Back to lurk status.
evaristoc
@evaristoc
Sep 16 2015 08:19
@coding-choi hehehe! unhappiest people fact! well: "working" is not always the most "nice thing to do" for everyone...
Welcome to the room!
C. Choi
@coding-choi
Sep 16 2015 08:22
Oh, also, my work was in I/O psychology - the study of people at work. I ultimately didn't finish so I'm no expert!
evaristoc
@evaristoc
Sep 16 2015 08:24
@coding-choi passive use could imply using Gitter as a source of info, which it is believed to be relevant to approximate the impact of Gitter in the student development.
Absolutely not! We are encouraging discussion as well as activities. You are more than welcome to participate. You can decide the level of involvement.
C. Choi
@coding-choi
Sep 16 2015 08:24
I jus stumbled into this room because I was curious about the what types of statistical methods were used in different fields. And how it factored into programming.
evaristoc
@evaristoc
Sep 16 2015 08:25
I think you are in the right place.
C. Choi
@coding-choi
Sep 16 2015 08:26
And yeah the one big mystery in IO psychology at least when I was in school was how there was no correlation between any sort of measures of happiness, satisfaction, and objective measures of performance
Kinda depressing.
evaristoc
@evaristoc
Sep 16 2015 08:27
We can discuss those things more formally along the way. I have some idea about statistics and machine learning, and I am taking the opportunity to go further with that knowledge, trying to combine that with what exists in JavaScript.
C. Choi
@coding-choi
Sep 16 2015 08:29
Hmm. I am a complete newbie when it comes to programming, but I am decent with statistics programs like SPSS. But the last time I actually did any sort of data analysis was five years ago, and it was definitely not as interesting as what I've been learning here.
I was wondering though - have you found any correlation between completion of your course and people who actually code "socially"? IE paired programming?
evaristoc
@evaristoc
Sep 16 2015 08:35

Too many variables when doing those analysis in psychology... difficult... it would be the same here: even if it is not behavioural analysis, part of the population is not visible, so we have to limit the scope to what we see.

So far I am working on the descriptives...

C. Choi
@coding-choi
Sep 16 2015 08:35
I also kinda understand why being "passive" would be negatively viewed though - I mean not being able to work with others is probably going to factor in negatively further down the line in your program.....
evaristoc
@evaristoc
Sep 16 2015 08:35
SPSS was what I was using the most... hate it...
C. Choi
@coding-choi
Sep 16 2015 08:35
YES. hate it
evaristoc
@evaristoc
Sep 16 2015 08:37
@coding-choi about the correlation: we are not there yet: we are waiting for the data about course progress to be available. But I expect to see a positive correlation for sure...
C. Choi
@coding-choi
Sep 16 2015 08:38
Can I ask - do you see a lot of people dropping out at certain points in your program? And if so, when?
evaristoc
@evaristoc
Sep 16 2015 08:38
We should think about how to compare that progress to affirm that pairing is better than no pairing... I don't think we can make "experiments" here...
C. Choi
@coding-choi
Sep 16 2015 08:39
Yeah, I am curious - what kind of data are you actually able to get your hands on?
I just read a few pages up that you have tons of data with gitter...
evaristoc
@evaristoc
Sep 16 2015 08:43
@coding-choi that is more my area: attrition analysis, etc. I am interested on that too, but by looking only at Gitter we can't make any conclusion. The FCC challenges could require different approaches, some of them (like the ziplines) are more like self-paced and "free". You can show progress but they are more a personal taste. Not like bonfires, that are more specific, somehow standard, so can be discussed. People will visit Gitter for different motives as well... Therefore, Gitter activity do not totally correlates with the progress and it cannot be considered as an indication of engagement. We need more data.
Yes, @coding-choi: we are managing to connect to the API and therefore getting data from the rooms.
C. Choi
@coding-choi
Sep 16 2015 08:45
Sometimes with the
Bonfires - I can see how participating in the chat rooms or searching google might make it go faster - but I also wonder if that's not "cheating" - I'm curious if you have any data on time spent on each section.
evaristoc
@evaristoc
Sep 16 2015 08:48
So far the scope of the analysis is about Gitter and in particular I would like to find more about the solving process. I am at the moment looking at past data (before the bot) and see how the people changed after the implementation... but the analysis will rely on what I can do about detecting some "dialogue acts".
C. Choi
@coding-choi
Sep 16 2015 08:49
......what is the "bot"?
evaristoc
@evaristoc
Sep 16 2015 08:51
About bonfires: personally, I see that as cheating if you just copy/paste without trying to understand. Having that source of info and using it is actually in my view advisable: there is no point to re-invent the wheel, specially if you are new. Re-engineering the problem to suit your needs and reverse-engineering a solution is a way of learning too.
CamperBot
@camperbot
Sep 16 2015 08:51
you need to ask about @someone!
evaristoc
@evaristoc
Sep 16 2015 08:52
Hello, camperbot!
C. Choi
@coding-choi
Sep 16 2015 08:52
.....Are you human?
After I solve it, I do wonder if there isn't a way to "check" to see how it's more efficiently done, by someone with more proficiency than me.
evaristoc
@evaristoc
Sep 16 2015 08:53
It is a project carried out by dcsan (who can be considered a specialist in the area) to identify some key words in the messages and offer automated help.
C. Choi
@coding-choi
Sep 16 2015 08:53
Ah.
evaristoc
@evaristoc
Sep 16 2015 08:56
Efficiency is very important but not at this stage, specially if you are a hobbyist. Yes: there should be other ways that make that more efficient, for sure... One place where you can find some ideas of how to re-engineering your code is the CodeReview room. It hasn't been active for a while though...
C. Choi
@coding-choi
Sep 16 2015 08:59
Hmmm well, before I came on your website, I went through Code Academy's javascript lessons, and the one thing I liked was that there was a button linking directly to a forum for that particular lesson. I just liked to know how other people were solving the same challenges. I guess it's a bit difficult to do that in a live chat room. Do you have something like that - or plan something like that?
evaristoc
@evaristoc
Sep 16 2015 08:59

Anyway, @coding-choi: would you like to participate in one of our projects? We are looking for people who can help us with the design of a small online survey. Some analysis of the data would be also required.

We can work together on that. We are trying to use JavaScript for the analyses but for now it is not compulsory. I know SPSS so we can go with that too, eventually you could have a grasp of other tools.

I have some statistics so we can discuss the analytical methods eventually. It should be simple for now...
C. Choi
@coding-choi
Sep 16 2015 09:02
Hmmm, I don' t know if I can get my hands on the SPSS program --- hold on for one sec
evaristoc
@evaristoc
Sep 16 2015 09:02
I am suggesting to work on projects based on achievable weekly targets, not compulsory as I assume that this is just "for the lol".
C. Choi
@coding-choi
Sep 16 2015 09:03
Ah! Minitab.... have you ever used it?
evaristoc
@evaristoc
Sep 16 2015 09:03
Never...
But if you can... I would use R or python or even Matlab instead, those I know...
C. Choi
@coding-choi
Sep 16 2015 09:04
Oh Minitab was the statistics program I used before they forced SPSS on me. I found it a lot more user friendly
evaristoc
@evaristoc
Sep 16 2015 09:04
But Minitab would be ok... we just have to design the questionnaire and some "research questions".
C. Choi
@coding-choi
Sep 16 2015 09:05
Oh. I have some experience with surveys and questionnaires -but I honestly don't think area wise I'd be the best person to help with the content of that.
evaristoc
@evaristoc
Sep 16 2015 09:05
You will be in a team that at the moment is conformed by 2 people (we are waiting for the response from someone else to see if we are actually three). So we would be 3-4 people working on that project.
C. Choi
@coding-choi
Sep 16 2015 09:06
I don't mind easy data entry and data analysis
...but I also don't want to make a commitment and disappoint ya'all.
evaristoc
@evaristoc
Sep 16 2015 09:07
It should be easy, don't worry (again, it should be small). Anyway, I learn that where there is no information, any information is welcome. Currently there are lots of gaps of info, so...
C. Choi
@coding-choi
Sep 16 2015 09:07
I don't know python.....
evaristoc
@evaristoc
Sep 16 2015 09:08
No data entry: everything should be online.
Or computer-based. Data cleaning perhaps though...
For what I have seen, there shouldn't be more than 5000 records. That is actually an optimistic target...
C. Choi
@coding-choi
Sep 16 2015 09:09
Also - I'd like to get through more of your program...
evaristoc
@evaristoc
Sep 16 2015 09:10
my program? I am student, just like you!
C. Choi
@coding-choi
Sep 16 2015 09:10
Really?
How far along are you?
I've been trying to schedule 10 minutes a day since....May, but I go through spurts
evaristoc
@evaristoc
Sep 16 2015 09:11
Yes! But you will find there are lots of people with other professional backgrounds taking this program, It is what it makes it so rich.
C. Choi
@coding-choi
Sep 16 2015 09:12
I'm more interested in learning python, but everyone suggested javascript first
Anyways - can I take a look at the type of data you have first?
evaristoc
@evaristoc
Sep 16 2015 09:13

I have been working one of the basejumps, but I am jumping in between... I am studying more about this outside this camp (books, other courses, etc). So I am not going so fast...

And working with andela-bfowotade in one of the projects here. Very nice!

C. Choi
@coding-choi
Sep 16 2015 09:13
I don't know what I can offer - I actually didn't finish my PhD, although I did finish my statistics coursework before I dropped out.
I can do regression analysis best, simple correlation better.
evaristoc
@evaristoc
Sep 16 2015 09:15
For that project, advising and eventually analysing some data (corr, possibly a logistic regression or a clustering to check for groups). For DataScience room, just come and check! I am preparing a weekly digest for everyone interested. I will put you in the main list from now on.
C. Choi
@coding-choi
Sep 16 2015 09:16
Okay, thanks!
CamperBot
@camperbot
Sep 16 2015 09:16
if you want to thank someone, put an @ before their name!
C. Choi
@coding-choi
Sep 16 2015 09:16
.....shoo camperbot
evaristoc
@evaristoc
Sep 16 2015 09:16
Well, that's it! Just join and check. I will send you a PM soon with the details about where we are discussing the project.
C. Choi
@coding-choi
Sep 16 2015 09:16
Uhm. So do a bunch of people just invisibly hang around these chat rooms?
I actually don't even know how to exit out of a room.
evaristoc
@evaristoc
Sep 16 2015 09:18
If you want unlist a room from the menu on your left, just hoover over: a "x" will show on the right side of the button. Press the "x": it will ask if hide or leave a room.
C. Choi
@coding-choi
Sep 16 2015 09:18
Ah.
evaristoc
@evaristoc
Sep 16 2015 09:19
But don't hide/leave this one!
C. Choi
@coding-choi
Sep 16 2015 09:19
Oh no I won't.
evaristoc
@evaristoc
Sep 16 2015 09:19
:)
C. Choi
@coding-choi
Sep 16 2015 09:19
But I just went chatroom surfing like crazy
Which ones are the main ones?
.....I was actually looking for a room for people who have reallllly stupid and elementary questions
Like why do some sites suggest you write variables likeThis - while others don't care.
Or what the order is in terms of learning programming languages?
I hate hate css. but people seem to say it'sthe first one you should learn.
I really want to learn python.
C. Choi
@coding-choi
Sep 16 2015 09:26
Anyways, thank you for letting me rant a little! I will be back tomorrow. Let me know how things are progressing!
CamperBot
@camperbot
Sep 16 2015 09:26
if you want to thank someone, put an @ before their name!
evaristoc
@evaristoc
Sep 16 2015 09:30
@coding-choi. This is not the room for elementary questions I am afraid. Here in FCC we say there are no stupid questions. For elementary questions as those, the best place I know is the main FCC room. And you don't need to visit the room unless you see some activity (the button get a green signal and goes first in your menu list). I will "call" you (then it will be marked with an orange indicator).
evaristoc
@evaristoc
Sep 16 2015 10:09

Hi people! (@dcsan and @SaintPeter):
Have to correct:

  • 20 days of no-activity to consider them no-active, not 15 days
  • more than 5 days to consider then frequent

Sorry for that...
For some code and for results, check:
https://github.com/evaristoc/fccgitterDataScience

evaristoc
@evaristoc
Sep 16 2015 10:14
Those numbers above are a bit arbitrary (obs: they always are a bit arbitrary, treated as conventions - you should find what works better for your case...)
Probably the more than 5 is a bit harsh... but the 20 days is probably ok...
If we loose a bit the specification for considering someone as frequent, we would get more frequent ones. But again: depends on what you want to achieve and what the objectives are.
evaristoc
@evaristoc
Sep 16 2015 10:20
This was just an intent to make comparisons. I will keep the concept for future comparisons while not anything else available...
Rex Schrader
@SaintPeter
Sep 16 2015 15:50
@evaristoc What might be useful would be to examine the entire body of historical data and determine if a person takes a break of X days, how likely are they to come back? Essentially, you can empirically determine if 15 or 20 days (or some other number) is a good assumption. You can say "For people who were inactive for X days, Y% came back?" You don't have to guess when you have data.
evaristoc
@evaristoc
Sep 16 2015 16:19
@SaintPeter thanks! yes, everything is now very much like a draft. I am using only one room as sandbox, getting used to the data and finding some initial patterns. Eventually we will can refine the information to suit particular interests.
CamperBot
@camperbot
Sep 16 2015 16:19
evaristoc sends brownie points to @saintpeter :sparkles: :thumbsup: :sparkles:
:star: 619 | @saintpeter | http://www.freecodecamp.com/saintpeter