These are chat archives for FreeCodeCamp/DataScience

17th
Jul 2016
Quincy Larson
@QuincyLarson
Jul 17 2016 00:36 UTC
@evaristoc @alicejiang1 thanks! I've been working really hard on Medium. It has been an extremely rewarding expenditure of effort.
CamperBot
@camperbot
Jul 17 2016 00:36 UTC
:cookie: 300 | @evaristoc |http://www.freecodecamp.com/evaristoc
quincylarson sends brownie points to @evaristoc and @alicejiang1 :sparkles: :thumbsup: :sparkles:
:warning: could not find receiver for alicejiang1
Quincy Larson
@QuincyLarson
Jul 17 2016 00:38 UTC
@evaristoc I can give you specific stats on Medium and Reddit - just let me know what you want. Unfortunately, the data on Medium isn't very granular. Also, data about Medium would be publication-level - not article level.
evaristoc
@evaristoc
Jul 17 2016 11:30 UTC
Thanks, @QuincyLarson ! I think it will be fine!
CamperBot
@camperbot
Jul 17 2016 11:30 UTC
:star2: 1181 | @quincylarson |http://www.freecodecamp.com/quincylarson
evaristoc sends brownie points to @quincylarson :sparkles: :thumbsup: :sparkles:
evaristoc
@evaristoc
Jul 17 2016 11:48 UTC

@samosale: ready for the billboard? I processed some data about main chatroom's posts during the period between Dec 31 2014 midnight until Jun 30 2016 midnight (8Gb) - European Time Zone. Data excluded camperbot.

The data processing resulted in a json file of 27Mb. The processing consisted in the following:

I divided the data in time windows of 6 hours and aggregated the number of posts that each camper sent during each time window. We have now 2069 points of data between the start and final data collection period, with exceptional cases where NO conversations occurred (eg. the interlude when FCC was using Slack). The idea I have in mind is:

  • A billboard of the 5-10 top participants every 6 hours, moving from start to final data collection date.
  • Add another chart under the billboard representing the number of participants and/or posts along the period highlighting the specific window we would be observing on the billboard at any specific moment.
  • Better if we add a brush to select specific periods of interest?

There is additional data added to the file, like small avatars and counts of readBy's per post. Let me know when we can discuss the file and the viz? I will send you the file when we think we are ready?

Aleksandar B.
@aleksandar-b
Jul 17 2016 12:14 UTC
@evaristoc hard challenge... :D Need some time to prepare... but sure
I think 6 hours is too much
Browser graphics can't handle that...but I am not sure
evaristoc
@evaristoc
Jul 17 2016 12:21 UTC
@samosale there are just 2069 intervals of data, each consisting in aggregated data every 6hours. Aggregate = Sum of all post that occurred during that interval...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:21 UTC
should we use d3 or pure svg+js? I will try to build prototype with Svg
Send me a file when you are ready
32mb?!
evaristoc
@evaristoc
Jul 17 2016 12:23 UTC
You said you have some library? anyway: between plane svg and d3.js, I would advise d3.js which it is a library to handle svg with data...
27Mb
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:23 UTC
How can we handle that?
evaristoc
@evaristoc
Jul 17 2016 12:24 UTC
It is a json file.
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:24 UTC
I don't have my library, though I have tried to build it
evaristoc
@evaristoc
Jul 17 2016 12:24 UTC
ok...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:24 UTC
Should we use stream?
evaristoc
@evaristoc
Jul 17 2016 12:24 UTC
stream is better...
or not?
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:24 UTC
idk
we need data the start
if we are rendering with d3/svg
evaristoc
@evaristoc
Jul 17 2016 12:25 UTC
Actually... it should be stream + brush...
let's use d3.js... the good thing is that the data is prepared...
I can't pass you the data right now but I will... I will have to go in few minutes...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:26 UTC
basicallly that is a long chart wich is drawn right away, and then translated over time with circle moving with y data
evaristoc
@evaristoc
Jul 17 2016 12:26 UTC
:+1:
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:28 UTC
I preffer pure svg, because we can use new d3 modules, not all d3
but we will see
evaristoc
@evaristoc
Jul 17 2016 12:28 UTC
easy for 2069 points... but there is some data processing: each interval consists in between 0-300 observations (the campers at that interval). We need to extract those who had the maximum participation at each interval, the first 5-10... reverse-sorting them by number of posts and picking the first 5-10...
Ah... about d3js 4: but then I am not sure if I can help much... I haven't experienced the new version yet...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:31 UTC
Most my works were in pure SVG, but it is not problem to me to code in D3, I will manage somehow
evaristoc
@evaristoc
Jul 17 2016 12:31 UTC
Ok... let's do it in svg + d3 v4!!
Sounds a good idea...
I think just the billboard for the first part of the project and for the article will be fine...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:32 UTC
ok
evaristoc
@evaristoc
Jul 17 2016 12:32 UTC
Then we can add more, and update the article eventually...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:33 UTC
sure
evaristoc
@evaristoc
Jul 17 2016 12:35 UTC
@samosale
I try to handle you the data tonight? There is no much data to show: 2069 points * 5-10 for the billboard, 2069 points for any other chart under the billboard...
OBS: if we face performance issues we could shrink the data collection period if that is too much and start the billboard since let's say 1 Jun 2015 instead or even later! The idea is just to produce a demo project that works just good...
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:37 UTC
send me a data when you are done :+1:
evaristoc
@evaristoc
Jul 17 2016 12:37 UTC
:+1:
@samosale Send me a PM with an email or similar?
See you!
Aleksandar B.
@aleksandar-b
Jul 17 2016 12:40 UTC
whatever you want
:stuck_out_tongue_winking_eye:
Quincy Larson
@QuincyLarson
Jul 17 2016 19:30 UTC
@evaristoc @samosale that sounds like a fun project. I think your approach and methodology would make a great Medium post when you're done.
@evaristoc I haven't forgotten about your pending Medium article. Let me know when you'd like to help me finalize and publish it.
evaristoc
@evaristoc
Jul 17 2016 19:32 UTC
@QuincyLarson :+1: I will find some time this week and send you a PM?