These are chat archives for FreeCodeCamp/DataScience

Mar 2016
Mar 19 2016 10:21

Hi People:
Started working on the code comparison project.
First preparing the data with python. Just one record (word counting):

['function palindrome  (  str  )  { str = str  .  replace  ( /[^\\ w ]|/ g  ,  "" )  str = str  . toLowerCase (  )  compare = str . split ( \'\' )  . reverse (  )  . join ( \'\' )  if  (  str == compare  )  { return true } else { return false } } palindrome ( "eye" ) ',
 defaultdict(<class 'int'>, {'': 22, 'replace': 1, 'w': 1, 'g': 1, '/[^\\': 1, 'palindrome': 2, '{': 3, 'if': 1, 'function': 1, 'reverse': 1, 'compare': 2, '}': 3, '==': 1, "''": 2, ']|/': 1, ')': 8, 'str': 7, '(': 8, 'false': 1, '.': 5, 'toLowerCase': 1, '""': 1, '=': 3, 'else': 1, '"eye"': 1, 'split': 1, ',': 1, 'return': 2, 'join': 1, 'true': 1})]

There are 328 records about palindromes captured from the Torrent Dataset. Not many. There are some reasons why that could be but I still think that my code is not capturing all...

I will be converting data into json... the method, as mentioned before, will be SVD. Still thinking what part of the calculation should be made by nodejs but it should be some. I will be using the sylvester library mentioned last week.

It should be online. I might need something like a markdown previewer to introduce new code. Any help?

Mar 19 2016 10:47
@luishendrix92: I am not sure how your time is... what do you think about helping me with making the user interface? I liked your Solution Getter, I was thinking on something similar although people should introduce the full code and the server should make some calculations to find similarities and rendering some "recommendations"...
Mar 19 2016 13:59

For those who were maybe interested in my network graph project (above), I didn't make any immediate loads of the project because I was unsatisfied with the results.

I tried different things but I couldn't solve the issues I wanted to solve. However my last solution is still worth sharing, as it is partially responding to some features that I wanted to introduce, like highlighting nodes.


The actual solution should also reduce the opacity of the links that are not connected to a selected node after double-clicking on the corresponding point.
Mar 19 2016 14:06

I will try to make the project available for the group before Wednesday next week and see if I can still find a solution to my code.

Instead of commenting the code I will just mention where I found the most challenging issues.

So far, no-one has beaten the camperbot popularity in the main chat during the last 30 min - 1 hour...

Last time I checked, ArielLeslie, sludge256, Cerebral and <some else> seemed to be centres of conversation.

I wonder if in absence of leaders, everyone talk to camperbot instead... Perhaps also activity (my construction of the network is based on explicit calls by username in the caller's post)

Mar 19 2016 14:16
Checking: NEW CAMPERS... a lot in a row! So camperbot welcome message...
Mar 19 2016 14:21
By the way, @Mius00: you haven't answer me... I think I now have a REALLY interesting project in mind... let me know...
Mar 19 2016 14:52

Mar 19 2016 15:02


I am going to quickly disclose the idea that I will propose to @Mius00 to work. It is based on research question that just occurred to me yesterday while studying networks. It goes as follows:

As you have maybe realised, FreeCodeCamp doesn't have mentors. Most of the "mentoring" is the knowledge transfer through campers with different level of expertise that participate in the different meeting resources. For what I have seen, chatrooms are the first point of knowledge transfer, then other channels (FCC encourages the use of pairing, and currently streams).

The general research question is:

how influencial those channels are on the final solutions given by campers?

More specifically:

how influential the people central at those channels are on the work of other campers?

Not having mentoring, it is possible that for some campers the most popular campers in the chats, for example will have strong effect on the final work of campers who follow them more...

Another question that follows is:

How the popularity in those channels would make the portfolio of the popular person also more popular?


How long that popularity could last?

This is an important aspect: if your portfolio becomes popular for any reason, you are more likely to suggest that your portfolio is outstanding, and your chances to find jobs, for example, is higher. That even if your portfolio is not technically speaking much better than the project of a less popular camper, for example.

Mar 19 2016 15:09
Just to see which sectors or areas we could likely talk about when doing this project:
  • marketing
  • topics in innovation and technology studies (for example, how the technical solutions are diffused and adopted by following a leader)
  • social network analysis
  • probably code comparison and therefore NLP
  • learning theory
  • and more
So: there is a lot to work on this project and the idea is to also to work on nice presentations using D3.js or even Three.js; analysing data using R or python or whatever; we can even make a video or similar about these results, I don't know...
Anyway: let me know if any of you are interested....
Mar 19 2016 15:16
And don't think this doesn't have relevance... no many people have so much data and information to do and find the things that we could be working with this project...