These are chat archives for FreeCodeCamp/DataScience

6th
Jan 2018
Josh Goldberg
@GoldbergData
Jan 06 2018 00:18
@evaristoc keep me informed when you want to move forward.
Yingjie (Iris) Hu
@huyingjie
Jan 06 2018 00:57
@evaristoc Do you have data in 2017? 2015 data is too out-dated.
evaristoc
@evaristoc
Jan 06 2018 08:40

@huyingjie there are 3 files - all of them combined have data from 31-Dec-2014 to around 9-Dec-2017.

I made some notes in the description. Be careful with some posts that are duplicated between files (exact dates of collection overlapped) AND time - the time, particularly the hours, showed by the sent variable is local time. Mine is Western Europe. Additionally, I collected the files in different times of the year and there is a file (the last one) that is in Winter Time (-1 hour to summer time).

If your project consider times, that will be an added data manipulation challenge ;) .

@huyingjie This data is more real than much of the things you will find in Kaggle. Be careful with conclusions too. Remember there are other fellow students involved. Data is open but be professional.


@GoldbergData immediately! :)

I will contact you privately.

evaristoc
@evaristoc
Jan 06 2018 17:14

PEOPLE

Back to the emoji project, it appears that the results look VERY different if I analyse the keywords (those between : ) than the unicodes (probably copy/paste of the emoji from a source)...

The top10, in order of popularity?

  • ':smile:',
  • ':wave:',
  • ':blush:',
  • ':point_up:',
  • ':laughing:',
  • ':sparkles:',
  • ':worried:',
  • ':coffee:',
  • ':fire:',
  • ':clap:'

This is a real deviation of what I was expecting to be honest... Now I am not sure of the reliability of this result...

My plan is to focus either in keywords or unicodes, not both. Otherwise the articles could get longer. BUT... I would be able to make Charts for both cases as a "bonus" information...
Hmmm.... 🤔
What is the shortcut of the "thinking" emoji???

It doesn't seems to be one in this apparently outdated reference : https://www.webpagefx.com/tools/emoji-cheat-sheet/

:scream:

evaristoc
@evaristoc
Jan 06 2018 17:19
:) :)
Josh Goldberg
@GoldbergData
Jan 06 2018 17:21
Hmmm.
What is the difference between : and the Unicode? I guess this a Gitter specific thing rather than a smartphone keyboard thing? @evaristoc
evaristoc
@evaristoc
Jan 06 2018 17:22
However, the keywords (actually they call them "aliases") are more used than the unicode form.

@GoldbergData

The unicode form is one interpreted by your machine. The alias seems to call an existing image in Github / Gitter.

If I write the unicode form in my Linux it will look differently than in my Windows.
I think but I am not sure, the aliases looks all the same (iOS-format).
The list used by Github that I have is the 2015 version.
evaristoc
@evaristoc
Jan 06 2018 17:27
I haven't found any updated version to the list so far...
evaristoc
@evaristoc
Jan 06 2018 17:38
I think what happened is that the list of shortcuts I am using is incomplete. :joy: renders an emoji but it is NOT in the list... No good...
I haven't been able to find a complete list so far. The project would be hard to finish completely if I don't get that list of shortcuts updated.
evaristoc
@evaristoc
Jan 06 2018 17:43
There are 1118 shortcuts out of 1935 that are not listed in the reference of 2015. Some of those 1118 don't render emojis at all.
For example: :301:
This one? :waxing_gibbous_moon:
That one yes.
:PENSIVE:
:pensive:
Only accepts lowercase
evaristoc
@evaristoc
Jan 06 2018 17:58
I got rid of uppercases and digits and found that there could be around 550 aliases missing data from the list I have. Probably less.
:F1:
:f1:
So in total, there would be about a vocabulary of ca. 1,300 emojis (out of 3,300 existing in the list released mid- 2017).
:scream:
evaristoc
@evaristoc
Jan 06 2018 18:06
:clapping:
Some of them might have more than one alias:
:point_up: and :pointup:
Oh! Only one alias. Good.
The 550 missing is likely less, probably a 60-70% are true aliases.

Well... getting close...

I do another analysis to show you and then I might be done for today with this one.

Timothy Javins
@timjavins
Jan 06 2018 18:21
If you are interested in neural networks, but you're also the kind who doesn't want to launch on-demand server instances but would rather have something of yours on-site, here are a couple options:
Intel Movidius USB stick
get access to Huawei Honor View 10's on-board NPU <- in its launching phase
For the Huawei Honor cell phone option, you'd use the HiAI API. Since they're still in launching phase, it might take a bit more work to gain access (beyond just clicking online). You might have to email somebody for now. But it should be widely available fairly soon. Microsoft is already working with it.
Timothy Javins
@timjavins
Jan 06 2018 18:30
I'd go for the Movidius option, but I'd also buy the phone just because of my own affinities. For some reason, if my phone can't double as a desktop, it doesn't feel right for me. I'm sure I'll need to add " and run a neural network for me" to that at some point. XD
evaristoc
@evaristoc
Jan 06 2018 18:31
thanks, @timjavins!
CamperBot
@camperbot
Jan 06 2018 18:31
evaristoc sends brownie points to @timjavins :sparkles: :thumbsup: :sparkles:
:cookie: 143 | @timjavins |http://www.freecodecamp.org/timjavins
evaristoc
@evaristoc
Jan 06 2018 18:32
@timjavins Are you using some NN at the moment?
Timothy Javins
@timjavins
Jan 06 2018 18:33
No. Honestly, I've had to do some self-reflection and reprioritize my time. It isn't time for the fun stuff yet. I'm working through the boring stuff right now.
Part of my program requires I get a CompTIA A+ certification. Part 2 of my test is in the next week or so. So that's what I'm working on.
I've spent some time on NN but haven't actually done anything yet.
evaristoc
@evaristoc
Jan 06 2018 19:08

@timjavins Ok. What is the boring stuff?

What makes CompTIA so boring? First time I see the certification by the way. Not in USA myself. And what are you really after, if I might know? And what are you expecting to get from learning NNs?

Anyway I take the opportunity to share a reflection with those asking questions about DS.

It is my humble opinion based on the poor experience I have that:

MOST OF THE DS TASKS are VERY but VERY VERY but VEEEEERYYYYYY BORING, for the disappointment of many in this channel.

Not that I am the best example but look at me: I am just stuck in a relatively simple project right now. So.... What do you expect when working with BIG DATA? Confetti? Unless you are brilliant, I think it will be a lot of... you know...

This might not relate to you, @timjavins, but I think many people coming to this channel are really but really really really really really really really really really underestimating the kind of things Data Analysis and DS entails....

So, I really congratulate you for going through the boring stuff now. Really: Well Done!!!!!! I am sure you might really need that for the sake of having fun later :) :) .


People

Final update of the emoji project and I go. The top10 of those aliases only with lowercases (which is might include majority but not all the aliases):

  • 18928, :smile:
  • 13135, :wave:
  • 11016, :blush:
  • 4351, :point_up:
  • 4327, :laughing:
  • 3778, :sparkles:
  • 3127, :worried:
  • 2783, :coffee:
  • 2513, :fire:
  • 2374, :clap:

Definitively different to what I got in the other results, but this is a more likely trend reflecting the activities in the Casual chatroom. Hmmmm... See you all !!

:smile: is one of the easiest: :) (its emoticon) is one simple alias too.
Ok, :wave: !!!