These are chat archives for FreeCodeCamp/DataScience

9th
Mar 2016
evaristoc
@evaristoc
Mar 09 2016 11:05

Hi People,

Some links...

--- For those who are planning to follow the Coursera's ML course here a medium article that might be discouraging but useful to read. The author suggests that the is not the best way to learn ML because:

  • the use of Octave/Matlab,
  • no indication of how to make feature selection,
  • use of small datasets,
  • many other techniques are ignored

In the first place, the course is about an introduction to ML techniques, not DS. However it is true that it is just an introduction. Take the training to focus on the 101 theory and exercises about the techniques: they are very well explained!

Then, when talking about Matlab (which is a very powerful language too...), I would say: if learning Matlab/Octave you could have a first introduction to python's scipy-numpy, which are commonly used for ML with python and are syntactically based on Matlab (actually: python scipy-numpy fall behind Matlab in some aspects).

It is true that I don't hear many DS positions using Matlab, but here it is an example that happens. There are niche sectors were you must know Matlab.

Additionally: Matlab/Octave forces to think in terms of matrices, which is core to data analysis anyway.

I have to be totally agree with feature engineering: there is nothing about in the course and it is VERY important.

Finally he suggests that you should know about all. In my opinion, think that it is possible that some organisations might be looking for someone who covers just one aspect of the whole thing. For example: the author suggests that a DS should know about JS, and particularly D3.js. Other thing to understand is back-end (the main focus of the author is Django). Well: guess what: we are having sections on those technologies here! So don't feel discouraged. But you have to work though...

--- the following link, I don't have any comment: some JS tools for Data Scientists
evaristoc
@evaristoc
Mar 09 2016 11:44
--- There are some attributes of nodejs for data analytics --- REMEMBER: not a panacea, use it wherever it fit the best... This example it is from 2013 :
http://www.datascienceassn.org/tags/nodejs
--- And.... about trends, in words of a guru (those guys may be biased...):
https://www.linkedin.com/pulse/ten-trends-data-science-2015-kurt-cagle

The KEYWORD seems to be STREAM
evaristoc
@evaristoc
Mar 09 2016 12:09
I have to correct and add info:
link to datascienceassn.org just above is not of 2013 but 2014. The guy who makes the presentation worked for the Genomic Project. He made a prototype for a tool (Barracuda) with Kafka, Storm, Redis, nodejs and d3js. The example is in text analytics. OBS: you will see some Java-like code, as many BigData tools are Java-based...

^^^ The idea is doing it in realtime: I have enough kpi's and script to work on something... I will need to solve the delay due to analysing the data

The project is a bit in the tradition of the camperbot, but it will be focused on analysing data instead, or at least making a chart at real time.

I need to understand some technical issues though: my plan is to collect data from several rooms simultanously, so I don't know if I will need workers for that (a worker per each FCC chat room...)

Miguel Ben
@Mius00
Mar 09 2016 16:21
o.o a datascience chat!!!!!
I mean
hello world1
CamperBot
@camperbot
Mar 09 2016 16:21

welcome to FreeCodeCamp @Mius00!

Darwin RC
@darwinrc
Mar 09 2016 16:21
thanks @evaristoc Great advise and content to learn DS.
CamperBot
@camperbot
Mar 09 2016 16:21
darwinrc sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:star: 230 | @evaristoc | http://www.freecodecamp.com/evaristoc
Darwin RC
@darwinrc
Mar 09 2016 16:23
@evaristoc just a question if you have the time: what do you think about the premise that an advanced degree (preferable PhD) is required to be proficient in DS?
evaristoc
@evaristoc
Mar 09 2016 16:23
@darwinrc thanks! Stay around: the idea is to keep going!
@Mius00 welcome, yes!
CamperBot
@camperbot
Mar 09 2016 16:23
evaristoc sends brownie points to @darwinrc and @mius00 :sparkles: :thumbsup: :sparkles:
:star: 304 | @mius00 | http://www.freecodecamp.com/mius00
:star: 397 | @darwinrc | http://www.freecodecamp.com/darwinrc
evaristoc
@evaristoc
Mar 09 2016 17:05

@darwinrc

The following is all my opinion:

For what I have seen currently is becoming more important, yes. This is because many companies are more and more aware of needing professionals that guarantee Return On Investment. Not all sectors requiring DS have similar requirements for the role, but it will be if a small improvement in a given solution means a marginal increase in return. In some very competitive sectors that small difference could be decisive of being on top or out of business. Example of a sector? Stock Market service providers.

Although the DS job market seems to offer a lot of potential it seems to be maturing in many aspects, like for example the preferred technologies and architectures. On the other hand, many sectors have delayed DS adoption until the offer gets more mature to guarantee their requirements (banking, for example).

Unfortunately I can't give you other advice than taking a uni training in related areas if your interest is DS (as I have mentioned before, I didn't take one).

I would say: it is possible that the job market will become more specialised in the future, for example a web developer with demostrable interest and or experience in DS, and for that DS online trainings could be an excellent option. Similarly, roles like Data Analyst could be reshaped and demanding more DS-like skills in the future (I have seen that...).

Hope this helps...

Darwin RC
@darwinrc
Mar 09 2016 17:23
@evaristoc thank you so much for taking the time to give such a thorough answer. I'm just a beginner (taking moocs basically) based in South America and I somewhat sense what you said is what is happening with this "new" discipline in the market. Being a software engineer I wonder if DS could be as meritocratic in the future.
CamperBot
@camperbot
Mar 09 2016 17:23
darwinrc sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:star: 231 | @evaristoc | http://www.freecodecamp.com/evaristoc
Miguel Ben
@Mius00
Mar 09 2016 18:17
This message was deleted
Miguel Ben
@Mius00
Mar 09 2016 18:33
/
This message was deleted
evaristoc
@evaristoc
Mar 09 2016 19:00

@darwinrc if you are studying to become a SE you have an advantage, but in general you should combine that with the study of Statistics, Machine Learning (ML) and Data Mining (DM) techniques, better if advanced levels.

In my opinion, keep 3 things in mind when working with ML and DM:

  • the artistic part of this is Feature Engineering; not easy
  • Usually, the form of your solution should be usually constrained within at least two statistical paradigms: Bayesian (NICE!) or Frequentist - the last one I would called Gaussian. Why statistics? You need an optimisation goal as well as to defend your results.
  • I would say that in practice DM techniques would be similar to solutions for NP-hard optimisation problems. If you understand the problem you are facing, you might offer a good computational solution if you are good in algos. If you haven't take any training in Discrete Optimisation yet, do it now.

I would suggest also to specialise: finance? marketing? healthcare?

Thanks, @darwinrc for considering my advice... Be aware that there are people here in this room that can have a different opinion and even giving you a better advice than me. Anyway: hope this helps. If you have other questions just DM (tambiƩn soy de LA).

CamperBot
@camperbot
Mar 09 2016 19:00
evaristoc sends brownie points to @darwinrc :sparkles: :thumbsup: :sparkles:
:star: 398 | @darwinrc | http://www.freecodecamp.com/darwinrc