These are chat archives for FreeCodeCamp/DataScience

19th
Sep 2017
Mohamed Zeid
@mzeidhassan
Sep 19 2017 04:45

Do you think these machines would be good for OpenNMT and any extensive machine learning work? I think the price is affordable. What do you think?

http://www8.hp.com/us/en/campaigns/workstations-z4-z6/index.html

evaristoc
@evaristoc
Sep 19 2017 07:14

@mzeidhassan Not an expert but I would buy one if I could for sure, specially if you prefer to use your own work station instead of cloud.

I am not sure if they will solve OpenNMT though, I guess it depends. They have just 2 Nvidia Quadro (although 48 cores in the Z6!), which is no bad but might not be enough for big models. I would suggest to go to the OpenNMT forum and ask there.

mstellaluna
@mstellaluna
Sep 19 2017 11:52
@mzeidhassan the z6 is suited for VFX and its rendering of graphics. Visual Effects is graphic and processing intententsive ... these are graphic cards that have built-in memory .. the machine is a beauty otherwise.. will easily run stuff like databases, graphic software..as a sys admin i'd buy one to add to my server collection... my concern is they aren't listing what the processor is... if its AMD or Intel.. I don't see the processor specifics.. the specs are UP TO.. they aren't out of box..... :) I don't know what OpenNMT is .. I would suggest as @evaristoc said and ask at the source as they will know best.
my largest server at home is up to 24 virtual cores if I'm not mistaken..
evaristoc
@evaristoc
Sep 19 2017 12:36

REALLY interesting discussions in the Deep Learning course of Andrew Ng.

Here this question:

Deep Learning is Easy, Learn Something Harder

Found an interesting post where the author suggests to learn the fundamentals and learn to apply them rather than just learning the tools (TensorFlow, Hadoop, Theano). Kind of agree. You can find the article here.
What's your opinion? Will doing this course make us future proof for next 2-3 years?

Here an answer by one of the students commenting the article:
What the article says is that deep learning and its affiliated algorithms rely on a relatively simple technique--gradient descent--that's a classic method in statistics. What the article claims is that in the future other statistical methods will be rediscovered and applied to develop new machine learning techniques.

The article also lists those classic methods:

the EM algorithm
variational inference,
unsupervised learning with linear Gaussian systems
PCA
factor analysis
Kalman filtering
slow feature analysis
Aapo Hyvarinen's work on ICA, pseudolikelihood.
this seminal deep belief network paper
So, yes, I agree, if you want to develop new Machine Learning techniques it makes sense to get more knowledgeable in statistics. But Machine Learning is not just statistical modeling and I guess one of the novelties of Machine Learning with respect to statistics is also the fact that it provides a new framework for using statistics (I leave this vague here because I'm not an expert).

You ask "Will doing this course make us future proof for next 2-3 years?"

This is an introductory course that should allow us to use and configure deep neural networks in an informed way, so definitely it should make us present-proof. Not sure about the next 2-3 years because who knows how fast ML technologies will be evolving in the future.

Shobhit Jain
@Shobhit1610
Sep 19 2017 13:00
Hello World
I am currently learning machine learning from andrew ng course . Could anyone suggest me a project in ML
evaristoc
@evaristoc
Sep 19 2017 15:05
@Shobhit1610 I recommend you to visit kaggle.com and see if there is something there for you. Check the Datasets section to start.
evaristoc
@evaristoc
Sep 19 2017 16:02

People,
I first finished already the videos for the first course of Deep Learning Specialization (all 4 weeks, at 1.75x the speed with some exceptions), finished some exams and now going through the Programming Assignments.

Guess... implementing NNs from scratch.

If you have done Coursera before you will find some innovations: Before you had to prepare the assignment on your side. Now you open a Notebook in cloud, complete the assignment, and submit that Notebook. You get evaluated very much as opening an account in Databricks. Smooth and no need of moving between platforms though.

This is THE course, IMO.

@erictleung - you were who mentioned the course here in DSR, not sure if you are interested.

I would say this course combined with the one by Google in Udacity.

In that course you are absolutely on your own, very much like taking the fCC web development course. You have to come up with the solutions without any advise but from the community.

I didn't really take it all but I learnt a couple of things. Planning to finish this one by Andrew Ng and then revisit the other one soon after.

Shobhit Jain
@Shobhit1610
Sep 19 2017 16:21
Thanks @evaristoc
CamperBot
@camperbot
Sep 19 2017 16:21
shobhit1610 sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:cookie: 366 | @evaristoc |http://www.freecodecamp.com/evaristoc
mstellaluna
@mstellaluna
Sep 19 2017 22:09
@evaristoc in case you weren't aware.. SAP has online free data science courses .. they have one coming up by Stuart Clarke for Data Science in Action - Building a predictive Churn Model
i can pass you the URL to the course
Matthew Barlowe
@mcbarlowe
Sep 19 2017 22:29
@mstellaluna what language are they taught in?
mstellaluna
@mstellaluna
Sep 19 2017 22:44
English for this specific course
Matthew Barlowe
@mcbarlowe
Sep 19 2017 22:48
Programming language
Haha
Mohamed Zeid
@mzeidhassan
Sep 19 2017 23:03
Can you please pass on the SAP course URL?
@mcbarlowe i dont know im signed up at openSAP for SAP Hana courses..
And i got alerted via email that this course starts in November
Mohamed Zeid
@mzeidhassan
Sep 19 2017 23:10
Thanks @mstellaluna!
CamperBot
@camperbot
Sep 19 2017 23:10
mzeidhassan sends brownie points to @mstellaluna :sparkles: :thumbsup: :sparkles:
:cookie: 818 | @mstellaluna |http://www.freecodecamp.com/mstellaluna
Bharath
@bharath93m
Sep 19 2017 23:45
Hello, is it possible to do a count distinct over a window function in pyspark?