These are chat archives for FreeCodeCamp/DataScience

25th
Dec 2017
Eric Leung
@erictleung
Dec 25 2017 00:06 UTC
@deanhu2 ah I see. Yeah, I guess Tensorflow is a more sophisticated and feature filled than what might be necessary. However, despite it not being lightweight, it does probably have better documentation and user tutorials to help you with your project. And merry Christmas to you! :smile: :christmas_tree:
Dean
@deanhu2
Dec 25 2017 00:24 UTC
hmm i looked at orange....it seems the license is a little restrictive? its gnu so any applications you make with it have to also be open source no?
Dean
@deanhu2
Dec 25 2017 00:38 UTC
@erictleung ah i might take alook into it, is it hard to set values of each layer such as stride, filters etc?
Dean
@deanhu2
Dec 25 2017 00:44 UTC
the api im using in tiny_dnn only accepts 1 parameter for a kernel....a single integer basically...but the original code uses kernel: (2, 2) , would it be the same if i used a kernel as 2*2?
Youness03
@Youness03
Dec 25 2017 01:00 UTC
@erictleung I try to folow the documentation but it's not correct when I lunch my application
evaristoc
@evaristoc
Dec 25 2017 13:34 UTC

@deanhu2
I guess you are trying a conv-NN? If so, kernel (also known as convolution matrix or mask) is usually a square matrix. It seems that tiny_dnn is set to work with square matrices only, so you need just one parameter value. I tried to go through the code of the Yolo exercise but I didn't find the variable "kernel" mentioned in the code of the example. For what I can get from the code, it seems an implementation of what the author later described in the website. If you could help me by pointing me to the section where those 2 values are given to the program?

Kernel size and Window size? Not sure, in convolutional NN jargon I think "window size" could refer to the size of the overlapping matrix when adding padding? Reading somewhere else I found out that it might be also related the size of the pooling matrices if you include one in your processing.

I agree with @erictleung and you that darknet documentation is poor. And I would add there is enough mix up and overloading of terms in the whole DS / ML sector to bring you mad sometimes.

I saw https://orange.biolab.si/ long time ago when they were starting as a project. That is a really lightweight one and I think they don't have an API for cnn, you should implement yourself using their NN capabilities.

evaristoc
@evaristoc
Dec 25 2017 13:40 UTC

@deanhu2

Have you training in cNN? If not, I would suggest you to get some. It might help you the make the whole reverse-engineering easier?

Success!

Dean
@deanhu2
Dec 25 2017 14:43 UTC
@evaristoc wow thanks so much for answering all those quesitons! merry christmas also! iv only tried training a cnn previously using a simple toolset i think imagej? but not its purely in code i seem to hit a barrier in understanding parts
CamperBot
@camperbot
Dec 25 2017 14:43 UTC
deanhu2 sends brownie points to @evaristoc :sparkles: :thumbsup: :sparkles:
:cookie: 391 | @evaristoc |http://www.freecodecamp.org/evaristoc
Dean
@deanhu2
Dec 25 2017 14:48 UTC
@evaristoc it seems darknet also sets a lot of values for the neural network its using in a config file : https://github.com/pjreddie/darknet/blob/master/cfg/darknet.cfg
@evaristoc doesent mention a kernel size only filter sizes...
Dean
@deanhu2
Dec 25 2017 15:39 UTC
theres also abreak down here of a minimal version : http://machinethink.net/blog/object-detection-with-yolo/ where it covers each layer, but fails to state how it has shrunk the layers...it goes from 416x416x3 -> 416x416x16 down to 208x208x16 with only a mention of using a leaky relu in between...so i can figure out how it shrinks the input down to 208...
Rajath
@rajathrao
Dec 25 2017 16:59 UTC
Guys, often time in the Data Analyst interviews they ask about Data Methodology. To be precise one of the question was' What methodology do you follow for a given data set. Explain with a example or a project'. I am not sure I have done this properly. Thanks in advance.
Eric Leung
@erictleung
Dec 25 2017 19:58 UTC
@rajathrao probably what they mean is your thought process when you get a data set. Do you immediately jump in and throw a neural network onto it? Would a random forest be better, or is a simple linear regression okay? Do you inspect the data i.e. what kind of exploratory data analysis do you do? Do you have a question in mind on what you want to do with the data? Those are some questions I can think of you might ask yourself when thinking about data methodology. I've never heard of the question before, but based on the "Explain with a example project" leads me to think they want a summary of your thought process on how you went from raw data to results and insights. I hope that helps and makes sense.
Rajath
@rajathrao
Dec 25 2017 19:59 UTC
@erictleung sure.. what would you answer. I mean in an enterprise, there may be a large and erroneous data set. You may have to check data quality etc.
Eric Leung
@erictleung
Dec 25 2017 20:06 UTC
@rajathrao so I haven't worked with enterprise data or the industry yet, so take my thought process with a grain of salt. But I'd focus on on the business requirements and purpose of looking into this data. First two questions I'd have are can this data answer the question I have and is the data fit or good enough to be able to answer my question. So you'll first want to check the data and understand it well enough to assess its quality and then from there you can assess whether the data is capable of answering your question. Then from there you can go into a long discussion about methodologies on how to answer your question.
@rajathrao although this is geared towards the R programming language, the concepts in this book are indepedent on implementation http://r4ds.had.co.nz/. Near the beginning, it has a general workflow of going from raw data, cleaning, interating, and visualizing your results.
Rajath
@rajathrao
Dec 25 2017 21:09 UTC
@erictleung Thanks so much for the reply. I will take a look. Also, what would you do for data from different sources ?
CamperBot
@camperbot
Dec 25 2017 21:09 UTC
rajathrao sends brownie points to @erictleung :sparkles: :thumbsup: :sparkles:
:cookie: 564 | @erictleung |http://www.freecodecamp.org/erictleung
Eric Leung
@erictleung
Dec 25 2017 23:42 UTC
@rajathrao I'd be skeptical about the data being of good quality and if it can even help. Essentially, same answer as before, without any more information about the different sources.