These are chat archives for beniz/deepdetect

17th
Sep 2016
Kumar Shubham
@kyrs
Sep 17 2016 13:28
Hi @beniz , How are you ?
Emmanuel Benazera
@beniz
Sep 17 2016 16:24
hi @kyrs good, what about you ?
Kumar Shubham
@kyrs
Sep 17 2016 16:26
I am also fine ...
Emmanuel Benazera
@beniz
Sep 17 2016 16:26
are LSTM giving you nightmares ? :)
Kumar Shubham
@kyrs
Sep 17 2016 16:27
:P not now atleast
but yes it is really an interesting stuff to work on
Got busy with some urgent work last week but yeah always manage few hours for dd and LSTM
Emmanuel Benazera
@beniz
Sep 17 2016 16:30
cool, no worries. As you have seen there's not a lot of docs regarding LSTM + Caffe though it works well
let me know if you need help. The most difficult step might be preparing the input data format through Caffe Blob objects.
Kumar Shubham
@kyrs
Sep 17 2016 16:32
I am trying to run the IMDB example which you have shared in the ticket with caffe
Emmanuel Benazera
@beniz
Sep 17 2016 16:32
OK, were you able to use the dataset ? I believe it's in Python pickle format...
Kumar Shubham
@kyrs
Sep 17 2016 16:33
yes it is, but I manage to get the data for IMDB online https://www.cs.cornell.edu/people/pabo/movie-review-data/
although they are plain text
Emmanuel Benazera
@beniz
Sep 17 2016 16:34
OK, were you able to find which one the keras dataset is ?
Kumar Shubham
@kyrs
Sep 17 2016 16:35
No, I am still looking into their document but no success so far
Emmanuel Benazera
@beniz
Sep 17 2016 16:36
OK, you can use the news20 dataset with two classes only to begin with if you like, or one of the datasets from http://deepdetect.com/applications/text_model/
you can reduce the number of documents for your early tests
Kumar Shubham
@kyrs
Sep 17 2016 16:38
ok fine I can do that
IMDB example was of binary classification for positive and negative sentiments
so should I work on multi class classification problem or binary one
on the news20 dataset
Emmanuel Benazera
@beniz
Sep 17 2016 16:45
you can pick any dataset and keep only two classes
I need to look at the imdb dataset for something else, so I'll see whether I can convert it back from Python
Kumar Shubham
@kyrs
Sep 17 2016 16:50
I have seen the imdb dataset of keras. we can load the pkl file in python then dump it in any format we want. btw the dataset of imdb is in bag of word format, numeric representation of word.
Emmanuel Benazera
@beniz
Sep 17 2016 16:52
OK, thanks for the information. It might be easier for you to work with this then. For instance, if you are using the CSV or SVM input connectors, then to connect to an LSTM
Characters can be connected later on through the text connector. What do you think ?
Kumar Shubham
@kyrs
Sep 17 2016 16:56
sure !!! I will do that. One more question does the caffe module build by deepdetect support LSTM ??
Emmanuel Benazera
@beniz
Sep 17 2016 16:58
yes LSTM / RNN are into Caffe master, and our custom version includes all master changes with minimal delay
you should have nothing to do on Caffe side, and that's a lucky thing :)
Kumar Shubham
@kyrs
Sep 17 2016 17:00
great !! I guess writing .prototxt file will be enough to run LSTM with Deepdetect
what do you say ?
ehab albadawy
@ebadawy
Sep 17 2016 17:00
BTW, I want to mention that torch has a great support to RNNs, it sure if it will be easy for you to switch to torch though
Emmanuel Benazera
@beniz
Sep 17 2016 17:01
yes @kyrs you need to write the prototxt down first. But I believe it is possible that the input from DD to Caffe LSTM requires modifications. I will double check. Let me know when you have the prototxt and we can look at it together.
@ebadawy hi, yes Torch has good support. We don't have an official Torch back-end at the moment, and there are many reasons for it. But it is something that could be done, if you're interested ^^
@kyrs btw, your TF code will get in at some point, there's some slight cleanup I need to make to the build system + finding a solution to the annoying TF library bug with jpg and png.
ehab albadawy
@ebadawy
Sep 17 2016 17:04
@beniz , that will be cool if I can start something like that! :D
Kumar Shubham
@kyrs
Sep 17 2016 17:04
@beniz I will do that. btw I guess torch have better C++ API as compared to tensorflow .
Emmanuel Benazera
@beniz
Sep 17 2016 17:05
TF C++ API is very limited, and it is another annoyance... Torch is really good, I would help on any work in that direction.
you guys can create a Torch issue if you like, and channel the discussion there, if one of you (or both ? :)) decide to take upon the token.
ehab albadawy
@ebadawy
Sep 17 2016 17:07
Cool! I'm in.
Kumar Shubham
@kyrs
Sep 17 2016 17:07
@ebadawy this project is really interesting, I have tried to integrate tf but hit dead end due to poor c++ API . let me know if you need any help.
Emmanuel Benazera
@beniz
Sep 17 2016 17:07
great, good spirit :)
Kumar Shubham
@kyrs
Sep 17 2016 17:08
=D =D but first I will complete LSTM one :P
Emmanuel Benazera
@beniz
Sep 17 2016 17:08
@kyrs not a dead end at all! you did exactly the most of what was possible, and it will get in!
@kyrs I like your perseverance :) I can help you on LSTM, just ask whenever you have questions / difficulties etc... I will look at the data format more closely and update the ticket anyways.
ehab albadawy
@ebadawy
Sep 17 2016 17:09
@kyrs , I still don't know very much about what you are working on, do you have something I can look for?
Emmanuel Benazera
@beniz
Sep 17 2016 17:09
@ebadawy same applies to your effort, I can help, don't hesitate to ask and describe any difficultiles.
Kumar Shubham
@kyrs
Sep 17 2016 17:12
@ebadawy I have worked on integration of tensorflow with deepdetect you can check beniz/deepdetect#30 for more detail about my work. I guess integrating torch will be similar to it.
Kumar Shubham
@kyrs
Sep 17 2016 17:19
@beniz my PR can handle .jpg and .png file both. you can look into the comments of the tensorflow ticket.
only problem was the batch processing. which didn't worked out at that time
Emmanuel Benazera
@beniz
Sep 17 2016 17:20
so png and jpg were working even when linking with OpenCV ?
Kumar Shubham
@kyrs
Sep 17 2016 17:20
I was not using the opencv library
some thing from tensorflow c++ api
Emmanuel Benazera
@beniz
Sep 17 2016 17:22
OK now I remember, you had to use the internal TF graph to convert images, thus the batch was not working, because it was complicated and not yet supported in TF examples, right ?
Kumar Shubham
@kyrs
Sep 17 2016 17:23
yes
Emmanuel Benazera
@beniz
Sep 17 2016 17:23
OK. The ticket on TF side was also updated by users with ways to possibly link TF to avoid the initial issue. I still need to look into that seriously.
Kumar Shubham
@kyrs
Sep 17 2016 17:26
I guess there has been some recent development in tensorflow to solve this issue. let me search about it. Will update the ticket
Kumar Shubham
@kyrs
Sep 17 2016 17:29
we have to look into it.
this can solve a lot of problem we faced last time
Emmanuel Benazera
@beniz
Sep 17 2016 17:30
agreed
there are a few other things, like the hash, remember ? ^^ but nothing too difficult
Kumar Shubham
@kyrs
Sep 17 2016 17:31
there was also some issue with the protobuf version
Emmanuel Benazera
@beniz
Sep 17 2016 17:33
ah yes, I had a dirty hack, and I need to sanitize that. Here again, it's not rocket science, just a bit of time. We can try to make a priority of this if you like
Kumar Shubham
@kyrs
Sep 17 2016 17:34
yeah sure, but first I have to complete the LSTM integration.
Emmanuel Benazera
@beniz
Sep 17 2016 17:34
OK
Kumar Shubham
@kyrs
Sep 17 2016 17:36
sparing time is a bit difficult with the current job but yes, will try my best to complete pending stuff in dd as soon as possible.
Emmanuel Benazera
@beniz
Sep 17 2016 17:36
don't worry, I'll update the tf branch when I can, just don't hesitate to ping me when you need help
Kumar Shubham
@kyrs
Sep 17 2016 17:39
sure :smile:
ehab albadawy
@ebadawy
Sep 17 2016 20:21
I'm trying to install dd but it keep saying that it can't find eigen3, which is kinda strange because it is installed and it's located in /user/include