These are chat archives for beniz/deepdetect

19th
Sep 2016
dgtlmoon
@dgtlmoon
Sep 19 2016 06:31
morning :) beniz/deepdetect#191 s, it says to reindex with pool5 but whats the exact name of that layer?
is the IRC channel active? I much prefer to use IRC
Emmanuel Benazera
@beniz
Sep 19 2016 06:39
hey, what as re trying to do ? you need to index and search based on the same layer. maybe it is unclear what imgsearch does, in that case start looking at annoy. Though if you don't want to dig this it's fine, but there are a number of theoretical rules.
IRC is up, a bit slower
dgtlmoon
@dgtlmoon
Sep 19 2016 06:41
Hey Emmanuel! Thanks for the great work!
I'm trying to just train on a bunch of images, they are quite similar so a higher level layer is too abstract
Emmanuel Benazera
@beniz
Sep 19 2016 06:42
you are indexing, not training I guess
dgtlmoon
@dgtlmoon
Sep 19 2016 06:42
(I'm training on several thousand images of t-shirts with different designs and finding similar ones)
yes
Emmanuel Benazera
@beniz
Sep 19 2016 06:43
if images are very similar you may need to train a net specifically on these images then index...
dgtlmoon
@dgtlmoon
Sep 19 2016 06:43
yes i was thinking the same
i have maybe 120,000 images to train from , a good size data set :)
they are all labelled/grouped in the original data set, can you recommend some strategy for training a net?
Emmanuel Benazera
@beniz
Sep 19 2016 06:46
follow the tutorials, read around, there's a lot of information
dgtlmoon
@dgtlmoon
Sep 19 2016 06:47
do you think its worth training a whole new net, or perhaps the final layer "Transfer learning " style from inception?
I guess Inception is all about image/object recognition, but i'm more interested in finding similar images of the size type
i guess it needs to be in caffee format right?
Emmanuel Benazera
@beniz
Sep 19 2016 06:50
yes, the TF branch does not have support for in layer extraction at the moment.
dgtlmoon
@dgtlmoon
Sep 19 2016 06:53
"It turned out that in the researchers' dataset, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days. The neural network had learned to distinguish cloudy days from sunny days, instead of distinguishing camouflaged tanks from empty forest." haha great
ok so the steps in that URL wont work? i'll try it and see what i can learn
it must be a little frustrating running a project like this with so many entry level questions that are not real issues :) thanks for your patience
Emmanuel Benazera
@beniz
Sep 19 2016 06:59
that's the game :) we can't answer everything, there's the doc. Most of our time is taken by dev and customers, but community requires attention as well
dgtlmoon
@dgtlmoon
Sep 19 2016 07:00
there a imagenet caffe tutorial, looks much more appropriate
Emmanuel Benazera
@beniz
Sep 19 2016 07:00
look at the tutorials and examples, there s everything to train and fine-tune, though if you are not versed in nn, you may find it a bit difficult.
dgtlmoon
@dgtlmoon
Sep 19 2016 07:00
yes, huge learning curve but it's a lot of fun
Emmanuel Benazera
@beniz
Sep 19 2016 07:00
DD docs, or use any other framework if easier.
dgtlmoon
@dgtlmoon
Sep 19 2016 07:04
my background: I've been using ORB/SIFT descriptors for finding similar images with good results, but am interested in getting better results with NN
Emmanuel Benazera
@beniz
Sep 19 2016 07:18
yes, it's more flexible
we've applied it to an art expo recently, look recognition.tate.org.uk up
dgtlmoon
@dgtlmoon
Sep 19 2016 07:20
hey very cool, kind of abstract!
dgtlmoon
@dgtlmoon
Sep 19 2016 07:29
how long did it take to train a net for that?
Emmanuel Benazera
@beniz
Sep 19 2016 07:31
hours to days
dgtlmoon
@dgtlmoon
Sep 19 2016 07:35
is there really enough information in 224x224 or 256x256 (another common image size) for training? some of those images at recognition.tate.org.uk are quite complex
Emmanuel Benazera
@beniz
Sep 19 2016 10:31
yes, object detection is done on 720xY images
dgtlmoon
@dgtlmoon
Sep 19 2016 10:34
if I follow http://www.deepdetect.com/tutorials/train-imagenet/ with my dataset (100,000~ images) it should give me a starting point for my own image-similarity NN where the domain/image type is very specific?
do you work on this project fulltime?
Emmanuel Benazera
@beniz
Sep 19 2016 10:35
it's for you to try yes, many things can go right or wrong. Yes we're a company around the project and we stay laiiiidddd backkk :)
dgtlmoon
@dgtlmoon
Sep 19 2016 10:35
very cool, congratulations
Emmanuel Benazera
@beniz
Sep 19 2016 10:38
what about you ?
dgtlmoon
@dgtlmoon
Sep 19 2016 10:39
I'm working mostly full time doing Drupal/web work for a small company in Copenhagen, web is fun... but it's a lot more fun when it has some data behind it :D
kind of bored of web type work and wanting to learn something more interesting
Emmanuel Benazera
@beniz
Sep 19 2016 10:40
There's a lot to do on the viz side ^^
dgtlmoon
@dgtlmoon
Sep 19 2016 10:40
viz?
Emmanuel Benazera
@beniz
Sep 19 2016 10:40
data visualization
dgtlmoon
@dgtlmoon
Sep 19 2016 10:40
oh yeah for sure
Emmanuel Benazera
@beniz
Sep 19 2016 10:40
why not a Drupal connector to DD ? :)
dgtlmoon
@dgtlmoon
Sep 19 2016 10:40
your download_imagenet_dataset.py can be replaced by the parallel bash command
Emmanuel Benazera
@beniz
Sep 19 2016 10:41
feel free to PR or gist any improvement
dgtlmoon
@dgtlmoon
Sep 19 2016 10:41
Could be... Drupal is just a framework really, its up to what people put in there....
Emmanuel Benazera
@beniz
Sep 19 2016 10:41
it's php ?
OK it is...
dgtlmoon
@dgtlmoon
Sep 19 2016 10:42
yeah its PHP
its not really a big data thing.. just more a content management system
Emmanuel Benazera
@beniz
Sep 19 2016 10:42
that's something we miss at the moment actually, a PHP client, just saying.
dgtlmoon
@dgtlmoon
Sep 19 2016 10:43
but for example, one of my websites has 100,000 images added by members of their collections
Emmanuel Benazera
@beniz
Sep 19 2016 10:43
yeah, but the predict calls can be used for many apps.
dgtlmoon
@dgtlmoon
Sep 19 2016 10:43
but i think most websites are domain specific, hence my case where i need to train a model for my own use
Emmanuel Benazera
@beniz
Sep 19 2016 10:43
yes, we will provide a similarity search engine sometimes in the months that come
and the granularity of search for a variety of use cases is critical, and needs work
dgtlmoon
@dgtlmoon
Sep 19 2016 10:44
yes but the search engine still needs a domain specific NN? or not?
ie 100,000 tshirts with similar designs
Emmanuel Benazera
@beniz
Sep 19 2016 10:44
checkout http://deepdetect.com/applications/model/ there's a fashion pre-trained model that might be a bit better for your needs
dgtlmoon
@dgtlmoon
Sep 19 2016 10:45
my model will be for heavy metal tshirts :D
Emmanuel Benazera
@beniz
Sep 19 2016 10:45
it depends on the patterns. if your concern is with the inner tshirt details, just crop them and use the similarity search on them directly.
and send me one of those tshirts :)
dgtlmoon
@dgtlmoon
Sep 19 2016 10:46
cropping is a hard subject... members of the drupal site add their own photos.. and the image is not always centered etc
i think cropping the outer 30% would cover most cases
i actually dont own the tshirts... its just photos from collectors :D
Emmanuel Benazera
@beniz
Sep 19 2016 10:48
well, do your things, and I'll play with Hawaian shirts when I have a moment :)
dgtlmoon
@dgtlmoon
Sep 19 2016 11:29
ok, just indexed on the clothing network with 1995 images, using image search against loss3/classifier, results are not very great
trying again against inception_5b/output
@beniz is your company available to help out with training a new model? im not rich but i could throw some money your way
Emmanuel Benazera
@beniz
Sep 19 2016 11:34
read annoy and DD instructions carefully, there are many parameters
dgtlmoon
@dgtlmoon
Sep 19 2016 11:34
yes, i watched a few presentations on annoy, lots of dials
Emmanuel Benazera
@beniz
Sep 19 2016 11:35
ok, that's a personal project I guess ?
dgtlmoon
@dgtlmoon
Sep 19 2016 11:36
yes, tho ide like to make it commercial if it works, its kind of a hobby website/data-set for the last 9 years seeing if i can do something with it
first is to allow people on the drupal website to browse by similar entries
after that.. dont know :D
Emmanuel Benazera
@beniz
Sep 19 2016 11:37
can you make your dataset public ?
dgtlmoon
@dgtlmoon
Sep 19 2016 11:37
it's already publicaly available
Emmanuel Benazera
@beniz
Sep 19 2016 11:39
can you point to it ?
dgtlmoon
@dgtlmoon
Sep 19 2016 11:39
its not producing a JSON feed of the images/categories just yet, i can do that soon
dgtlmoon
@dgtlmoon
Sep 19 2016 12:03
ok tried on clothing at inception_5b/output results are different but not much better, perhaps 256x256 is not good enough resolution
dgtlmoon
@dgtlmoon
Sep 19 2016 12:09
time to try my own trained network
dgtlmoon
@dgtlmoon
Sep 19 2016 12:16
is this what is used to create the clothing model but with a different image list? http://www.deepdetect.com/tutorials/train-imagenet/
Emmanuel Benazera
@beniz
Sep 19 2016 12:20
it's finetuned, look at the examples
dgtlmoon
@dgtlmoon
Sep 19 2016 16:03
This message was deleted
This message was deleted
This message was deleted
http://www.deepdetect.com/tutorials/train-imagenet/ this is not so clear, "Create a model directory where you want, in the following we refer to the path and model directory as imgnet, and the machine learning service as imageserv" so how do I send it the images in the directory to train on?
Emmanuel Benazera
@beniz
Sep 19 2016 16:06
put images in a directory, and organize them per label, and pass that directory to the train call. In the tutorial it is called ilsvrc12.
dgtlmoon
@dgtlmoon
Sep 19 2016 16:07
got it x) thanks
do you accept donations of beer money? x)
Emmanuel Benazera
@beniz
Sep 19 2016 16:08
we accept IPA beers :)
dgtlmoon
@dgtlmoon
Sep 19 2016 16:09
I live in Prague.. we have the best beer in the world :D
back later, dinner
dgtlmoon
@dgtlmoon
Sep 19 2016 21:48
curl -X POST "http://localhost:8080/predict" -d '{
blob
in the docs, the -d should be on the same line, or some bash interpreters will fail
dgtlmoon
@dgtlmoon
Sep 19 2016 22:04
{"status":{"code":201,"msg":"Created"}}{"status":{"code":201,"msg":"Created"},"head":{"method":"/train","job":1,"status":"running"}}
after issuing the 'Training the classifier' command
i get

```$ curl -s -X GET "http://localhost:8080/train?service=imageserv&job=1&timeout=20"
{"status":{"code":200,"msg":"OK"},"head":{"method":"/train","job":1,"status":"error"},"body":{}}
ERROR - 22:03:48 - service imageserv training status call failed

ERROR - 22:03:48 - {"code":400,"msg":"BadRequest","dd_code":1005,"dd_msg":"Service Input Error"}
```

im using the 5 class image set