These are chat archives for beniz/deepdetect

6th
Jun 2018
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 16:20

:+1: i tried again with the latest changes:

    Start 1: ut_apidata
1/7 Test #1: ut_apidata .......................   Passed    1.92 sec
    Start 2: ut_conn
2/7 Test #2: ut_conn ..........................   Passed    1.03 sec
    Start 3: ut_jsonapi
3/7 Test #3: ut_jsonapi .......................   Passed    2.25 sec
    Start 4: ut_caffe_mlp
4/7 Test #4: ut_caffe_mlp .....................   Passed    0.09 sec
    Start 5: ut_caffeapi
5/7 Test #5: ut_caffeapi ......................***Failed  916.41 sec
    Start 6: ut_httpapi
6/7 Test #6: ut_httpapi .......................***Exception: Other 31.66 sec
    Start 7: ut_dlibapi
7/7 Test #7: ut_dlibapi .......................   Passed    2.47 sec

71% tests passed, 2 tests failed out of 7

Total Test time (real) = 955.86 sec

The following tests FAILED:
      5 - ut_caffeapi (Failed)
      6 - ut_httpapi (OTHER_FAULT)
Errors while running CTest

but running ut_httpapi again manually worked without a problem, and using ctest a second time fixed it as well. strange.

ut_caffeapi still errors however:

[  FAILED  ] 5 tests, listed below:
[  FAILED  ] caffeapi.service_train_txt
[  FAILED  ] caffeapi.service_train_txt_sparse
[  FAILED  ] caffeapi.service_train_txt_sparse_lr
[  FAILED  ] caffeapi.service_train_txt_char
[  FAILED  ] caffeapi.service_train_txt_char_resnet

doesn't break it for me, so no worries, just an FYI :)

if you want the specific output from each failed test just let me know
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 16:56
Unrelated (sorry for so many messages....) but it looks like merging #418 changed the caffe branch that is being used from master to float_img which no longer has the fix that resolved #417
Emmanuel Benazera
@beniz
Jun 06 2018 16:57
Hi, tests are passing for us, and yes the default GPU was resetted back to 0 this morning.
you can actually change it by defining CUDA_VISIBLE_DEVICES on the terminal
so this should not happen.
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 16:58
:+1: sounds good. the issue with #418 is unrelated to that, however
Emmanuel Benazera
@beniz
Jun 06 2018 17:00
The fix for #417 was in caffe IIRC
it should be unaffected by DD merging stuff
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:00
it was - but #418 is now using a different branch of caffe
Emmanuel Benazera
@beniz
Jun 06 2018 17:00
ah..
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:00
URL https://github.com/beniz/caffe/archive/float_img.tar.gz now instead of master
Emmanuel Benazera
@beniz
Jun 06 2018 17:00
shit, my fault merging these things to quickly to unlock other changes
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:01
hah no worries :)
Emmanuel Benazera
@beniz
Jun 06 2018 17:40
thanks for catching this, it has been fixed and tested.
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:42
thanks for the quick turnaround!
Emmanuel Benazera
@beniz
Jun 06 2018 17:43
as more changes are coming, we need to get better at pre-checking everything.
we've avoided versioning for not suffering maintenance of old versions
I guess that we'll need to slowly update our way of dealing with these changes
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:44
price of progress :)
fwiw this nsfw model is becoming truly annoying. before the mean values fix, i know know that it wasn't using the mean values in any way, but it was still (somewhat) reasonable and useable. now that it's using the mean values, it's basically always returning high confidence of 'safe for work' unless i leave the mean values out entirely.
i am so very tempted to train my own model or try the TF port of the model, but the tradeoff in time invested is still a factor.
Emmanuel Benazera
@beniz
Jun 06 2018 17:48
using the mean yields worse results ?
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:48
yep
Emmanuel Benazera
@beniz
Jun 06 2018 17:50
then look at what the tf port is doing as input image transforms..
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:51
yup i was looking at that
i can dig into this myself, but any idea offhand if the opencv mat that dede uses is [0, 255.0] or [0.0, 1.0]? i assume the first
Emmanuel Benazera
@beniz
Jun 06 2018 17:53
not sure what skimag does (we've banned it here, we have a sticker to remind everyone), but I don't get the *255
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:54
i believe they're converting the input from [0.0, 1.0] pixel values to [0.0, 255.0] pixel values
Emmanuel Benazera
@beniz
Jun 06 2018 17:55
sure, but what is bringing them into [0,1] ? skimage ?
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:56
looks like it
img_as_float
Convert an image to floating point format, with values in [0, 1].
Emmanuel Benazera
@beniz
Jun 06 2018 17:56
yep, ridiculous, so they do this then rescale...
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:56
yup! i agree, ridiculous.
if you look at the link you pasted further down, they do a tf 'version' of the preprocessing
Emmanuel Benazera
@beniz
Jun 06 2018 17:57
maybe it doesn't work without this :)
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 17:57
-_- that would be incredibly frustrating lol
Emmanuel Benazera
@beniz
Jun 06 2018 17:58
so if I read it correctly, using tf transforms, they do not remove the mean ?
Emmanuel Benazera
@beniz
Jun 06 2018 17:59
" # The whole jpeg encode/decode dance is neccessary to generate a result
# that matches the original model's (caffe) preprocessing"
...
this model is badly overfitted
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 18:00
yup.
but i don't know of another publicly available one for the same domain
Emmanuel Benazera
@beniz
Jun 06 2018 18:00
it should be easy to do
binary classifier, then calibrating it (see our tscaling that'll be part of DD soon), and then using the softmax as approx
or sigmoid
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 18:01
not sure if we have the resources to create a dataset for it, i'd have to see
Emmanuel Benazera
@beniz
Jun 06 2018 18:02
there's no difficulty and we have > 35M sfw images
I just don't want to have to look at the nsfw class data
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 18:02
yep, that's the thing.
you have >35M sfw or nsfw?
Emmanuel Benazera
@beniz
Jun 06 2018 18:03
sfw
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 18:03
yeah. it's the nsfw which is the tricky piece.
Emmanuel Benazera
@beniz
Jun 06 2018 18:03
we have 35k classes already labeled, we can sample from that. The nsfw is just scraping from any of these sites I think
that's what yahoo certainly did
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 20:47
hi @beniz - was looking through the code and i had a question for when you have a minute. It looks to me like on lines 210-211 (https://github.com/jolibrain/deepdetect/blob/master/src/backends/caffe/caffeinputconns.h#L210) the mean (from the scalar) is subtracted from the image in question.
but later after the image is converted to a datum, it looks like on lines 229-239 (https://github.com/jolibrain/deepdetect/blob/master/src/backends/caffe/caffeinputconns.h#L229) that the mean is subtracted (again) from each pixel value in the datum. am i interpreting this correctly?
thanks again!
Emmanuel Benazera
@beniz
Jun 06 2018 21:11
Yes this is bad indeed. The else should go away. We spotted this while putting caffe2 in, then missjudged it it seems. We're still providing the ability to subtract either mean channels or full image mean, and here the mean channels appears to be applied twice.
You can pr a fix it'll get merged tomorrow
cchadowitz-pf
@cchadowitz-pf
Jun 06 2018 21:15
so is it better to subtract the mean values before the Mat is converted to a datum or after? I think you're suggesting it's better to do it before, and remove the loop where it subtracts it from each datum value?