These are chat archives for beniz/deepdetect

7th
Jul 2016
Isaacpm
@Isaacpm
Jul 07 2016 09:08
Hi @beniz how is it going?
Emmanuel Benazera
@beniz
Jul 07 2016 09:10
hi @Isaacpm good, thanks, how are things ? I haven't really understood your last email to be honest.
Isaacpm
@Isaacpm
Jul 07 2016 09:11
good, thanks. Trying to juggle with everything at the moment. Haven't had much time for the machine learning part
yeah, was going to ask if you had a chance to look at it
so the problem is that the confussion matrix comes out as a dictionary
what we do is copy and paste to an excel file so we can work with it better
(at least for now)
but being a pythin dict, and not being ordered, the rows don't follow the order like the columns should be
so, the order of the rows should be the same order as the columns in the first row, that will give the diagonal with the results for each category
as it's a dictionary, the keys don't come ordered, then we lose the rows order
does that explain it?
Emmanuel Benazera
@beniz
Jul 07 2016 09:17
do you have an example JSON output with cmfull handy ?
Isaacpm
@Isaacpm
Jul 07 2016 09:17
I was just thinking that, and that I should have sent it to you in the email
let me check
I actually do, will send it by email, if I paste it here it will be fun to say the least
Emmanuel Benazera
@beniz
Jul 07 2016 09:22
one line is enough I guess
Isaacpm
@Isaacpm
Jul 07 2016 09:23
mmmm, not sure, it has lots of categories, it looks messy
'': [inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf], 'department.home_and_garden.bath': [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.2857142857142857, 0.0, 0.0, 0.0, 0.42857142857142855, 0.07142857142857142, 0.0, 0.0, 0.0, 0.0, 0.07142857142857142, 0.0, 0.07142857142857142, 0.07142857142857142, 0.0, 0.0, 0.0, 0.0, 0.0],
Isaacpm
@Isaacpm
Jul 07 2016 09:28
those are the first two lines, but I've sent you the full thing
no idea about the first one, or if it should have the names of the categories instead of 'inf'
but the second line, which would be the first row of the matrix doesn't have it's own category as the first number
Emmanuel Benazera
@beniz
Jul 07 2016 09:31
the first line IMO indicates you have an issue with your classes, either you may have specified one more than there is, or something related to that.
Isaacpm
@Isaacpm
Jul 07 2016 09:32
ok, so it should have the classes names?
I'll have to check that, but I didn't get any error, so I was assuming the number of classes was correct
Emmanuel Benazera
@beniz
Jul 07 2016 09:36
I don't there's a header because the classes should be in order in the JSON array
Isaacpm
@Isaacpm
Jul 07 2016 09:37
ok, going to check that and run another test
Emmanuel Benazera
@beniz
Jul 07 2016 09:40
I will check on the order of keys, but the inf line is another issue I think, possibly with the number of classes
Isaacpm
@Isaacpm
Jul 07 2016 09:41
ok, will check that, as having that will allow me to order the rows correctly, even if they aren't
will get back to you as soon as I get the results, thanks :-)
Emmanuel Benazera
@beniz
Jul 07 2016 09:46
thanks, I will definitely take a look at it today, just so many micro tasks, and it's sunny outside... hard to resist :)
Isaacpm
@Isaacpm
Jul 07 2016 09:46
I feel the same, I'm missing the only two sunny days in London this summer
but I have a training course from 2pm to 10pm, so need to do all my own personal stuff in the morning
....
Emmanuel Benazera
@beniz
Jul 07 2016 09:47
ouch
Isaacpm
@Isaacpm
Jul 07 2016 09:47
can't complain much, though
work is paying for it
and thanks god work is quiet and I can do this machine learning stuff and some more for my pet projects during the day
Emmanuel Benazera
@beniz
Jul 07 2016 09:48
:)
Isaacpm
@Isaacpm
Jul 07 2016 09:50
finishing something else for a yet another side project and will get that training with the right number of classes, which is funny as I thought I was using it already :-(
Akshay Pai
@akshaypai
Jul 07 2016 10:16
Hello
Hi beniz , must say that , deep detect is just amazing
one can build an entire framework for deeplearning classification using this. More like platform as a service. Hats off to you
Hope, I can contribute in debugging CentOS / AWS AMI Linux issues, where docker container are causing issues and defeating the purpose of using Docker itself
Emmanuel Benazera
@beniz
Jul 07 2016 10:19
@akshaypai thanks ML / DL is the next commodity on the stack, so better start now :) Regarding the CentOS + docker issue, then please report on your experiments here or in PM as we can iterate more quickly than inside the issue. We can update it later with main observations.
@Isaacpm I believe I see what you mean now, the cmfull rows may not be in particular order. Note that the inf are still wrong, and probably related to classes.
Akshay Pai
@akshaypai
Jul 07 2016 10:20
sure , will do
Emmanuel Benazera
@beniz
Jul 07 2016 10:21
@Isaacpm and I see where it comes from: originally, the rows where numbered accordingly with the corresp.txt file, when switched to the string entries, the order cannot be guaranteed anymore, so this qualifies as another bug you've hunted down I guess!
Emmanuel Benazera
@beniz
Jul 07 2016 10:38
@Isaacpm though in practice you get the order via the cmdiag results
but the names are not given in the JSON...
Isaacpm
@Isaacpm
Jul 07 2016 10:40
not sure if I fully follow, but I guess I'm earning the bug hunter badge, lol
Emmanuel Benazera
@beniz
Jul 07 2016 10:40
we should make one :)
do you have any suggestion on a useful cmfull JSON format from your experience ?
Isaacpm
@Isaacpm
Jul 07 2016 10:42
let me think about that one, I thought a python list (json array?) could work, but let me give it a try and see if it's easy to use
Emmanuel Benazera
@beniz
Jul 07 2016 10:43
thanks, there are basically two ways of doing it I see: either having an index to re-rerank the rows from the JSON as needed, or using an ordered array with external list of row names.
Isaacpm
@Isaacpm
Jul 07 2016 10:44
that second one is the one I had in mind
Emmanuel Benazera
@beniz
Jul 07 2016 10:44
doesn't require external code yes
Isaacpm
@Isaacpm
Jul 07 2016 10:45
what I was thinking would be something in the form of a list of lists
ordered correctly, but not sure how easy is that to get in the code
somethign like:
[[category,category1,category2,...],[category,0.33,0.33,0.33],[category1,0,0.5,0.5],[category2,0.75,0.25,0]
and close the bracket
[[category,category1,category2,...],[category,0.33,0.33,0.33],[category1,0,0.5,0.5],[category2,0.75,0.25,0]]
that would work really well for what we are doing now with excel, and I think for the automation later
Emmanuel Benazera
@beniz
Jul 07 2016 10:48
I don't think you can do that as an array should be of a single type, no ?
Isaacpm
@Isaacpm
Jul 07 2016 10:48
you can in python
not sure in other languages
Emmanuel Benazera
@beniz
Jul 07 2016 10:48
what I'm saying is I don't think this is proper JSON
Isaacpm
@Isaacpm
Jul 07 2016 10:48
let me check that
Emmanuel Benazera
@beniz
Jul 07 2016 10:49
[[category1,...,categoryn],[0.33,0.33,0.33],...,[0.33,0.33,0.33]] would do though I think
but I'm not sure in fact
Isaacpm
@Isaacpm
Jul 07 2016 10:50
jsonlint.com accepts the json
Emmanuel Benazera
@beniz
Jul 07 2016 10:50
which one ?
Isaacpm
@Isaacpm
Jul 07 2016 10:50
[
["category", "category1", "category2"],
["category", 0.33, 0.33, 0.33],
["category1", 0, 0.5, 0.5],
["category2", 0.75, 0.25, 0]
]
Emmanuel Benazera
@beniz
Jul 07 2016 10:50
weird, I don't think dd can generate that from internal structures then...
Isaacpm
@Isaacpm
Jul 07 2016 10:51
that's the part I don't know, the internal dd workings, I know how the output would be useful for my use cases
in any case, anything that gives a list of the categories and a way to get the "rows" for each category would work for use
*us
even the current json if there is a key with the order
then you can pull each key in the right order following that 'order' key
Emmanuel Benazera
@beniz
Jul 07 2016 10:57
so, I've checked and mixed type arrays are supported in JSON, but I'm reluctant to use them: there are a pain to convert in some language, and you need special code to acquire the data (e.g. if i =0 then category else ...)
Isaacpm
@Isaacpm
Jul 07 2016 10:57
would this work, then:
[
["category", "category1", "category2"],
["category":[0.33, 0.33, 0.33]],
["category1": [0, 0.5, 0.5]],
["category2":[0.75, 0.25, 0]]
]
or is it too complex?
sorry, wouldn't be lists
[
"order":["category", "category1", "category2"],
"category":[0.33, 0.33, 0.33],
"category1": [0, 0.5, 0.5],
"category2":[0.75, 0.25, 0]
]
not 100% about that one, though
Emmanuel Benazera
@beniz
Jul 07 2016 11:04
a less verbose one is "3_category":[],"1_category":[] but note that the results are not ordered
Isaacpm
@Isaacpm
Jul 07 2016 11:05
that's the thing, if they are not ordered, or you have a way to order them, the matrix is not very useful
Emmanuel Benazera
@beniz
Jul 07 2016 11:08
I will ask to a friend who has good opinions on these matters. For programmatic use, being not ordered is OK, though annoying since it requires code.
Isaacpm
@Isaacpm
Jul 07 2016 11:10
ok, thanks, will wait for updates on this one, and also try to use the unordered one, we may still be able to use it for what we want
Emmanuel Benazera
@beniz
Jul 07 2016 11:11
do you want the patch to get the x_category form ?
I can even commit it for what matters, as it is better than the current one.
Isaacpm
@Isaacpm
Jul 07 2016 11:12
yes, that would do, I can order it with the x
how do I get it? or will you commit it?
ok, question answered :-)
Emmanuel Benazera
@beniz
Jul 07 2016 11:14
yes
Isaacpm
@Isaacpm
Jul 07 2016 11:14
thanks!
Emmanuel Benazera
@beniz
Jul 07 2016 11:14
not optimal, but can do
Isaacpm
@Isaacpm
Jul 07 2016 11:14
yep, will do until you find a better way to do it
Emmanuel Benazera
@beniz
Jul 07 2016 11:14
sure
Emmanuel Benazera
@beniz
Jul 07 2016 12:05
so one of the propositions by my friend versed in these matters is "cmfull": { "labels": ["talk_politics_mideas", "talk_politics_misc"], "data": [ [0, 1], [2, 3] ] }
Isaacpm
@Isaacpm
Jul 07 2016 12:06
that would work
you'd need to order the labels and the data to match, though
Emmanuel Benazera
@beniz
Jul 07 2016 12:09
another one from the same conversation: {"ConfusionMatrix":{"sourceCategories":["Red","Green","Blue"],"targetCategories":["Red","Green","Blue"],"counts":[1,2,3,4,5,6,7,8,9]}}
the TargetCategories is useless for a symetric matrix
Isaacpm
@Isaacpm
Jul 07 2016 12:09
was going to say that
it would be the same as the previous one with labels and data
danielgollas
@danielgollas
Jul 07 2016 18:41
@Isaacpm @beniz Aaah I'm so glad you are having this conversation, as I was wondering how the matrix was sorted
For a while I thought I could sort alphabetically using the category as the key
It worked for one of my classifiers but I got wierd results for others.
I thing the simplest and most compact way would be the last proposal, i.e. labels in a list, to determine the order, and then data as a list of lists to represent the rows, where each entry in the row corresponds to the label in the same index.
Emmanuel Benazera
@beniz
Jul 07 2016 18:51
yes, sorry about that, that was clearly a bug introduced when the indexes were modified
danielgollas
@danielgollas
Jul 07 2016 18:52
No problem, let me know if I can help test
Emmanuel Benazera
@beniz
Jul 07 2016 18:52
can one of you open an issue with the preferred solution ? I have a game to watch :)
danielgollas
@danielgollas
Jul 07 2016 18:53
hahaha
I will open one
Emmanuel Benazera
@beniz
Jul 07 2016 18:53
thanks :)
danielgollas
@danielgollas
Jul 07 2016 19:09
beniz/deepdetect#157
tmeroniown
@tmeroniown
Jul 07 2016 20:33
@beniz - how do I tell dede which image mean file to use
do I just place it in the model directory?
does it need to be called something specific?
Emmanuel Benazera
@beniz
Jul 07 2016 20:56
@tmeroniown if you place a mean.binaryproto into the model directory it is automatically used when working with images
if you train from images, it is automatically generated as well
tmeroniown
@tmeroniown
Jul 07 2016 21:03
I just need to call it mean.binaryproto?
Emmanuel Benazera
@beniz
Jul 07 2016 21:09
yes, at the moment is hardcoded that way. If you want it otherwise, it should work by modifying the .prototxt directly
tmeroniown
@tmeroniown
Jul 07 2016 21:11
thanks!
sangav
@sangav
Jul 07 2016 22:19
@beniz I am trying image classification using some of the pre-trained models you have. Most of them work except the age_model .. I am getting the following error - Error calling service [status:[code:500, dd_code:1007, dd_msg:src/caffe/data_transformer.cpp:169 / Check failed (custom): (height) <= (datum_height), msg:InternalError]]