These are chat archives for beniz/deepdetect

22nd
May 2017
roysG
@roysG
May 22 2017 05:00 UTC
@beniz , you wrote me: " No idea about lambda nor have I heard about using DD+lambda. We do maintain official AMI on AWS, or you can build your own.", but what is DD?
I am trying to looking information about that.
My problem with amazon ec2 that they are very expensive if you need to run scripts for a few days or hours and need at least 20 server or even more
alkollo
@alkollo
May 22 2017 09:06 UTC
DD = DeepDetect I suppose :)
@RoysG, GG cloud platform, 1 Gpu Tesla K80 + 2 Cpus, 13 GB RAM, 40 GB disk on ubuntu image cost me about $0.78 / Hr. Plus GG give $300 credit to new registered account. At Amazon I'm still waiting for gpu allocation approval... That's why I switched to GG. https://cloud.google.com/pricing/
roysG
@roysG
May 22 2017 10:20 UTC
Thanks
roysG
@roysG
May 22 2017 11:20 UTC
Why don׳t you use in serverless ?
Or in you case "google functions"
*your
Emmanuel Benazera
@beniz
May 22 2017 11:28 UTC
from googling quickly, I don't see GPU available for lambda.
roysG
@roysG
May 22 2017 11:29 UTC
In case you just want to use predection function, you do not have to use gpu, also cpu if fine
*is
alkollo
@alkollo
May 22 2017 16:18 UTC

Hello,

I did some training with the DD image classification tutorial and got ugly numbers.
I use the googlenet model with about 140 classes and arround 1500-3000 images per classe with a total of 215000 images.

batch size=64
iteration=19000
train_loss=6.23046
f1=0.0287021
accp=0.0272648
mcll=4.87055
acc=0.0272648
precision=0.0223203
recall=0.0401944
smoothed_loss=6.22597

I also noticed that I have a lr of 0.00135085 despit I setted up base_lr = 0.01
INFO - 16:09:33 - Iteration 19520, lr = 0.00135085
I saw on Googlenet train methodologie that learning rate should decrease by 4% every epoch, is it applying here ?

I'm worried with the precision and recall numbers that are pretty low even after a lot of iterations. As per some doc the Googlenet model should give pretty good results on loss after few thousand iteration but at 19000 iterations I still have huge loss.

I plan to switch to resnet50 for test but as it have twice layers it should take much more time for not so much accuracy gain (as long I can't found the problem with Googlenet I mean) and would not solve my current problem I think.

I can set batch size to 128 as I do not actually use all of my Tesla K80 RAM but not sure it will change something to the problem, just speed up the training.

Any path to put me on to get more precision would be welcome.

Thank you by advance.

Emmanuel Benazera
@beniz
May 22 2017 16:39 UTC
you probably did not specify the learning rate or the solver policy correctly. Look the API up for the training parameters with caffe, and you can post your exact training call here if you like.
roysG
@roysG
May 22 2017 16:46 UTC
Hi, someone try run the machine on serverless?
alkollo
@alkollo
May 22 2017 19:04 UTC

@beniz

I'm pretty sure my solver policy is correct, here it is.

curl -X POST "http://localhost:8080/train" -d '{
"service":"imageserv",
"async":true,
"parameters":{
"mllib":{
"gpu":true,
"net":{
"batch_size":80
},
"solver":{
"test_interval":500,
"iterations":30000,
"base_lr":0.01,
"stepsize":1000,
"gamma":0.9
}
},
"input":{
"connector":
"image",
"test_split":0.1,
"shuffle":true,
"width":224,
"height":224
},
"output":{
"measure":["acc","acc-1","acc-5","mcll","f1"]
}
},
"data":["/home/john/images"]
}'

I found an interesting point here : soumith/imagenet-multiGPU.torch#2

Seems, model need to have batchnormalisation or weight initialized with xavier and not constant like in the googlenet model provided with DD.

I tryed googlenet_bn and the loss decreased at the first 100 iterations, with a growing accuracy but still low after 2500 iters:

2500 iter

I0522 18:53:29.260016 3526 caffelib.cc:728] batch size=80
I0522 18:53:29.260059 3526 caffelib.cc:734] iteration=2500
I0522 18:53:29.260080 3526 caffelib.cc:734] train_loss=0.562861
I0522 18:53:29.260100 3526 caffelib.cc:734] mcll=5.32981
I0522 18:53:29.260110 3526 caffelib.cc:734] acc=0.0902724
I0522 18:53:29.260121 3526 caffelib.cc:734] acc-5=0.637354
I0522 18:53:29.260130 3526 caffelib.cc:734] f1=0.128766
I0522 18:53:29.260140 3526 caffelib.cc:734] accp=0.0902724
I0522 18:53:29.260150 3526 caffelib.cc:734] precision=0.0912658
I0522 18:53:29.260161 3526 caffelib.cc:734] recall=0.218579
I0522 18:53:31.154448 3526 caffelib.cc:794] smoothed_loss=0.561215

I will try to replace constant weight init params by xavier in the googlenet model to see if it trains.

Thanks,

roysG
@roysG
May 22 2017 22:24 UTC
I want to call two models in one call, for example get age and gender models in one call, can i define it to work like that?