beniz on master
chore: more torch optimizers (compare)
mergify[bot] on master
fix: wrong cuda runtime in dock… (compare)
mergify[bot] on fix_cuda_docker_runtime
fix: caffe build can use custom… Merge branch 'master' into fix_… (compare)
mergify[bot] on master
fix: caffe build can use custom… (compare)
beniz on chore_more_torch_optimizers
chore: more torch optimizers (compare)
hey @beniz - happy new year! long time since I've been looking through this stuff - tons of updates!! was just wondering if you're still building docker images with TF included. I'm trying to update our build process and I'm running into quite a number of issues with changes in DD affecting the build as well as changes in floopz/tensorflow_cc (and tensorflow itself).
Anyways, was just wondering if a (even CPU-only, for now) automated build is still happening with tensorflow that I can compare my build process too. Thanks in advance!
@beniz hey, happy new year 😊 simsearch question, I have the object detector working nice for the tshirt artwork, and you said that training the ResNet classifier model with more categories (say 200 or so different bands tshirt artwork) can improve the search results, however I've found that training the classifier on only 10 or so band names seems to give the simsearch better results.. whats your thoughts? note - a single band might have many different designs, so maybe this is the problem..
so trained on 10 categories, the resnet simsearch model is kinda OK, but not brilliant
on 200 categories, its sometimes workable but not great
Do you have any other tips for tuning simsearch? fortunately the 'domain' is all kinda similar, printed artwork on cloth.. maybe some image filter or tuning option?
[2022-02-04 16:06:58.042] [torchlib] [info] Initializing net from parameters:
[2022-02-04 16:06:58.042] [torchlib] [info] Creating layer / name=tdata / type=AnnotatedData
[2022-02-04 16:06:58.042] [torchlib] [info] Creating Layer tdata
[2022-02-04 16:06:58.042] [torchlib] [info] tdata -> data
[2022-02-04 16:06:58.042] [torchlib] [info] tdata -> label
terminate called after throwing an instance of 'CaffeErrorException'
what(): ./include/caffe/util/db_lmdb.hpp:15 / Check failed (custom): (mdb_status) == (0)
0# dd::OatppJsonAPI::abort(int) at /opt/deepdetect/src/oatppjsonapi.cc:255
1# 0x00007F24AC053210 in /lib/x86_64-linux-gnu/libc.so.6
2# raise in /lib/x86_64-linux-gnu/libc.so.6
3# abort in /lib/x86_64-linux-gnu/libc.so.6
4# 0x00007F24AC477911 in /lib/x86_64-linux-gnu/libstdc++.so.6
5# 0x00007F24AC48338C in /lib/x86_64-linux-gnu/libstdc++.so.6
6# 0x00007F24AC4833F7 in /lib/x86_64-linux-gnu/libstdc++.so.6
7# 0x00007F24AC4836A9 in /lib/x86_64-linux-gnu/libstdc++.so.6
8# caffe::db::MDB_CHECK(int) at ./include/caffe/util/db_lmdb.hpp:15
9# caffe::db::LMDB::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, caffe::db::Mode) at src/caffe/util/db_lmdb.cpp:40
10# caffe::DataReader<caffe::AnnotatedDatum>::Body::InternalThreadEntry() at src/caffe/data_reader.cpp:95
11# 0x00007F24B0DC343B in /lib/x86_64-linux-gnu/libboost_thread.so.1.71.0
12# start_thread in /lib/x86_64-linux-gnu/libpthread.so.0
13# __clone in /lib/x86_64-linux-gnu/libc.so.6
Aborted (core dumped)
current jolibrain/deepdetect_cpu
with https://www.deepdetect.com/downloads/platform/pretrained/ssd_300/VGG_rotate_generic_detect_v2_SSD_rotate_300x300_iter_115000.caffemodel
used to work before i think.. checking GPU version
src/caffe/util/db_lmdb.cpp
LMDB etc related.. maybe.. checking
x/detection$ ls -al model/train.lmdb/
total 29932
drwxr--r-- 2 dgtlmoon dgtlmoon 4096 Feb 4 17:10 .
drwxrwxrwx 3 dgtlmoon dgtlmoon 4096 Feb 4 17:10 ..
-rw-r--r-- 1 dgtlmoon dgtlmoon 30638080 Feb 4 17:10 data.mdb
-rw-r--r-- 1 dgtlmoon dgtlmoon 8192 Feb 4 17:10 lock.mdb
looks atleast like its writing, I'm nuking that dir when i create the service, so these are always fresh
train.lmdb/data.mdb
is created without problem brand new every time, and THEN the segfault happens
curl -X POST "http://localhost:8080/train" -d '
{
"service": "location",
"async": true,
"parameters": {
"input": {
"db": true,
"db_width": 512,
"db_height": 512,
"width": 300,
"height": 300
},
"mllib": {
"resume": false,
"net": {
"batch_size": 20,
"test_batch_size": 12
},
"solver": {
"iterations": 50000,
"test_interval": 500,
"snapshot": 1000,
"base_lr": 0.0001
},
"bbox": true
},
"output": {
"measure": [
"map"
]
}
},
"data": [ "/tags_dataset/bottom/bottom-images.txt" ]
}
'
caffe
for absolutely no reason than that's whats in the examples, maybe I should try torch? (trying squeezenet object detector here)