These are chat archives for beniz/deepdetect

31st
May 2018
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:25

hi @beniz - trying to build the tests with CUDA enabled and i'm getting an issue linking libcaffe.so related to cublas:

/usr/bin/c++   -g -Wall -Wextra -fopenmp -fPIC -std=c++11  -DUSE_OPENCV -DUSE_LMDB   CMakeFiles/ut_conn.dir/ut-conn.cc.o  -o ut_conn  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src  -L/usr/lib/x86_64-linux-gnu/hdf5/serial  -L/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib -rdynamic ../src/libddetect.a /usr/local/cuda/lib64/libcudart_static.a -lpthread -ldl -lrt /usr/local/cuda/lib64/libcublas.so /usr/local/cuda/lib64/libcurand.so /usr/local/cuda/lib64/libcudart.so /usr/local/cuda/lib64/libcusolver.so -lcudnn -lglog -lgflags -lgtest -lgtest_main /usr/local/lib/libopencv_videostab.so.2.4.13 /usr/local/lib/libopencv_ts.a /usr/local/lib/libopencv_superres.so.2.4.13 /usr/local/lib/libopencv_stitching.so.2.4.13 /usr/local/lib/libopencv_contrib.so.2.4.13 -lcurlpp -lcurl -lhdf5_cpp -lboost_filesystem -lboost_thread -lboost_system -lboost_iostreams -lboost_chrono -lboost_date_time -lboost_atomic -lboost_regex -lpthread -lleveldb -lsnappy -llmdb -lhdf5_hl -lhdf5 -lopenblas -lcaffe -lprotobuf -ldlib /usr/local/lib/libopencv_nonfree.so.2.4.13 /usr/local/lib/libopencv_ocl.so.2.4.13 /usr/local/lib/libopencv_gpu.so.2.4.13 /usr/local/lib/libopencv_photo.so.2.4.13 /usr/local/lib/libopencv_objdetect.so.2.4.13 /usr/local/lib/libopencv_legacy.so.2.4.13 /usr/local/lib/libopencv_video.so.2.4.13 /usr/local/lib/libopencv_ml.so.2.4.13 /usr/local/lib/libopencv_calib3d.so.2.4.13 /usr/local/lib/libopencv_features2d.so.2.4.13 /usr/local/lib/libopencv_highgui.so.2.4.13 /usr/local/lib/libopencv_imgproc.so.2.4.13 /usr/local/lib/libopencv_flann.so.2.4.13 /usr/local/lib/libopencv_core.so.2.4.13 -ldl -lm -lpthread -lrt -Wl,-rpath,/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib:/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src:/usr/lib/x86_64-linux-gnu/hdf5/serial:/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib:/usr/local/cuda/lib64:/usr/local/lib 
/usr/bin/ld: /home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib/libcaffe.so: undefined reference to symbol 'cublasCreate_v2'
/usr/local/cuda/lib64/libcublas.so: error adding symbols: DSO missing from command line

could this be related to #426? I'm using the following in my tests/CMakeLists.txt:

if (CUDA_FOUND)
  set(CUDA_LIB_DEPS ${CUDA_LIBRARIES} ${CUDA_CUBLAS_LIBRARIES} ${CUDA_curand_LIBRARY} ${CUDA_CUDART_LIBRARY} ${CUDA_cusolver_LIBRARY})
  if (USE_CUDNN)
    set(CUDA_LIB_DEPS ${CUDA_LIB_DEPS} ${CUDNN_LIBRARY})
  endif()
else()
  set(CUDA_LIB_DEPS "")
  add_definitions(-DCPU_ONLY)
endif()
i have /usr/local/cuda/lib64 in my LD_LIBRARY_PATH and /usr/local/cuda in my PATH
Emmanuel Benazera
@beniz
May 31 2018 15:29
hi @cchadowitz-pf this looks like the symptoms from #426, have you merged it ?
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:30
yup i have, and added the cudart and cusolver libs as well
Emmanuel Benazera
@beniz
May 31 2018 15:31
check that -lcublas is part of the linking line, use make VERBOSE=1 for this
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:31
i included the output from my make VERBOSE=1 above (same message that has the error) actually
Emmanuel Benazera
@beniz
May 31 2018 15:32
yep, add -lcublas to it maybe
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:35
it has /usr/local/cuda/lib64/libcublas.so in there already, would that not be enough? also i don't have this problem when building the main app (without tests)
i'll try -lcublas too though
Emmanuel Benazera
@beniz
May 31 2018 15:36
ah ok, apologies, was searching for -lcublas
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:36
no worries :)
Emmanuel Benazera
@beniz
May 31 2018 15:37
have you updated cuda recently ?
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:38
yup from that issue the other day. now on CUDA Version 8.0.61
Emmanuel Benazera
@beniz
May 31 2018 15:41
have you tried adding '-lcusolver' ?
try '-lcusparse' as well if missing
cusolver already there
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:43
yeah cusolver is there, not cusparse though
adding cusparse didn't help either. is there something going on in the tests that differs from the main app?
Emmanuel Benazera
@beniz
May 31 2018 15:46
you mean dede is linking fine ?
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:46
yup
Emmanuel Benazera
@beniz
May 31 2018 15:47
it shouldn't really, you can check by using make VERBOSE=1 on both dede and ut_conn
(like, rm dede && make VERBOSE=1
)
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:48
this is what i have for dede (builds+links without a problem):
/usr/bin/c++   -g -Wall -Wextra -fopenmp -fPIC -std=c++11  -DUSE_OPENCV -DUSE_LMDB   CMakeFiles/dede.dir/dede.cc.o  -o dede  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src  -L/usr/lib/x86_64-linux-gnu/hdf5/serial  -L/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib -rdynamic ../src/libddetect.a -ldlib /usr/local/cuda/lib64/libcudart_static.a -lpthread -ldl -lrt /usr/local/cuda/lib64/libcublas.so /usr/local/cuda/lib64/libcurand.so /usr/local/cuda/lib64/libcudart.so /usr/local/cuda/lib64/libcusolver.so -lcudnn -lglog -lgflags /usr/local/lib/libopencv_videostab.so.2.4.13 /usr/local/lib/libopencv_ts.a /usr/local/lib/libopencv_superres.so.2.4.13 /usr/local/lib/libopencv_stitching.so.2.4.13 /usr/local/lib/libopencv_contrib.so.2.4.13 -lcppnetlib-uri -lcurlpp -lcurl -lcrypto -lssl -lhdf5_cpp -lboost_filesystem -lboost_thread -lboost_system -lboost_iostreams -lboost_chrono -lboost_date_time -lboost_atomic -lboost_regex -lpthread -lleveldb -lsnappy -llmdb -lhdf5_hl -lhdf5 -lopenblas -lcaffe -lprotobuf /usr/local/lib/libopencv_nonfree.so.2.4.13 /usr/local/lib/libopencv_ocl.so.2.4.13 /usr/local/lib/libopencv_gpu.so.2.4.13 /usr/local/lib/libopencv_photo.so.2.4.13 /usr/local/lib/libopencv_objdetect.so.2.4.13 /usr/local/lib/libopencv_legacy.so.2.4.13 /usr/local/lib/libopencv_video.so.2.4.13 /usr/local/lib/libopencv_ml.so.2.4.13 /usr/local/lib/libopencv_calib3d.so.2.4.13 /usr/local/lib/libopencv_features2d.so.2.4.13 /usr/local/lib/libopencv_highgui.so.2.4.13 /usr/local/lib/libopencv_imgproc.so.2.4.13 /usr/local/lib/libopencv_flann.so.2.4.13 /usr/local/lib/libopencv_core.so.2.4.13 -ldl -lm -lpthread -lrt -Wl,-rpath,/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib:/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src:/usr/lib/x86_64-linux-gnu/hdf5/serial:/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib:/usr/local/cuda/lib64:/usr/local/lib
and ut_conn (errors as above):
/usr/bin/c++   -g -Wall -Wextra -fopenmp -fPIC -std=c++11  -DUSE_OPENCV -DUSE_LMDB   CMakeFiles/ut_conn.dir/ut-conn.cc.o  -o ut_conn  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib  -L/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src  -L/usr/lib/x86_64-linux-gnu/hdf5/serial  -L/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib -rdynamic ../src/libddetect.a /usr/local/cuda/lib64/libcudart_static.a -lpthread -ldl -lrt /usr/local/cuda/lib64/libcublas.so /usr/local/cuda/lib64/libcurand.so /usr/local/cuda/lib64/libcudart.so /usr/local/cuda/lib64/libcusolver.so -lcudnn -lglog -lgflags -lgtest -lgtest_main /usr/local/lib/libopencv_videostab.so.2.4.13 /usr/local/lib/libopencv_ts.a /usr/local/lib/libopencv_superres.so.2.4.13 /usr/local/lib/libopencv_stitching.so.2.4.13 /usr/local/lib/libopencv_contrib.so.2.4.13 -lcurlpp -lcurl -lhdf5_cpp -lboost_filesystem -lboost_thread -lboost_system -lboost_iostreams -lboost_chrono -lboost_date_time -lboost_atomic -lboost_regex -lpthread -lleveldb -lsnappy -llmdb -lhdf5_hl -lhdf5 -lopenblas -lcaffe -lprotobuf -ldlib /usr/local/lib/libopencv_nonfree.so.2.4.13 /usr/local/lib/libopencv_ocl.so.2.4.13 /usr/local/lib/libopencv_gpu.so.2.4.13 /usr/local/lib/libopencv_photo.so.2.4.13 /usr/local/lib/libopencv_objdetect.so.2.4.13 /usr/local/lib/libopencv_legacy.so.2.4.13 /usr/local/lib/libopencv_video.so.2.4.13 /usr/local/lib/libopencv_ml.so.2.4.13 /usr/local/lib/libopencv_calib3d.so.2.4.13 /usr/local/lib/libopencv_features2d.so.2.4.13 /usr/local/lib/libopencv_highgui.so.2.4.13 /usr/local/lib/libopencv_imgproc.so.2.4.13 /usr/local/lib/libopencv_flann.so.2.4.13 /usr/local/lib/libopencv_core.so.2.4.13 -ldl -lm -lpthread -lrt -Wl,-rpath,/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/lib:/home/cchadowitz/pfi-dev/deepdetect/build/caffe_dd/src/caffe_dd/build/src:/usr/lib/x86_64-linux-gnu/hdf5/serial:/home/cchadowitz/pfi-dev/deepdetect/build/dlib/build/lib:/usr/local/cuda/lib64:/usr/local/lib
let me take dlib back out and see if that changes something
cchadowitz-pf
@cchadowitz-pf
May 31 2018 15:59
interesting - somehow dlib was messing that up...
Emmanuel Benazera
@beniz
May 31 2018 15:59
can't be worse than caffe2 or tf :)
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:00
haha
if you have a few minutes at some point to critique #425 i'm open to any and all suggestions
some tests will be coming at some point soon
Emmanuel Benazera
@beniz
May 31 2018 16:18
not today, I or someone else here will try to build when you give us a go
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:19
:+1: sounds good, thanks. you're welcome to grab it when you'd like and give it a shot. i'm currently focusing on tests so the main bulk of the dlib integration is likely good to go
Emmanuel Benazera
@beniz
May 31 2018 16:20
great :)
that was quick :)
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:22
bare minimum integration so far :) parameters at the moment are only confidence threshold, and specifying which of the two network architectures to use (along with repo path), and it returns confidence, bounding box, and label (if available)
i figure adding more functionality/features is simpler later on once the base groundwork is solid
Emmanuel Benazera
@beniz
May 31 2018 16:23
if you could find / share a model to test and use the code with, that'd be great
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:24
would you rather i commit it in the branch or attach it somewhere else?
Emmanuel Benazera
@beniz
May 31 2018 16:29
can the weights be downloaded from somewhere ?
if yes, updating the README.md would be the right way I think
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:32
yup, they're available bz2 compressed from the dlib website. i see that cmake already downloads .bz2 files from deepdetect.com and unpacks them for the tests, so i'll have it do the same for the dlib models too
i can update the README.md too
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:38
interesting, ran the tests and got 2 segfaults:
The following tests FAILED:
      2 - ut_conn (SEGFAULT)
      4 - ut_caffe_mlp (SEGFAULT)
full output (ignore the dlib tests for now):
$ ctest
Test project /home/cchadowitz/pfi-dev/deepdetect/build
    Start 1: ut_apidata
1/7 Test #1: ut_apidata .......................   Passed    0.12 sec
    Start 2: ut_conn
2/7 Test #2: ut_conn ..........................***Exception: SegFault  1.22 sec
    Start 3: ut_jsonapi
3/7 Test #3: ut_jsonapi .......................   Passed    2.10 sec
    Start 4: ut_caffe_mlp
4/7 Test #4: ut_caffe_mlp .....................***Exception: SegFault  0.24 sec
    Start 5: ut_caffeapi


5/7 Test #5: ut_caffeapi ......................   Passed  986.19 sec
    Start 6: ut_httpapi
6/7 Test #6: ut_httpapi .......................   Passed   31.40 sec
    Start 7: ut_dlibapi
7/7 Test #7: ut_dlibapi .......................***Failed    1.70 sec

57% tests passed, 3 tests failed out of 7

Total Test time (real) = 1022.98 sec

The following tests FAILED:
      2 - ut_conn (SEGFAULT)
      4 - ut_caffe_mlp (SEGFAULT)
      7 - ut_dlibapi (Failed)
Errors while running CTest
$ tests/ut_conn 
Running main() from gtest_main.cc
[==========] Running 16 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 8 tests from outputconn
[ RUN      ] outputconn.mlsoft
[       OK ] outputconn.mlsoft (0 ms)
[ RUN      ] outputconn.acc
[       OK ] outputconn.acc (1 ms)
[ RUN      ] outputconn.acc_v
[       OK ] outputconn.acc_v (0 ms)
[ RUN      ] outputconn.acck
[       OK ] outputconn.acck (0 ms)
[ RUN      ] outputconn.auc
[       OK ] outputconn.auc (0 ms)
[ RUN      ] outputconn.mcc
[       OK ] outputconn.mcc (0 ms)
[ RUN      ] outputconn.gini
gini=-0.333333
[       OK ] outputconn.gini (0 ms)
[ RUN      ] outputconn.cmfull
jstr={"measure":{"labels":["zero","one","two","three"],"f1":0.35294117352941187,"cmfull":[{"zero":[0.5,0.5,0.0,0.0]},{"one":[0.0,1.0,0.0,0.0]},{"two":[0.0,1.0,0.0,0.0]},{"three":[2.696539702293474e308,2.696539702293474e308,2.696539702293474e308,2.696539702293474e308]}],"cmdiag":[0.4999999975,0.9999999900000002,0.0,0.0],"recall":0.3333333305555556,"precision":0.3749999968750001,"accp":0.5}}
/home/cchadowitz/pfi-dev/deepdetect/tests/ut-conn.cc:283: Failure
Value of: jstr
  Actual: "{\"measure\":{\"labels\":[\"zero\",\"one\",\"two\",\"three\"],\"f1\":0.35294117352941187,\"cmfull\":[{\"zero\":[0.5,0.5,0.0,0.0]},{\"one\":[0.0,1.0,0.0,0.0]},{\"two\":[0.0,1.0,0.0,0.0]},{\"three\":[2.696539702293474e308,2.696539702293474e308,2.696539702293474e308,2.696539702293474e308]}],\"cmdiag\":[0.4999999975,0.9999999900000002,0.0,0.0],\"recall\":0.3333333305555556,\"precision\":0.3749999968750001,\"accp\":0.5}}"
Expected: "{\"measure\":{\"cmfull\":[{\"zero\":[0.5,0.5,0.0,0.0]},{\"one\":[0.0,1.0,0.0,0.0]},{\"two\":[0.0,1.0,0.0,0.0]},{\"three\":[2.696539702293474e308,2.696539702293474e308,2.696539702293474e308,2.696539702293474e308]}],\"precision\":0.3749999968750001,\"labels\":[\"zero\",\"one\",\"two\",\"three\"],\"f1\":0.35294117352941187,\"accp\":0.5,\"recall\":0.3333333305555556,\"cmdiag\":[0.4999999975,0.9999999900000002,0.0,0.0]}}"
[  FAILED  ] outputconn.cmfull (1 ms)
[----------] 8 tests from outputconn (2 ms total)

[----------] 8 tests from inputconn
[ RUN      ] inputconn.img
../examples/caffe/mnist//sample_digit.png
https://deepdetect.com/dd/examples/caffe/mnist/sample_digit.png
[       OK ] inputconn.img (947 ms)
[ RUN      ] inputconn.csv_mem1
[       OK ] inputconn.csv_mem1 (0 ms)
[ RUN      ] inputconn.csv_mem2
[       OK ] inputconn.csv_mem2 (0 ms)
[ RUN      ] inputconn.csv_copy
[       OK ] inputconn.csv_copy (0 ms)
[ RUN      ] inputconn.csv_categoricals1
trying to transform
Segmentation fault (core dumped)
Emmanuel Benazera
@beniz
May 31 2018 16:41
Ut_conn issue is known
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:41
:+1:
Emmanuel Benazera
@beniz
May 31 2018 16:42
We have a CI issue and we have to migrate to our own
cchadowitz-pf
@cchadowitz-pf
May 31 2018 16:42
gotcha, just wanted to make sure i hadn't broken existing tests somehow :)
Emmanuel Benazera
@beniz
May 31 2018 21:06
@cchadowitz-pf I'd suggest you add some information on how to use the dlib backend for object detection for instance, directly into the PR description. This would allow others to quickly test the code beyond the building part. My apologies if it's already somewhere and I've missed it. Thanks!