These are chat archives for beniz/deepdetect

22nd
Jun 2016
deter3
@deter3
Jun 22 2016 19:30
Hi , Emmanuel , how are you doing ? In your image search demo , I am wondering whether i can save the index into elasticsearch instead of annoy index file ? Say I have millions of images I will need to extract features and end users will search for similar image . I searched a lot online and did not see much show cases ever done this with elasticsearch before .
Emmanuel Benazera
@beniz
Jun 22 2016 19:33
@deter3 there s a thread about it on the elasticsearch discussion forum. Bottom line, it is not easy
deter3
@deter3
Jun 22 2016 19:33
there is only one blog using bag of words concept , http://sujitpal.blogspot.jp/2016/06/comparison-of-image-search-performance.html , by using elasticsearch payloads for CosineDistance score
Emmanuel Benazera
@beniz
Jun 22 2016 19:35
This is not the link. I all have to look it up on my laptop
deter3
@deter3
Jun 22 2016 19:35
okay , thanks a lot !
elasticsearch discussion forum is https://discuss.elastic.co/c/elasticsearch ?
Emmanuel Benazera
@beniz
Jun 22 2016 19:35
Annoy should deal fine with millions of images no ? Maybe the indexing can be very long ?
deter3
@deter3
Jun 22 2016 19:36
I am having other filter needed to be applied to narrow down the numbers of images to search
annoy index just can not apply other filters , such as names , categories (i have around 1000 categories )
is there any keywords i can search on https://discuss.elastic.co/c/elasticsearch to locate the thread ?
FYI we have contact inside ES and this is why Zachary did respond, he's the expert on the matter at ES
if you follow the links from the initial message, you'll see that it is not too difficult to do similarity search if you decide to write java code as a plugin
If you do something with image-match we'll be interested if you can share the approach so that we can write a quick doc about how to do
deter3
@deter3
Jun 22 2016 19:48
thanks for the link , actually we are doing image similarity search , it is not image match with Signatures .
I read the link from ES and it is hard for me to implement at the moment . I am just thinking , what if I do not save binary codes into eladticsearch , if there any other alternative solutions which save the extracted features into elasticsearch for searching , which might not be the best solution ?
deter3
@deter3
Jun 22 2016 19:59
https://www.elastic.co/blog/lucene-points-6.0 , here is Multi-dimensional points, coming in Apache Lucene 6.0 , expecting coming to ES very soon .
Emmanuel Benazera
@beniz
Jun 22 2016 20:02
Kdtrees have higher access complexity than phash in high dimension if the hashing function is not too heavy
deter3
@deter3
Jun 22 2016 20:05
I see , thanks a lot @beniz , I will see what I can do at the moment , which might be using bags of words approach as temporary solution. http://sujitpal.blogspot.jp/2016/06/comparison-of-image-search-performance.html .