These are chat archives for Automattic/mongoose

26th
Aug 2016
Jonathan
@MaddoxDevelopment
Aug 26 2016 00:09
yes look up ajax
LeonineKing1199
@LeonineKing1199
Aug 26 2016 16:54
So, if I'm doing a query where I'm using $in: [values], how large can my array be?
For example, is 13,000 too large?
David Baldwynn
@whitef0x0
Aug 26 2016 19:12
Is there any good way of generating api documents from a mongoose schema?
but probably you need to consider changing your schema or way how to handle it
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:33
Okay, well, it's good to know how big is too big and wow, too big is definitely waaaaaay too big in my case.
My only concern now is how slow the query might potentially be.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:34
I remember that we tested for one API call this $in operator and we found it very very slow
so we ended up with splitting in multiple $or stuff
which was much better
Michael Leanos
@mleanos
Aug 26 2016 19:38
Yes. Splitting into multiple $or conditions on a single query seems like the right approach.
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:40
Really?
How so in my case?
The naive approach is just simple $in: [ big_amount_of_values]
Should I just do, $or: [ { $in: [chunk_of_array], { $in: [other_chunk_of_array] } } ]
Michael Leanos
@mleanos
Aug 26 2016 19:42
This seems like an interesting use case.. If you have a backend process that accepts a list of _id's from the client that effectively would use them in a $in type query, you'd probably want to implement this type of multiple $or conditions solution. This would help protect yourself against erroneous & malicious attacks.
@LeonineKing1199 Yes to your last statement.. chunk the incoming list of _id's.
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:42
Interesting.
I guess I'd have to profile it to see.
Michael Leanos
@mleanos
Aug 26 2016 19:43
the real issue here is the limitation on the bson size.
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:43
Yeah, that stack overflow post seemed to allude to that.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:44
@mleanos @LeonineKing1199 yep, but in our case we didn't have huge array for $in, so we ended up with: $or: [ { "field" : "value1" }, { "field": "value2" }} ]
Michael Leanos
@mleanos
Aug 26 2016 19:44
As for the performance of all these $or's, I'm not sure. But it doesn't seem like it would be too much of an extraordinary requirement. If your application needs to be able to accept a list for this type of query, then you should have this safeguard in place
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:44
Okay, neat.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:45
do you have index on $in field?
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:46
Nope.
XD
Should we?
Our DB size isn't that big right now but it's growing every day.
So it's really hard to tell if we're prematurely optimizing or not.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:46
what is size of that collection?
$in is very slow operator
you have to
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:49
Oh man, the collection itself is actually getting pretty big. Hold on, lemme go check.
Okay, right now it's about 215k documents.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:50
try to execute simple in $stuff in console
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:50
I'm also looking into moving a lot of this stuff into the application logic.
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:50
@LeonineKing1199 https://github.com/TylerBrock/mongo-hacker good stuff if you don't use it already
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:51
Hey, that seems pretty cool!
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:52
yep run few queries with $in you will see difference
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:53
Weird. Never thought $in would be such a slow operator!
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:54
are you using it with other fields or in that query only $in?
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:55
I'm querying by a text field as well.
So it's documents with a certain title who have movie ids $in this long array of valid movie ids
Vlado Tesanovic
@vladotesanovic
Aug 26 2016 19:56
so you are asking if that movie is valid movie?
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:56
I.e. mutitiple docs with the same title contain a pretty wide range of values.
Michael Leanos
@mleanos
Aug 26 2016 19:56
Using @vladotesanovic's technique seems best. Don't use the $in operator. Just build an array from your _id's that end up being something like { 'fieldname': _id }and pass this array to your $or condition.. Is that what you were saying @vladotesanovic?
LeonineKing1199
@LeonineKing1199
Aug 26 2016 19:56
No, right now I'm just gathering for some future algorithms.