These are chat archives for Automattic/mongoose

8th
Jun 2018
panigrah
@panigrah
Jun 08 2018 00:05
@NJM8 yes you will see errors in prod too.
panigrah
@panigrah
Jun 08 2018 00:19

hi All, I am looking for the most efficient way to insert a large number of documents and subdocuments. I am doing this currently.

for(let doc of documents) {
  /* does this document exist - if so skip */
  let d = await Doc.findOne({name: doc.name});
  if (!d) {
    const child = await Child.findOneAndUpdate({name: doc.cName},{/* doc.childAttributes */}, {upsert: true});
    let parent = await Parent.findOneAndUpdate({name: doc.pName}, {updated: Date.now(), $inc: { count: 1} }, {})
    if (!parent) parent = await Parent.create({name: doc.pName, updated: Date.now(), count: 1, ...additional attributes from doc});
    /* now create the document */
   let newEntity = {/* fill in doc attributes */, child: child._id, parent: parent._id };
   newDocs.push(newEntity);
   if (newDoc.length > 100) { Doc.createMany(newDocs), newDocs = []};
 }

}

I need child and parent to be separate documents, because they are independently searchable documents. I need to process millions of these records and this is a bottleneck. What are some ways to make more efficient?

TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 05:59
I have a rather small but stumbling block issue trying to do a very simple deep populate with my web app so I can pull, populate, and hand information to my ejs templates so they can display comments inside of comments insi... all that!
I've opened a github issue on the accompanying library here:
buunguyen/mongoose-deep-populate#58
Again, problems using this npm library to deeply populate (not vanilla mongoose population) some mongoose models.
Kev
@lineus
Jun 08 2018 10:18
@TheRedstoneTaco what version of mongoose are you using?
panigrah
@panigrah
Jun 08 2018 11:32
@NJM8 I am doing the bulk writes with insertMany but it’s the other checks I need to make to ensure that the related records exist.
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 13:15
@lineus 3.10.10
I just updated to 5.1.4
Kev
@lineus
Jun 08 2018 13:18
@TheRedstoneTaco you should be able to do that population from mongoose 5.1.4 without any plugins.
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 13:18
the problem still exists that the deepPopulate function (from the mongoose-deep-populate library) only populates one level as if I were using mongoose's built in .populate
deeply populate any level of nested models?
How?!?!
@lineus
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 13:26
That populates across a known limit of the nesting. If I know I'm only going to go so far as friends inside of friends, that will work, but what if I don't know how many friend models (their ids that need to be populated) are nested inside the friend models?
That is currently my problem
The mongoose-deep-populate library was supposed to do that but it appears it's real job is to just populate one type of model inside of another type of models instead of the same type of model nested inside itself and arbitrary amount of times like I have in my database
@lineus
Francesco
@ffiorent_twitter
Jun 08 2018 13:28
Hi all, I'm finding a way to integrate SchemaTypes with extra options such as 'description', 'unit' for numeric fields: any idea?
My target is to use Mongoose Schema also for client needs, otherwise I have to duplicate informations
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 13:42
how might I deeply populate a mongoose object of arbitrary nesting levels? Where each nest is of the same type of model "conversation" so you have conversation inside of conversation inside of conversation insi... until you reach the bottom.
I want to populate it entirely so the end result is one big parent conversation model with all its children, grandchildren, etc. having their ids replaced with the actual conversation models.
Kev
@lineus
Jun 08 2018 14:03
@TheRedstoneTaco I see what you mean. I'll play around with it at some point today and see if I can help.
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 14:07
I have just now tried my very best to manually implement the deep document population with my own custom recursion, and it should work exactly as I think it should but it actually has the EXACT same result as the mongoose-deep-populate library did so it may be something with my document!
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 15:33
I have a page model that has a conversation field which is an array of conversation id's.
Each conversation is an array of other conversation id's.
I want to deep populate the page model to replace the id's in it's conversation array with the actual conversation objects and then the conversation id's in those conversations with more actual conversation objects and then the conversation id's in tho... however far deep it goes, I want the "foundPage" variable to be fully populated so I can render the template with it like I'm trying to do at the bottom of the code.
However, since Node.js functions are asynchronous, the render statement at the bottom runs before the recursive population function finishes, so the server just renders a page with conversation id's instead of the actual conversations.
I think if I did something with promises in my recursive function, I could implement it so that when the recursive function is completely done running and doing it's business, then we will have a callback that renders the template with the now fully populated page. But how do I implement the promises?
// GET - index - '/:pageTitle': Show page of some route
router.get("/:pageTitle", function(req, res) {
    Page.findOne({
        title: req.params.pageTitle
    }, function(err1, foundPage) {
        foundPage.populate("conversation", function(err2, populatedPage) {
            r(foundPage.conversation);
            function r(obj) {
                if (Array.isArray(obj)) {
                    for (var i = 0; i < obj.length; i++) {
                        r(obj[i]);
                    }
                } else {
                    obj.populate("conversation", function(err, populatedObj) {
                        populatedObj.conversation.forEach(function(conversation) {
                            r(conversation);
                        });
                    });
                }
            }
            res.render(req.params.pageTitle + "/index.ejs", {
                page: foundPage
            });
        });
    });
});
Kev
@lineus
Jun 08 2018 16:31
@TheRedstoneTaco check out this gist It's a hack for sure, but it works :smile: I increased the value of recursive pop Objects to 1100 before I ran out of heap space.
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 16:39
one sec!

@lineus I actually shortened my custom recursive population function down to this:

Page.findOne({
    title: "os"
}, function(err1, foundPage) {
    r(foundPage);
    function r(obj) {
        obj.populate("conversation", function(err, populatedObj) {
            populatedObj.conversation.forEach(function(conversation) {
                r(conversation);
            });
        });
    }
    console.log(JSON.stringify(foundPage, null, 4));
    //setTimeout(function() {
    //   console.log(JSON.stringify(foundPage, null, 4)); 
    //}, 500);
});

If I set a timer to wait about a second before console.log 'ing the "foundPage" variable, you will see that this recursive approach actually works correctly, but it works asynchronously because of how node.js does functions. So if I can just find a way to make the "r" function synchronous, this problem should be solved!

The result needs to look like this!:
{
    "conversation": [
        {
            "conversation": [
                {
                    "conversation": [],
                    "_id": "5b1a029994ee700731996535",
                    "__v": 0
                }
            ],
            "_id": "5b19f6829facfd06e03714a9",
            "__v": 1
        }
    ],
    "_id": "5b0b0fe5f00f640be0eb1ea1",
    "title": "os",
    "__v": 7
}
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 16:44
(remember we're trying to get the console log of "foundPage" to be like this)
Chris Rutherford
@cjrutherford
Jun 08 2018 16:48
Now that most of the basics is complete, I need to call the database with a query that takes in a where clause that has three conditions, two can only be one value each, and the last one can be six different values. How would you guys recommend doing that? Trying to keep it down to one call.
Chris Rutherford
@cjrutherford
Jun 08 2018 16:50
as always, @lineus is ready with a link to the docs! I'll look at it
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 16:53
:)
Chris Rutherford
@cjrutherford
Jun 08 2018 16:54
so from looking at it I can do it this way:
function GetObjects(arrayValues, value2, value3){
    Model.find({
         where: {  
              prop1: {$in : [arrayValues.option1, arrayValues.option2, arrayValues.option3]},
              prop2: value2,
              prop3: value3
     }}).then(....);
};
Kev
@lineus
Jun 08 2018 16:59
I think the where should be $where but is superfluous anyway. Model.find({ prop1: { $in: [a, b, c, d, e, f] }, prop2: value2, prop3: value3 }) should be enough to get the job done.
Kev
@lineus
Jun 08 2018 17:11
Daniel Netzer
@DanielNetzer
Jun 08 2018 17:15
do you guys any good tutorials on how to construct complex data structures in MongoDB?
design/plan*
and if you know a good online chart/diagram editor to write those schema's in there? so I can show references between docs
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 17:37

@lineus and any others,
For some reason, I don't need any external deep population libraries, the vanilla mongoose population deep populates automatically now? ¯_(ツ)_/¯ I updated mongoose and integrated async/await and now the "page" variable gets populated almost correctly. The route is now:

// GET - index - '/:pageTitle': Show page of some route
router.get("/:pageTitle", function(req, res) {
    (async function(req, res) {
        var page = await Page.findOne({
            title: req.params.pageTitle
        }).populate([{ path: 'conversation'}, {path: 'conversation.author'}]);
        console.log(JSON.stringify(page, null, 4));
        res.render(req.params.pageTitle + "/index.ejs", {
            page: page
        });
    })(req, res);   
});

The page variable now comes out to (give or take a few attributes like title and version and all that):

[{
        "conversation": [{
                "conversation": [],
                "_id": "5b1a029994ee700731996535",
                "author": "5b193a3e6b21a608c68a4751"
        }],
        "_id": "5b19f6829facfd06e03714a9",
        "author": "5b193a3e6b21a608c68a4751"
}]

BUT I want to go a little more (to let the ejs template have all the functionality I desire) and populate those author fields as well so they will be replaced with user objects (that have an id, username, email, etc.)

Kev
@lineus
Jun 08 2018 17:44
that's what my example takes advantage of @TheRedstoneTaco, it will go as far as you tell it.
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 17:47
How can I get it to populate the author fields? @lineus
If you could just either modify your example or my example I'd use either!
Chris Rutherford
@cjrutherford
Jun 08 2018 17:56
@lineus awesome! now to implement
TheRedstoneTaco
@TheRedstoneTaco
Jun 08 2018 19:58
How can I make the recursive population portion of this route (the recursor function) synchronous so I will be able to render a template with the page after it is recursively populated?
var page = Page.findOne({
    title: "os"
}, function(err_1, foundPage) {
    function recursor(obj) {
        obj.populate("conversation", function(err_2, populatedObj) {
            obj.populate("author", function(err_3, finalObj) {
                finalObj.conversation.forEach(function(conversation) {
                    recursor(conversation);
                });
            });
        });
    }
    recursor(foundPage);
    // pass in "foundPage" to a template here, just console logging for now to see the result
    console.log(foundPage);
});