These are chat archives for rosshinkley/nightmare

13th
Jul 2016
Federico Marcos
@marcosfede
Jul 13 2016 07:35
guys, is it possible to use evaluate more than once per nightmare instance? I need to go to N links and retrieve data from each one to process, but I need the same instance since it is already authenticated in the site
Ross Hinkley
@rosshinkley
Jul 13 2016 14:35
@marcosfede yes, if you're trying to loop over something, you may want to read the primer on async ops and loops
@mingsterism where is run() being run? inside of an evaluate? what about getSelectors()?
Ross Hinkley
@rosshinkley
Jul 13 2016 14:46
re separating concerns... in the past, especially for simple single-connection stuff, i think you can put all of the models on a single index or context object and include that
not sure that's best practices, though
i'd have to go back and do some reading
Mingsterism
@mingsterism
Jul 13 2016 15:12
hey @rosshinkley here's the code. http://hastebin.com/ejatapalaw.js
also, the funny thing is , even if i comment out line 9, lines 3-5 is still being invoked as i still get the same error. No idea why.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:14
probably because extractionSettings no longer exists? :)
Mingsterism
@mingsterism
Jul 13 2016 15:16
i still keep lines 3-5 though. shouldnt const extractionSettings just remain as it is.
i just dont call it by commenting out line 9.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:16
i misread
i thought you were commenting out 3-5
okay
having extractionSettings like you have it - outside of an evaluate - is going to cause problems
notably because it's going to try to evaluate Array.from(document.getElementsByClassName(...
i'm not real clear on what you're trying to do... are you trying to define what gets selected outside of the evaluate to store in a config?
Mingsterism
@mingsterism
Jul 13 2016 15:18
actually i just want to make code more modular/organise.
not meant to affect function whatsover
Ross Hinkley
@rosshinkley
Jul 13 2016 15:21
what do you mean?
Mingsterism
@mingsterism
Jul 13 2016 15:21
i guess the question is "Is there any way to let Array.from(document.get... outside of the function extraction() and run()
Ross Hinkley
@rosshinkley
Jul 13 2016 15:21
well... technically yes
Mingsterism
@mingsterism
Jul 13 2016 15:21
i dont want to call Array.from from within the function because those functions are meant to be generic.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:22
not sure i follow
Mingsterism
@mingsterism
Jul 13 2016 15:22
BaseBot.prototype.runNn = function(loop, count) {
    const self = this;
    const extractionSettings = {
        result: Array.from(document.getElementsByClassName('ListingItem-title title text-nn-primary--bright text-overflow')) 
    }
    function extraction() {
        console.log('running extract...')
        const final = new Object()
        // const result = extractionSettings.result;
        const result = Array.from(document.getElementsByClassName('ListingItem-title title text-nn-primary--bright text-overflow')) 
        const price = Array.from(document.querySelectorAll('.ListingItem-price span:nth-child(2)'))
        return result.map((x, y) => {
            return JSON.stringify([x.innerText, price[y].innerText])
        })
    }
in this code, is there a way to get const extractionSettings to work.
without throwing error ReferenceError: document is not defined
Ross Hinkley
@rosshinkley
Jul 13 2016 15:23
like i said, yes, but it's going to be a lot of work
and not worth it imho
Mingsterism
@mingsterism
Jul 13 2016 15:23
why alot of work?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:23
because you're going to have to make a closure, stringify it, and then recreate the function inside of .evaluate()
Mingsterism
@mingsterism
Jul 13 2016 15:24
but if not, any way to improve usability / modularity of code?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:24
what i would recommend
make the selector the thing you pass in with your extraction config
something like....
BaseBot.prototype.runNn = function(loop, count) {
    const self = this;
    const extractionSettings = {
        result:'ListingItem-title title text-nn-primary--bright text-overflow',
                price: '.ListingItem-price span:nth-child(2)'
    }
    function extraction(settings) {
        console.log('running extract...')
        const final = new Object()
        const result = Array.from(document.getElementsByClassName(settings.result)) 
        const price = Array.from(document.querySelectorAll(settings.price))
        return result.map((x, y) => {
            return JSON.stringify([x.innerText, price[y].innerText])
        })
    }
... some time later...
nightmare.evaluate(extraction, extractionSettings)
Mingsterism
@mingsterism
Jul 13 2016 15:29
ah. so you mean passing in an object literal into .evaluate?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:29
i think you'll have a better time
yep
... is that kind of the direction you were heading when you said you wanted to make it generic?
Mingsterism
@mingsterism
Jul 13 2016 15:30
that is definitely in the right direction. its making the code more modular/reusable.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:31
cool :)
Mingsterism
@mingsterism
Jul 13 2016 15:31
this will not work right nightmare.evaluate(extraction(extrationSettings))
Ross Hinkley
@rosshinkley
Jul 13 2016 15:31
not the way you have it, no
Mingsterism
@mingsterism
Jul 13 2016 15:32
why not that way. you mean it works for some cases?
rosshinkley @rosshinkley thinks
Ross Hinkley
@rosshinkley
Jul 13 2016 15:33
if extraction() returned a function based on extractionSettings, i think it would work
but again, you don't get variable lifting for free
Mingsterism
@mingsterism
Jul 13 2016 15:33
sry. but whats variable lifting.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:33
sorry
i think a simple example might be helpful
function extraction(settings){
  return function(){
    //ordinarily, you'd be able to do this:
    var x = settings.whatever;
   }
};
but because of how nightmare transmits functions to Electron, settings won't exist when the returned function is executed
(that's why if you're passing arguments to the evaluated function, you have to pass them as parameters to .evaluate())
Mingsterism
@mingsterism
Jul 13 2016 15:40
ross i still got the same error after trying your changes.
BaseBot.prototype.runNn = function(loop, count) {
    const self = this;
    const extractionSettings = {
        result: Array.from(document.getElementsByClassName('ListingItem-title title text-nn-primary--bright text-overflow')) 
    }
    function extraction(settings) {
        console.log('running extract...')
        const final = new Object()
        // const result = extractionSettings.result;
        const result = Array.from(document.getElementsByClassName(settings.result))  // <<<<<<<<, ERROR HERE
        const price = Array.from(document.querySelectorAll('.ListingItem-price span:nth-child(2)'))
        return result.map((x, y) => {
            return JSON.stringify([x.innerText, price[y].innerText])
        })
    }

    function run() {
        return self.nightmare 
        .goto(self.baseUrl)
        .then(function loops() {
            return self.nightmare
                .evaluate(self.extraction, extractionSettings)
i tested on const result
oh. wait. forgot to take out self.
Ross Hinkley
@rosshinkley
Jul 13 2016 15:41
... you're also selecting on Array.from(document.getElementsByClassName('ListingItem-title title text-nn-primary--bright text-overflow'))
Mingsterism
@mingsterism
Jul 13 2016 15:41
nope still doesnt work
why. whats wrong with that?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:41
extractionSettings.settings should be just the selector string
Mingsterism
@mingsterism
Jul 13 2016 15:41
ahhhh
Ross Hinkley
@rosshinkley
Jul 13 2016 15:43
you want to avoid having browser-specific globals be evaluated on the nightmare end of things
Mingsterism
@mingsterism
Jul 13 2016 15:43
works well :)
sry what did you mean by that?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:43
so... in your last snippet
result: Array.from(document.getElementsByClassName('ListingItem-title title text-nn-primary--bright text-overflow')) } function extraction(settings) {
document is (typically) a browser global
Mingsterism
@mingsterism
Jul 13 2016 15:44
i see. so you mean they should be called within a function?
Ross Hinkley
@rosshinkley
Jul 13 2016 15:45
yeah, so long as they're in a function, you should be fine
Mingsterism
@mingsterism
Jul 13 2016 15:45
thanks Ross :)
Ross Hinkley
@rosshinkley
Jul 13 2016 15:45
just make sure the function is run with .evaluate()
Mingsterism
@mingsterism
Jul 13 2016 15:45
just one more question
Ross Hinkley
@rosshinkley
Jul 13 2016 15:45
shoot
Mingsterism
@mingsterism
Jul 13 2016 15:46
any thoughts on the mongoose. you mentioned earlier to do some pattern?
right now im just doing this.
function CoreBot() {
    this.Nightmare = require('nightmare');
    this.nightmare = this.Nightmare({show: true});
    this.urls = [];
    const mongoose = require('mongoose');
    mongoose.connect('192.168.99.100:27017')
    this.db = mongoose.connection;
    this.schemas = mongoose.Schema({job: String})
    this.model = mongoose.model('newModel1', this.schemas)
}
Ross Hinkley
@rosshinkley
Jul 13 2016 15:46
that'll work for now
Mingsterism
@mingsterism
Jul 13 2016 15:47
but thing is what if i want to run multiple CoreBot instances.
they take different collection names
can i just put it in a different function and call it from CoreBot()
i tried something like that but didnt work.
function CoreBot() {
    this.Nightmare = require('nightmare');
    this.nightmare = this.Nightmare({show: true});
    this.urls = [];
    this.db = this.dbActivate;
}

CoreBot.prototype.dbActivate = function(modelName) {
    const mongoose = require('mongoose');
    mongoose.connect('192.168.99.100:27017')
    this.db = mongoose.connection;
    this.schemas = mongoose.Schema({place: String})
    this.model = mongoose.model(modelName, this.schemas)
}
Ross Hinkley
@rosshinkley
Jul 13 2016 15:48
uuhhh, off the cuff, yeah, i think that should work
i'd have to go experiment with it to be sure
Mingsterism
@mingsterism
Jul 13 2016 15:48
ok. ill give it a try as well. let u know the error.
btw just curious would nightmare able suitable for a large scale project like this? http://www.michaelnielsen.org/ddi/how-to-crawl-a-quarter-billion-webpages-in-40-hours/
Ross Hinkley
@rosshinkley
Jul 13 2016 15:53
probably not
Mingsterism
@mingsterism
Jul 13 2016 15:53
why :(
Ross Hinkley
@rosshinkley
Jul 13 2016 15:53
resources, for starters
Mingsterism
@mingsterism
Jul 13 2016 15:54
i see. good point.
is there a way to figure out the max num of nightmare instance able to run on a cloud server
eg: like digital ocean servers here https://www.digitalocean.com/pricing/
Ross Hinkley
@rosshinkley
Jul 13 2016 15:59
not directly, i don't think
because each nightmare instance requires an electron instance (at least for now), it's relatively resource-intensive
on the smallest DO droplet, i'd suspect you'd hit trouble after 8-12
i have no real evidence to back those numbers up
just guessing
Mingsterism
@mingsterism
Jul 13 2016 16:01
ok. no probs. its fine. just wanted to get some idea. thanks :)
Ross Hinkley
@rosshinkley
Jul 13 2016 16:04
no problem