These are chat archives for rosshinkley/nightmare

30th
May 2016
Mingsterism
@mingsterism
May 30 2016 01:51
Does anyone have an example of nightmare with generators
I'm trying something like this. but not too sure if the idea is correct.
function *starContent() {
    var selector1 = "slcontent3_0_wleft1_2_ulEditorsPickList";
    var selector2 = "three columns right ss";
    var text = yield nightmare 
        .evaluate((selector1, selector2) => {
            console.log('outer title')
            var titles = Array.prototype.slice.call(document.getElementById(selector1).getElementsBYClassName(selector2))
            return titles.map((title) => {
                console.log('inner title')
                return title.innerText;
            })
        }, selector1, selector2)
    console.log('------------------')
    return text;
Mingsterism
@mingsterism
May 30 2016 02:07
I tried to invoke it in my code, but it did not work.
nightmare
    .goto('http://www.thestar.com.my/')
    .evaluate(function() {
        var ss = starContent();
        var titles = ss.next().value
        return titles.map(function (title) {
            return title.innerText;        })
    })
    // .wait(5000)
    .end()
    .then(function(content) {
        console.log(content);
    })
    .catch(function(err) {
        console.log(err);
    })
mingk@DESKTOP-BMERQIM MINGW64 ~/coding/nightmarejs/electron-quick-start/nightmare1 (master)
$ node theStar.js
starContent is not defined
Ross Hinkley
@rosshinkley
May 30 2016 15:10
@mingsterism .evaluate() runs in the Electron context, not the Nightmare context
that's why it's complaining that starContent is not defined
are you trying to run generators on the browser client?
it looks like you're trying to mix and match promises and generators to control how nightmare executes... and... while doable, will have to look a bit different
Ross Hinkley
@rosshinkley
May 30 2016 15:16
i also feel compelled to ask if you're managing the generators yourself on purpose... using something like vo or co might be of some help
Mingsterism
@mingsterism
May 30 2016 15:31
@rosshinkley thanks Ross for advise. I'm just wondering if there was a real benefit to using Promises together with Generators.
then again, I havent used them enough to know how to make them work well together.
@rosshinkley yes. saw those libraries before. Do you mind just briefly sharing what do those libraries actually do. I'm having little tough time to understand it.
Ross Hinkley
@rosshinkley
May 30 2016 15:35
they make the mechanics of using generators a little more intuitive while also providing support to make promises yieldable
ordinarily, with generators, you'd have to manage .next() and iteration yourself
i did a writeup on this topic...
might be worth your time :)
Mingsterism
@mingsterism
May 30 2016 15:40
thanks ross. looks very comprehensive. i'll read through it. :)
Ross Hinkley
@rosshinkley
May 30 2016 15:40
lmk if you have questions, i'll do my best to answer them
Mingsterism
@mingsterism
May 30 2016 15:42
just came to my mind. always wanted to know what does a large scale nightmare script look like?
Ross Hinkley
@rosshinkley
May 30 2016 15:42
that's a great question, i haven't seen many big ones publicly
Mingsterism
@mingsterism
May 30 2016 15:42
because im just experimenting on such a small scale. Wanted to get your experience on running larger scale scripts. how are they architectured?
Ross Hinkley
@rosshinkley
May 30 2016 15:42
(if anyone has one in the wild, i'd love to hear about it)
Mingsterism
@mingsterism
May 30 2016 15:43
i still quite new to coding, so i cant even imagine what it would be like.
Ross Hinkley
@rosshinkley
May 30 2016 15:43
the biggest one i've seen is probably nightmare-swiftly
which is really a bunch of handy wrapper methods to make poking around 99designs a bit easier
otherwise, i've seen it mostly used for testing
Mingsterism
@mingsterism
May 30 2016 15:45
is there any limitation to it being run on a larger scale?
Ross Hinkley
@rosshinkley
May 30 2016 15:45
great question
Mingsterism
@mingsterism
May 30 2016 15:46
do you see nightmare evolving into something like scrapy?
middlewares with a full architecture..
Ross Hinkley
@rosshinkley
May 30 2016 15:49
you could kind of do something like that now... nightmare is intentionally set up to be extensible
the biggest hurdle you'd probably run into doing large-scale scraping is memory problems
right now, you need to run an electron instance per nightmare instance
chromium doesn't require a ton of memory, but it's enough to seriously limit utility for scaling out
Mingsterism
@mingsterism
May 30 2016 15:53
do you see any solution for it ?
when you mean memory problem, did you mean running many instances of chromium/nightmare??
Ross Hinkley
@rosshinkley
May 30 2016 15:56

do you see any solution for it ?

The lowest-hanging fruit there is to manage a single (or small number) of Electron instances. This is partially accomplished with some of the changes in segmentio/nightmare#593 (namely, multiple window management)

when you mean memory problem, did you mean running many instances of chromium/nightmare??

Yeah. Memory usage goes up pretty dramatically with multiple instances of Nightmare.

Mingsterism
@mingsterism
May 30 2016 15:59
ok. got it. thanks.
btw, did you manage to take a look at readability api from google?
Ross Hinkley
@rosshinkley
May 30 2016 16:00
?
did we talk about this before and i forgot?
Mingsterism
@mingsterism
May 30 2016 16:01
yeah. i showed you the link before.
hmm no worries. its called readability by google if you have time to check it out.
it was the auto parser alogirthm for websites
Ross Hinkley
@rosshinkley
May 30 2016 16:04
oohhhh right right right
sorry
no, i haven't had a real chance to dig into that
Mingsterism
@mingsterism
May 30 2016 16:08
ah. ok no worries.
Ross Hinkley
@rosshinkley
May 30 2016 16:09
clearing the cobwebs
i think i still stand by what i said - making a general parser for content is very hard
readability is geared more for news articles, blogposts, etc i think
Mingsterism
@mingsterism
May 30 2016 16:12
yeah. will take a closer look at it again.
Rick Medina
@rickmed
May 30 2016 19:48
@rosshinkley have you tested the impact of nightmare being multiple electron instances vs one instance -> multiple windows?
I would imagine that chromium instances would require much more memory than electron processes?
Ross Hinkley
@rosshinkley
May 30 2016 19:49
@rickmed in probably oversimplified terms, electron is a chromium instance
Rick Medina
@rickmed
May 30 2016 19:50
isn't electron a node instance and browserwindows chromium instances?
Ross Hinkley
@rosshinkley
May 30 2016 19:50
and to answer your question, a while back i did some tinkering with spinning up new windows, and anecdotally i don't think it required as much memory
electron is a nice api abstraction in javascript calling the chromium methods directly, to the best of my understanding
Rick Medina
@rickmed
May 30 2016 19:52
on second thought, electron is ran with the compiled binary so I don't think is that easy to say "isn't electron a node instance and browserwindows chromium instances?"
rosshinkley @rosshinkley nods
Rick Medina
@rickmed
May 30 2016 19:55
my thought was that if electron was a ~nodejs instance and the browser windows were ~chromium instances, considering how much memory chrome takes (very different animal to chromium I know) one would predict that browser windows would require much more memory than the electron process, making the nightmare's change to only one electron process resources neligable
buuut, only a test would determine I'd say
Ross Hinkley
@rosshinkley
May 30 2016 19:55
browserwindows !== chromium instances, though
(at least, not necessarily)
i'd expect resource use to be muuuuch lower
Rick Medina
@rickmed
May 30 2016 19:57
FWIW, i did a quick test nightmare vs phantom and nightmare was much more lightweight
Ross Hinkley
@rosshinkley
May 30 2016 19:57
in terms of resource use?
Rick Medina
@rickmed
May 30 2016 19:57
one window
yes
Ross Hinkley
@rosshinkley
May 30 2016 19:57
or speed?
(or both? :P)
that's interesting, i would not have expected that
Rick Medina
@rickmed
May 30 2016 19:58
like 2x-3x less memory
didn't test speed
both were super fast on the top of my mind
yeah, me neither
a quick test opened the equivalent script (on windows), phantom was ~80mb, nightmare ~30mb something like that
Ross Hinkley
@rosshinkley
May 30 2016 20:00
hhhuh
makes me wonder what phantom is doing internally, then
... actually, doesn't phantom have it's own DOM renderer?
Rick Medina
@rickmed
May 30 2016 20:01
yes
Ross Hinkley
@rosshinkley
May 30 2016 20:01
it's been a hot mess of forever since i've looked at it
Rick Medina
@rickmed
May 30 2016 20:01
specially since phantom 2.0 is much more heavyweight than predecessor
no idea about speed and such
Ross Hinkley
@rosshinkley
May 30 2016 20:02
i can maybe-kind-of see it if it's doing pixel rendering internally
because Chromium outsources that job
Rick Medina
@rickmed
May 30 2016 20:02
how ?
Ross Hinkley
@rosshinkley
May 30 2016 20:02
goes to the framebuffer instead of taking care of it internally
Rick Medina
@rickmed
May 30 2016 20:02
mmm...
Ross Hinkley
@rosshinkley
May 30 2016 20:03
(will change with ... is it aura? aurora?)
Rick Medina
@rickmed
May 30 2016 20:03
so not a valid test the one i did lol
not even close
Ross Hinkley
@rosshinkley
May 30 2016 20:03
well, it might be
i'm telling you all of this from the hip :P
i'd have to go spend some time with phantom to be anywhere near certain
Rick Medina
@rickmed
May 30 2016 20:05
lol
Ross Hinkley
@rosshinkley
May 30 2016 20:05
i'm trying to justify why Phantom would use more memory
Rick Medina
@rickmed
May 30 2016 20:05
yeah, might be a fun test on a clean vm, a considerable size script with instances, nightmare go...phantom go and see rources diff
I don't think it would honestly
Ross Hinkley
@rosshinkley
May 30 2016 20:07
i wouldn't be surprised if it's roughly a wash, resource wise
Rick Medina
@rickmed
May 30 2016 20:07
did a quick search
pthanom is using webkit
Ross Hinkley
@rosshinkley
May 30 2016 20:07
yep
Rick Medina
@rickmed
May 30 2016 20:08
(was that the case always...?)
Ross Hinkley
@rosshinkley
May 30 2016 20:08
oh boy
uhhhhh
i feel like i knew that at some point, but i can't remember/don't knwo
Rick Medina
@rickmed
May 30 2016 20:08
I think they transitioned from 2.0 or 1.9 anyways...
Ross Hinkley
@rosshinkley
May 30 2016 20:08
to the release notes
Rick Medina
@rickmed
May 30 2016 20:08
but using webkit maybe on par with nightmare on resources
being a one electron process ...
yeah looking at the release history it has definitely been more heavyweight as a new release comes out
started pure headless, then ghost driver (webdriver), then webkit
Ross Hinkley
@rosshinkley
May 30 2016 20:12
yep
i was reading through the 2.0 upgrade plan
got sidetracked :P
Rick Medina
@rickmed
May 30 2016 20:14
maybe i'll do a quick test phantom vs nightmare one of these days...
on a clean machine
Ross Hinkley
@rosshinkley
May 30 2016 20:14
speaking of headless
Rick Medina
@rickmed
May 30 2016 20:14
any recommendations for a resurces profiler
?/
Ross Hinkley
@rosshinkley
May 30 2016 20:15
zombiejs has piqued my interest
Rick Medina
@rickmed
May 30 2016 20:15
I took it for a quick spin a while a go
but I lacked features I needed in the js execution department
Ross Hinkley
@rosshinkley
May 30 2016 20:15
yep
still neat
as for a resources profiler
Rick Medina
@rickmed
May 30 2016 20:17
the problem with all those pure headless is debugging is soooo much harder
Ross Hinkley
@rosshinkley
May 30 2016 20:18
not an area i'm ultra-familiar with
especially with nodejs
and you're really looking for a more general-use OS-level resource profiler, i'd think
perf, maybe?
:P
Rick Medina
@rickmed
May 30 2016 20:23
perf?
Rick Medina
@rickmed
May 30 2016 20:29
will into into that, thanks
Ross Hinkley
@rosshinkley
May 30 2016 20:30
if anyone has suggestions i'd love to hear what you're using
oooh
this looks kinda neato
ctrl-d
for later
Rick Medina
@rickmed
May 30 2016 20:37
yes
back to work, later
Ross Hinkley
@rosshinkley
May 30 2016 20:37
see you