These are chat archives for rosshinkley/nightmare

7th
May 2016
Mingsterism
@mingsterism
May 07 2016 02:07 UTC
@rosshinkley thanks ross. Yeah that makes sense. I want to abstract/refactor out the code so I can create more large scale scripts that can go through more sites / do more commands on each site etc... Make code more organised.
rosshinkley @rosshinkley nods
Ross Hinkley
@rosshinkley
May 07 2016 02:08 UTC
@mingsterism you might also want to have a look at .use()
Mingsterism
@mingsterism
May 07 2016 02:08 UTC
if I have 100 sites, must I create 100 custom css /xpath selectors for each site?
Ross Hinkley
@rosshinkley
May 07 2016 02:09 UTC
uh, probably?
that sort of stuff isn't standardized
(in general, anyway)
Mingsterism
@mingsterism
May 07 2016 02:10 UTC
no such thing as smart bots?
Ross Hinkley
@rosshinkley
May 07 2016 02:10 UTC
oh, almost for sure they exist
but that is way beyond the scope of nightmare :P
Mingsterism
@mingsterism
May 07 2016 02:11 UTC
why is that so?
Ross Hinkley
@rosshinkley
May 07 2016 02:11 UTC
because the algorithms/source involved is... nontrivial
Mingsterism
@mingsterism
May 07 2016 02:12 UTC
nontrivial? sry I'm still quite beginner
Ross Hinkley
@rosshinkley
May 07 2016 02:12 UTC
i was half-kidding. It's big.
and complicated.
take something like lucene for example
not exactly what you're talking about, but some of the principles are the same
Mingsterism
@mingsterism
May 07 2016 02:13 UTC
you mean there are bots that can automatically generate the css selectors for a site?
@rosshinkley btw just saw the 99designs.com example. thanks for sharing. very good examples of .use(). didnt know before
Mingsterism
@mingsterism
May 07 2016 02:25 UTC
@rosshinkley some sites I see, for pagination, their next button seems to be dynamically generated. The html for the next button is <a href="#" data-value="3">Next ›</a>
but the actual url for the next page is http://www.propwall.my/malaysia?page=3&tab=Most%20Searched&view=list
Ross Hinkley
@rosshinkley
May 07 2016 02:26 UTC
readability is an api, not a built-in product
Mingsterism
@mingsterism
May 07 2016 02:26 UTC
in this case, nightmare .click() will it work.?
Ross Hinkley
@rosshinkley
May 07 2016 02:26 UTC
(sorry, had to go afk for a minute)
it should
Mingsterism
@mingsterism
May 07 2016 02:26 UTC
no worries. its fine. thanks for your help
yes. can readability build in nightmare?
incorporate into nightmare ? smart parsing to auto-detect articles?
Ross Hinkley
@rosshinkley
May 07 2016 02:27 UTC
probably not.
Mingsterism
@mingsterism
May 07 2016 02:27 UTC
oh. its not open source is it?
Ross Hinkley
@rosshinkley
May 07 2016 02:27 UTC
what, readability?
i don't think so
Mingsterism
@mingsterism
May 07 2016 02:27 UTC
yeah
ok
@rosshinkley for .click() if this is the whole html for the next button, how will nightmare know where to go.
<a href="#" data-value="3">Next ›</a>
because normally in the html will have the link to the next page as well
Ross Hinkley
@rosshinkley
May 07 2016 02:29 UTC
not necessarily
as a trivial example, you can bind to an anchor's onClick in client-side javascript and set window.location there
nightmare doesn't have to know the underlying behavior
you just tell it where to click, and the electron instance (which is a browser window) will figure out the rest
Mingsterism
@mingsterism
May 07 2016 02:31 UTC
in regards to the bind and set window.location any code examples
Ross Hinkley
@rosshinkley
May 07 2016 02:31 UTC
ah, hm. Hang on.
Mingsterism
@mingsterism
May 07 2016 02:32 UTC
thanks. ok
Ross Hinkley
@rosshinkley
May 07 2016 02:38 UTC
MDN looks reasonably helpful
Mingsterism
@mingsterism
May 07 2016 02:38 UTC
@rosshinkley they do have the api in archives. don't know if this is suitable though
https://code.google.com/archive/p/arc90labs-readability/source/default/source
@rosshinkley thanks will check it out.
Ross Hinkley
@rosshinkley
May 07 2016 02:39 UTC
and now you have me curious
Mingsterism
@mingsterism
May 07 2016 02:40 UTC
ya. was hoping can put into nightmare :)
Ross Hinkley
@rosshinkley
May 07 2016 02:40 UTC
well, again, we probably won't
if i'm understanding what you're asking for, anyway
ok. is it too complex?
or plugin is more suitable?
Ross Hinkley
@rosshinkley
May 07 2016 02:41 UTC
incredibly complicated
if you want to craft a plugin for nightmare to make smart selectors, i'm for it
but building something that reliably can answer questions like "how many pages does this search have?" is pretty tough
Mingsterism
@mingsterism
May 07 2016 02:45 UTC
i see. ok. makes sense.
hopefully can try one day.
Ross Hinkley
@rosshinkley
May 07 2016 02:46 UTC
also... i have to ask, you said you were scraping many sites, and your example used reddit
you may want to have a look at the reddit api
Mingsterism
@mingsterism
May 07 2016 02:47 UTC
ah. thanks very much. yeah. was just practicing on reddit
Ross Hinkley
@rosshinkley
May 07 2016 02:48 UTC
i guess my point is, you might want to make sure what you want to hit doesn't have APIs prebuilt
Mingsterism
@mingsterism
May 07 2016 02:48 UTC
got it
Ross Hinkley
@rosshinkley
May 07 2016 02:48 UTC
you can build something much more robust that way
Mingsterism
@mingsterism
May 07 2016 02:48 UTC
btw, whats the pipeline for nightmarejs development in future?
roughly where is the library headed?
Ross Hinkley
@rosshinkley
May 07 2016 02:48 UTC
great question
Mingsterism
@mingsterism
May 07 2016 02:49 UTC
it already has more stars than casper. so looks very promising.
Ross Hinkley
@rosshinkley
May 07 2016 02:49 UTC
segmentio/nightmare#593
that issue has become a rough dumping ground for big, (mostly) breaking changes
Mingsterism
@mingsterism
May 07 2016 02:50 UTC
interesting. thanks.
Ross Hinkley
@rosshinkley
May 07 2016 02:51 UTC
some of it is already in, like ipc safety
another reasonably big non-breaking change is the fix for #502
er, segmentio/nightmare#502
(which i'm going to hopefully get a proposed fix PR up soon)
been picking away at it for far too long
Mingsterism
@mingsterism
May 07 2016 02:55 UTC
still quite a beginner. don't fully understand it but hopefully one day i will. haha.
thanks for your help though. really appreciate it.
Ross Hinkley
@rosshinkley
May 07 2016 02:56 UTC
ha, thats quite okay
if you've got questions, i can try to explain things :)
Manuel Koell
@glumanda99
May 07 2016 14:29 UTC
How do I use nightmare with x-ray to scrape sites?
Michael Groncki
@dergroncki
May 07 2016 15:30 UTC

I think I missed something because the following code opens electron but will not load the site:

var Nightmare = require('nightmare');
var nightmare = Nightmare({ show: true });

nightmare
  .goto('http://cnn.com')

But the following code will also load the site. Why?

var Nightmare = require('nightmare'),
  nightmare = Nightmare({ show: true });

nightmare
  .goto('http://cnn.com')
  .evaluate(function(){
    return document.title;
  })
  .end()
  .then(function(title){
    console.log(title);
  })
Ross Hinkley
@rosshinkley
May 07 2016 17:42 UTC
@manu09 i personally haven't tried it, but there is a nightmare x-ray driver
@DerGroncki true. Your first example will start Electron, queue the goto but not run it
the second will start electron, queue several actions, then execute the queue when .then() is called
Manuel Koell
@glumanda99
May 07 2016 21:39 UTC
@rosshinkley thank you sir!