Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Mikk Kiilaspää
    @Mikk36
    Hey
    Is there anything that could be done to mitigate this?
    Seeing a small issue with a copied page, where the characters don't look the same (charset seems off?)
    Mikk Kiilaspää
    @Mikk36
    Hmm, seems to be in combination with PhantomJS scraper
    Sophia Antipenko
    @s0ph1e
    Hi @Mikk36. Unfortunately I don't know quick fix for that. I'll take closer look on it
    Anjan Das
    @bugToaster
    hi
    If i don't need to save file what should i use
    Sophia Antipenko
    @s0ph1e
    Hi @bugToaster . You can use 'resourceSaver' option https://github.com/website-scraper/node-website-scraper#resourcesaver and just leave saveResource function empty
    Tim Birkett
    @pysysops

    Hello 👋🏼 may be a long shot as it's awfully quiet in here... I'm looking to either dynamically add urls to be scraped at runtime based on other urls or manipulate the url / filename of a resource in-flight. To give some context. Sites I'm scraping have rss at for example: https://site.com/author/jim/rss but they are linked to using feedly so teh href looks something like https://feedly.com/something/https://site.com/author/jim/rss in the pages html. Currently the urlFilter doesn't scrape feedly (which is desired) but I would like to extract that url and add it to the queue of resources to be scraped.

    Is there a sane way to achieve this? Plugins for beforeRequest and afterResponse or generateFileName ? 🤔

    Sophia Antipenko
    @s0ph1e
    Hi @pysysops I'm not sure I understand the problem. As I can see you have some website with links to https://feedly.com, and that links to feedly are not downloaded. Is that correct? Could you please share config you are using?
    Striar Yunis
    @Aords
    Any chance anyone here is online
    Sophia Antipenko
    @s0ph1e

    Hey @Aords

    Usually I'm not online, but I try to respond each message when I get email notifications about it.
    Do you have some questions? Feel free to write them here or create an issue on github