by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    David Hirmes
    @hirmes
    @toufic-m Thanks for your response. Are you saying that the custom parsers listed here are automatically run when the domains they are associated with are encountered in the url given to Mercury.parse()? If so, that doesn't seem to be the case for me. In the example I posted above, the wikipedia.org parser doesn't seem to be executed, just the default Mercury parser. Is there some way to ensure that a custom parser is run that I'm missing?
    Toufic Mouallem
    @toufic-m
    @hirmes Yes, a custom extractor is automatically used in order to extract content if its domain name matches that of the current request. Could you please submit an issue describing what you're experiencing, while including the current result and what is incorrect about it?
    David Hirmes
    @hirmes
    @toufic-m I see now that the 4 or 5 tests I ran on various extractors just happened to be failing (mostly with attempts to capture dates and authors) I think I can update the selectors in those extractors and submit PRs. Thanks again.
    Babak Fakhamzadeh
    @MastaBaba
    I'm trying to install the mercury parser, but am not yet succeeding. I've got npm, nvm and node installed and "npm install @postlight/mercury-parser" completed successfully. But, then what? Where do I place "import Mercury from '@postlight/mercury-parser';" and how do I execute it? Note, I'm not familiar at all with npm and nvm.
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    I have been using the Mercury parser on a near-continual basis, going back seven years to the days of Readability. I have set up multiple AppleScripts on my Mac to extensively process the resulting HTML to my personal specifications, such that it has been a joy and a blessing to be able to read webpages off-line on my iPhone.
    I have been watching this page since the original announcement, to see if anyone has been willing or able to provide a means of running the Mercury parser on a local Mac, by simply processing the HTML taken from a browser window.
    Now that there are only two weeks remaining before Postlight shuts down the hosted Mercury Web Parser API, I am starting to believe that it will not be possible for end-users to be able to take advantage of the open source code.
    Any words of advice would be greatly appreciated!
    Blessings, and thank you!
    Sean Brodie
    @seanbrodie

    Hello, I can't seem to pass the headers option through to the request successfully. I have it implemented as in the examples:

    Mercury.parse(url, {
        headers: {
            dnt: '1',
            cookie: '__cfduid=vqwo24522env9832hgo23gh3g23gkewe',
            'accept-language': 'en-US,en;q=0.9,tr;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'upgrade-insecure-requests': '1',
            'cache-control': 'max-age=0,no-cache',
            authority: 'medium.com'
        },
        contentType: 'markdown',
    }).then(result => console.log(result));

    Am I missing something? I have made an edit to the source to fix for me so long.

    function parse(_x) {
          if (arguments[1].headers)
            REQUEST_HEADERS = arguments[1].headers;
          return _parse.apply(this, arguments);
        }

    – mercury.js:6537

    If this is an actual issue I will submit a PR

    Sean Brodie
    @seanbrodie
    Never mind, this feature seems to have been added in 58ff9df which is not yet in the 2.0.0 release.
    Babak Fakhamzadeh
    @MastaBaba
    I managed to get Mercury running on my own server. Here's what I did:
    tgjma
    @tgjma
    Hi, Is it possible to run the CLI using phantomjs (without node)
    Justin Cone
    @justincone
    @MastaBaba Thanks for that write-up. Very helpful!
    Toufic Mouallem
    @toufic-m
    @RichardFairbanks the mercury-parser-api provides a drop-in replacement for the existing hosted API; deploying your own instance of this would make your transition seamless, and the only change you need to make to your API requests in your AppleScripts is the base URL.
    Additionally, Mercury supports parsing pre-fetched HTML as detailed here.
    Does this address your concerns?
    zootooz
    @zootooz
    Many thanks to Postlight for letting the Mercury parser go open source and to everyone who put in the extra work to make that happen. I think a lot of companies would have just let it die (I'm looking at you Google). Much appreciated!
    Frank Boucher
    @FBoucher
    In the Chrome Extension there is a Send to Kindle... Does THAT functionality is available by code?
    Rockwell
    @Rockwellwindsor_twitter
    I am working to get the api in as a replacement to the old service. Am I understanding this correctly that once we have this set up as a lambda function on AWS the only thing we need to change is the endpoint getting hit, which would be from the old URL of https://mercury.postlight.com/parser to the new URL provided by AWS, is that the idea?
    Toufic Mouallem
    @toufic-m
    @Rockwellwindsor_twitter yes, exactly!
    Rockwell
    @Rockwellwindsor_twitter
    Thanks, I was struggling to get it working but it fell into place yesterday. Thanks to the team who put this together, it works like a charm!
    Conrad Jackson
    @conradj
    I needed to get the (fantastic) Mercury Parser working yesterday. Being of the firm belief that there has got to be an easier alternative to AWS for anything not running at scale, I came up with deploying it to Zeit Now, which is a really easy to use serverless platform. All you have to do is create a Now account, and then fork my repo and you'll have your own version of Mercury running on the web. Links and instructions at https://github.com/conradj/now-mercury-parser.
    Сказочные Новости
    @skz_news_twitter
    Hi, how I may get text only, not html?
    Nirawit Jittipairoj
    @SixFingeredAmish
    Hi everyone
    How do I use Mercury Parser. I installed it using NPM. What's next?
    Adam Pash
    @adampash
    specifically:
    Mercury.parse(url, { contentType: 'text' }).then(result =>
      console.log(result)
    );
    @SixFingeredAmish Have you written/deployed code in Node.js before? The basic usage should get you started, but let me know if you have any more specific questions
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    (Please excuse me for missing a couple of months . . . )
    I can run mercury-parser just fine in Terminal on my Mac (macOS 10.14.5), but it is not recognized when I call it via AppleScript:
    do shell script "mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source"
    I get the result:
    sh: mercury-parser: command not found
    I have searched for the executable in all the usual places (starting with my paths list) and then a full disk search, including hidden files, but to no avail.
    Assuming that I need the full path for the AppleScript call, where is the mercury-parser executable to be found?
    Blessings, and thank you!!!
    Howard Camp
    @howiecamp_twitter
    I was reading this guide to installing the Mercury API on AWS Lambda - https://www.evernote.com/shard/s3/client/snv?noteGuid=e8251e3d-3938-47bb-9941-64bb7c6f57f2&noteKey=2a89a5cbc811cfa4&sn=https%3A%2F%2Fwww.evernote.com%2Fshard%2Fs3%2Fsh%2Fe8251e3d-3938-47bb-9941-64bb7c6f57f2%2F2a89a5cbc811cfa4&title=Installing%2Bthe%2BMercury%2BReader%2BAPI%2Bon%2BAWS%2BLambda - Can anyone explain the AWS resources (aside from Lambda) that this uses? I notice for example a massive number of S3 requests associated with each call to the Mercury Parse API via AWS Lambda.
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    I have posted the solution to my Mac shell-script challenge with the Mercury Web Parser at:
    https://macscripter.net/viewtopic.php?pid=196840#p196840
    It has the AppleScript I’ve been using and the HTML template called in the script, as well as a screen shot of how it looks on my iPhone 7 Plus.
    It’s working great!
    Blessings, and thank you!!
    tgjma
    @tgjma
    Hi, I am seeing the mercury-parser CLI is very slow processing the web pages, the CLI takes around 15s to process an URL, whereas the post light hosted web service used to take around 5s. Any thoughts on where the bottleneck could be. I am running the mercury-parser CLI on a Mac.
    Richard Fairbanks
    @RichardFairbanks
    I can concur with tgima’s reported fifteen-second delay, running the mercury-parser CLI on a Mac.
    Howard Camp
    @howiecamp_twitter
    @adampash Are there instructions for hosting this within a simple Node app as opposed to on AWS Lambda?
    Steve Upstill
    @upstill
    @conradj , I forked your now-mercury parser, and got it deployed to Mow (as now-mercury-parser.upstill.now.sh), but when I hit it with a URL, it redirects to a URL that is identical except that it changes '://' to ':/'. I haven't touched any of the code. Does the parser work properly for you?
    Steve Upstill
    @upstill
    @MastaBaba My thanks as well for providing the solution that got me rolling. Server now running on my Linode!
    Babak Fakhamzadeh
    @MastaBaba
    You're welcome @upstill :)
    joosepP
    @joosepP
    Hey, im trying to install and run Mercury Web Parser but i seem to be running in to some problems
    Is someone here who could help me
    zootooz
    @zootooz
    What's the best way to keep AWS updated with the latest Github changes?
    zootooz
    @zootooz
    Do I have to repeat the yarn deploy steps @adampash laid out all over again or is there some automated way to handle this? I'm finding the documentation available across the internet to be not well-focused to say the least.
    Bryan Hackett
    @BryanHackett_twitter
    Node 8.10 is losing support on AWS after 12/31. Are there any plans to update to support a newer version?
    zootooz
    @zootooz
    @BryanHackett_twitter This Gitter space seems to have gone dark, which is a concern as I don't know where we are supposed to get information. Perhaps there is a space attached to the GitHub repository?
    singularita-zz
    @singularita-zz

    Hi, could you help me please with passing errors and running .preview script? (PowerShell, Win10-64bit)

    Martin

    node ./preview https://archiweb.cz/n/domaci/v-opave-se-bude-stavet-novy-bazen-za-350-milionu-korun                                      Rebuilding Mercury
    'MERCURY_TEST_BUILD' is not recognized as an internal or external command,
    operable program or batch file.
    child_process.js:649
        throw err;
        ^
    
    Error: Command failed: MERCURY_TEST_BUILD=true npm run build
    'MERCURY_TEST_BUILD' is not recognized as an internal or external command,
    operable program or batch file.
    
        at checkExecSyncError (child_process.js:610:11)
        at execSync (child_process.js:646:15)
        at Object.<anonymous> (C:\app.martin\mercury-parser\mercury-parser\preview:20:3)
        at Module._compile (internal/modules/cjs/loader.js:1139:30)
        at Object.Module._extensions..js (internal/modules/cjs/loader.js:1159:10)
        at Module.load (internal/modules/cjs/loader.js:988:32)
        at Function.Module._load (internal/modules/cjs/loader.js:896:14)
        at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
        at internal/main/run_main_module.js:17:47 {
      status: 1,
      signal: null,
      output: [
        null,
        <Buffer >,
        <Buffer 27 4d 45 52 43 55 52 59 5f 54 45 53 54 5f 42 55 49 4c 44 27 20 69 73 20 6e 6f 74 20 72 65 63 6f 67 6e 69 7a 65 64 20 61 73 20 61 6e 20 69 6e 74 65 72 ... 59 more bytes>
      ],
      pid: 26728,
      stdout: <Buffer >,
      stderr: <Buffer 27 4d 45 52 43 55 52 59 5f 54 45 53 54 5f 42 55 49 4c 44 27 20 69 73 20 6e 6f 74 20 72 65 63 6f 67 6e 69 7a 65 64 20 61 73 20 61 6e 20 69 6e 74 65 72 ... 59 more bytes>
    }
    Adam Pash
    @adampash
    @singularita-zz I don't have a windows machine to test on but i'm guessing that declaring the environment variable in the command MERCURY_TEST_BUILD=true npm run build isn't supported on powershell? you may have to edit the preview script to play friendly with powershell. it assumes a *nix shell
    @zootooz sorry for missing this: like you suggested, you would have to re-deploy
    @BryanHackett_twitter Apologies for the slow response. A couple of weeks ago, we updated the parser api to a newer node :thumbsup:
    waplay
    @waplay
    Hi, how make custom extractor with Mercury API on AWS Lambda?
    waplay
    @waplay
    @zootooz Ok, how can I then transfer this to my lambda? I use: https://github.com/postlight/mercury-parser-api
    zootooz
    @zootooz
    I'm no expert here, but assuming you've already set up your lambda/mercury aws server, I believe you just have to re-deploy the files up to AWS.
    So for me that would be yarn deploy:prod
    waplay
    @waplay
    @zootooz thanks you
    Beyza
    @beyzacevik___twitter
    hey