Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Adam Pash
    @kinngh I think your best bet, then, would be to look into deploying the Mercury Parser API https://github.com/postlight/mercury-parser-api/
    Babak Fakhamzadeh
    Thanks again, @adampash. I'm still not completely clear on the import; I run my javascript in scripts included in PHP pages. I don't use yarn or npm.
    On the serverless framework; In a nutshell, if I understand this correctly, I first have to use npm to install serverless framework, then use git to clone the repo, yarn to install, serve and deploy it. While at some point I have to put my AWS credentials in the code. Yeah?
    Adam Pash
    @MastaBaba Mercury isn't hosted on a CDN, and more likely than not, you want to run Mercury in a Node environment, not a browser environment. you don't have to put your AWS credentials in the code. all you need to do is set up your aws credentials using the aws cli https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html#cli-quick-configuration
    Bryan Hackett
    Do we need to run nodejs8.10 locally for the API? I encountered this error attempting to deploy using v11.9.0.
        throw err;
    Error: Cannot find module '..'
    Bryan Hackett
    @adampash Could you post an example IAM json policy to use with this? Thanks!
    Bryan Hackett

    I got this to finally work:

        "Version": "2012-10-17",
        "Statement": [
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                "Resource": "*"
                "Sid": "VisualEditor1",
                "Effect": "Allow",
                "Action": "apigateway:*",
                "Resource": "arn:aws:apigateway:*::*"

    But it feels like there are probably more permissions than needed.

    Adam Pash
    :point_up: February 13, 2019 8:47 PM I'm not completely sure about this, but could you please try with 8.10 and if you still experience this error, let me know?
    @BryanHackett_twitter it's true that one drawback to the serverless framework is that it requires a lot of access (a lot of ppl end up just using admin privileges, which isn't exactly ideal). I think this is a decent post on the ins and outs of IAM requirements for serverless
    Bryan Hackett
    @adampash Thanks. Everything worked when I manually set the mercury parser dependency to 2.0.0.
    Adam Pash
    Hi all, i have a Problem, i am a Mac/iOS Developer that relies on the Mercury Parse API for one of my Apps, now i am looking for a way to install the Code on my own Server, but its not a Root or AWS, its just a hosting PHP Server and i have no clue how to do this.
    Is there any tutorial or way to install the API without NodeJS or YARN? (just by uploading all Files and call a Script with parameters like http://www.myserver.com?parse=whater url)
    Adam Pash
    @atoxx Unfortunately if you want to use our out-of-the-box solution (mercury-parser-api), you'll need to use AWS for that. if that's not an option, you'll need to either use yarn/npm to install the cli or have some sort of javascript code to run the parser.
    Thx for clarifications, sorry it seems i am out of luck on this, maybe there might be a mirrror APi endpoint or an alternative Install Tutorial one day 11
    Harshdeep Singh Hura
    So figured out how to deploy the mercury-parser-api on Lambda, and got it working, but here's the thing: I have function invocations, about ~300 of them while I have made only 2-3 requests? Is it because of the pinging every 5 minutes to avoid cold start?
    Adam Pash

    @kinngh That's correct. If you don't want to keep your function running, you can remove the scheduled events by commenting out these lines:


    And these:


    Tom Lai
    Hi. Thank you for this open source project. It seems that the import statement in code sample doesn't work for typescript. It worked after I changed the import statement to the following. Thanks!
    import * as Mercury from '@postlight/mercury-parser';
    Hi, I'm a bit of a newbie developer and I have a question. I was previously using the Mercury API inside a Chrome Extension. Now that it's gone would my only option be to include Mercury in extension? Meaning I would have a folder, mercuryFiles, with all the files from github and just import it inside the main file with import Mercury from "./mercuryFiles/src/mercury.js" and then I can do something likeMercury.parse(url).then(result => console.log(result))?
    Adam Pash
    @underTheFunSun_twitter You have two options:
    1. You could include Mercury directly in your extension. I don't know what your extensions build steps look like, but if you can use npm modules, you can install it and import/require it into your projects the same way you would any other dependency.
    2. Alternately, you could deploy your own Mercury Parser API drop-in replacement. This process is detailed here: https://github.com/postlight/mercury-parser-api
    Stephen Bradley

    Hi there!

    I found the information on the GitHub page to be a little incomplete. Although I'm a full time IT professional and part-time developer, it was not obvious to me how to proceed with the Mercury installation on Amazon Lambda.

    Making the installation work requires some software that is not installed on any OS by default.

    So, I made a note with the instructions that I used to create a working API on Amazon. Feel free to share and to get in touch if you have questions or need assistance.

    There are two links, so you can choose your poison.



    Just a small update: I manage to get it working by doing npm install @postlight/mercury-parser. Then inside one of the folders in @postlight/mercury-parser is a Mercury.web.js file. Add a line at the end: export default Mercury. Afterwards, I could just import it regularly into my extension and use it without any issues.
    Though, one question: What's the best way to get around Cross-Origin Read Blocking with Mercury Web Parser on the browser?
    David Hirmes

    I'm using the mercury-parser in node, and it's working beautifully:

    const Mercury = require('@postlight/mercury-parser');
    const url = 'https://en.wikipedia.org/wiki/John_von_Neumann';
    Mercury.parse(url).then(result => { console.log(result); } );

    but it is unclear to me how to utilize the custom extractors. There is one for wikipedia here but I don't see instructions for how to incorporate that into the above code?

    Toufic Mouallem
    Hi @hirmes , while Mercury Parser can extract content from almost any website, some websites might require special handling in order to find the content more quickly and more accurately than it might otherwise do, and that's where custom extractors come into play. This Custom Parsers README file explains how custom extractors work and details the process of creating one.
    Essentially, they're internal to the Mercury Parser package, and can't be directly referenced from apps using the package.
    David Hirmes
    @toufic-m Thanks for your response. Are you saying that the custom parsers listed here are automatically run when the domains they are associated with are encountered in the url given to Mercury.parse()? If so, that doesn't seem to be the case for me. In the example I posted above, the wikipedia.org parser doesn't seem to be executed, just the default Mercury parser. Is there some way to ensure that a custom parser is run that I'm missing?
    Toufic Mouallem
    @hirmes Yes, a custom extractor is automatically used in order to extract content if its domain name matches that of the current request. Could you please submit an issue describing what you're experiencing, while including the current result and what is incorrect about it?
    David Hirmes
    @toufic-m I see now that the 4 or 5 tests I ran on various extractors just happened to be failing (mostly with attempts to capture dates and authors) I think I can update the selectors in those extractors and submit PRs. Thanks again.
    Babak Fakhamzadeh
    I'm trying to install the mercury parser, but am not yet succeeding. I've got npm, nvm and node installed and "npm install @postlight/mercury-parser" completed successfully. But, then what? Where do I place "import Mercury from '@postlight/mercury-parser';" and how do I execute it? Note, I'm not familiar at all with npm and nvm.
    Richard Fairbanks
    Greetings, folks!
    I have been using the Mercury parser on a near-continual basis, going back seven years to the days of Readability. I have set up multiple AppleScripts on my Mac to extensively process the resulting HTML to my personal specifications, such that it has been a joy and a blessing to be able to read webpages off-line on my iPhone.
    I have been watching this page since the original announcement, to see if anyone has been willing or able to provide a means of running the Mercury parser on a local Mac, by simply processing the HTML taken from a browser window.
    Now that there are only two weeks remaining before Postlight shuts down the hosted Mercury Web Parser API, I am starting to believe that it will not be possible for end-users to be able to take advantage of the open source code.
    Any words of advice would be greatly appreciated!
    Blessings, and thank you!
    Sean Brodie

    Hello, I can't seem to pass the headers option through to the request successfully. I have it implemented as in the examples:

    Mercury.parse(url, {
        headers: {
            dnt: '1',
            cookie: '__cfduid=vqwo24522env9832hgo23gh3g23gkewe',
            'accept-language': 'en-US,en;q=0.9,tr;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'upgrade-insecure-requests': '1',
            'cache-control': 'max-age=0,no-cache',
            authority: 'medium.com'
        contentType: 'markdown',
    }).then(result => console.log(result));

    Am I missing something? I have made an edit to the source to fix for me so long.

    function parse(_x) {
          if (arguments[1].headers)
            REQUEST_HEADERS = arguments[1].headers;
          return _parse.apply(this, arguments);

    – mercury.js:6537

    If this is an actual issue I will submit a PR

    Sean Brodie
    Never mind, this feature seems to have been added in 58ff9df which is not yet in the 2.0.0 release.
    Babak Fakhamzadeh
    I managed to get Mercury running on my own server. Here's what I did:
    Hi, Is it possible to run the CLI using phantomjs (without node)
    Justin Cone
    @MastaBaba Thanks for that write-up. Very helpful!
    Toufic Mouallem
    @RichardFairbanks the mercury-parser-api provides a drop-in replacement for the existing hosted API; deploying your own instance of this would make your transition seamless, and the only change you need to make to your API requests in your AppleScripts is the base URL.
    Additionally, Mercury supports parsing pre-fetched HTML as detailed here.
    Does this address your concerns?
    Many thanks to Postlight for letting the Mercury parser go open source and to everyone who put in the extra work to make that happen. I think a lot of companies would have just let it die (I'm looking at you Google). Much appreciated!
    Frank Boucher
    In the Chrome Extension there is a Send to Kindle... Does THAT functionality is available by code?
    I am working to get the api in as a replacement to the old service. Am I understanding this correctly that once we have this set up as a lambda function on AWS the only thing we need to change is the endpoint getting hit, which would be from the old URL of https://mercury.postlight.com/parser to the new URL provided by AWS, is that the idea?
    Toufic Mouallem
    @Rockwellwindsor_twitter yes, exactly!
    Thanks, I was struggling to get it working but it fell into place yesterday. Thanks to the team who put this together, it works like a charm!
    Conrad Jackson
    I needed to get the (fantastic) Mercury Parser working yesterday. Being of the firm belief that there has got to be an easier alternative to AWS for anything not running at scale, I came up with deploying it to Zeit Now, which is a really easy to use serverless platform. All you have to do is create a Now account, and then fork my repo and you'll have your own version of Mercury running on the web. Links and instructions at https://github.com/conradj/now-mercury-parser.
    Сказочные Новости
    Hi, how I may get text only, not html?
    Nirawit Jittipairoj
    Hi everyone
    How do I use Mercury Parser. I installed it using NPM. What's next?
    Adam Pash
    Mercury.parse(url, { contentType: 'text' }).then(result =>
    @SixFingeredAmish Have you written/deployed code in Node.js before? The basic usage should get you started, but let me know if you have any more specific questions
    Richard Fairbanks
    Greetings, folks!
    (Please excuse me for missing a couple of months . . . )
    I can run mercury-parser just fine in Terminal on my Mac (macOS 10.14.5), but it is not recognized when I call it via AppleScript:
    do shell script "mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source"
    I get the result:
    sh: mercury-parser: command not found
    I have searched for the executable in all the usual places (starting with my paths list) and then a full disk search, including hidden files, but to no avail.
    Assuming that I need the full path for the AppleScript call, where is the mercury-parser executable to be found?
    Blessings, and thank you!!!
    Howard Camp
    I was reading this guide to installing the Mercury API on AWS Lambda - https://www.evernote.com/shard/s3/client/snv?noteGuid=e8251e3d-3938-47bb-9941-64bb7c6f57f2&noteKey=2a89a5cbc811cfa4&sn=https%3A%2F%2Fwww.evernote.com%2Fshard%2Fs3%2Fsh%2Fe8251e3d-3938-47bb-9941-64bb7c6f57f2%2F2a89a5cbc811cfa4&title=Installing%2Bthe%2BMercury%2BReader%2BAPI%2Bon%2BAWS%2BLambda - Can anyone explain the AWS resources (aside from Lambda) that this uses? I notice for example a massive number of S3 requests associated with each call to the Mercury Parse API via AWS Lambda.
    Richard Fairbanks
    Greetings, folks!
    I have posted the solution to my Mac shell-script challenge with the Mercury Web Parser at:
    It has the AppleScript I’ve been using and the HTML template called in the script, as well as a screen shot of how it looks on my iPhone 7 Plus.
    It’s working great!
    Blessings, and thank you!!