Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    atoxx
    @atoxx
    Thx for clarifications, sorry it seems i am out of luck on this, maybe there might be a mirrror APi endpoint or an alternative Install Tutorial one day 11
    Harshdeep Singh Hura
    @kinngh
    So figured out how to deploy the mercury-parser-api on Lambda, and got it working, but here's the thing: I have function invocations, about ~300 of them while I have made only 2-3 requests? Is it because of the pinging every 5 minutes to avoid cold start?
    Adam Pash
    @adampash

    @kinngh That's correct. If you don't want to keep your function running, you can remove the scheduled events by commenting out these lines:

    https://github.com/postlight/mercury-parser-api/blob/master/serverless.yml#L43-L45

    And these:

    https://github.com/postlight/mercury-parser-api/blob/master/serverless.yml#L57-L59

    Tom Lai
    @tomlai19852004
    Hi. Thank you for this open source project. It seems that the import statement in code sample doesn't work for typescript. It worked after I changed the import statement to the following. Thanks!
    import * as Mercury from '@postlight/mercury-parser';
    wwmc
    @underTheFunSun_twitter
    Hi, I'm a bit of a newbie developer and I have a question. I was previously using the Mercury API inside a Chrome Extension. Now that it's gone would my only option be to include Mercury in extension? Meaning I would have a folder, mercuryFiles, with all the files from github and just import it inside the main file with import Mercury from "./mercuryFiles/src/mercury.js" and then I can do something likeMercury.parse(url).then(result => console.log(result))?
    Adam Pash
    @adampash
    @underTheFunSun_twitter You have two options:
    1. You could include Mercury directly in your extension. I don't know what your extensions build steps look like, but if you can use npm modules, you can install it and import/require it into your projects the same way you would any other dependency.
    2. Alternately, you could deploy your own Mercury Parser API drop-in replacement. This process is detailed here: https://github.com/postlight/mercury-parser-api
    Stephen Bradley
    @std_ptr_null_twitter

    Hi there!

    I found the information on the GitHub page to be a little incomplete. Although I'm a full time IT professional and part-time developer, it was not obvious to me how to proceed with the Mercury installation on Amazon Lambda.

    Making the installation work requires some software that is not installed on any OS by default.

    So, I made a note with the instructions that I used to create a working API on Amazon. Feel free to share and to get in touch if you have questions or need assistance.

    There are two links, so you can choose your poison.
    Evernote

    https://www.evernote.com/l/AAPoJR49OThHu5lBZLt8b1fyKomly8gRz6Q
    Notion

    https://www.notion.so/Installing-the-Mercury-Reader-API-on-AWS-Lambda-9b994989be1d49959a894b65def1095d

    wwmc
    @underTheFunSun_twitter
    Just a small update: I manage to get it working by doing npm install @postlight/mercury-parser. Then inside one of the folders in @postlight/mercury-parser is a Mercury.web.js file. Add a line at the end: export default Mercury. Afterwards, I could just import it regularly into my extension and use it without any issues.
    wwmc
    @underTheFunSun_twitter
    Though, one question: What's the best way to get around Cross-Origin Read Blocking with Mercury Web Parser on the browser?
    David Hirmes
    @hirmes

    I'm using the mercury-parser in node, and it's working beautifully:

    const Mercury = require('@postlight/mercury-parser');
    const url = 'https://en.wikipedia.org/wiki/John_von_Neumann';
    Mercury.parse(url).then(result => { console.log(result); } );

    but it is unclear to me how to utilize the custom extractors. There is one for wikipedia here but I don't see instructions for how to incorporate that into the above code?

    Toufic Mouallem
    @toufic-m
    Hi @hirmes , while Mercury Parser can extract content from almost any website, some websites might require special handling in order to find the content more quickly and more accurately than it might otherwise do, and that's where custom extractors come into play. This Custom Parsers README file explains how custom extractors work and details the process of creating one.
    Essentially, they're internal to the Mercury Parser package, and can't be directly referenced from apps using the package.
    David Hirmes
    @hirmes
    @toufic-m Thanks for your response. Are you saying that the custom parsers listed here are automatically run when the domains they are associated with are encountered in the url given to Mercury.parse()? If so, that doesn't seem to be the case for me. In the example I posted above, the wikipedia.org parser doesn't seem to be executed, just the default Mercury parser. Is there some way to ensure that a custom parser is run that I'm missing?
    Toufic Mouallem
    @toufic-m
    @hirmes Yes, a custom extractor is automatically used in order to extract content if its domain name matches that of the current request. Could you please submit an issue describing what you're experiencing, while including the current result and what is incorrect about it?
    David Hirmes
    @hirmes
    @toufic-m I see now that the 4 or 5 tests I ran on various extractors just happened to be failing (mostly with attempts to capture dates and authors) I think I can update the selectors in those extractors and submit PRs. Thanks again.
    Babak Fakhamzadeh
    @MastaBaba
    I'm trying to install the mercury parser, but am not yet succeeding. I've got npm, nvm and node installed and "npm install @postlight/mercury-parser" completed successfully. But, then what? Where do I place "import Mercury from '@postlight/mercury-parser';" and how do I execute it? Note, I'm not familiar at all with npm and nvm.
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    I have been using the Mercury parser on a near-continual basis, going back seven years to the days of Readability. I have set up multiple AppleScripts on my Mac to extensively process the resulting HTML to my personal specifications, such that it has been a joy and a blessing to be able to read webpages off-line on my iPhone.
    I have been watching this page since the original announcement, to see if anyone has been willing or able to provide a means of running the Mercury parser on a local Mac, by simply processing the HTML taken from a browser window.
    Now that there are only two weeks remaining before Postlight shuts down the hosted Mercury Web Parser API, I am starting to believe that it will not be possible for end-users to be able to take advantage of the open source code.
    Any words of advice would be greatly appreciated!
    Blessings, and thank you!
    Sean Brodie
    @seanbrodie

    Hello, I can't seem to pass the headers option through to the request successfully. I have it implemented as in the examples:

    Mercury.parse(url, {
        headers: {
            dnt: '1',
            cookie: '__cfduid=vqwo24522env9832hgo23gh3g23gkewe',
            'accept-language': 'en-US,en;q=0.9,tr;q=0.8',
            'accept-encoding': 'gzip, deflate, br',
            accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'upgrade-insecure-requests': '1',
            'cache-control': 'max-age=0,no-cache',
            authority: 'medium.com'
        },
        contentType: 'markdown',
    }).then(result => console.log(result));

    Am I missing something? I have made an edit to the source to fix for me so long.

    function parse(_x) {
          if (arguments[1].headers)
            REQUEST_HEADERS = arguments[1].headers;
          return _parse.apply(this, arguments);
        }

    – mercury.js:6537

    If this is an actual issue I will submit a PR

    Sean Brodie
    @seanbrodie
    Never mind, this feature seems to have been added in 58ff9df which is not yet in the 2.0.0 release.
    Babak Fakhamzadeh
    @MastaBaba
    I managed to get Mercury running on my own server. Here's what I did:
    tgjma
    @tgjma
    Hi, Is it possible to run the CLI using phantomjs (without node)
    Justin Cone
    @justincone
    @MastaBaba Thanks for that write-up. Very helpful!
    Toufic Mouallem
    @toufic-m
    @RichardFairbanks the mercury-parser-api provides a drop-in replacement for the existing hosted API; deploying your own instance of this would make your transition seamless, and the only change you need to make to your API requests in your AppleScripts is the base URL.
    Additionally, Mercury supports parsing pre-fetched HTML as detailed here.
    Does this address your concerns?
    zootooz
    @zootooz
    Many thanks to Postlight for letting the Mercury parser go open source and to everyone who put in the extra work to make that happen. I think a lot of companies would have just let it die (I'm looking at you Google). Much appreciated!
    Frank Boucher
    @FBoucher
    In the Chrome Extension there is a Send to Kindle... Does THAT functionality is available by code?
    Rockwell
    @Rockwellwindsor_twitter
    I am working to get the api in as a replacement to the old service. Am I understanding this correctly that once we have this set up as a lambda function on AWS the only thing we need to change is the endpoint getting hit, which would be from the old URL of https://mercury.postlight.com/parser to the new URL provided by AWS, is that the idea?
    Toufic Mouallem
    @toufic-m
    @Rockwellwindsor_twitter yes, exactly!
    Rockwell
    @Rockwellwindsor_twitter
    Thanks, I was struggling to get it working but it fell into place yesterday. Thanks to the team who put this together, it works like a charm!
    Conrad Jackson
    @conradj
    I needed to get the (fantastic) Mercury Parser working yesterday. Being of the firm belief that there has got to be an easier alternative to AWS for anything not running at scale, I came up with deploying it to Zeit Now, which is a really easy to use serverless platform. All you have to do is create a Now account, and then fork my repo and you'll have your own version of Mercury running on the web. Links and instructions at https://github.com/conradj/now-mercury-parser.
    Сказочные Новости
    @skz_news_twitter
    Hi, how I may get text only, not html?
    Nirawit Jittipairoj
    @SixFingeredAmish
    Hi everyone
    How do I use Mercury Parser. I installed it using NPM. What's next?
    Adam Pash
    @adampash
    specifically:
    Mercury.parse(url, { contentType: 'text' }).then(result =>
      console.log(result)
    );
    @SixFingeredAmish Have you written/deployed code in Node.js before? The basic usage should get you started, but let me know if you have any more specific questions
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    (Please excuse me for missing a couple of months . . . )
    I can run mercury-parser just fine in Terminal on my Mac (macOS 10.14.5), but it is not recognized when I call it via AppleScript:
    do shell script "mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source"
    I get the result:
    sh: mercury-parser: command not found
    I have searched for the executable in all the usual places (starting with my paths list) and then a full disk search, including hidden files, but to no avail.
    Assuming that I need the full path for the AppleScript call, where is the mercury-parser executable to be found?
    Blessings, and thank you!!!
    Howard Camp
    @howiecamp_twitter
    I was reading this guide to installing the Mercury API on AWS Lambda - https://www.evernote.com/shard/s3/client/snv?noteGuid=e8251e3d-3938-47bb-9941-64bb7c6f57f2&noteKey=2a89a5cbc811cfa4&sn=https%3A%2F%2Fwww.evernote.com%2Fshard%2Fs3%2Fsh%2Fe8251e3d-3938-47bb-9941-64bb7c6f57f2%2F2a89a5cbc811cfa4&title=Installing%2Bthe%2BMercury%2BReader%2BAPI%2Bon%2BAWS%2BLambda - Can anyone explain the AWS resources (aside from Lambda) that this uses? I notice for example a massive number of S3 requests associated with each call to the Mercury Parse API via AWS Lambda.
    Richard Fairbanks
    @RichardFairbanks
    Greetings, folks!
    I have posted the solution to my Mac shell-script challenge with the Mercury Web Parser at:
    https://macscripter.net/viewtopic.php?pid=196840#p196840
    It has the AppleScript I’ve been using and the HTML template called in the script, as well as a screen shot of how it looks on my iPhone 7 Plus.
    It’s working great!
    Blessings, and thank you!!
    tgjma
    @tgjma
    Hi, I am seeing the mercury-parser CLI is very slow processing the web pages, the CLI takes around 15s to process an URL, whereas the post light hosted web service used to take around 5s. Any thoughts on where the bottleneck could be. I am running the mercury-parser CLI on a Mac.
    Richard Fairbanks
    @RichardFairbanks
    I can concur with tgima’s reported fifteen-second delay, running the mercury-parser CLI on a Mac.
    Howard Camp
    @howiecamp_twitter
    @adampash Are there instructions for hosting this within a simple Node app as opposed to on AWS Lambda?
    Steve Upstill
    @upstill
    @conradj , I forked your now-mercury parser, and got it deployed to Mow (as now-mercury-parser.upstill.now.sh), but when I hit it with a URL, it redirects to a URL that is identical except that it changes '://' to ':/'. I haven't touched any of the code. Does the parser work properly for you?
    Steve Upstill
    @upstill
    @MastaBaba My thanks as well for providing the solution that got me rolling. Server now running on my Linode!
    Babak Fakhamzadeh
    @MastaBaba
    You're welcome @upstill :)
    joosepP
    @joosepP
    Hey, im trying to install and run Mercury Web Parser but i seem to be running in to some problems
    Is someone here who could help me
    zootooz
    @zootooz
    What's the best way to keep AWS updated with the latest Github changes?
    zootooz
    @zootooz
    Do I have to repeat the yarn deploy steps @adampash laid out all over again or is there some automated way to handle this? I'm finding the documentation available across the internet to be not well-focused to say the least.
    Bryan Hackett
    @BryanHackett_twitter
    Node 8.10 is losing support on AWS after 12/31. Are there any plans to update to support a newer version?
    zootooz
    @zootooz
    @BryanHackett_twitter This Gitter space seems to have gone dark, which is a concern as I don't know where we are supposed to get information. Perhaps there is a space attached to the GitHub repository?