A place to discuss using Mercury Web Parser, an open source utility that extracts content from the chaos of the web. https://github.com/postlight/mercury-parser https://github.com/postlight/mercury-parser-api
Mercury.parse()
? If so, that doesn't seem to be the case for me. In the example I posted above, the wikipedia.org parser doesn't seem to be executed, just the default Mercury parser. Is there some way to ensure that a custom parser is run that I'm missing?
Hello, I can't seem to pass the headers option through to the request successfully. I have it implemented as in the examples:
Mercury.parse(url, {
headers: {
dnt: '1',
cookie: '__cfduid=vqwo24522env9832hgo23gh3g23gkewe',
'accept-language': 'en-US,en;q=0.9,tr;q=0.8',
'accept-encoding': 'gzip, deflate, br',
accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
'upgrade-insecure-requests': '1',
'cache-control': 'max-age=0,no-cache',
authority: 'medium.com'
},
contentType: 'markdown',
}).then(result => console.log(result));
Am I missing something? I have made an edit to the source to fix for me so long.
function parse(_x) {
if (arguments[1].headers)
REQUEST_HEADERS = arguments[1].headers;
return _parse.apply(this, arguments);
}
– mercury.js:6537
If this is an actual issue I will submit a PR
https://mercury.postlight.com/parser
to the new URL provided by AWS, is that the idea?
Mercury.parse(url, { contentType: 'text' }).then(result =>
console.log(result)
);
do shell script "mercury-parser https://postlight.com/trackchanges/mercury-goes-open-source"
sh: mercury-parser: command not found
Hi, could you help me please with passing errors and running .preview script? (PowerShell, Win10-64bit)
Martin
node ./preview https://archiweb.cz/n/domaci/v-opave-se-bude-stavet-novy-bazen-za-350-milionu-korun Rebuilding Mercury
'MERCURY_TEST_BUILD' is not recognized as an internal or external command,
operable program or batch file.
child_process.js:649
throw err;
^
Error: Command failed: MERCURY_TEST_BUILD=true npm run build
'MERCURY_TEST_BUILD' is not recognized as an internal or external command,
operable program or batch file.
at checkExecSyncError (child_process.js:610:11)
at execSync (child_process.js:646:15)
at Object.<anonymous> (C:\app.martin\mercury-parser\mercury-parser\preview:20:3)
at Module._compile (internal/modules/cjs/loader.js:1139:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1159:10)
at Module.load (internal/modules/cjs/loader.js:988:32)
at Function.Module._load (internal/modules/cjs/loader.js:896:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
at internal/main/run_main_module.js:17:47 {
status: 1,
signal: null,
output: [
null,
<Buffer >,
<Buffer 27 4d 45 52 43 55 52 59 5f 54 45 53 54 5f 42 55 49 4c 44 27 20 69 73 20 6e 6f 74 20 72 65 63 6f 67 6e 69 7a 65 64 20 61 73 20 61 6e 20 69 6e 74 65 72 ... 59 more bytes>
],
pid: 26728,
stdout: <Buffer >,
stderr: <Buffer 27 4d 45 52 43 55 52 59 5f 54 45 53 54 5f 42 55 49 4c 44 27 20 69 73 20 6e 6f 74 20 72 65 63 6f 67 6e 69 7a 65 64 20 61 73 20 61 6e 20 69 6e 74 65 72 ... 59 more bytes>
}
MERCURY_TEST_BUILD=true npm run build
isn't supported on powershell? you may have to edit the preview script to play friendly with powershell. it assumes a *nix shell