Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    mahmoudnabil
    @mnabil
    on another machine.
    im testing on a vm and a remote machine
    it opens on the vm but not the remote machine.
    Charles Green
    @charlesgreen
    Sorry, I’m not really following what you are trying to do. Are you saying the javascript is not rendering on Splash or you’re not getting back the response that you expect?
    mahmoudnabil
    @mnabil
    its ok ,im just not getting the response that i expect
    Charles Green
    @charlesgreen
    if you print(response.text) are you able to find the elements you are looking for? If you inspect the page can you tell me the element ID and I can also check.
    mahmoudnabil
    @mnabil
    @charlesgreen if you open the url i sent you from US or US proxy , u'll find a 'shop now' panel on the right side
    Charles Green
    @charlesgreen
    I’m in Japan… don’t have a proxy setup
    mahmoudnabil
    @mnabil
    this panel is only shown when javascript is allowed.
    its ok , problem is im using splash and it still doesn;t show the panel.
    Charles Green
    @charlesgreen
    is it depended on headers at all?
    mahmoudnabil
    @mnabil
    how do i know that
    Charles Green
    @charlesgreen
    If you are able to view the website in Chrome or using a proxy like Charles proxy or Postman then you can see the headers. If both the browser and splash are running on the same machine then I would guess there is where the difference is but it’s just a guess.
    mahmoudnabil
    @mnabil
    i'll see thanks charles
    Charles Green
    @charlesgreen
    cheers.
    Umair Ashraf
    @umrashrf
    Hi, is there a way to send logs from Lua to Scrapy using Scrapy-Splash plugin?
    Charles Green
    @charlesgreen
    @umrashrf I’m not sure out of the box how to get logs from Lua but looking at the code I don’t think it would take a lot of effort to add a log in the lua_runner (I don’t know about getting that type of support added but at least locally in a forked version). What type of output are you looking for? errors? etc. https://github.com/scrapinghub/splash/blob/master/splash/lua_runner.py
    Abd ar-Rahman Hamidi
    @hbakhtiyor
    @nramirezuy just interested, not fully understand
    Nicolás Ramírez
    @nramirezuy
    @hbakhtiyor In order to solve the capcha I need to send it back to the spider, but to do this I can't just return the value, because this closes the tab. So I had to build a service where I was able to send the image via http with LUA.
    After that I got to a part of the page were I needed to load a different frame to query on it, but this isn't supported. You can return the value of the frame back to Python, but for some reason I wasn't able to do so. So I ended up switching to Selenium, BTW Firefox Selenium wasn't able to get to this frame either, but luckly Chrome was. (:
    Charles Green
    @charlesgreen
    Hi All, what is the best way to remove or redefine a JavaScript function from the DOM before the page is rendered? splash:runjs ?
    I’m currently doing the following in my Lua script.
         if string.find(splash.args.url, “thepage.html") ~= nil then
            assert(splash:runjs(“timeOutCheck = function(){return;}"))
        end
    Charles Green
    @charlesgreen
    The function I am trying to replace does a few checks and if they fail it does a location.href redirect.
    Umair Ashraf
    @umrashrf
    @charlesgreen if it's in a .js file then may be you can stop this resource from being downloaded?
    Charles Green
    @charlesgreen
    Hi Umair, Thanks for your reply. It’s embedded in the page.
    Actually, the function checks for the existance of a form. If not there then it uses location.href to redirect the browser.
    might work
    Charles Green
    @charlesgreen
    I believe the set_content would clear the page that gets rendered however, perhaps I can stop the page from rendering javascript
    Umair Ashraf
    @umrashrf
    You can replace the page contents with the same contents replacing JS
    using regexes or selectors
    not sure if there is selectors support in Splash
    Charles Green
    @charlesgreen
    thank you. will keep working on it.
    Umair Ashraf
    @umrashrf
    no problem, good luck
    Charles Green
    @charlesgreen
    thank you. it’s much appreciated. I’ll post an update with the solution.
    Charlie Smith
    @chuckus
    Hi all, I’ve been working on a drop-in replacement for splash that utilises google chrome, in particular, the devtools api to implement the splash HTTP api. It can be easily deployed as a docker container and the repo is hosted at https://github.com/chuckus/chromewhip. It’s still well in early alpha but the container has working functionality. My motivation was simply to start getting some practice with asyncio. Any suggestions, comments or improvements, please file an issue or pull request :)
    Charles Green
    @charlesgreen
    @chuckus sounds very cool. Will take a look.
    cadabrum
    @cadabrum
    Hello! What are system requirements for running Splash container in production? I run into memory leaking with 3.0 docker container, oom killing it with 6581516kB taken after 120k+ processed requests.
    Any advice for reducing memory consumption?
    I've requested splash:html only, without processing any images.
    Moataz Hisham
    @mtzhisham
    Hi, i was wondering how to use proxy with scrapy-splash while also using render.html as an endpoint, splash is running through the docker container
    Wenxing Zheng
    @wenxzhen
    Dear all, how to capture the traffic request to the internet from Splash especially when the request goes out with proxy?
    habout632
    @habout632
    Hi #hi
    Somebody there #
    splash chrome
    The difference between them
    Charles Green
    @charlesgreen
    @habout632 I'm a few days late to reply. Can you give a bit more context? What differences would you like to know? Have you read the docs?
    mahmoudnabil
    @mnabil
    HI guys ? , is scrapinghub hiring remote software engineers ? :D