Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Oct 06 2019 10:53
    bravegag closed #16
  • Oct 06 2019 10:28
    bravegag commented #16
  • Oct 06 2019 01:47
    gaocegege commented #16
  • Oct 06 2019 01:46
    gaocegege commented #17
  • Oct 05 2019 15:37
    bravegag commented #17
  • Oct 05 2019 15:36
    bravegag opened #17
  • Oct 05 2019 11:01
    bravegag opened #16
  • Jul 17 2018 08:27
    gaocegege closed #15
  • Jul 17 2018 08:23
    Mr-Malone opened #15
  • Apr 05 2018 02:19
    gaocegege commented #14
  • Apr 04 2018 22:25
    at15 commented #14
  • Nov 21 2017 02:49
    gaocegege added as member
  • Nov 21 2017 02:49
    at15 removed as member
  • Nov 21 2017 02:49
    at15 added as member
  • Apr 05 2017 06:32
    at15 commented on 791e48d
  • Apr 05 2017 06:16
    gaocegege commented on 791e48d
  • Apr 05 2017 06:12
    at15 commented on 791e48d
  • Apr 05 2017 05:39

    gaocegege on master

    readme.md: Update docker badge (compare)

  • Feb 22 2017 04:52

    gaocegege on master

    Add idea folder Signed-off-by:… (compare)

  • Feb 22 2017 04:26

    gaocegege on master

    Add request to code review Sig… (compare)

Pinglei Guo
@at15
can I use scrala in java programs? I am new to scala
arrowrowe
@arrowrowe
Hi everyone!
Ce Gao
@gaocegege
@at15 maybe yes, I have used scala library in java.
Pinglei Guo
@at15
ok, I need to use a scraper for my boring course project, I may have a try on scrala, thx
Ce Gao
@gaocegege
好消息,没弃坑,我好自豪
Pinglei Guo
@at15
坏消息, 答辩前一天还在不务正业,药丸
Pinglei Guo
@at15
有种钦定的感觉
Ce Gao
@gaocegege
Pinglei Guo
@at15
0.0
Ce Gao
@gaocegege
刚刚是测试gif功能= =
Pinglei Guo
@at15
/w\
Ce Gao
@gaocegege
受良心谴责,项目将于近期重新进入开发
Now have a plan for redevelopment
:tada:
Pinglei Guo
@at15
摸摸你的良心,不疼么
Joshua Goldberg
@JoshSGman
Hi! Is there a built in way to crawl recursively?
Ce Gao
@gaocegege

Hi, @JoshSGman

I think yeah. You could parse the URL and enqueue the new URLs into the scheduling queue, and define another functions to deal with these new URLs.

I am not sure if I understand your meaning. Do you mean that you want to crawl the new URLs in one page, then the new URLs in the new pages?

Joshua Goldberg
@JoshSGman

@gaocegege Hey! Thanks for the response - stepping through the methods, I wasn't quite sure what will enqueue the following links. Essentially, I would like to keep crawling a site until it's 10 levels deeps. Would I use the request method to enqueue the new URLs? Does request ultimately call Parse again?

Thanks in advance!

Joshua Goldberg
@JoshSGman
Another question, is there also a way to make the url used available to the parse function? It would be helpful to make when normalizing subsequent links on the page.
Ce Gao
@gaocegege
  1. Yeah, request will enqueue the new URLs
  1. I am afraid not. But it could. Welcome to contribute a patch to scrala :)
Joshua Goldberg
@JoshSGman
@gaocegege Thanks! Will look into it :)