Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Kedar
    @kedarv
    Hey @utkarshkukreti , I'm trying to use http://alexcrichton.com/curl-rust/curl/index.html with this library, but having some issues. From what I understand, I need to curl the HTML page, and then pass it Document::from(), right?
    Utkarsh Kukreti
    @utkarshkukreti
    Hey @kedarv. I'm sorry I thought I'd get a notification if someone tags me here but I just logged in after months and saw this message now :(
    curl-rust's API looks a bit weird. Let me know if you still have any issues and I'll try to help. I've only used Hyper for HTTP with select.rs in my projects.
    I've set gitter to send me email notifications for any messages here now. I hope it works.
    blakehawkins
    @blakehawkins
    @kedarv hi, in 0.5.0 I can no longer use .find().iter(). I previously did something like x.find.iter().zip(y.find.iter()).take(10) to work with pairs of objects. How can I duplicate this in 0.5.0?
    @utkarshkukreti please see above. @kedarv terribly sorry for pinging you!
    blakehawkins
    @blakehawkins
    looks like x.find().zip(y.find()).take(10)works just fine :)
    Utkarsh Kukreti
    @utkarshkukreti
    @blakehawkins Sorry I missed this notification as well :( Yes, find() earlier returned an eagerly evaluated collection of Nodes which was very inefficient if you only needed a few matching items. Now it returns an Iterator so the solution you found is the one I'd have recommended if I were here then!
    Lakelezz
    @Lakelezz
    Hello @utkarshkukreti, I was trying to pick one attribute from some meta-tags, but literally only one, I managed to get it to work via a for-loop, but that feels odd to do for just one. This is what I have currently: https://play.rust-lang.org/?gist=5a66cfa49fe6e361566df64d1be63e0b&version=stable
    Is it possible to turn into some one liner?
    Utkarsh Kukreti
    @utkarshkukreti

    Yep:

    document.find(Attr("name", "Author")).next().and_then(|node| node.attr("content"))

    This will return an Option<&str>. It'll be None if the element doesn't exist OR the element doesn't have a "content" value. You can unwrap it if you want.

    (Forgot to tag you @Lakelezz.)
    Lakelezz
    @Lakelezz
    @utkarshkukreti works splendid! thanks a lot for your time : )
    Lakelezz
    @Lakelezz
    @utkarshkukreti hello again! sometimes, when I try to do a POST via reqwest and then use it in select.rs, I get this: Error { repr: Custom(Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }) }, when exactly does this error occur? Not all responds are affected by this, so I assume there is something about the character encoding going wrong? Also, is it possible to try fixing the document?
    Utkarsh Kukreti
    @utkarshkukreti
    @Lakelezz That error is created in Document::from_read if the std::io::Read contained an invalid UTF-8 sequence somewhere. I'd suggest reading it yourself into a Vec<u8> and doing a lossy conversion to String (which will replace all invalid sequences). I'm typing this from memory, let me know if you get this working or encounter any problems:
    let mut bytes = Vec::new();
    read.read_to_end(&mut bytes)?;
    let string = String::from_utf8_lossy(bytes);
    let document = Document::from(&string);
    Where read is the io::Read instance.
    Lakelezz
    @Lakelezz
    Thanks a lot, works : )
    Jordan Petridis
    @alatiera
    Are there any blog-posts/guides/tours of the select crate? Would be nice to have an overview of how it works!
    Utkarsh Kukreti
    @utkarshkukreti
    I'm not aware of any :(
    Lakelezz
    @Lakelezz
    hello, @utkarshkukreti : ) so one page had the letter 'รง' which probably caused another invalid utf-8-error. Sadly, after doing String::from_utf8_lossy()I get an empty string. Is there any other route I can take as in still process the page? : / A bit frustrating if a single character can totally blow away everything.
    Utkarsh Kukreti
    @utkarshkukreti
    Hey @Lakelezz, is it possible to share the URL of that page? I'll check it out. from_utf8_lossy should only remove the parts of the string that are not valid UTF-8.
    Lakelezz
    @Lakelezz
    @utkarshkukreti, I realised my mistake. I did not know that reading actually affects the content of the stream. So, failed lossless reading emptied the stream... hence trying strategy 2 via lossy conversion on the same and now empty stream read nothing at all, haha. I decided to go lossy on every page, which solves the problem.
    Lakelezz
    @Lakelezz
    hello again @utkarshkukreti! whenever my code runs Document::from(), my logger says: WARN:html5ever::tree_builder::actions: stop_parsing not implemented, full speed ahead!, what is the cause and how can I stop this?
    Utkarsh Kukreti
    @utkarshkukreti
    Looks like you're running into servo/html5ever#219. I think env_logger has a way to disable logs per module which you can use here to disable everything from html5ever crate.