by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Aug 09 01:48
  • Aug 07 19:21
    tbehling starred sparklemotion/nokogiri
  • Aug 07 17:56
    felipeandreslugosalazar starred sparklemotion/nokogiri
  • Aug 07 16:34
    DJRolls starred sparklemotion/nokogiri
  • Aug 05 22:51
    maxence33 commented #2046
  • Aug 05 22:51
    maxence33 commented #2046
  • Aug 05 22:45
    antgel commented #2046
  • Aug 05 12:04
    flavorjones review_requested #2058
  • Aug 05 08:59
    Tour-dev-maker starred sparklemotion/nokogiri
  • Aug 05 01:47
    renugasaraswathy starred sparklemotion/nokogiri
  • Aug 04 23:10
    nwellnhof commented #2059
  • Aug 04 23:08
    dreamthink starred sparklemotion/nokogiri
  • Aug 04 20:30
    flavorjones commented #2058
  • Aug 04 19:06
    flavorjones edited #2058
  • Aug 04 18:17
    flavorjones closed #2059
  • Aug 04 18:17
    flavorjones commented #2059
  • Aug 04 17:53
    nwellnhof commented #2059
  • Aug 04 17:08
    AppVeyorBot commented #2058
  • Aug 04 15:59
    codeclimate[bot] commented #2058
  • Aug 04 15:53
    flavorjones synchronize #2058
Mike Dalessio
@flavorjones
@fulldecent ACK. Just commented on the PR and will review ASAP
Qqwy / Wiebe-Marten
@Qqwy
Hi!

When doing the following:

Nokogiri::HTML.fragment("<a href='https://foo.com?a=1&b=2").to_s                                                    
# => "<a href=\"https://foo.com?a=1&amp;b=2\"></a>"

in the output, the ampersand is escaped

Am I doing something wrong here?
(My real use-case is iterating over all a[href]s in the document and altering the URLS)
Nokogiri::HTML("<a href='https://foo.com'>foo</a>").search("//a").each do |n| n.attributes["href"].value = "https://foo.com?q=a&x=y" end.to_s
Qqwy / Wiebe-Marten
@Qqwy
Hmm, I learned something new today!
Turns out that ampersands should always be escaped inside URLs.
I only hope that no double escaping will happen, where &amp; is expanded into &amp;amp; in this example
Shlomi Fish
@shlomif
hi all! The "tutorials" link here is broken - https://nokogiri.org/
Mike Dalessio
@flavorjones
expect a few minutes of CI downtime, updating to https://github.com/concourse/concourse/releases/tag/v6.0.0
guillermo haas-thompson
@memoht
I am strugglebussing to parse an XML feed with Nokogiri to create records in a Rails database. I've tried multiple times over the years to get an XML feed to parse and managed to avoid it by going other routes (CSV import, JSON files, hitting API). I have a new task for a side project that is forcing me to revisit XML parsing. I think my example is straightforward enough, and wondering if anyone has a few to help me through this. [Simplified example: https://gist.github.com/memoht/1dc78f0f005abbb8d01267519ce386f1]
Mike Dalessio
@flavorjones
@memoht Hi there! Sorry for the slow response, have been moving onto a new laptop and missed the notifications. I'm happy to try to help! FWIW I can parse the XML in your gist just fine using Nokogiri::XML(xml) ... can you be more specific about what you're trying to do?
guillermo haas-thompson
@memoht
My end goal is to parse the feed via. rake task a couple times a day, and either create or update records in the Rails app by the referencenumber field in the XML. That is in the 2nd file of the gist.
guillermo haas-thompson
@memoht
Current state > Now able to create new Job records and skip if record exists (searching by referencenumber in XML field) but still unable to update existing records. So, progress-ish.
guillermo haas-thompson
@memoht
@flavorjones Gist updated with current state of affairs (the horror, look away) https://gist.github.com/memoht/e693d8bffc433e8d63d8cbc8d2ceebe0
guillermo haas-thompson
@memoht
Current state > Got it working via some assist, but not the most efficient. Taking advantage of first_or_initialize >> would still love to see a cleaner way of achieving this.
Mike Dalessio
@flavorjones
@memoht I'm still not sure what you need help with? The code you're using to parse each job record seems fine and similar to how I would approach the problem. Can you be more specific about what you're looking for help with?
guillermo haas-thompson
@memoht
@flavorjones TBH, I didn't know if my code was the cleanest. The section where it tries to create a new record, or update if record already exists (searching by ref_no field) felt like I didn't do it so well. It works, but I was wondering if there was a more efficient approach.
I spent a lot of effort trying to convert the data to a Hash first because I was more familiar with that. The experience has helped me understand Nokogiri a bit better.
Mike Dalessio
@flavorjones
OK, since you asked - having scraped many feeds and sites in my day, I do have one strong opinion about how to structure that code. Specifically, my preference is to have a clean separation between parsing and extracting the data and storing the data.
Mike Dalessio
@flavorjones
For example, with your current code, there's no way to test that the parsing is correct without also storing records in a database -- it's all done in one method, making it hard to figure out whether something is wrong with the parsing or if something is wrong with the database or ORM code.
Maybe imagine how you could take this same code, but restructure it to have one method that accepts XML and returns, e.g., an array of attribute hashes. Then a second method could accept the array of attribute hashes and update or insert records into the database.
Anyway. I'm not criticizing at all! Take this advice with a grain of salt, it's just what I've done in the past.
guillermo haas-thompson
@memoht
I agree and appreciate the input. When something goes sideways, it just does. I plan to iterate back over. I was surprised I got this to work. I didn't figure this out by reading the docs unfortunately, but more through searching and trial and error. I wish the docs covered a bit more items in detail (well more like detail helpful for newcomers. I did read through the actual RDocs as well, I just need to get better at that). Thanks, stay safe and have a great day. @flavorjones LLAP
Mike Dalessio
@flavorjones
Mike Dalessio
@flavorjones
Mike Dalessio
@flavorjones
Getting ready to ship v1.10.10 which will have precompiled Ruby 2.7 support for Windows (#2029)
Mike Dalessio
@flavorjones
CI is down for a bit, tearing down the infrastructure and rebuilding.