Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Jen Hammock
@jhammock
Cool!
Jeremy Rice
@JRice
...I... could write some kind of script to replace this in evey file, but ... that would take me a while to write and it would take a long time to run. :S
Jen Hammock
@jhammock
or Eli could fix it on his end?
Jeremy Rice
@JRice
I think we should either a) re-write the JS to use the current format or B) ask Eli to produce the right output. :S
Jen Hammock
@jhammock
your call. I’m happy to wait while something runs
Jeremy Rice
@JRice
...Yeah, it'll take him just as long as it'll take me ... AND I'm going to have to load all of the new-new data back into place when it's fixed ... but... yeah.
I think "fixing" the JS to accept the current format is the least effort .... but also the dumbest: it would either be turning the JSON into an eval() call (which makes my teeth itch: you shouldn't be EVAL-ing code willy-nilly) or means replacing the first few words in the file before treating it like data. Ugh.
Maybe I'm being too delicate here, but I don't like either of those. They each violate coding principles. I'd rather fix the data.
...But this is going to take some time. Hmmmn. Let me experiment.
Jen Hammock
@jhammock
just as you like, but I’m happy to wait for Eli to fix the input...
Jeremy Rice
@JRice
In the meantime, I've restored old maps.
Jeremy Rice
@JRice
Okay, I've ... actually written a script to do this. ...at least, one directory at a time (and there are 100 directories). It's actually pretty fast (considering how many files there are in each dir), so I think this is the most viable solution.
Jen Hammock
@jhammock
Cool. And… that’s fixing the data, so for next time we update GBIF, Eli should also make that fix?
Jeremy Rice
@JRice
Yesplease.
NO var data = at the top of the file, please.
Jen Hammock
@jhammock
thanks; I’ll put that in the ticket
Jeremy Rice
@JRice
For posterity:
for D in ./map_data_dwca/*; do
    if [ -d "${D}" ]; then
        echo "${D}"
        for file in $D/*; do
            echo "$file"
            if [ -f $file ]; then
              echo "fixing $file";
              tail +12c $file > $file.truncated && mv $file.truncated $file
            fi
        done
    fi
done
(Ideally, I would have preferred some check in there that said "if the file starts with "var data = " then do this..." but I didn't want to figure that out right now. I have a backup if we need to restore a few odd non-data files hanging out in that tree.)
Jeremy Rice
@JRice
Okay, I am now technically babysitting three jobs: the sitemap rebuild, the beta database population of map ids, and the fixing of the files (on the production side). :\ I'll have to reverse those two on Monday: update the prod DB and fix the beta files. ...and the sitemap should finish tomorrow...
Jen Hammock
@jhammock
riiiiiight, yes, and then we’ll have working maps on both sides
Jeremy Rice
@JRice
In theory. :) That's the plan, anyway...
Jen Hammock
@jhammock
of course :)
Jeremy Rice
@JRice
Beta maps have the potential to be wonky, because page_ids (esp in the upper range) can be different there, but I don't think that's a big deal and it will self-resolve when we start syncing the data between the two.
Jen Hammock
@jhammock
yeah, meanwhile, no biggie on beta
JRice @JRice nods
Jeremy Rice
@JRice
A more productive week than average of late, anyway...
Jen Hammock
@jhammock
Yup, lining ‘em up, knocking ‘em down :)
and almost to regression tests
Jeremy Rice
@JRice
Indeed!
Jeremy Rice
@JRice
Aaaaand the fixing-files script Just finished, but I'd rather wait until Monday to flip the switch in case things go wrong (it's quitting time now).

Flip the switch on the new maps on Monday

^^ Note to self.
Jen Hammock
@jhammock
:+1:
Jeremy Rice
@JRice
...The DB-import script is still running, though. :\
In fact, it looks like it's probably only about 1/3rd done. So.... should finish tomorrow-ish.
Jen Hammock
@jhammock
cool. The switcheroo can wait for Monday too
Jorrit Poelen
@jhpoelen
@jhammock just looking at dynamic hierarchy and noticed that https://eol.org/pages/281 (Plantae) is not part of it . Can you confirm? Got DH from https://editors.eol.org/other_files/DWH/TRAM-809/DH_v1_1.tar.gz .
Jen Hammock
@jhammock
You have the right file. 281 didn’t make it through the migration, I’m afraid. You may want https://eol.org/pages/42430800
Jorrit Poelen
@jhpoelen
@jhammock thanks for confirming, much appreciated.
Jeremy Rice
@JRice
...checking my calendar this morning I see that I ALSO need to take Thursday morning off: I have a dentist's appointment. So: Tuesday and Thursday mornings.
Jen Hammock
@jhammock
Tuesday and Thursday, cool
Michael Vitale
@mvitale
@jrice just seeing your sitemap thing now. I have no idea what that’s all about. I haven’t changed anything in there recently, I don’t think. Let me know if you still need me to investigate
Jeremy Rice
@JRice
Nope, I fixed it. Thanks.
@mvitale ^^
Michael Vitale
@mvitale
:+1:
Jeremy Rice
@JRice
Well, sitemap is no longer running, but the last file it created was sitemap297.xml.gz
Huh. Well, it DOES look like it finished.
Sitemap stats: 14,816,104 links / 297 sitemaps / 5637m05s
Jeremy Rice
@JRice
This was lovely and I recommend you watch it: https://www.youtube.com/watch?v=txJTUVpVCT4
Jen Hammock
@jhammock
yes, that’s a beautiful series