Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Vladimír Čunát
@vcunat
Thanks for the patch.
Andreas Rammhold
@andir
libfaketime is next.. doesn't seem like anyone with a recent gcc (>=6) has used it before.. :/
Vladimír Čunát
@vcunat
Yes, I bumped into that just today for the first time.
Andreas Rammhold
@andir
I just ifdef'ed those NULL checks out currently but that doesn't feel right.. It never feels right to remove sanity checks..
Vladimír Čunát
@vcunat
-Wno-error=... probably
Andreas Rammhold
@andir
I've a feeling we (as in all developers on the planet) just start collecting Warning exceptions more and more and more... but thats probably okay here
Ondřej Surý
@oerdnj
@andir: If you are ready for another round of testing, the updated packages have a fix for the out-of-the-bailiwick glitch
Andreas Rammhold
@andir
oerdnj: upgrading :-)
Andreas Rammhold
@andir
well it SIG7's (BUS error) after a few seconds
Ondřej Surý
@oerdnj
can you get us a coredump?
Andreas Rammhold
@andir
just installed coredumpctl..
Ondřej Surý
@oerdnj
@andir Can you remove the cache (rm /run/knot-resolver/cache/*.mdb) and try again? The crash in lmdb suggest something with the cache
Andreas Rammhold
@andir
I'll do that just now.. was trying to find some liblmdb debug symbols without luck :/
Vladimír Čunát
@vcunat
You can build with embedded lmdb and thus generate them just by -g
Andreas Rammhold
@andir
oerdnj: it died again
but no coredump yet :/
ahh well... it did log 50GB of stuff... I'll have to remove the verbose flag or get largers disks for those tests.. I'll run it again with empty caches
Ondřej Surý
@oerdnj
so it ate all the disk space? and it's crashing because lmdb fails to write to cache?
Andreas Rammhold
@andir
well it worked before... i'm trying to reprodcue now :-)
Vladimír Čunát
@vcunat
Out-of-space on cache write should be handled on our side, but it's certainly almost untested condition.
Actually, LMDB creates a holey file and mmaps it into memory. I'm not sure if it's possible to catch the situation when it accesses the part that has no disk space assigned yet - and the disk is full ATM.
Andreas Rammhold
@andir
I did some tuning of the rsyslog settings.. they are rather VERBOSE on debian per default.. logging stuff >=2x + journald ...
hopefully that false result wont happen again
Ondřej Surý
@oerdnj
@andir So, everything running smoothly? :)
Andreas Rammhold
@andir
@oerdnj so far so good, it auto upgraded to the version from this morning
Andreas Rammhold
@andir
For a while the predict.queue metric also did decline.. now with a 12h period it only grows (even after >12h) :/ Not sure if 12h are either a good idea or practical at all..
Vladimír Čunát
@vcunat
I think the predict module isn't very advanced ATM. For high-traffic resolvers it might only be usable with a short window (and period).
Vladimír Čunát
@vcunat
Note that the predictions are done from a table of estimated most-frequent queries - and that table has ~5k lines.
Andreas Rammhold
@andir
i'll investigate predction tuning when I'm back from congress/holiday travels.
image.png
thats my current graph of the predict queue :-) Looked fine a few versions ago.. can't remmeber when I switched to the rather large window/period
Andreas Rammhold
@andir
I just ran into a SERVFAIL condition where some records (e.g. google.com) did respond with SERVFAIL and others worked right. Cache flush did fix it. Version is latest from yesterday. I'll have time to check logs tomorrow if required :-)
Vladimír Čunát
@vcunat
An A record for google.com failed sometimes?
Andreas Rammhold
@andir
always
from some point in time on
matrixbot
@matrixbot
ondrej Do you have idea how full your cache was at that time?
Andreas Rammhold
@andir
I can pull up the graphs later. flokli might be able to give us that information. I'm out with my phone only :-)
matrixbot
@matrixbot
ondrej I am watching Little Mole with my kids and I am on my phone ;)
Vladimír Čunát
@vcunat
We merged major changes today to master, and these fixed almost all of the bad responses we knew of. With google.com there's the "problem" that they serve different DNS on different locations...
matrixbot
@matrixbot
ondrej The Debian/Ubuntu packages were already on the vld-refactoring branch, but there were more fixes today. I'll package the updated version today or tomorrow morning....
Florian Klink
@flokli
vcunat, ondrej: these are the graphs: http://i.cubeupload.com/AR5OSj.png
cache.delete shows pretty good when cache was flushed
the ~10-15minutes before that look strange
(scroll down first)
Andreas Rammhold
@andir
thanks flokli :-)
Florian Klink
@flokli
np
Andreas Rammhold
@andir
are we missing a cache.size graph?
Florian Klink
@flokli
I think that's all graphs I had... Let me check
Florian Klink
@flokli
andir: nope, no cache.size