by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Aug 07 15:54
    eliv opened #92
  • Jul 23 05:37
    gokamal opened #91
  • Jun 23 21:38
    Obihoernchen commented #90
  • Jun 23 18:26
    Obihoernchen commented #47
  • May 22 15:53
    martijnkruiten commented #65
  • May 22 15:50
    martijnkruiten commented #65
  • May 22 15:25
    martijnkruiten synchronize #65
  • May 22 14:14
    martijnkruiten commented #84
  • May 22 13:59
    martijnkruiten synchronize #65
  • May 22 13:56
    martijnkruiten synchronize #65
  • May 22 13:49
    martijnkruiten commented #65
  • Mar 23 19:26
    yqin opened #90
  • Mar 20 15:53
    mslacken synchronize #89
  • Mar 20 15:52
    mslacken opened #89
  • Mar 09 12:12
    codingzhoujin opened #88
  • Jan 08 06:57
    flybirdkh opened #87
  • Dec 23 2019 16:25
    rpabel commented #49
  • Oct 22 2019 00:54
    mej demilestoned #44
  • Oct 22 2019 00:54
    mej milestoned #44
  • Oct 22 2019 00:54
    mej assigned #76
smichnowicz
@smichnowicz

I am posting here as nhc@lbl.gov bounced. I am using nhc 1.4.2 on a centos7 system. using bash
4.2.46(1)-release (x86_64-redhat-linux-gnu)

Our logs are filling up with many smash stacking errors, I traced the problem to /opt/nhc-1.4.2/sbin/nhc
about line 143. which produces a command like
kill -s USR2 -- -17525 17525
where the problem occurs

Is anyone able to give us any guidance as to how to resolve this issue?

Michael Jennings
@mej
@smichnowicz In order to send e-mail to nhc@lbl.gov, you need to be subscribed to the ML or use the online web forum.
As to your question, are you sure you're using the 1.4.2 release version? That line no longer exists in the nhc script in 1.4.2, in part due to the fact that it was triggering some weird bug in Bash.
Michael Jennings
@mej
@smichnowicz I changed the Group settings so that you can now send e-mail to nhc@lbl.gov even if you're not subscribed. Feel free to resend your e-mail if you'd like.
smichnowicz
@smichnowicz
Thanks for response. We have modified our nhc to reflect the latest changes and the bash error went away. regards Simon
ytghazal
@ytghazal
Hello, I was wondering if there was a way to check the contents of a file, but only the recent contents.
Basically we are using a negative match string in check_file_contents to check on some logs. Unfortunately, after solving the issue, the program does not create a new log file and so nhc continues to hit on the negative match string
Michael Jennings
@mej
@ytghazal At present there isn't. check_file_contents() wasn't really written with log files in mind, and unfortunately bash doesn't have an ability to seek() within an existing file to a particular spot, so I'd have to skip a user-specified number of lines. And even then, that wouldn't allow me to track "recent" changes. Nothing in Linux/UNIX tracks when different parts of a file were written (at least not generically and in a way userland programs could query it), so the only way to track that would require (1) saving state, and (2) a way in bash to seek to a byte value in a file.
It would, however, be possible to write an external Perl/Python script or a C/Go program that NHC could invoke that would be capable of doing that sort of thing.
@ytghazal The external script/program could track the size of the file on last run somewhere (e.g., /var/state/<something>), then seek() to that position on startup and output the rest of the file. Then NHC's check_cmd_output() could be used to assert that your search string wasn't in the new portion of the file. Should be pretty trivial to write. In fact, apart from the tracking-where-to-seek-to portion, tail -c can do exactly that (dump the remainder of a file to stdout).
Michael Jennings
@mej
@ytghazal Actually, now that I'm thinking about it, you could do something like this: * || read LOG_FILE_LINES < /tmp/log_file_lines.tmp && wc -l /path/to/file > /tmp/log_file_lines.tmp && check_cmd_output -m '!/error you want to look for/' tail -n +$LOG_FILE_LINES /path/to/file
@ytghazal That would read in the old line count, store the new line count for the next run, and then use the old line count to tell where to start reading from.
@ytghazal Note that I haven't tested this at all, but hopefully it's not too far off. :-)
novosirj
@novosirj
Hi folks. I'm trying to work with OpenHPC to get this software readmitted to the repository. It came in via Warewulf the first time, and then when the projects split, it was removed. Their concern is support for PBSPro, as it seems primarily geared to TORQUE. My answer to them was that I don't think anything NHC does is so complicated that it's different between the two, and I see the config file mentions PBSPro. Can I get confirmation that it does work with PBSPro? I use SLURM instead, so I've not personally tried it.
novosirj
@novosirj
I should say, they're really talking about PBSPro open source. Which to me seems like some kind of poor naming. :)
downloadico
@downloadico
Hello everyone! I'm rewriting my node health checker script for the bajillionth time. I came across NHC and was intrigued. Is it easy to get a fairly basic configuration up and running? Is there a "I'm too busy with other work to even breathe so I need something quick" guide to NHC? I really just want to check that 1. my filesystems are mounted 2. this machine can resolve users from LDAP/NIS and 3. the resource manager daemon is running
Michael Jennings
@mej
@downloadico If you run the nhc-genconf utility that comes with NHC, it'll generate a config file for you that has most of that stuff already covered. :-)
Just make sure all the filesystems you want to check are mounted when you run it.
Michael Jennings
@mej
There will also be some sample check_ps_service tests in there which you can easily modify to look for your RM daemon, whatever that happens to be. Then add a check like this to verify LDAP/NIS resolution: check_cmd_status getent passwd <someuser>
You can nuke all the other sample checks if that's all you care about. :-)
downloadico
@downloadico
thanks!