Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Elenui
    @Elenui
    hum ok it's back after a reboot
    :(
    Tanguy PELADO
    @tpelado

    Good day everyone,

    I'm having some weird bahaviours on node upon upgrading them to our new rudder server.
    Some of them won't run the agent automatically, and as such they show up as not running on the server.

    rudder agent run is okay and reports not errors.
    rudder agent health also reports OK.

    This could be some weird cron related stuff, but I wanted to know if there's a known issue

    Nicolas Charles
    @ncharles
    they are in /var/log/rudder/webapp
    @tpelado you upgraded to which version ?
    if you run rudder agent history on the node, do you see that the agent is run every 5 minutes?
    Tanguy PELADO
    @tpelado
    Running "rudder agent history" shows that the agent doesn't run on its own after upgrading to 6.2 (we were on 4.X)
    Elenui
    @Elenui

    they are in /var/log/rudder/webapp

    indeed but it was empty when I check. I thought logs were somewhere else ^^"

    Tanguy PELADO
    @tpelado
    on a side note : it would be a nice feature to have criteria using logic between themselves. As an exemple, you could have a groupe that has ("Node OS = Debian" AND "node Location = DC1") AND NOT "Node property foobar is defined"
    a bit like the factorio train stop condition logic, if that talks to anyone (doubt it does :D )
    Nicolas Charles
    @ncharles

    Running "rudder agent history" shows that the agent doesn't run on its own after upgrading to 6.2 (we were on 4.X)

    i'm unsure the direct upgrade from 4.x to 6.2 is supported

    can you run rudder agent info
    it might be that the agent was either disabled, or its uuid modified
    Nicolas Charles
    @ncharles
    if it is disabled, you may run rudder agent enable
    Tanguy PELADO
    @tpelado
    I have removed the agent and factory reseted the node
    so that should not be an issue per se
    Nicolas Charles
    @ncharles
    when you factory reset it, you give it a new rudder id
    so it's not the same node
    Tanguy PELADO
    @tpelado
    yes, that is correct
    but I suspect there might be some old files still there
    the thing is : i've done this on multiple nodes, and only some of them are behaving like this
    image.png
    update was at 1700
    you can see it running twice and never running again
    Nicolas Charles
    @ncharles
    what does rudder agent info say ?
    Tanguy PELADO
    @tpelado
     root@NPERF:~# rudder agent info
    Hostname: NPERF
    UUID: 3
    Key hash: MD5=2
    Certificate creation: Dec  7 17:06:26 2021 GMT
    Certificate expiration: Dec  5 17:06:26 2031 GMT
    Certificate fingerprint: 54
    Policy server: rudder-v2
    Roles: rudder-agent
    Report mode: full-compliance
    Run interval: 5 min
    Agent is enabled
    Agent is not forced in audit mode
    Configuration id: 20211207-170708-
    Policy updated: 2021-12-07 18:07:33
    Inventory sent: 2021-12-07 18:06:35
    Version: Rudder agent 6.2.11-debian11
    root@NPERF:~#
    i've removed some of the values, reckon they shouldn't be needed for troubleshooting
    Nicolas Charles
    @ncharles
    this is interesting - it ought to be running
    can you check if cf-execd is running ?
    Tanguy PELADO
    @tpelado
    root@NPERF:~# ps aux | grep cf-exec[d]
    root@NPERF:~#
    there's no process running with that name
    Nicolas Charles
    @ncharles
    that's the cause of the issue
    Tanguy PELADO
    @tpelado
    I've tried the same on a "good" node and the process is running as expected.
    now, why isn't it running
    Nicolas Charles
    @ncharles
    you can check with journalctl -u rudder-cf-execd
    Tanguy PELADO
    @tpelado
    image.png
    welp. Died and never really started again.
    I've restarted the service manually
    it now runs fine
    will check if the agent sends an inventory
    is it normal that the service isn't enabled systemd wise?
    Nicolas Charles
    @ncharles
    it should be enabled
    Tanguy PELADO
    @tpelado
    I enabled it manually
    as to why it wasn't, can't say I know the cause
    the irony is that I can't make a rudder directive to enable it because it won't run automatically on the affected nodes lol
    Anyhow, that's now fixed. Merci pour ton aide !
    saying that, the node didn't send anything yet...
    Nicolas Charles
    @ncharles
    if it doesn't start in 5 minutes, we'll investigate
    Tanguy PELADO
    @tpelado
    "2021-12-08 13:48:18+00:00 ( 94s) Enforce: 65 compliant, 0 repaired, 18 N/A, 0 errors Audit: 0 compliant, 0 non-compliant, 0 N/A, 0 errors" heyyyyyyyy
    just needed time (and apparently the cronjob isn't on *5)