Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Prometheus is assuming the vmware API can return requests reasonably quickly, but some of them are ungodly slow. So it assumes something has gone wrong and closes the connection. That error is because vmware_exporter carried on working in the background and tried to write a valid respond, but prometheus had already got bored and left.
    Unfortunately the vmware API is just slow, so all you can do is increase your scrape timeouts in your prometheus config
    Ryan Woods
    @Jc2k I'm running into the same issue that @tkalpakidis mentioned. My scrape interval is 5 minutes and my timeout is 2 minutes, however I'm getting this error long before the 2 minutes is even reached.
    2021-03-17 10:28:39,853 INFO:Start collecting metrics from REDACTED
    2021-03-17 10:28:39,853 INFO:Starting vm metrics collection
    2021-03-17 10:28:39,854 INFO:Fetching vim.VirtualMachine inventory
    2021-03-17 10:28:39,854 INFO:Retrieving service instance content
    2021-03-17 10:28:39,854 INFO:START: _vmware_get_vm_perf_manager_metrics
    2021-03-17 10:28:39,855 INFO:Fetching vim.Datastore inventory
    2021-03-17 10:28:39,855 INFO:Starting host metrics collection
    2021-03-17 10:28:39,855 INFO:Fetching vim.HostSystem inventory
    2021-03-17 10:28:39,855 INFO:START: _vmware_get_host_perf_manager_metrics
    2021-03-17 10:28:39,939 INFO:Retrieved service instance content
    Unhandled error in Deferred:
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 517, in errback
      File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 580, in _startRunCallbacks
      File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 662, in _runCallbacks
        current.result = callback(current.result, *args, **kw)
      File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 1514, in gotResult
        current_context.run(_inlineCallbacks, r, g, status)
    --- <exception caught here> ---
      File "/usr/local/lib/python3.7/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
        result = current_context.run(result.throwExceptionIntoGenerator, g)
      File "/usr/local/lib/python3.7/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
        return g.throw(self.type, self.value, self.tb)
      File "/usr/local/lib/python3.7/site-packages/vmware_exporter/vmware_exporter.py", line 1845, in _async_render_GET
      File "/usr/local/lib/python3.7/site-packages/twisted/web/server.py", line 279, in finish
        return http.Request.finish(self)
      File "/usr/local/lib/python3.7/site-packages/twisted/web/http.py", line 1134, in finish
        "Request.finish called on a request after its connection was lost; "
    builtins.RuntimeError: Request.finish called on a request after its connection was lost; use Request.notifyFinish to keep track of this.
    2021-03-17 10:28:51,619 INFO:Fetched vim.Datastore inventory (0:00:11.764348)
    2021-03-17 10:28:53,853 INFO:Fetched vim.HostSystem inventory (0:00:13.997306)
    2021-03-17 10:28:55,108 INFO:Fetched vim.VirtualMachine inventory (0:00:15.254623)
    2021-03-17 10:28:56,251 INFO:Finished vm metrics collection
    2021-03-17 10:28:56,253 INFO:Finished host metrics collection
    2021-03-17 10:28:56,259 INFO:FIN: _vmware_get_host_perf_manager_metrics
    2021-03-17 10:28:59,732 INFO:FIN: _vmware_get_vm_perf_manager_metrics
    2021-03-17 10:28:59,746 INFO:Finished collecting metrics from REDACTED
    So i don't use this any more but all i can say is looking at the exception, it comes from here https://github.com/pryorda/vmware_exporter/blob/main/vmware_exporter/vmware_exporter.py#L1845
    So that is the last bit of code inside a scrape, and theres no way back from there to the other log messages you are seeing
    I guess they could be threads still running in the background
    the weird thing is that the last log message comes from .collect(), which is called here
    Ryan Woods
    Yeah it isn't consistent which is weird. I can't reproduce it reliably either
    What i mean is, it feels like we are suffering from the temporal effects of it being async. That the error logged happens to fire inside a successfull scrape, but its unrelated to the successful scrape.
    becase after it finishes collecting metrics it will call request.finish shortly after, so i'd expect there to be another exception if that request was borked
    i think i'd probably expect to see another set of log messages earlier that were missing a :Finished collected metrics" message
    ultimately there are only 2 ways this error could come about - one is that connection was closed under it, the other is that the connection was closed twice.
    the one we are looking at here is in an exception handler, so that restricts the conditions further - something has to call request.finish() and then raise an exception
    There are 3 calls to request.finish(), the one we are looking at, the one where there is a configuration error (so it shouldn't intermittently work....) - and it returns immediately after calling request.finish(), and the success case. in the success case it calls request.finish() and immediately returns so there is nowhere to raise.
    ideally flip prometheus verbosity up and see if it can tell you what happened on its side
    if something /is/ double erroring we'd at least see the first error in prometheus, i think
    and if its a timeout on the prometheus side we'd see it too
    thanks i will try your recommendations...
    Gabrie van Zanten

    Hi, first time here, please let me know if this isn't the place to ask this, but here it goes:
    I'm trying to setup vmware_exporter in just a docker container and not passing environment variables, since I'll be having a lot of hosts I need to add. With other projects I managed to find where config files are stored and just create a docker volume with the config file in it and map that into container on a certain path. Just with vmware_exporter I'm unable to find a configuration file inside the container that I could map. I tried creating a config.yml file and map that to /opt/vmware_exporter/kubernetes, but it seems this is more the config.yml to create the container with docker compose.

    So, I'd like to find out, is there a config / settings file inside the container that I could use to write my config into?

    Or is this a road I should not try and just build a big composer file with all settings in it?
    I'm just wondering how you can have the latest vmware_vm_guest_disk_capacity in grafana in table mode ?
    another thing
    Is it possible to have disks IO for each vm ?