For complex issues please use https://discuss.hashicorp.com/c/consul/, https://github.com/hashicorp/consul/issues or https://groups.google.com/forum/#!forum/consul-tool.
upstream connect error or disconnect/reset before headers. reset reason: connection termination: 0
when using consul connect. Is there a workaround for this? I see that Envoy has an option to retry on reset connections. Is that configurable in Consul?
HTTP Maximum Connections per Client
- http_max_conns_per_client
. Looking to hotfix some servers and id like to confirm the configuration has been taken but I can’t seem to figure out how to list the config for that entry (if at all possible). Thank you in advance
Try to run a consul reload -token=<token>
and get back a Configuration reload triggered
but I don’t see any logs that suggest a reload actually took place, and the value we are attempting to reload http_max_conns_per_client
doesn’t appear to actually have been updated.
Questions:
consul connect
way of starting envoy. What should be the correct way to accomplish this? We are currently using systemctl to manage envoy. Can we configure some thing there for hot restart?
Hi all,
I am running Consul servers on ECS EC2 which all connect up fine via retry-join
on an NLB.
For the clients, I am using ECS Fargate and retry-join
with the aws tags.
The clients seem to find the server instances and their IPv4 address and attempt to join them, but there's no error logged about that failing. What happens is is that the client starts logging logs like: 2022-06-20T09:00:54.961Z [WARN] agent.router.manager: No servers available
and 2022-06-20T09:00:54.961Z [ERROR] agent: failed to sync changes: error="No known Consul servers"
.
I've seen a couple issues about the logs above in GitHub but no solutions and I can't determine if this is even related.
Has anyone seen this before/know off the top of their head what this could be?
Hmm, strange one related to connect.
All clients/servers have connect enabled. However, ALL clients are reporting this error every 10 minutes+-:
Jun 29 00:24:27 ip-11-0-3-20 consul[1572]: {"@level":"error","@message":"RPC failed to server","@module":"agent.client","@timestamp":"2022-06-29T00:24:27.108110Z","error":"rpc error making call: i/o deadline reached","method":"ConnectCA.Roots","server":{"IP":"11.0.4.125","Port":8300,"Zone":""}}
Jun 29 00:24:27 ip-11-0-3-20 consul[1572]: {"@level":"warn","@message":"handling error in Cache.Notify","@module":"agent.cache","@timestamp":"2022-06-29T00:24:27.108796Z","cache-type":"connect-ca-root","error":"rpc error making call: i/o deadline reached","index":12}
Connect sidecar proxies fail to deploy (with nomad), Traefik fails with a similar error when setup to use consul connect.
KV sync and health check sync is working. The network is open between the cluster and clients (confirmed with telnet {server-ip} 8300 from client). curl https://{server-ip}:8501/v1/connect/ca/roots
returns a valid 200 response with a CA cert.
I've successfully deployed this before, which makes it doubly strange. THE ONLY difference between past consul deployments and this one, is TLS auto_encrypt for the clients. In the past I've distributed client certs. TLS settings are set to their strictest, including tls { internal_rpc { verify_server_hostname = true } }
ACLs are also enabled.
The servers themselves don't have any logs of interest (at least at INFO level).
Any ideas, how can I debug further?
Hi there, I'm using Traefik which builds its configuration using Consul Catalog. Upon Traefik startup, it takes >5 minutes for Traefik to retrieve its configuration from Consul Catalog. Looking in Traefik logs, it looks like it's having issues fetching the Connect certificate from Consul
level=info msg="Waiting for Connect certificate before building first configuration" providerName=consulcatalog
while it appears Consul seems to be canceling the request
consul[458]: agent.http: Request cancelled: method=GET url=/v1/agent/connect/ca/roots?index=9 from=10.128.0.34:38606 error="context canceled"
consul[458]: agent.http: Request cancelled: method=GET url=/v1/agent/connect/ca/leaf/traefik?index=111619 from=10.128.0.34:38604 error="context canceled"
I am on Consul v1.12.0. How can I debug what's causing Consul to be canceling the request like this?