For complex issues please use https://discuss.hashicorp.com/c/consul/, https://github.com/hashicorp/consul/issues or https://groups.google.com/forum/#!forum/consul-tool.
Hi, got problem when trying to do nomad job with consul connect enabled like in https://developer.hashicorp.com/nomad/docs/integrations/consul-connect
the connect-proxy-count-dashboard
[2022-09-28 07:49:45.908][1][warning][config] [./source/common/config/grpc_stream.h:196] DeltaAggregatedResources gRPC config stream closed since 312s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
already tried to follow https://developer.hashicorp.com/nomad/tutorials/integrate-consul/consul-service-mesh?in=nomad%2Fintegrate-consul#tls-enabled-consul-environment but still no luck.
anyone can help me to fix?
Roles can only be linked to policies that are defined in the same namespace
Hi,
I'm doing some testing with Consul and I am running a client inside a container, when the container starts up it initially is not able to find any consul servers (I assume network starting or something) and the watches that I've setup is throwing an error:
2022-10-21T14:58:32.569Z [ERROR] agent.watch.watch: Watch errored: type=nodes error="Unexpected response code: 500 (No known Consul servers)" retry=5s
After this the client joins the cluster successfully but the watches never retries and stays dead.
Consul v1.13.3
Any thoughts?
Hi, I'm trying to troubleshoot the reachability of the consul from one of the jobs I'm running in nomad.
So I'm trying to start grafana/agent (read prometheus, they both work in the same way) container as a nomad job and use it to collect consul cluster telemetry.
Our consul cluster has an ingress gateway with public dns and if I point grafana/agent to that address, say https://consul.example.com:8500
, everything works. Traffic to this public address goes via AWS ALB and all other AWS plumbing, so we want grafana agent to talk to the consul cluster locally, via private network they both resides in. And I can't figure out how to point Grafana-agent task to consul HTTP API correctly. Grafana agent has a consul service sidecar and I can see - it successfully registered in Consul mesh via Nomad connect {sidecar_service...}
stanza.
What I've tried so far:
Point agent to http:/127.0.0.1:8500, which from my understanding corresponds to local consul agent that we are running in client mode on each node for service mesh. I also tried to define an upstream in this sidecar to point to service "consul" registered in the catalog via
connect{
sidecar_service{
proxy{
upstreams {
destination_name = "consul"
local_bind_address = "127.0.0.1"
local_bind_port = 10123
}
and point grafana agent to 127.0.0.1:10123
I tried to use one of the env variables injected by nomad to get a specific consul service IP (it actually gives me a local node private network IP) and use it to configure consul cluster scraping at http://{IP}:8500.
Given this is our research cluster - I also tried to update consul cluster to allow all comms between all services and hardcode one of the consul server nodes' private IP address as a destination e.g. grafana agent tries to reach http://{consul-node-ip-from-AWS-console}:8500
Everything to no avail with various errors in grafana agent logs.
Can anyone please advice on what's is the correct way to configure grafana agent to collect Consul cluster own telemetry via Prometheus endpoint (https://developer.hashicorp.com/consul/docs/agent/telemetry) and what I might be doing wrong here, as I spent almost 3 days trying to figure this out.
@0xalex88 Maybe you are just experimenting, but a 4 node cluster is not a good idea.
Consul does leadership election based on majority. In a 4 node cluster, you are likely to run into a split brain or inability to elect a leader.
I believe they are adding a warning for when people incorrectly set an even number in bootstrap_expect.
You want bootstrap_expect set to an odd number:
1 - No HA.
3 - Tolerance for 1 node failing
5 - Tolerance for 2 nodes failing
consul.hashicorp.com/connect-service-upstreams: foo:1234
annotation to the client, but curl localhost:1234
fails with Connection refused
and indeed there's nothing listening on that port. curl foo
is working. The envoy-sidecar is injected. What could be missing?
Hi everyone, after following https://developer.hashicorp.com/consul/tutorials/get-started-vms/virtual-machine-gs-deploy#create-server-tokens I'm still getting:
agent: Node info update blocked by ACLs: node=7f08f176-a3f3-effe-7443-bd60865e09d1 accessorID=e340e34c-4ef6-5adb-ad48-5a3d923355f9
agent: Coordinate update blocked by ACLs: accessorID=e340e34c-4ef6-5adb-ad48-5a3d923355f9
what could be the reason?
Dec 05 14:46:10 kubetmplp consul[3325]: 2022-12-05T14:46:10.876Z [INFO] agent: Starting server: address=127.0.0.1:8500 network=tcp protocol=http
Dec 05 14:46:10 kubetmplp consul[3325]: agent: Starting server: address=127.0.0.1:8500 network=tcp protocol=http
Dec 05 14:46:10 kubetmplp consul[3325]: 2022-12-05T14:46:10.876Z [INFO] agent: started state syncer
Dec 05 14:46:10 kubetmplp consul[3325]: 2022-12-05T14:46:10.876Z [INFO] agent: Consul agent running!
Dec 05 14:46:10 kubetmplp consul[3325]: 2022-12-05T14:46:10.876Z [WARN] agent.router.manager: No servers available
Dec 05 14:46:10 kubetmplp consul[3325]: 2022-12-05T14:46:10.876Z [ERROR] agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
Dec 05 14:46:10 kubetmplp consul[3325]: agent: started state syncer
Dec 05 14:46:10 kubetmplp consul[3325]: agent: Consul agent running!
Dec 05 14:46:10 kubetmplp consul[3325]: agent.router.manager: No servers available
Dec 05 14:46:10 kubetmplp consul[3325]: agent.anti_entropy: failed to sync remote state: error="No known Consul servers"
{
"datacenter": "iplan",
"data_dir": "/var/lib/consul",
"encrypt": "3ZYt2575ONn/EYcnQTGKBg==",
"retry_interval": "10s",
"enable_script_checks": false,
"disable_update_check": true,
"dns_config": {
"enable_truncate": true,
"only_passing": true
},
"enable_syslog": true,
"leave_on_terminate": true,
"log_level": "trace",
"rejoin_after_leave": true,
"tls": {
"defaults": {
"verify_incoming": false,
"verify_outgoing": false
}
}
}
failed to switch to Consul server \"xx.xx.xx.xx:8502\": target sub-connection is not ready (state=TRANSIENT_FAILURE)"}
when it tries to connect to the server during upgrade to consul chart 1.0.2 with consul 14.2. Think this issue is due to TLS encryption.