Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Michael Aldridge
    I'm actually in the middle of a vault rollout, images were built 4 minutes after the binaries were up on docker hub.
    Has anyone used consul connect with horizontally scaling databases like Cassandra? If each Cassandra node is registered as a instance in consul, and a client calls localhost:connect_port, would this single connection work?
    6 replies
    Yoan Blanc
    Hey, playing with GRPC checks, we are scratching our heads as it seems to require grpc.health.v1.Health, ignoring the value we are giving it.
    11 replies
    I am getting a lot of metrics read error in my Consul on K8s with Connect and ACLs enabled. What is the best way to handle this, Should I setup Prometheus to run inside the mesh?
    6 replies
    How to validate if the ipv6 is configured correctly for consul cluster or not I do see tagged address with the ipv6 address for the nodes call but does that mean the RPC is also ready to communicate over the same address?
    Alex Henning Johannessen

    I noticed that I sometimes get this in consul logs on consul servers and clients:

    "2021-06-30T08:10:26.559Z [WARN]  agent: grpc: addrConn.createTransport failed to connect to { 0 hp-03.als <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp operation was canceled\". Reconnecting..."

    Any suggestion what this could be? Otherwise, the cluster seems healthy and all nodes are healthy. I use consul 1.10 on all nodes.

    My config for a server node looks like this:
        "acl": {
            "default_policy": "deny",
            "down_policy": "extend-cache",
            "enable_token_persistence": true,
            "enabled": true,
            "token_ttl": "30s",
            "tokens": {
                "agent": "<redacted>",
                "master": "<redacted>",
                "replication": "<redacted>"
        "addresses": {
            "dns": "",
            "grpc": "",
            "http": "",
            "https": ""
        "advertise_addr": "",
        "advertise_addr_wan": "",
        "auto_encrypt": {
            "allow_tls": true
        "bind_addr": "",
        "bootstrap": false,
        "bootstrap_expect": 3,
        "ca_file": "/etc/consul/certs/consul-agent-ca.pem",
        "cert_file": "/etc/consul/certs/server.pem",
        "client_addr": "",
        "connect": {
            "enabled": true
        "data_dir": "/data/consul",
        "datacenter": "als",
        "disable_update_check": false,
        "domain": "consul",
        "enable_local_script_checks": false,
        "enable_script_checks": false,
        "encrypt": "<redacted>",
        "encrypt_verify_incoming": true,
        "encrypt_verify_outgoing": true,
        "key_file": "/etc/consul/certs/server-key.pem",
        "log_file": "/var/log/consul/consul.log",
        "log_level": "INFO",
        "log_rotate_bytes": 0,
        "log_rotate_duration": "24h",
        "log_rotate_max_files": 14,
        "performance": {
            "leave_drain_time": "5s",
            "raft_multiplier": 1,
            "rpc_hold_timeout": "7s"
        "ports": {
            "dns": 53,
            "grpc": 8502,
            "http": 8500,
            "https": -1,
            "serf_lan": 8301,
            "serf_wan": 8302,
            "server": 8300
        "primary_datacenter": "als",
        "raft_protocol": 3,
        "recursors": [
        "retry_interval": "30s",
        "retry_interval_wan": "30s",
        "retry_join": [
        "retry_max": 0,
        "retry_max_wan": 0,
        "server": true,
        "tls_min_version": "tls12",
        "tls_prefer_server_cipher_suites": false,
        "translate_wan_addrs": false,
        "ui": false,
        "verify_incoming": true,
        "verify_incoming_https": false,
        "verify_incoming_rpc": false,
        "verify_outgoing": true,
        "verify_server_hostname": true
    Shai Ben-Naphtali
    If I have the CheckID, is there a way I can "read" it and see what its configured to?
    I'm asking because I have a checkid error with "CheckID XYZ... does not have associated TTL" in the log
    but AFAIK, I've got a timeout set on all my checks
    Robert Goldsmith
    hi all :) I'm seeing some very weird behaviour from my consul cluster where http-based health checks are failing with timeouts. It starts happening after a few minutes of a service being registered and won't stop. Seems to be happening with consul 1.9.6 and 1.10.0. Other kinds of checks seem ok. Anyone got any ideas?
    Shantanu Gadgil
    @blake @angrycub @anyone_else could you check this and help with suggestions, if any:
    Hi there, I am trying to set up a simple sidecar proxy with envoy. I got node-exporter and an http service (name=web) on separate machines. When I start both sidecars, I am checking with curl if I can reach node-exporter over the proxy, but envoy doesn't follow the upstream in connect.sidecar_service. Instead it forwards all http requests to the actual web service.
    When I set up the same configuration on the same machine -- it works fine.
    What am I doing wrong?
    Blake Covarrubias
    @deni64k Which version of Consul are you using?
    Willi Schönborn
    With consul connect, is there a way to know which service made a request to my service? TLS is terminated, so I don't get the certificate but I also don't see any headers being injected which would tell me. Is there any configuration that would allow me to enable e.g. envoy's XFCC header?
    1 reply
    Shai Ben-Naphtali
    Why are there no logs about ACL. Not even when using -log-level=trace ; I'm using v1.8.10
    Anyone here uses HCP consul as the remote backend for their local (laptop/desktop/raspi) terraform runs?
    Blake Covarrubias
    @gc-ss I’m using my local Consul cluster for TF state storage. HCP Consul should work the same. Do you have a particular question or issue about it?
    3 replies

    Hey Blake, appreciate your input.

    HCP Consul should work the same. Do you have a particular question or issue about it?

    I do. My naive understanding so far is, the HCP consul cluster is configured to reject any traffic sources that does not originate from within the VPC that's peer'd with the HVN?

    1. Is that correct?

    2. Would that mean I would have to setup a VPN to use HCP consul as otherwise my local terraform traffic would be rejected?

    BTW - I am interested in getting feedback or links to repos for those who have used https://registry.terraform.io/providers/hashicorp/hcp
    1 reply
    i can't really find many consul best practices, and i'm wondering what is the ideal way to add a "version" to a node's service. we deploy code like 20 times a day, and I want to dynamically update the service to show the current version. any ideas?
    Blake Covarrubias
    @kornface13 Service tags or meta. Both are documented here. https://www.consul.io/docs/discovery/services
    Alex Oskotsky
    I'm having an issue creating a service-resolver in a non-primary data center. All of my PUT requests to the /config API write to the primary datacenter in my cluster instead of the one I'm sending the request to. Read requests go to the right data center and I end up not seeing the config that I created unless I read from the primary data center. Is this expected?
    3 replies
    I'm using consul 1.10.0
    would anyone be able to help me troubleshoot an ACL not found error? Everything appears to be correct, but some hosts get this error and i've been unable to figure it out
    3 replies
    Matt Darcy
    could someone check my consul config for me - I’ve been running this config for months without issue on deployed agent hosts and in a container, the container auto updated to 1.10 and is now failing to start complaining about the config
    the config is pretty basic for the container
    the container complains ==> failed to parse /consul/config/config.json: invalid character ',' after top-level value
    I don’t see an invalid placement of a comma, more so when the same config is working on non-container deployments running 1.10 (and now 1.10.1)
    what am I missing, it looks like a json error rather than an actual consul error, but I’ve not touched the configs between versions
    (sorry should put that config in a gist to make it easier to read)
    Matt Darcy
    got to be a user error I’m just not seeing
    Shantanu Gadgil
    @ikonia could you try cat file.json | jq . to see if that passes.
    I do see that the ending } is missing.
    Matt Darcy
    I can, thank you
    odd, @shantanugadgil parse error: Expected value before ',' at line 1, column 224
    I don’t see where a value is missing, and I don’t get how this has ‘broken’ after no change
    (also never used jq before - so thanks for that)
    Matt Darcy
    sorted, thank you, it was the ordering, jq was the saviour, great tip, thank you
    need to research if/when/how that file got changed
    Shantanu Gadgil
    @ikonia glad to help! 👍

    Interested in hearing your thoughts


    Outside using the way a prepared query can be accessed over DNS, is there a more direct way of putting in an expression in retry_join such that it uses prepared query in the same style of the way "provider" expressions work in retry_join?

    I want to try using a prepared query to self discover consul peers in retry_join (assuming the consul server resolving the prepared query knows about said peers)?

    4 replies
    Hi all, we had Consul Connect working with Vault as an external CA for a few weeks up until over the weekend, on Saturday our proxies have stopped working, displaying 'Connection Refused' after the initial request. We can't get the minimal Nomad Consul Connect "countdash" example working, which was fine previously. We haven't had any configuration changes, the Vault CA periodic token is still valid and it seems like certs are still being generated and assigned to new proxies. We also can't find anything immediately obvious in metrics or logs, wondering if anyone has experienced something similar? We are on Consul 1.9.6, Nomad 1.0.1 and Envoy 1.16.4
    2 replies
    Matt Darcy
    I’ve just rebuilt my consul test cluster, basic 3 quorum set nodes with around ~20 members - one of the members is a synlogy NAS box with consul 1.10.1 container running as a cluster member, (it’s an interesting member to have) the 3 consul raft servers are all showing an error from the IP address of the NAS box running the consul member container
    Jul 21 10:05:35 jake consul[7781]: 2021-07-21T10:05:35.282Z [ERROR] agent.server.rpc: unrecognized RPC byte: byte=71 conn=from=