Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Matt Darcy
    love some gut feels, not looked at duplicates as an idea, thanks
    Shantanu Gadgil
    the way I debugged this was:
    • shut down the node ...
    • restart Consul servers (so as to cleanup "left" nodes)
    • then wait and watch if the node with the same id comes back
    Matt Darcy
    I’ve sort of done that (for differemt reasons\0
    I’m wrondering if I should do a force leave on the node and then see if it rejoins
    Shantanu Gadgil
    force leave would be worth a try ...
    else "rm -rf" the data dir for consul and reboot the node
    do you have any config magic to set machine node ids? or do you let them be auto generated?
    Matt Darcy
    Shantanu Gadgil

    Hi Team,

    I was trying to create tf state in GCP with version 0.15.4 but terraform validate command is throwing below error.

    erraform has been successfully initialized!
    $ terraform validate -var-file=environment/${CI_ENVIRONMENT_NAME}/variables.tfvars

    │ Error: Failed to parse command-line flags

    │ flag provided but not defined: -var-file

    For more help on using this command, run:
    terraform validate -help

    Cleaning up file based variables

    Please can someone help me, if something i'm missing or is there any change in 0.15.4 version

    Shantanu Gadgil
    what is the output of terraform validate -help ?
    btw: this is the Consul lobby :smiley:
    Michael Aldridge
    @blake as you gaze into your crystal ball, should I wait to do a base image upgrade for another week in the hopes of consul 1.10 becoming GA by then?
    Blake Covarrubias
    @the-maldridge We're planning to release Consul 1.10 tomorrow.
    Michael Aldridge
    fantastic, I'll hold off until tomorrow
    Blake Covarrubias
    @the-maldridge Consul 1.10 is out. :-)
    Michael Aldridge
    I'm actually in the middle of a vault rollout, images were built 4 minutes after the binaries were up on docker hub.
    Has anyone used consul connect with horizontally scaling databases like Cassandra? If each Cassandra node is registered as a instance in consul, and a client calls localhost:connect_port, would this single connection work?
    6 replies
    Yoan Blanc
    Hey, playing with GRPC checks, we are scratching our heads as it seems to require grpc.health.v1.Health, ignoring the value we are giving it.
    11 replies
    I am getting a lot of metrics read error in my Consul on K8s with Connect and ACLs enabled. What is the best way to handle this, Should I setup Prometheus to run inside the mesh?
    6 replies
    How to validate if the ipv6 is configured correctly for consul cluster or not I do see tagged address with the ipv6 address for the nodes call but does that mean the RPC is also ready to communicate over the same address?
    Alex Henning Johannessen

    I noticed that I sometimes get this in consul logs on consul servers and clients:

    "2021-06-30T08:10:26.559Z [WARN]  agent: grpc: addrConn.createTransport failed to connect to { 0 hp-03.als <nil>}. Err :connection error: desc = \"transport: Error while dialing dial tcp operation was canceled\". Reconnecting..."

    Any suggestion what this could be? Otherwise, the cluster seems healthy and all nodes are healthy. I use consul 1.10 on all nodes.

    My config for a server node looks like this:
        "acl": {
            "default_policy": "deny",
            "down_policy": "extend-cache",
            "enable_token_persistence": true,
            "enabled": true,
            "token_ttl": "30s",
            "tokens": {
                "agent": "<redacted>",
                "master": "<redacted>",
                "replication": "<redacted>"
        "addresses": {
            "dns": "",
            "grpc": "",
            "http": "",
            "https": ""
        "advertise_addr": "",
        "advertise_addr_wan": "",
        "auto_encrypt": {
            "allow_tls": true
        "bind_addr": "",
        "bootstrap": false,
        "bootstrap_expect": 3,
        "ca_file": "/etc/consul/certs/consul-agent-ca.pem",
        "cert_file": "/etc/consul/certs/server.pem",
        "client_addr": "",
        "connect": {
            "enabled": true
        "data_dir": "/data/consul",
        "datacenter": "als",
        "disable_update_check": false,
        "domain": "consul",
        "enable_local_script_checks": false,
        "enable_script_checks": false,
        "encrypt": "<redacted>",
        "encrypt_verify_incoming": true,
        "encrypt_verify_outgoing": true,
        "key_file": "/etc/consul/certs/server-key.pem",
        "log_file": "/var/log/consul/consul.log",
        "log_level": "INFO",
        "log_rotate_bytes": 0,
        "log_rotate_duration": "24h",
        "log_rotate_max_files": 14,
        "performance": {
            "leave_drain_time": "5s",
            "raft_multiplier": 1,
            "rpc_hold_timeout": "7s"
        "ports": {
            "dns": 53,
            "grpc": 8502,
            "http": 8500,
            "https": -1,
            "serf_lan": 8301,
            "serf_wan": 8302,
            "server": 8300
        "primary_datacenter": "als",
        "raft_protocol": 3,
        "recursors": [
        "retry_interval": "30s",
        "retry_interval_wan": "30s",
        "retry_join": [
        "retry_max": 0,
        "retry_max_wan": 0,
        "server": true,
        "tls_min_version": "tls12",
        "tls_prefer_server_cipher_suites": false,
        "translate_wan_addrs": false,
        "ui": false,
        "verify_incoming": true,
        "verify_incoming_https": false,
        "verify_incoming_rpc": false,
        "verify_outgoing": true,
        "verify_server_hostname": true
    Shai Ben-Naphtali
    If I have the CheckID, is there a way I can "read" it and see what its configured to?
    I'm asking because I have a checkid error with "CheckID XYZ... does not have associated TTL" in the log
    but AFAIK, I've got a timeout set on all my checks
    Robert Goldsmith
    hi all :) I'm seeing some very weird behaviour from my consul cluster where http-based health checks are failing with timeouts. It starts happening after a few minutes of a service being registered and won't stop. Seems to be happening with consul 1.9.6 and 1.10.0. Other kinds of checks seem ok. Anyone got any ideas?
    Shantanu Gadgil
    @blake @angrycub @anyone_else could you check this and help with suggestions, if any:
    Hi there, I am trying to set up a simple sidecar proxy with envoy. I got node-exporter and an http service (name=web) on separate machines. When I start both sidecars, I am checking with curl if I can reach node-exporter over the proxy, but envoy doesn't follow the upstream in connect.sidecar_service. Instead it forwards all http requests to the actual web service.
    When I set up the same configuration on the same machine -- it works fine.
    What am I doing wrong?
    Blake Covarrubias
    @deni64k Which version of Consul are you using?
    Willi Schönborn
    With consul connect, is there a way to know which service made a request to my service? TLS is terminated, so I don't get the certificate but I also don't see any headers being injected which would tell me. Is there any configuration that would allow me to enable e.g. envoy's XFCC header?
    1 reply
    Shai Ben-Naphtali
    Why are there no logs about ACL. Not even when using -log-level=trace ; I'm using v1.8.10
    Anyone here uses HCP consul as the remote backend for their local (laptop/desktop/raspi) terraform runs?
    Blake Covarrubias
    @gc-ss I’m using my local Consul cluster for TF state storage. HCP Consul should work the same. Do you have a particular question or issue about it?
    3 replies

    Hey Blake, appreciate your input.

    HCP Consul should work the same. Do you have a particular question or issue about it?

    I do. My naive understanding so far is, the HCP consul cluster is configured to reject any traffic sources that does not originate from within the VPC that's peer'd with the HVN?

    1. Is that correct?

    2. Would that mean I would have to setup a VPN to use HCP consul as otherwise my local terraform traffic would be rejected?

    BTW - I am interested in getting feedback or links to repos for those who have used https://registry.terraform.io/providers/hashicorp/hcp
    1 reply
    i can't really find many consul best practices, and i'm wondering what is the ideal way to add a "version" to a node's service. we deploy code like 20 times a day, and I want to dynamically update the service to show the current version. any ideas?
    Blake Covarrubias
    @kornface13 Service tags or meta. Both are documented here. https://www.consul.io/docs/discovery/services
    Alex Oskotsky
    I'm having an issue creating a service-resolver in a non-primary data center. All of my PUT requests to the /config API write to the primary datacenter in my cluster instead of the one I'm sending the request to. Read requests go to the right data center and I end up not seeing the config that I created unless I read from the primary data center. Is this expected?
    3 replies
    I'm using consul 1.10.0
    would anyone be able to help me troubleshoot an ACL not found error? Everything appears to be correct, but some hosts get this error and i've been unable to figure it out
    3 replies
    Matt Darcy
    could someone check my consul config for me - I’ve been running this config for months without issue on deployed agent hosts and in a container, the container auto updated to 1.10 and is now failing to start complaining about the config
    the config is pretty basic for the container
    the container complains ==> failed to parse /consul/config/config.json: invalid character ',' after top-level value
    I don’t see an invalid placement of a comma, more so when the same config is working on non-container deployments running 1.10 (and now 1.10.1)
    what am I missing, it looks like a json error rather than an actual consul error, but I’ve not touched the configs between versions