Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Math. You can try it out at: https://www.browserling.com/tools/ip-to-hex
    John Spencer
    I love Math.
    Thank you sir
    Willi Schönborn
    I didn't find anything in the docs, so I'm asking here. Does the transparent proxy support Consul's own DNS as well, instead of Kubernetes DNS? We're running multiple clusters, so Kubernetes DNS won't do any good for us. But we do have routeable pod IPs, which means two pods from different clusters can talk to one another.
    Spencer Owen
    Is there an automated way to upgrade consul_intentions < 1.9 to the new consul_config_entry syntax with terraform? I have hundreds of consul intentions and changing all these by hand is toing to take forever.
    # This was correct in version 2.10.0
    resource "consul_intention" "database" {
      source_name      = "api"
      destination_name = "db"
      action           = "allow"
    # This is now the correct configuration starting version 2.11.0
    resource "consul_config_entry" "database" {
      name = "db"
      kind = "service-intentions"
      config_json = jsonencode({
        Sources = [{
          Action     = "allow"
          Name       = "api"
          Precedence = 9
          Type       = "consul"
    9 replies
    When running consul connect in Nomad with an envoy sidecar, consul agent and envoy sidecar container stderr logs show the following grpc permission related errors below. Anyone familiar with this or how to debug it?
    # From consul agent on the host (log level is trace):
    agent.envoy.xds: Incremental xDS v3: xdsVersion=v3 direction=request protobuf="{ "typeUrl": "type.googleapis.com/envoy.config.cluster.v3.Cluster"
    agent.envoy.xds: subscribing to type: xdsVersion=v3 typeUrl=type.googleapis.com/envoy.config.cluster.v3.Cluster
    agent.envoy.xds: watching proxy, pending initial proxycfg snapshot for xDS: service_id=_nomad-task-6227f408-bee9-77fa-529f-924164f42b80-group-api-count-api-9001-sidecar-proxy xdsVersion=v3
    agent.envoy.xds: Got initial config snapshot: service_id=_nomad-task-6227f408-bee9-77fa-529f-924164f42b80-group-api-count-api-9001-sidecar-proxy xdsVersion=v3
    agent.envoy: Error handling ADS delta stream: xdsVersion=v3 error="rpc error: code = PermissionDenied desc = permission denied"
    # From envoy stderr in the envoy sidecar container (log level is trace):
    DeltaAggregatedResources gRPC config stream closed: 7, permission denied
    gRPC update for type.googleapis.com/envoy.config.cluster.v3.Cluster failed
    gRPC update for type.googleapis.com/envoy.config.listener.v3.Listener failed
    1 reply
    Daniel Hix
    getting IO timeouts on consul snapshot restore, any ideas? the port 8300 is open and I can hit it from the leader container
    3 replies
    Has anyone setup a mongodb atlas connection via terminating gateway?
    12 replies
    Michael Aldridge
    @blake is there a recommendation anywhere for how to distribute certificates to consul servers when running immutably?
    6 replies
    Gaurav Shankar
    having an issue " agent.server.memberlist.wan: memberlist: Failed to resolve consul-consul-server-1.dc1/2605::::::8302: lookup 2605:::::::8302: no such host" .. the issue is tthere is no brackets on the ipv6 like [2605:::]8302 . how do i introduce this in the wan lookup .. environment is openshift ipv6 cluster
    1 reply

    hello. I have a working mesh gateway with wan federation. from both datacenters I can curl /v1/catalog/services?dc=<other-dc> and see the services running there and "consul members -wan" shows servers in both dcs
    however, services themselves (e.g. the socat example) cannot connect between the DCs
    The only errors I see in the consul logs are on the secondary DC where there are lots of warnings:
    Err :connection error: desc = "transport: Error while dialing dial tcp <internal ip of server in primary dc>:8300: i/o timeout"

    I outlined the issue here https://discuss.hashicorp.com/t/unable-to-connect-services-between-datacenters-despite-working-mesh-gateways/28721
    I would really appreciate any help as I'm completely stuck

    1 reply
    Hi. What Java library are developers using now for accessing consul? consul-client has not had a release in a while.
    Matt Darcy
    my home test lab (all running consul 1.10.1) is having an odd problem with one node - it seems to never truly join the cluster properly, I’ve just done a force-remove and then a join wich didn’t error, however the node is filled with errors / problems, I cannot understand the reasoning for the behaviour of this node. The status of the test cluster is as follows.
    blockquote Node Address Status Type Build Protocol DC Segment
    jake.no-dns.co.uk alive server 1.10.1 2 bathstable <all>
    nog.no-dns.co.uk alive server 1.10.1 2 bathstable <all>
    wesley.no-dns.co.uk alive server 1.10.1 2 bathstable <all>
    anton.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    archer.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    c8test2.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    dukat.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    garak.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    janeway.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    jarvis.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    lcars.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    lemon.no-dns.co.uk alive client 1.9.6 2 bathstable <default>
    paris.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    riker.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    ro.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    router.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    tpol.no-dns.co.uk alive client 1.10.1 2 bathstable <default>
    my first concern which I can find no reference to is why archer.no-dns.co.uk is being referenced on rather than it’s true IP address like all the other nodes, config.json on all the nodes including archer displays the IP linked to the FQDN
    Aug 27 17:22:07 archer consul[1726]: 2021-08-27T17:22:07.419Z [WARN] agent.client.memberlist.lan: memberlist: Was able to connect to ro.no-dns.co.uk but other probes failed, network may be misconfigured
    Aug 27 17:22:08 archer consul[1726]: 2021-08-27T17:22:08.419Z [WARN] agent.client.memberlist.lan: memberlist: Was able to connect to riker.no-dns.co.uk but other probes failed, network may be misconfigured
    Aug 27 17:22:08 archer consul[1726]: 2021-08-27T17:22:08.969Z [WARN] agent.client.memberlist.lan: memberlist: Refuting a suspect message (from: archer.no-dns.co.uk)
    Aug 27 17:22:09 archer consul[1726]: 2021-08-27T17:22:09.420Z [WARN] agent.client.memberlist.lan: memberlist: Was able to connect to router.no-dns.co.uk but other probes failed, network may be misconfigured
    Aug 27 17:22:10 archer consul[1726]: 2021-08-27T17:22:10.421Z [WARN] agent.client.memberlist.lan: memberlist: Was able to connect to dukat.no-dns.co.uk but other probes failed, network may be misconfigured
    that’s my second concern that the node archer cannot talk to any other node (it can at a network level, it can ping, and nc connect on the appropraite ports as was as telnet to the right ports) and the same from other nodes, they can all talk to it
    my only assumption that there is some sort of consul network transport problem as at a network level the connectivity is there
    Matt Darcy
    that error message appears to normally be there is no network/firewall connectivity, but I’ve tested this and it is %100 reachable between other nodes
    I’ve no idea why this cluster is being so odd with one node
    one of the cluster leaders, has these errors in in the consul log - which again makes no sense to me as it suggests the node archer.no-dns.co.uk is not a member of the cluster
    2021-08-27T17:26:16.642Z [WARN] agent.server.memberlist.lan: memberlist: Got ping for unexpected node 'archer.no-dns.co.uk' from=
    2021-08-27T17:26:17.144Z [WARN] agent.server.memberlist.lan: memberlist: Got ping for unexpected node archer.no-dns.co.uk from=
    2021-08-27T17:26:17.144Z [ERROR] agent.server.memberlist.lan: memberlist: Failed fallback ping: EOF
    Matt Darcy
    is there a way to refute any possible network comms, the node that’s failing is a raspberry pi, with no software firewall, running on a switch port connected to all the other nodes in that list with one or two minor exceptions, so for most of the nodes there is nothing in between the nodes
    the only thing I can think of, is that the dead message is from an earlier time / failure of some sort, and because the mesage being refuted it’s stopping it fully joining the cluster, but that doesn’t explain why it’s the only node being referenced on
    pablo platt
    Is it possible to connect a server on remote datacenter to the service mesh without federation and gateway?
    federation will force me to add another Consul cluster and the gateway will be another point of failure
    the remote data center has only a single server and adding a Consul and gateway just for one server is too much overhead
    6 replies
    hello after a yum update on my environment ( push by automation ) I'm unable to recover my consul cluster , the only error I'm getting is the following : {"@level":"info","@message":"Request cancelled","@module":"agent.http","@timestamp":"2021-08-30T20:11:12.216779Z","error":"No cluster leader","from":"","method":"GET","url":"/v1/operator/raft/configuration"} and {"@level":"error","@message":"failed to make requestVote RPC","@module":"agent.server.raft","@timestamp":"2021-08-30T20:11:10.433318Z","error":"EOF","target":{"Suffrage":0,"ID":"bas25486-e913-f023-8493-91a46cab6f0a","Address":""}}
    3 replies
    Shai Ben-Naphtali
    What is the variable type of discovery_max_stale? Is it a binary? A string or an int?
    The docs don't mention this AFAIK
    1 reply
    Shai Ben-Naphtali
    Found it in agent/config/config.go
        DiscoveryMaxStale                *string                  `json:"discovery_max_stale" hcl:"discovery_max_stale" mapstructure:"discovery_max_stale"`
    Matt Darcy
    I’ve got one consul cluster member that’s misbehaving in so many ways, I’m a bit overcome with ‘fault’ on, I’m trying to zero on on one fault/symptom at a time to understand the root problem,
    at the moment, if I do a consul members -detailed I see all the cluster members, all correctly reporting their IP address and port 8301 except for the faulty one, which is reporting as
    the config.jason for that node is not using it’s using it’s correct IP
    I cannot understand what is causing the reference to for this one node
    Matt Darcy
    interesting, I’ve restarted consul 10+ times (while testing stuff) and it’s not made a differnce, the last restart (which was a reboot of the box due to a non-consul package update) a few minutes ago has fixed this and it’s now reporting it’s correct IP address, even though no config file change was made
    Alex Oskotsky
    Hi. Is it possible to add extra envoy filters while using L7 traffic management features? I see this page https://www.consul.io/docs/connect/proxies/envoy#escape-hatch-overrides says that envoy_listener_json can't be used when using a service-resolver. Is there any workaround to that? I'm trying to use a filter to return a custom response when there are no upstreams available
    Frederik Bosch
    How can I know I successfully enabled consul connect in my cluster?
    7 replies

    Hello All,

    I need you expert comments on below error:

    2021-09-08T09:18:07.191Z [ERROR] agent.server.memberlist.lan: memberlist: Failed fallback ping: EOF
    2021-09-08T09:18:54.191Z [ERROR] agent.server.memberlist.lan: memberlist: Failed fallback ping: EOF
    2021-09-08T09:21:29.190Z [ERROR] agent.server.memberlist.lan: memberlist: Failed fallback ping: EOF
    2021-09-08T09:22:25.191Z [ERROR] agent.server.memberlist.lan: memberlist: Failed fallback ping: EOF

    My network connectivity is fine with all the servers in cluster.
    I am able to get server members and not facing any other issues but its still keep on logging this error message which I am unable to understand.

    Please suggest whats happening here?
    Thanks in advance

    6 replies
    Horizon Zero James
    Hello, I'm in the process of migrating our web apps to consul/nomad from static configs managed by ansible. We're using HAProxy and I'm planning on sticking with that for now. If I run HAProxy in Nomad how do people manage routing traffic to it given it's IP can change? Do people usually force a static IP on the container? Thanks
    Michael Aldridge
    @foozmeat_twitter you have a few options. The easiest is to create a datacenter in nomad specifically for running haproxy and then run it as a system job. This is a pattern that many of the people over in the nomad room use already. At the more complex end of the scale you could do bgp anycast from any of your nodes running haproxy and then just move the IPs themselves around
    8 replies
    is there a way to dial a service instance directly through a sidecar proxy, without transparent proxy? can't use the iptables traffic redirection as I have multiple services in the same netns
    4 replies
    Does anyone have any hints for requirements/troubleshooting WRT TLS for WAN gossip?
    I have one existing Consul cluster (intended as new primary dc) and adding a new one.
    Primary one set up with built-in CA. New one set up using a separate CA (Vault, FWIW)
    I have attempted a join -wan and the primary cluster sees the new cluster server as a gossip node
    But both ends are giving TLS errors.
    Servers for both DCs have the CA for the server certificates added as trusted CA certs on the system (on debian, using update-ca-certificates)
    I don't seem to be able to find any settings in Consul to add additional CAs, or separate CAs for WAN.
    Is there any way to get this working or is there a requirement that server TLS certs for all servers in a WAN gossip pool come from the same CA?
    (consul members on server in new DC is yielding 403 ACL not found but I guess that is expected until replication is successful and that the cert issues are preventing replication)
    This verifies fine on both ends: openssl s_client -showcerts -verify 5 -connect server.$OTHER_DC.consul:8300 < /dev/null (using /etc/hosts to map hostname to server in the other DC)
    Stll, I get on server in old, primary DC
    [ERROR] agent.server.rpc: failed to read byte: conn=from=$PROXY_IP :32139 error="remote error: tls: bad certificate"
    and on new :
    [WARN]  agent.server.replication.acl.role: ACL replication error (will retry if still leader): error="failed to retrieve remote ACL roles: rpc error getting client: failed to get conn: x509: certificate signed by unknown authority"
    [ERROR] agent.server.connect: error performing intention migration in secondary datacenter, will retry: routine="intention config entry migration" error="rpc error getting client: failed to get conn: x509: certificate signed by unknown authority"
    [ERROR] agent.server.rpc: RPC failed to server in DC: server=$IP:8300 datacenter=$PRIMARY_DC method=ConfigEntry.ListAll error="rpc error getting client: failed to get conn: x509: certificate signed by unknown authority"
    Probably unrelated but running consul acl bootstrap on new server return Failed ACL bootstrapping: Unexpected response code: 500 (ACL support disabled) (ACL enabled in config , only hardcoded token is replication as per docs)