Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Iury Fukuda
    @iuryfukuda:matrix.org
    [m]
    i had some problem
    in am vm environment
    May 30 17:20:06 r1 consul-mesh-start[42577]: [2022-05-30 17:20:06.836][42577][debug][pool] [source/common/conn_pool/conn_pool_base.cc:443] [C23] client disconnected, failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
    May 30 17:20:06 r1 consul-mesh-start[42577]: [2022-05-30 17:20:06.836][42577][debug][router] [source/common/router/router.cc:1154] [C0][S12085588059115559816] upstream reset: reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
    May 30 17:20:06 r1 consul-mesh-start[42577]: [2022-05-30 17:20:06.836][42577][debug][http] [source/common/http/async_client_impl.cc:100] async http request response headers (end_stream=true):
    May 30 17:20:06 r1 consul-mesh-start[42577]: ':status', '200'
    May 30 17:20:06 r1 consul-mesh-start[42577]: 'content-type', 'application/grpc'
    May 30 17:20:06 r1 consul-mesh-start[42577]: 'grpc-status', '14'
    May 30 17:20:06 r1 consul-mesh-start[42577]: 'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED'
    May 30 17:20:06 r1 consul-mesh-start[42577]: [2022-05-30 17:20:06.836][42577][warning][config] [./source/common/config/grpc_stream.h:195] DeltaAggregatedResources gRPC config stream closed since 278s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED
    1 reply
    the grpc is configured in server
    and tls seens to be good ( i can use it in browser)
    Iury Fukuda
    @iuryfukuda:matrix.org
    [m]
    thanks, it apparently passed
    now its in =s Error registering service "gateway-primary": Put "https://127.0.0.1:8501/v1/agent/service/register": dial tcp 127.0.0.1:8501: connect: connection refused
    1 reply
    Nicolasrs23
    @Nicolasrs23
    Hi everyone,
    I am working with consul v1.12.0 and kubernetes. install some deploys configure service mesh, so far so good.
    The problem is when I want to communicate with an RDS (external service) TCP health checks don't work; I tried two types of approaches with no results:
    • Registering it together with the node, the service and checks via catalog. (output: timeout)
    • Registering a proxy and linking it with the service (output: connection refused)
      Intentions all allow and security groups ok !! here is the repo https://github.com/Nicolasrs23/Consul_proyect.git.
    Marina Shustova
    @MarinaShustowa
    Hello Everyone,
    I’m looking into how Consul can process “Host” http header instead of destination IP for outgoing http requests.
    In my scenario some requests make it to Consul through proxy, so only "Host" header has information about actual destination.
    Is it possible to configure Consul this way?
    Thanks in advance!
    jaiganeshvazhkudai
    @jaiganeshvazhkudai
    hi everyone.. getting a lot of error messages "[ERR] memberlist: Push/Pull with <host> failed: Node <host> protocol version (2) is incompatible: [1, 0] - incidentally If i try to add a new node (client) to the cluster, it fails repeatedly and all i can see in the failure logs is a version of these messages
    Failed to join IP of server : Node 'different host name' protocol version (2) is incompatible: [1, 0]
    5 replies
    Riain Condon
    @riain0

    Hi all,

    I am running Consul servers on ECS EC2 which all connect up fine via retry-join on an NLB.

    For the clients, I am using ECS Fargate and retry-join with the aws tags.

    The clients seem to find the server instances and their IPv4 address and attempt to join them, but there's no error logged about that failing. What happens is is that the client starts logging logs like: 2022-06-20T09:00:54.961Z [WARN] agent.router.manager: No servers available and 2022-06-20T09:00:54.961Z [ERROR] agent: failed to sync changes: error="No known Consul servers".

    I've seen a couple issues about the logs above in GitHub but no solutions and I can't determine if this is even related.

    Has anyone seen this before/know off the top of their head what this could be?

    Patrick Flick
    @patflick
    I'd like to put a templated app config into consul KV. Is it possible that consul updates its kv value based on consul template? Is there an easy way to achieve this that doesn't require manually triggered scripts?
    Sean
    @seanamos

    Hmm, strange one related to connect.

    All clients/servers have connect enabled. However, ALL clients are reporting this error every 10 minutes+-:

    Jun 29 00:24:27 ip-11-0-3-20 consul[1572]: {"@level":"error","@message":"RPC failed to server","@module":"agent.client","@timestamp":"2022-06-29T00:24:27.108110Z","error":"rpc error making call: i/o deadline reached","method":"ConnectCA.Roots","server":{"IP":"11.0.4.125","Port":8300,"Zone":""}}
    Jun 29 00:24:27 ip-11-0-3-20 consul[1572]: {"@level":"warn","@message":"handling error in Cache.Notify","@module":"agent.cache","@timestamp":"2022-06-29T00:24:27.108796Z","cache-type":"connect-ca-root","error":"rpc error making call: i/o deadline reached","index":12}

    Connect sidecar proxies fail to deploy (with nomad), Traefik fails with a similar error when setup to use consul connect.
    KV sync and health check sync is working. The network is open between the cluster and clients (confirmed with telnet {server-ip} 8300 from client). curl https://{server-ip}:8501/v1/connect/ca/roots returns a valid 200 response with a CA cert.

    I've successfully deployed this before, which makes it doubly strange. THE ONLY difference between past consul deployments and this one, is TLS auto_encrypt for the clients. In the past I've distributed client certs. TLS settings are set to their strictest, including tls { internal_rpc { verify_server_hostname = true } }

    ACLs are also enabled.

    The servers themselves don't have any logs of interest (at least at INFO level).

    Any ideas, how can I debug further?

    1 reply
    axsuul
    @axsuul:matrix.org
    [m]

    Hi there, I'm using Traefik which builds its configuration using Consul Catalog. Upon Traefik startup, it takes >5 minutes for Traefik to retrieve its configuration from Consul Catalog. Looking in Traefik logs, it looks like it's having issues fetching the Connect certificate from Consul

    level=info msg="Waiting for Connect certificate before building first configuration" providerName=consulcatalog

    while it appears Consul seems to be canceling the request

    consul[458]: agent.http: Request cancelled: method=GET url=/v1/agent/connect/ca/roots?index=9 from=10.128.0.34:38606 error="context canceled"
    consul[458]: agent.http: Request cancelled: method=GET url=/v1/agent/connect/ca/leaf/traefik?index=111619 from=10.128.0.34:38604 error="context canceled"

    I am on Consul v1.12.0. How can I debug what's causing Consul to be canceling the request like this?

    axsuul
    @axsuul:matrix.org
    [m]
    Upgraded to Consul v1.12.2, seems to have fixed the issue
    Sean
    @seanamos
    @axsuul:matrix.org See my question above, it was exactly the same problem. Upgrading to v1.12.2 fixed it. I lost 2 days on this... sigh
    axsuul
    @axsuul:matrix.org
    [m]
    @seanamos: Thanks! Yep same, lost days but glad there's a fix 😊
    Narendra Patel
    @narendrapatel
    Hi, is connect non mandatory? We missed setting it to true for 2 of our lower env DCs and service mesh was still working with envoy receiving certificates and it getting rotated as well post the default 72h interval.
    Marina Shustova
    @MarinaShustowa
    Hello!
    Could you please tell me if Hashicorp has any community meetings? If yes, where can I find the schedule?
    axsuul
    @axsuul:matrix.org
    [m]
    I seem to have some type of phantom Vault service in Consul, is there any way for me to force remove this?
    Shantanu Gadgil
    @shantanugadgil
    @axsuul:matrix.org Consul catalog deregister?
    axsuul
    @axsuul:matrix.org
    [m]
    Do you mean consul services deregister? How would I specify however just that one IP since I don't want to deregister the entire service
    nahsi (Anatoly Laskaris)
    @nahsi:nahsi.dev
    [m]
    consul services deregister -id vault:100.99.252.246:8200 on the host with that ip
    Shantanu Gadgil
    @shantanugadgil
    @axsuul:matrix.org actually there are two deregister commands...one from the node and another from the catalog
    Jason Witkowski
    @jwitko

    Hey All, I am having TONS of errors about RPC connections failing between my consul server and mesh gateway pods inside my kubernetes cluster.

    2022-07-21T17:03:24.666Z [ERROR] agent.server.rpc: failed to ingest RPC: sni=consul-server-1.server.lhr-poc1-dataplane.dev.consul protocol=consul/wan-gossip/packet conn=from=10.245.3.51:57878 error="read tcp 10.245.2.195:8300->10.245.3.51:57878: i/o timeout"

    I have googled to infinity, I have modified gossip_wan settings, I have opened firewall/security group settings to be wide open, but nothing seems to work

    Has anyone seen these issues before or could maybe provide me any insight into why this is failing?
    Jason Witkowski
    @jwitko
    Putting my mesh gateway into trace level logging I see the following:
    [2022-07-21 18:02:52.789][50][debug][connection] [source/common/network/connection_impl.cc:890] [C613] connecting to 10.245.1.44:8300
    [2022-07-21 18:02:52.789][50][debug][connection] [source/common/network/connection_impl.cc:909] [C613] connection in progress
    [2022-07-21 18:02:52.789][50][trace][pool] [source/common/conn_pool/conn_pool_base.cc:130] not creating a new connection, shouldCreateNewConnection returned false.
    [2022-07-21 18:02:52.789][50][debug][conn_handler] [source/server/active_tcp_listener.cc:140] [C612] new connection from 10.245.2.0:31924
    [2022-07-21 18:02:52.789][50][trace][connection] [source/common/network/connection_impl.cc:554] [C612] socket event: 2
    [2022-07-21 18:02:52.789][50][trace][connection] [source/common/network/connection_impl.cc:663] [C612] write ready
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:554] [C613] socket event: 2
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:663] [C613] write ready
    [2022-07-21 18:02:52.790][50][debug][connection] [source/common/network/connection_impl.cc:672] [C613] connected
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:417] [C613] raising connection event 2
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:356] [C613] readDisable: disable=true disable_count=0 state=0 buffer_length=0
    [2022-07-21 18:02:52.790][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:294] [C613] attaching to next stream
    [2022-07-21 18:02:52.790][50][debug][pool] [source/common/conn_pool/conn_pool_base.cc:177] [C613] creating stream
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:356] [C613] readDisable: disable=false disable_count=1 state=0 buffer_length=0
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:356] [C612] readDisable: disable=false disable_count=1 state=0 buffer_length=0
    [2022-07-21 18:02:52.790][50][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:609] [C612] TCP:onUpstreamEvent(), requestedServerName: cpeconsul-consul-server-4.server.l
    hr-poc1-dataplane.dev.consul
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:554] [C613] socket event: 2
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:663] [C613] write ready
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:554] [C612] socket event: 3
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:663] [C612] write ready
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/connection_impl.cc:592] [C612] read ready. dispatch_buffered_data=false
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/raw_buffer_socket.cc:24] [C612] read returns: 341
    [2022-07-21 18:02:52.790][50][trace][connection] [source/common/network/raw_buffer_socket.cc:38] [C612] read error: Resource temporarily unavailable
    Yann Huissoud
    @aiqency

    Two similar clusters, two similar consul configs. trying to spawn a second consul cluster, one join the other not:

    [WARN]  agent.server: Raft has a leader but other tracking of the node would indicate that the node is unhealthy or does not exist. The network may be misconfigured.: leader=172.98.120.15:8300
    [WARN]  agent: Syncing node info failed.: error="Raft leader not found in server lookup mapping"
    [ERROR] agent.anti_entropy: failed to sync remote state: error="Raft leader not found in server lookup mapping"
    [ERROR] agent.server.memberlist.lan: memberlist: Conflicting address for pxe-boot. Mine: 172.98.120.101:8301 Theirs: 172.98.120.15:8301 Old state: 0
    [ERROR] agent.server.serf.lan: serf: Node name conflicts with another node at 172.98.120.15:8301. Names must be unique! (Resolution enabled: false)

    Any idea what might cause the error?

    Alvin Lin
    @alvinlin123
    @Amier3 any luck finding someone new to take a look at hashicorp/memberlist#262
    Marc Richter
    @The-Judge
    WOW - last message I see is from Jul 26 - is this thing still alive?
    techdrgn
    @techdrgn:matrix.org
    [m]
    I have no idea, but it is extremely quiet.
    Marc Richter
    @The-Judge
    Hmm. What is "the main" Community Channel/Platform then if it isn't Gitter?

    In the meantime, I will put my question in here anyways. Maybe someone who might help reads it ...
    As I described in discuss already (which seems to have similar activity as Gitter), the official Deployment Guide is inconsistent when it comes to TLS configuration.

    In “Create the certificates” section, it says: “First, for your Consul servers, use the following command to create a certificate for each server.”. So: not for the clients, since “servers” is explicitly written.
    Next it says: " The Consul client agents will only need the the CA certificate, consul-agent-ca.pem , to enable mTLS.". So again: It confirms that the clients only need the CA certificate, not the DC certificates.

    But then, with the very next section “Distribute the certificates to agents”, it says: “You must distribute the CA certificate, consul-agent-ca.pem, to each of the Consul agents as well as the agent specific certificate and private key.”. So, from here, it says that one must copy all node specific certs in addition to the CA certificate, which is the opposite of what was explained before.

    This is once more confirmed in the TLS configuration - Section. Even though “Auto encryption” guide is selected, the consul.hcl snipplet lists not only ca_file, but cert_file and key_file parameters as well “for Consul clients”. The only difference between “Auto” and “Manual” seems to be the auto_encrypt nested section. Which again seems to be the opposite of the “CA cert only” statement and the entire Auto encryption idea.

    Marc Richter
    @The-Judge
    Regarding that auto_encrypt nested section, the consul Security guide brings another unclear element onto the table: in Configure the clients section, it says to configure the clients by indeed setting the ca_file option only, but instead of auto_encrypt { allow_tls = true } to set auto_encrypt { tls = true } instead.
    What's correct now?
    Marc Richter
    @The-Judge
    As far as I understand from the general Consul Configuration Reference, on servers auto_encrypt { allow_tls = true } must be set and on clients auto_encrypt { tls = true }; but that's what my interpretation is and I'm unsure if that's correct.
    1 reply
    oratlv
    @oratlv
    Hi, Does anyone know how to handle this message:
    [WARN] agent.server.serf.lan: serf: Intent queue depth (11437) exceeds limit (10690), dropping messages!
    Consul’s version is 1.13.1
    higuita
    @higuita:matrix.org
    [m]
    not sure, but either your have way too many hosts/services and consul is already having problem with all them, or some node is slow and is getting more healtchecks to do than those that it can manage...
    segment the consul in the first one, increase the node or solve the load issue in the second
    that is also a warning, so if just a random event, it was probably just load and worse case you failed to do a healtcheck for some hosts/services in time
    @oratlv: ↑
    OliverSmart
    @AdamCzepiel78
    Hello, i try to use the consul kv inside kubernetes, consul implemented but inside a pod the code http://127.0.0.1:8500 says
    Unhandled exception. System.Net.Http.HttpRequestException: Connection refused (127.0.0.1:8500)
    1 reply
    oratlv
    @oratlv
    I get it actually on all servers but we don't have extra load - so it's weird. I'm trying to track the cause of it somehow.
    Sean
    @seanamos

    These docs demonstrate how to register a service proxy: https://www.consul.io/docs/connect/registration/service-registration
    They give plenty sample configurations, but I can't figure out where to use those sample configurations!

    consul services register proxy.hcl
    Error: failed to parse proxy.hcl: 4 errors occurred:
        * invalid config key kind
        * invalid config key name
        * invalid config key port
        * invalid config key proxy
    consul config write proxy.hcl
    Failed to decode config entry input: invalid config entry kind: connect-proxy

    What am I missing?

    Sean
    @seanamos
    Right, figured it out:
    service { # <-- must be in a service block, examples don't show this
      name =  <name of the service>
      kind = "connect-proxy"
      proxy = {
      destination_service_name = "<name of the service that the proxy represents>"
      <additional proxy parameters> = "<additional parameter values>"
      }
      port = <port where services can discover and connect to proxied services>
    }
    1 reply
    Ayaan Zaidi
    @obviyus
    For some reason DNS resolution across all my consul nodes seem to be failing. The only thing I see in logs is:
    Aug 20 11:32:28 ip-172-31-33-223 consul[569145]: agent.rpcclient.health: subscribe call failed: err="rpc error: code = InvalidArgument desc = Key is required" failure_count=14 key=<service_name> topic=ServiceHealth
    Brett Larson
    @brettplarson
    What's the best practice for dev workstations? Should I install the consul agent on my WSL2 to get service resolution?
    bsharma-tavisca
    @bsharma-tavisca

    @bsharma-tavisca
    Hello everyone
    I am occasionally getting this error
    "Raft leader not found in server lookup mapping"

    "bootstrap_expect": 3,
    "retry_join": ["provider=aws tag_key=DataCenterName tag_value=ek-consul-nv-aws region=us-east-1 addr_type=private_v4"],
    "performance": {
    "raft_multiplier": 1
    }
    total consul servers running 5
    all consul server are running on m5.4xlarge

    bsharma-tavisca
    @bsharma-tavisca

    Two similar clusters, two similar consul configs. trying to spawn a second consul cluster, one join the other not:

    [WARN]  agent.server: Raft has a leader but other tracking of the node would indicate that the node is unhealthy or does not exist. The network may be misconfigured.: leader=172.98.120.15:8300
    [WARN]  agent: Syncing node info failed.: error="Raft leader not found in server lookup mapping"
    [ERROR] agent.anti_entropy: failed to sync remote state: error="Raft leader not found in server lookup mapping"
    [ERROR] agent.server.memberlist.lan: memberlist: Conflicting address for pxe-boot. Mine: 172.98.120.101:8301 Theirs: 172.98.120.15:8301 Old state: 0
    [ERROR] agent.server.serf.lan: serf: Node name conflicts with another node at 172.98.120.15:8301. Names must be unique! (Resolution enabled: false)

    Any idea what might cause the error?

    hey @aiqency
    did you find any luck getting the answers for the query you posted

    lcividin
    @lcividin:matrix.org
    [m]
    is it possible to create a consul key value out of a registered consul service?