Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    iluciv
    @iluciv:matrix.org
    [m]
    Hey @blake thanks just wanting more to understand how it's working. We recently spun up a dev ent version hashistack consul, nomad,vault and fabio for nomad conatiner lb /ssl I had made a full chain pem file for the wildcard cert to be used by fabio for the nomad jobs. Fabio is configured as a nomad job (docker conainters) on the linux worker nodes and as a windows service on the windows worker nodes. As the linux fabio are docker containers deployed by nomad they check into consul with a 443 health check and the fabio url webprefix. The windows services have been set up as a script to install the service, the fabio windows service config (fabio.properties file and certs) is on a cifs share and it only checks into consul with an 8080 healthcheck. We had an issue with any windows containers were not getting issued ssl certs for their webprefix url until I added in a key.pem and cert.pem file with the full chain pem file I already had in the certs folder. So that issue got resolved but I don't fully understand whats happening here. As the linux jobs are the only ones with the web prefix and the 443 health check in consul are they the only nodes issuing the certs? Is there something I should have configured to improve this? I really had a hard time finding logs of any vaule on the windows nodes. Should I be setting the cert path to consul or vault instead? At the moment the nomd jobs mounts a volume (nfs) and ingests the certs from there and the windows as I said gets they're config and certs off a cifs share. (fwiw we have 4 windows worker nodes and 4 linux worker nodes currently in this cluster)
    Saurabh Rawat
    @eklavya
    I have:
    • 3 instances with public ips, labelled as separate datacenters.
    • firewall rules to allow tcp/udp on 8300-8302 ports.
    • bind_addr on all 3 nodes as 0.0.0.0
    • network rules are fine since I can ssh to all these nodes from each other over their public ips.
      yet I keep getting no route to host while trying to WAN join these togather, any ideas why?
    Kholis Respati Agum Gumelar
    @kholisrag
    Hi all, using consul-template in nomad job template stanza, can we get metadata of the nodes that consul service reside ? if so how can we do it ?
    2 replies
    Theerapong Kulawong
    @Theerapong

    Hi all,

    I got this error,

    Error creating: Internal error occurred: failed calling webhook "consul-consul-connect-injector.consul.hashicorp.com": failed to call webhook: Post "https://consul-consul-connect-injector-svc.myconsulns.svc:443/mutate?timeout=10s": dial tcp 10.105.230.122:443: i/o timeout

    while I was trying to deploy the Deployment.

    My 3 Consul servers are outside Kubernetes.
    I install Consul clients using Helm.
    (All Consul v1.11.2)

    when I use this command to see logs
    "kubectl -n myconsulns logs -f pod/consul-consul-connect-injector-webhook-deployment-59d4d85bg79sk"

    2022-01-18T05:58:32.603Z        ERROR        controller.endpoints        failed to get service instances        {"name": "consul-consul-connect-injector-svc", "error": "Get \"http://192.168.90.187:8500/v1/agent/services?filter=Meta%5B%22k8s-service-name%22%5D+%3D%3D+%22consul-consul-connect-injector-svc%22+and+Meta%5B%22k8s-namespace%22%5D+%3D%3D+%22myconsulns%22+and+Meta%5B%22managed-by%22%5D+%3D%3D+%22consul-k8s-endpoints-controller%22\": dial tcp 192.168.90.187:8500: i/o timeout"}
    sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile

    In Helm, if I set "connectInject.failurePolicy " to be "Fail", I got the above error.
    (If I set "connectInject.failurePolicy " to be "Ignore", I will not get the above error. But there are no SlideCars when I see via "consul catalog services" command)

    Could anyone suggest to me how to solve this issue, please.

    jdaniel-at-yottly-dot-com
    @jdaniel-at-yottly-dot-com
    Hello, I noticed that consul tracks its server instance (server port) among services, nomad for example registers all instances (rpc, serf, http, ...), can I convince consul to register other ports as well for more complete discovery?
    6 replies
    Michael Aldridge
    @the-maldridge
    @blake I just finished reading the roblox postmortem. My hat is off to the consul team for really stepping up and debugging that close to the metal. Going inside the storage primitives really demonstrates how dedicated hashicorp is to building a quality product. Figured you might be able to pass along the positive feedback!
    1 reply
    bbuddha
    @bbuddha
    @blake Can you please look into this when you got a minute. Feeling stuck as this is a critical use case for us to recommend building a service mesh using consul enterprise.
    https://discuss.hashicorp.com/t/setup-envoy-as-ingress-gateway/34547/2
    1 reply
    tretinha
    @tretinha
    Hey, I'm trying to connect two services using Consul Connect, I'm doing a sidecar via Nomad and I'm constantly getting "gRPC config stream closed since connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED". I have TLS enabled on Consul and Nomad, using self-signed certificates. Can someone help me with that? I'm not sure if I'm missing some step from the docs, but I thought Consul would handle the
    Sorry, the message went cut in half, but to complete: I thought Consul would handle the connection between everyone with its built-in CA. That's it :)
    Alex Oskotsky
    @aoskotsky-amplify
    Does consul DNS support looking up a subset defined in a service-resolver for a service?
    3 replies
    Roi Ezra
    @ezraroi
    I am using the consul-k8s project and using it to sync k8s to consul. Looks like that services are synced to consul only on what is called fullSync and on changes. From the comments in the code of the consul-k8s it seems that it should sync changes when they happen and only on full sync (happens every 30 seconds by default). Have anyone else encountered this? is this on purpose?
    Jesse Adelman
    @boldandbusted
    Howdy. Can I use 'consul snapshot restore' on a non-Leader member of a cluster? I just tried it, and I didn't get an error, but it seems like the snapshot wasn't applied?
    2 replies
    (Better jargon: a 'follower' server)
    Jesse Adelman
    @boldandbusted
    (My belief about it not being applied is only from the Index number not rolling back.)
    smuthali
    @smuthali

    Hello all, I am running a 3 node consul cluster (version v1.11.2), when i perform a DNS look up from consul cluster node 01,:

    dig +short @localhost -p 8600 consul.service.us-west.consul SRV +tcp
    1 1 8300 <REDACTED>-consul-server-01.node.us-west.consul.
    1 1 8300 <REDACTED>-consul-server-02.node.us-west.consul.
    1 1 8300 <REDACTED>-consul-server-03.node.us-west.consul.

    However from consul cluster node 02 and 03, I don't get an authoritative answer for

    dig +short @localhost -p 8600 consul.service.us-west.consul SRV +tcp

    Any pointers to debug this is super appreciated

    3 replies
    Jesse Adelman
    @boldandbusted
    blake: Thank you - I do see the data - but it is on a test cluster, not very active.
    blake: But, I think you answered my question. :)
    Jesse Adelman
    @boldandbusted
    Howdy. How can I tell which ACL token policy is associated with an ACL token I'm currently using?
    Or match a token with an AccessorID?
    Jesse Adelman
    @boldandbusted
    This seems like something basic that I'm missing. :/
    Blake Covarrubias
    @blake
    @boldandbusted You can use consul acl token read -self to see a list of policies assigned to the token you are using.
    Jesse Adelman
    @boldandbusted
    blake: Ah, thank you!
    Chris Johnson
    @chrisjohnson

    We have a setup with an ingress gateway in one datacenter and services that it routes to in other datacenters (using a failover block of a service-resolver) but it doesn't reflect in the topology view (running 1.9.7)

    Is this something that is reflected in the topology in a newer version of consul? I wanted to ask before I submitted a feature request that has already been implemented

    Chris Johnson
    @chrisjohnson
    I submitted a feature request. I actually think this might be a bug because they are listed in the upstreams tab, just not topology hashicorp/consul#12353
    Ayaan Zaidi
    @obviyus
    I was just browsing the CNCF chart over at: https://landscape.cncf.io/
    Has Consul been removed from the service discovery section?
    Shantanu Gadgil
    @shantanugadgil
    @obviyus Consul seems to have been categorized under the "Service Mesh" category.
    Ayaan Zaidi
    @obviyus
    @shantanugadgil ah I see
    My bad
    John Spencer
    @johnnyplaydrums
    Hey folks! I have a 1 node consul cluster that I'd like to upgrade to a 3 node cluster. I will add 2 new nodes, and then once they join the cluster successfully I will then replace the initial node with a new 3rd node. I can't find any documentation or conversation around upgrading from 1 node to 3 node cluster, so just wondering if y'all had any callouts or pitfalls I might hit.
    1 reply
    Epifeny
    @epifeny

    Hello, I've got Consul 1.9.7 installed with TLS. I used the built-in CA of Consul to generate the server and client certificates. I followed the Secure Consul Agent Communication with TLS Encryption tutorial. I now want to use the API and am getting some errors from curl:

    # curl -k https://localhost:8501/v1/agent/self
    curl: (58) NSS: client certificate not found (nickname not specified)

    What might be the issue here?

    I do have "verify_incoming": true, in the configuration file. So from what I understand, I need to supply the relevant certificates with curl
    Epifeny
    @epifeny
    So I tried to provide those that I generated earlier, but curl still fails
    # curl -vk https://localhost:8501/v1/agent/self --cacert /etc/consul/tls/<hidden>-agent-ca.pem --key /etc/consul/tls/<hidden>.pem --cert /etc/consul/tls/<hidden>-key.pem
    * About to connect() to localhost port 8501 (#0)
    *   Trying ::1...
    * Connected to localhost (::1) port 8501 (#0)
    * Initializing NSS with certpath: sql:/etc/pki/nssdb
    * unable to load client cert: -8018 (SEC_ERROR_UNKNOWN_PKCS11_ERROR)
    * NSS error -8018 (SEC_ERROR_UNKNOWN_PKCS11_ERROR)
    * Unknown PKCS #11 error.
    * Closing connection 0
    curl: (58) unable to load client cert: -8018 (SEC_ERROR_UNKNOWN_PKCS11_ERROR)
    Epifeny
    @epifeny
    Sorry, the above last output was using a wrong set of certificates. Nevertheless, I've still got an issue:
    # curl -vk https://localhost:8501/v1/agent/self --cacert ./<hidden>-agent-ca.pem --key ./<hidden>-key.pem --cert ./<hidden>.pem
    * About to connect() to localhost port 8501 (#0)
    *   Trying ::1...
    * Connected to localhost (::1) port 8501 (#0)
    * Initializing NSS with certpath: sql:/etc/pki/nssdb
    * unable to load client key: -8178 (SEC_ERROR_BAD_KEY)
    * NSS error -8178 (SEC_ERROR_BAD_KEY)
    * Peer's public key is invalid.
    * Closing connection 0
    curl: (58) unable to load client key: -8178 (SEC_ERROR_BAD_KEY)
    Epifeny
    @epifeny
    I did try to compile curl from source with OpenSSL instead of the CentOS default of NSS and that solved my issue. The question remains, how can I overcome this with a default curl in my CentOS distro that is compiled with NSS and while using the built-in Consul TLS creation tool?
    # ./curl/curl-7.67.0/src/curl -sk https://localhost:8501/v1/status/leader --key ./<hidden>-key.pem --cert ./<hidden>.pem | jq
    "<hidden>.<hidden>.<hidden>.<hidden>:8300"
    • I had to <hide> some details for obvious privacy/security reasons.
    Epifeny
    @epifeny
    When generating the server certificates, the tutorial says "servers are provided with a special certificate - one that contains server.dc1.consul in the Common Name". Does server.dc1.consul need to have the server prefix? For ex. server.eu.mydomain.com or can it be hostname.eu.mydomain.com? The tutorial isn't very informative on this matter (IMHO). What's the Consul internal usage for the word server in this case?
    JamieGruener
    @JamieGruener
    @epifeny I don't know what the Consul folks say about this, but when we generate server certificates, we include a laundry list of SANs (subject alternative names) to make sure that the cert includes all of the possible hostnames that we might use to reach the server, including the IP address and the Consul-based FQDN like consul-1.node.consul.
    How do folks handle encrypting intra-cluster communications for database clusters? For example, we're looking to stand up a mongodb replica set and a Cassandra cluster, both of which rely on lots of intra-cluster communications. Do folks ever use sidecar proxies to handle that? Or is it self-signed certs (or certs from a managed CA like Vault)?
    Epifeny
    @epifeny

    @epifeny I don't know what the Consul folks say about this, but when we generate server certificates, we include a laundry list of SANs (subject alternative names) to make sure that the cert includes all of the possible hostnames that we might use to reach the server, including the IP address and the Consul-based FQDN like consul-1.node.consul.

    Yea, I don't think I'm gonna be able to get an answer for this. The docs are lacking detail.

    Jasmine W
    @jnwright
    Hello, all! I work as a product designer on Consul and am looking for research participants for a quick, async study. We want to test a new feature that we're currently designing for a future Consul release. If you're interested in participating, please fill out this short google form: https://forms.gle/QA8zLkRenEqYnyaa9. Thank you!
    chrisvanmeer
    @chrisvanmeer:matrix.org
    [m]
    All filled in
    1 reply
    Psi-Jack
    @psi-jack:matrix.org
    [m]
    Why does consul tls cert create -server -dc dc1 create a cert that has only server.dc1.consul, localhost, and 127.0.0.1? Why specifically server.dc1.consul that it itself does not even seem to resolve?
    Psi-Jack
    @psi-jack:matrix.org
    [m]
    I mean, consul.service.dc1.consul resolves as expected. But where in the heck does server.dc1.consul ever resolve in consul land?
    1 reply
    chrisvanmeer
    @chrisvanmeer:matrix.org
    [m]
    Same goes for the client certificate which has DNS:client.dc1.consul, DNS:localhost, IP Address:127.0.0.1
    Karthick Ramachandran
    @rkarthick
    Question on catalog check (https://www.consul.io/api-docs/catalog#check). Is it expected for the consul server to take the service to "critical" after the timeout "1s" and deregister after "30s" automatically (thereby honoring timeout and DeregisterCriticalServiceAfter), if we dont update the health? In my local runs, I am finding the service to be "passing" even after 30 minutes.
    1 reply
    Blake Covarrubias
    @blake
    I recently wrote a dissector for Wireshark that decodes unencrypted RPC and Gossip traffic from Consul. I figured folks here might have an interest in it. It is still very much a work in progress / proof-of-concept project, but does provide a nice way to view network traffic from Consul. The code can be found here: https://github.com/blake/wireshark-consul-dissector. At the moment, it can only read data from PCAP files. The dissector cannot decode traffic that actively being captured by Wireshark.
    tim
    @timhungdao
    Hello there
    anyone has Out of Memory issue with Consul 1.10.4?
    anytime I start the consul agent (client mode) it eat all the server memory and get killed by the kernel
    Yann Huissoud
    @aiqency
    Similar to this issue: hashicorp/consul#1545 I want to add tags to services programmatically in go. /v1/catalog/register is expecting the full service definition (api.AgentServiceRegistration), but I couldn't find any API to get it in the first place (in order to modify it).
    /v1/catalog/service/ also doesn't return all required values.
    2 replies