Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Sukant Garg
    I cannot connect to the mongodb atlas instance no matter which URL I try. I have an atlas replica set cluster with one primary and two secondary shards. I tried using the primary URL as project-shard-00-02.whatever.mongodb.net:27017, with the replica name and cluster URL as <replica0>/project.whatever.mongodb.net:27017 and other combinations. Interestingly, I can connect to mongo just fine via cli on the same machine
    2 replies
    Philip Moy

    Hello, I am using debezium postgres connector 1.3.0 Final. How do i create a snapshot for tables that are newly whitelist?

    for example my db has 5 tables. I whitelisted 3 of them when I first turned on the connector. three weeks later I whitelist the final 2 but dbz doesnt return the snapshot.

    2 replies
    Ashish Kadam
    hello team, any one has used debezium to sync data from mongodb to bigquery?
    i am trying to find an example online or a tutorial, but not getting many resources.
    Petr Koňárek

    hello, we have problem with restarting debezium, where it is recovering history topic, we get this error

    [2021-04-08 09:02:09,738] ERROR WorkerSourceTask{id=db-debezium-kafka-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:187)
    org.apache.kafka.connect.errors.ConnectException: java.lang.IllegalStateException: The database history couldn't be recovered. Consider to increase the value for database.history.kafka.recovery.poll.interval.ms
            at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:291)
            at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:106)
            at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:219)
            at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
            at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
            at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
            at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
            at java.base/java.lang.Thread.run(Thread.java:834)
    Caused by: java.lang.IllegalStateException: The database history couldn't be recovered. Consider to increase the value for database.history.kafka.recovery.poll.interval.ms
            at io.debezium.relational.history.KafkaDatabaseHistory.recoverRecords(KafkaDatabaseHistory.java:281)
            at io.debezium.relational.history.AbstractDatabaseHistory.recover(AbstractDatabaseHistory.java:82)
            at io.debezium.connector.mysql.MySqlSchema.loadHistory(MySqlSchema.java:268)
            at io.debezium.connector.mysql.MySqlTaskContext.loadHistory(MySqlTaskContext.java:164)
            at io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:110)
            ... 9 more
    [2021-04-08 09:02:09,740] ERROR WorkerSourceTask{id=db-debezium-kafka-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:188)

    a set up
    "database.history.kafka.recovery.attempts": "20",
    "database.history.kafka.recovery.poll.interval.ms": "300000",
    but i look like it is ignored and not trying numerous attempts or even waiting for this long
    we run it in pod in k8s where liveness kill it if staring fail and restarting it, some pods start on first try some on fifth or more

    1 reply
    Does anyone know how to control bidirectionality with Debezium, that is, avoid an infinite loop between two databases?
    1 reply


    Friends tell me, I've configured the connector and added heartbeat. But friends, the table is swelling, is it not cleaned? Of course, I'll hang up the cronjob to truncate the table, but maybe it should be implemented?

    1 reply
    Hello team, We tried to connect multiple debezium MySQL connectors with different database with in the same azure mysql database server for different clients and it was running with two brokers on single node. For one connector it was working fine. Second source connector worked for few minutes and returned the following errors.
    the above are the two mysql source connectors
    Chris Cranford
    @sandhyaranibala:matrix.org What errors were returned, you didn't say.
    immediately after that message
    Mike Lively
    When using the sqlserver debezium connector, we thought we might try distributing the load on kafka connect by running multiple configurations connecting to the same sql server cdc with different table include lists. We have a concern about scaling the data coming through. I am hoping someone out there can confirm whether this is or isn't a smart approach and what issues we might face. There is a high volume of change with frequent but spikey changes. The db in question is essentially an analytics instance servicing batched ETL processes. I can't really find any use cases around this for sql server specifically. I recall seeing some warnings for this methodology for mysql, but I am wondering if the differences in how the changes are detected between mysql and sql server might result in a different answer for sql server. Thanks in advanced for any thoughts/help.
    1 reply
    Hi, we caught a weird error when running debezium mysql connector yesterday using 1.5.0.CR1. I was able to reproduce it using a local mysql server.
    10 replies
    Dmytro Stepanchuk
    Hello Debezium Team!
    We need to stream events from a single table. This table gets new records at a rate of about 500 per second. It is in SQL Server and we were going to use SQLServerConnector Kafka Connector. Everything seems pretty straightforward, however one thing we are still unclear on: Is it possible to run multiple instances of this Kafka connector (i.e. multiple Docker Containers) all operating on that single table for the purposes of providing for high availability and increased performance? Is there any synchronization mechanism in Debezium that will ensure these connectors don't end up generating duplicate events? There are some hints in the docs pointing to the contrary, but it's not completely clear :)
    2 replies
    Jun Huh
    Hi! Is debezium embedded possible for a multi node environment sharing same db? Basically I want to ensure that a record in a table is only processed once.
    1 reply
    Wade Sherman
    A question about history topics:
    If I have a mysql connector running. MySQLConnectorA and it is using MySQLHistoryTopic and the connector fails in such a way that I just want to delete it and create a new one...
    so I delete MySQLConnectorA and create MySQLConnectorB. Can I just reuse the same history topic? Should I always make a new one? what is the best way to handle this scenario?
    2 replies
    Is it possible to run debezium server in a distributed fashion on kubernetes? I’m wondering how the replication and sync of the the transaction offset would be managed.
    Lam Tri Hieu

    Hi Team,
    We currently have 2 connectors without setting the "decimal.handling" so it would default to "precise". This is fine when we write a streaming application to join/rekey kstream/table. However, now I want to consume it using ksqldb as it's more convenient and less code. But I cannot use aggregate function latest_by_offset as it's wouldn't support integer struct(the numeric number is mapped to ksqldb as integer struct).
    So I need to change the decimal.handling to "string" for easy consuming with ksqldb.
    So a couple questions :

    1. If we change the config of decimal.handling, it would means we need to delete the current topic and run the new connector so that it would stream data with new decimal format ?
    2. In my local machine, when we create a connector with snapshot.mode=always whenever I delete connector and create new connector again, it stream again and create duplicate records in the topic. What should we do when we delete the connector ? We should also need to delete the topics ?
    3. Is there anyway to use ksqldb with "decimal.handling" set as default. As far as I know, ksqldb don't support byte yet. So I would think the answer is NO but glad to hear your opinion.


    8 replies
    Hi team, Just one quick question.
    Can we handle mysql connector with mixed level binlog's formatting? I'm using avro serialization.
    2 replies
    @Naros What are the parameters for automatic reconnection of MySQL connection
    1 reply
    hello team , i found the code on GitHub is not complete, how can I get the complete open source code? thanks!
    3 replies
    Beomyong Lee
    We are going to build and use debezium-ui from source.
    The source build method is described in the readme of debezium-ui github, but the following error message occurs. Please contact us if there is a solution.
    $ mvn clean install
    [INFO] Debezium UI Build Aggregator ...................... SUCCESS [  0.181 s]
    [INFO] Debezium UI Frontend .............................. SUCCESS [ 47.648 s]
    [INFO] Debezium UI Backend ............................... FAILURE [  1.257 s]
    [INFO] ------------------------------------------------------------------------
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 49.275 s
    [INFO] Finished at: 2021-04-09T09:00:20+00:00
    [INFO] Final Memory: 49M/367M
    [INFO] ------------------------------------------------------------------------
    [ERROR] Failed to execute goal io.quarkus:quarkus-maven-plugin:1.12.1.Final:prepare (default) on project debezium-ui-backend: Execution default of goal io.quarkus:quarkus-maven-plugin:1.12.1.Final:prepare failed: Unable to load the mojo 'prepare' (or one of its required components) from the plugin 'io.quarkus:quarkus-maven-plugin:1.12.1.Final': com.google.inject.ProvisionException: Guice provision errors:
    [ERROR] 1) Error injecting: private org.eclipse.aether.spi.log.Logger org.eclipse.aether.internal.impl.DefaultLocalRepositoryProvider.logger
    [ERROR] while locating org.eclipse.aether.internal.impl.DefaultLocalRepositoryProvider
    [ERROR] while locating java.lang.Object annotated with *
    [ERROR] at org.eclipse.sisu.wire.LocatorWiring
    [ERROR] while locating org.eclipse.aether.impl.LocalRepositoryProvider
    [ERROR] for parameter 8 at org.eclipse.aether.internal.impl.DefaultRepositorySystem.<init>(Unknown Source)
    [ERROR] while locating org.eclipse.aether.internal.impl.DefaultRepositorySystem
    [ERROR] while locating java.lang.Object annotated with *
    [ERROR] at ClassRealm[plugin>io.quarkus:quarkus-maven-plugin:1.12.1.Final, parent: sun.misc.Launcher$AppClassLoader@70dea4e]
    [ERROR] while locating io.quarkus.maven.QuarkusBootstrapProvider
    [ERROR] while locating io.quarkus.maven.PrepareMojo
    [ERROR] at ClassRealm[plugin>io.quarkus:quarkus-maven-plugin:1.12.1.Final, parent: sun.misc.Launcher$AppClassLoader@70dea4e]
    [ERROR] while locating org.apache.maven.plugin.Mojo annotated with @com.google.inject.name.Named(value=io.quarkus:quarkus-maven-plugin:1.12.1.Final:prepare)
    [ERROR] Caused by: java.lang.IllegalArgumentException: Can not set org.eclipse.aether.spi.log.Logger field org.eclipse.aether.internal.impl.DefaultLocalRepositoryProvider.logger to org.eclipse.aether.internal.impl.PlexusLoggerFactory
    [ERROR] at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167)
    [ERROR] at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171)
    [ERROR] at sun.reflect.UnsafeObjectFieldAccessorImpl.set(UnsafeObjectFieldAccessorImpl.java:81)
    [ERROR] at java.lang.reflect.Field.set(Field.java:764)
    Gilles Vaquez
    Hello, I am using Debezium 1.4 with Strimzi on K8S and found out that Debezium get "stuck" upon KafkaConnect deployment auto-restart. It seems that when another pod is started while the previous is running, it somehow "corrupt" Debezium state and we are unable to make it process data anymore. The only way to have it back at work is to delete the 3 KafkaConnect topics (config,offsets,status). Do you have an idea regarding this ? I can provide some logs if needed
    3 replies
    Christian Dorner
    Hello, I upgrade from version 1.4 to 1.5.0.Final and it seams that debezium is sending messages in very large bulks to kafka. My Kafka stops receiving messages and then receive a very large bulk. In 1.4 the rate of sending message was much more constant
    2021-04-09 13:22:01,878 INFO   ||  3699 records sent during previous 00:00:39.5, last recorded offset: {transaction_id=null, lsn_proc=374194546795464, lsn_commit=374194546795464, lsn=374194546513840, txId=2675225950, ts_usec=1617974521360503}   [io.debezium.connector.common.BaseSourceTask]
    2021-04-09 13:22:16,507 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:22:16,507 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:22:16,512 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Finished commitOffsets successfully in 5 ms   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:22:46,513 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:22:46,513 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:23:16,548 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:23:16,548 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:23:21,456 INFO   ||  785807 records sent during previous 00:01:19.578, last recorded offset: {transaction_id=null, lsn_proc=374194549319488, lsn_commit=374194549319488, lsn=374193616936160, txId=2675220979, ts_usec=1617974521525094}   [io.debezium.connector.common.BaseSourceTask]
    2021-04-09 13:23:46,548 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:23:46,548 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:24:16,549 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:24:16,549 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:24:46,549 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:24:46,549 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:25:16,550 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:25:16,550 INFO   ||  WorkerSourceTask{id=shipment-order-connector-v1-0} flushing 0 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
    2021-04-09 13:26:01,454 INFO   ||  1607505 records sent during previous 00:02:39.998, last recorded offset: {transaction_id=null, lsn_proc=374194549319488, lsn_commit=374194549319488, lsn=374194350949688, txId=2675220979, ts_usec=1617974521525094}   [io.debezium.connector.common.BaseSourceTask]
    2021-04-09 13:26:16,551 INFO
    1 reply
    Petr Koňárek
    hello, what will happen when only history topic is deleted and snapshot.mode is set on schema_only, it will do new scheme snapshot and continue where is offset end in offset topic?
    Alok Kumar Singh
    Debezium missed some events for a topic. The data is missing for few days. The table is having only 4 rows. How do i prevent such problems and get to know if it happens by some monitoring. The topic user told me about the missed, how to find out of such errors?
    1 reply

    Hello, everyone!
    Anyone knows how to consume messages from Kafka through REST API? I have deployed Debezium Postgres tutorial to AWS ECS and I can connect to Postgres and Kafka Connect and use REST API to operate them. However, when I connect to Kafka endpoint I get this:

    curl.exe -H "Accept:appliction/json" <load balancer end point>:9092/
    curl: (52) Empty reply from server

    Any pointers or reference will be greatly appreciated.

    Gunnar Morling
    @tnevolin Kafka has it's own protocol which isn't HTTP; if you want to access Kafka in a REST-ful way, you'll need to set up an HTTP bridge, such as Strimzi's: https://strimzi.io/docs/bridge/latest/
    1 reply
    Hi .. Can i use debezium to do complete one time migration from One postgresSQL to another with some enriching while sending data using kafka or pulsar. If yes. what is the steps to do that.
    1 reply
    In the MySQL connector logs I see rows scanned greater than total - this is very confusing? (version 1.3.1)
    2021-04-05 23:58:56
    [2021-04-05 20:58:54,558] INFO Step 8: - 193090000 of 165839479 rows scanned from table 'cdc_load_test.users' after 13:11:00.325 (io.debezium.connector.mysql.SnapshotReader:664)
    2 replies
    Hosein Ey
    hi all, we want to encrypt some sensitive data in Debezium, how we can do that ?
    custom transformation ?
    2 replies

    Dear Debezium community. I would like to (re)open a discussion that you probably already talked about for many times. In my company I use a PostgreSQL cluster containing two instances.

    Currently there are some debezium connectors active to this cluster and in the perfect world they are doing great. However when the cluster experiences a switchover to the standby instance, the whole challenge starts with the need recreate the replication slots and being sure wether all data are transferred to kafka.

    This situation bothers me a lot since we took debezium to production and I yet didnt find a good solution on how to solve that.
    The best thing I came up with by today is having a job that restarts failed connectors every 30 seconds, but this is still not quite satisfying.

    I also stumbled across the blogpost about "pglogical" you referred to in the documentation. At this point I'm not yet skilled enough about possibilities with postgres. I also thought about creating the replication slots at the time of the failover, so the connectors have the slots already available when the get restarted.

    I would appreciate a lot any kind of hint or suggestions of what you have since this topic drains my confidence in using debezium with Postgres for the future.

    thx a lot in advance

    8 replies
    We are using debezium db2 conenctor to pull the CDC logs from DB2 (zOS) and write to Kafkatopics. while doing that, we ar efacing issue with capturing snapshot. Debezium connector is looking for syscat schema tables which do not exist is Db2 (zOS). Instead same tables are present in SysIBM schema in DB2 (zOS). Is there a configuration setting in Debezium Db2 Connector to change the schema name from syscat to sysibm? Also is DB2Connector tested with DB2 (zOS)?
    1 reply
    Appreciate any insights or pointers?
    hi guys. i have a question. does debezium read data from the cdc table like 'cdc.dbo_xxx_CT'? i turn on the sqlserver cdc of a table and then do some dml.
    3 replies
    some records have been inserted into cdc.dbo_xxx_CT. after that i start the connector. it read the snapshot of all data and send to kafka. i dont want the snapshot ,i just want the changes records from my dml before.what should i do
    i set the snapshot.mode to schema_only.it can get the changed recode after my connector started,but the before dml changes didnt send to kafka.
    Nikita Babich
    Hello, I installed and setup debezium connector to my postgresql database and I used limited list of databases, and publication.autocreate.mode: filtered option. For now I want to extend list of tables, I added new one and as I can see, in the connector config via API, this table was added and field table.include.list successfully updated. But when I checked tables included in publication I see that there are only two old tables, without new one. Should I re-create connector for apply these settings or it is possible to add something to configuration that will be alter/re-create publication when table list changed?
    @Naros ,Hello, in my Kafka connector distributed mode, when one of the connectors is stopped, other connectors will be in unassigned status, and then some connectors will report the following problem
    1 reply
    Having an issue with Oracle 1.5.0 connector, Connect is configured with SASL_PLAIN security to the kafka broker, but it looks like the connector is not recognizing it when it uses the "database.history.kafka.bootstrap.servers" property. Keep seeing this over and over in my logs:
    [Producer clientId=xxx] Bootstrap broker xxx:9093 (id: -1 rack: null) disconnected [org.apache.kafka.clients.NetworkClient]
    [Consumer clientId=xxx] Bootstrap broker xxx:9093 (id: -1 rack: null) disconnected [org.apache.kafka.clients.NetworkClient]
    3 replies
    Hello. Can someone explain please why kafka-connect standalone forces me to enable supplemental log on whole database? I'm using oracle connector and enable supplemental log on two tables in one schema. After starting logminer I'm trying to start connect-standalone.sh and I get an error "connect-standalone.sh[15378]: ERROR WorkerSourceTask{id=debezium-connector-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
    connect-standalone.sh[15378]: Caused by: io.debezium.DebeziumException: Supplemental logging not properly configured. Use: ALTER DATABASE ADD SUPPLEMENTAL LOG DATA"

    Hi All.
    Any specifics to connect Kafka behind the load balancer?
    I have deployed Debezium Postgres tutorial docker compose to AWS Fargate. It put Kafka behind the load balancer with a single DNS name and port. This host:port itself is accessible but when I try to list topics I get this.

    I have no name!@320aed83a282:/opt/bitnami/kafka$ bin/kafka-topics.sh --bootstrap-server audit-LoadB-PH107PYVPE77-869abac555f00a22.elb.us-east-1.amazonaws.com:9092 --list
    [2021-04-12 16:35:06,512] WARN [AdminClient clientId=adminclient-1] Connection to node 1 (/ could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)

    And is not the IP broker DNS resolves to.

    nslookup audit-LoadB-PH107PYVPE77-869abac555f00a22.elb.us-east-1.amazonaws.com
    Server:  Fios_Quantum_Gateway.fios-router.home
    Non-authoritative answer:
    Name:    audit-LoadB-PH107PYVPE77-869abac555f00a22.elb.us-east-1.amazonaws.com

    What should I do?

    Avi Mualem
    hey i have an interesting question:
    during a rebalance in kafka connect. will all debezium connectors will flush the bin log position the got to or they will just restart and i should expect more than one messaging in high probability ?
    L Suarez
    Hi all. I hope you'll forgive a new arrival with minimal exposure, but I'm working to build CDC and sync of a data domain between MongoDB Atlas and SQL Server. I've got a pretty good sense of the uni-directional feed from SQL to Mongo, but I've no shortage of questions about the opposite direction. First: the MongoDB Debezium connector seems to be based on tailing the oplog instead of utilizing a change stream for Mongo 3.6+. Is that correct? Also, I'm fuzzy on a Kafka sink connector for SQL: whether a generic implementation library exists, how I would go about implementing one myself if needed, etc.

    What are best practices running Debezium on AWS? It seems that it poorly works with docker compose to AWS ECS translation. Any articles or hints will be greatly appreciated.

    Specific questions:

    1. Is it better to deploy it as a single docker compose package or use stand-alone Kafka installation and then configure Debezium Kafka Connector to use it?
    2. What is the best AWS service type to host whole things on: EC2, ECS, Fargate, Kubernetis?