Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Ben Cohen
    @cohenbenjamin05

    @Naros - Before I open up an issue, I’d like to confirm something with you. It doesn’t appear that the connector is explicitly acquiring locks. Rather, an AccessShareLock is being acquired as a result of the connector running a SELECT * FROM table query during the snapshot. Based on the documentation, I would have expected that this query would not be run given that the documentation says

    The fifth snapshot mode, exported, will perform a database snapshot based on the point in time when the replication slot was created. This mode is an excellent way to perform a snapshot in a lock-free way.
    Is this expected?

    Ashika Umanga Umagilya
    @ashikaumanga
    @Naros @jpechane We are using Mysql Connector to ingest ~20mil records.We write the messages back to HDFS using HDSF Sink. Initial snapshot works fine but then after awhile seems the CDC stopped ingesting new records.
    Our source record counts increased with like 20515736,20515744,20515986...etc But the HDFS count remained at 20515283. We confirmed that HDFS Sink works fine and there are no new data ingested (no new HDFS partitioned created). Any idea what might have gone wrong ?
    Jiri Pechanec
    @jpechane
    @ashikaumanga Hi, are the records stored in the Kafka broker or not?
    Ashika Umanga Umagilya
    @ashikaumanga
    @jpechane seems like after the initial load no new messages received to the topic
    Jiri Pechanec
    @jpechane
    @ashikaumanga Could you please try schema only snapshot to verify that streaming is/is not working? Is there any exception in the log?
    Ashika Umanga Umagilya
    @ashikaumanga
    @jpechane I flushed the Topic. Started listening to the topic using kafka-console-consumer. Finally created a new MySql Connector with 'schema_only' (with a new connector name)
    Noticed 15 new records in the source db. But no messages shown in the console-consumer.
    do i need to flush the Db History Topic as well ?
    Ashika Umanga Umagilya
    @ashikaumanga
    Then I created a new Connector with 'when_needed' and messages started flooding in the console-consumer..
    Ashika Umanga Umagilya
    @ashikaumanga
    we are using 1.0.0Beta2, Mysql version is 5.7.18
    Martin
    @mhaagens
    Hi! I was wondering if there's a way to force Debezium to re run the indexing of Postgres?
    Dawid Mazur
    @dwdmzr_twitter

    Hello everyone! I'm trying to deploy Debezium to do CDC on AWS RDS MySQL (5.7) server and today it stopped working with an exception: org.apache.kafka.connect.errors.ConnectException: Received DML 'DELETE FROM mysql.rds_sysinfo where name = 'innodb_txn_key'' for processing, binlog probably contains events generated with statement or mixed based replication format

    This is weird, because our DB is configured with correct binlog format:

    +--------------------------------------------+----------------------+
    | Variable_name                              | Value                |
    +--------------------------------------------+----------------------+
    | binlog_format                              | ROW                  |
    | binlog_row_image                           | FULL                 |
    +--------------------------------------------+----------------------+

    I'm searching around for a solution for some time now and didn't find anything useful yet. Did I do something wrong when configuring the connector, or should I create a ticket for the problem? Maybe the fact that we have replication in place could cause the DML to appear in the binlog?

    Jiri Pechanec
    @jpechane
    @dwdmzr_twitter Hi, please see https://issues.jboss.org/browse/DBZ-1492
    Dawid Mazur
    @dwdmzr_twitter
    Oh, so upgrading to 1.0 should fix the problem, thank you!
    Chris Cranford
    @Naros
    Hi @cohenbenjamin05 so after reviewing the exported snapshot functionality, the AccessShareLock is expected and here is why.
    The goal behind the exported snapshot feature was to avoid LOCK TABLE <table> IN ACCESS SHARE UPDATE EXCLUSIVE MODE (at the time this was the lock we used).
    When the replication slot is first created, we are able to obtain the current LSN when the slot is created and we cache this value.
    We are then able to get the rows from each table at that LSN point without needing to physically lock the tables and then we can use that LSN to begin streaming from thereafter.
    Without being able to obtain that LSN marker then the table locks are necessary.
    Ben Cohen
    @cohenbenjamin05
    @Naros thanks for the clarification. I'd suggest updating the documentation as others may have the same expectation that I had after reading the documentation.
    Weijing Jay Lin
    @dotku
    Hello :)
    I'm working on postgres connector, anyone has a image with postgres 10 sample?
    Kai
    @akai1024

    Hello,
    I am using EmbeddedEngine to capture data from MySQL.
    After a few weeks of running, I have changed schemas of tables while debezium running and I found that some of the value mismatched to the key.
    For example: expected key/value is {"col1":"value1", "col2":"value2", "col3":"value3"}
    But debezium exports something like {"col1":"value2", "col2": null, "col3":"value1"}
    I try to use snapshot.mode=schema_only_recovery and restart it to fix this problem but it doesn't work.
    Here are some logs when starting:

    2019-11-16 03:56:28.201 [WARN] [taskExecuteManager-2] i.d.c.mysql.MySqlValueConverters - Using UTF-8 charset by default for column without charset: player_country VARCHAR(40) NOT NULL DEFAULT VALUE 
    2019-11-16 03:56:28.201 [WARN] [taskExecuteManager-2] i.d.c.mysql.MySqlValueConverters - Column is missing a character set: player_nickname VARCHAR(30) NOT NULL DEFAULT VALUE 
    2019-11-16 03:56:28.201 [WARN] [taskExecuteManager-2] i.d.c.mysql.MySqlValueConverters - Using UTF-8 charset by default for column without charset: player_nickname VARCHAR(30) NOT NULL DEFAULT VALUE 
    2019-11-16 03:56:28.205 [WARN] [taskExecuteManager-2] i.d.c.mysql.MySqlValueConverters - Column uses MySQL character set ''utf8mb4'', which has no mapping to a Java character set
    2019-11-16 03:56:28.205 [WARN] [taskExecuteManager-2] i.d.c.mysql.MySqlValueConverters - Using UTF-8 charset by default for column without charset: last_login_ip VARCHAR(40) CHARSET 'utf8mb4' NOT NULL DEFAULT VALUE 
    2019-11-16 03:56:28.228 [WARN] [taskExecuteManager-4] i.d.c.mysql.MySqlValueConverters - Column is missing a character set: banker_id VARCHAR(16) NOT NULL DEFAULT VALUE

    And when row has been updated, it comes with logs like this which I think is relevant to below logs:

    2019-11-16 04:13:05.241 [ERROR] [blc-DB_MEMBER:3306] i.d.relational.TableSchemaBuilder - Failed to properly convert data value for 'otg_member.player.last_logout_time' of type BIGINT for row [[97, 97, 97, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 49, 52, 54], [97, 111, 116, 103, 48, 49, 48, 48, 48, 48, 48, 48, 48, 48, 48, 49], [97, 97, 97], 98, [107, 97, 105, 54, 54, 54], [75, 97, 105], [], [50, 101, 53, 53, 54, 100, 101, 55, 100, 97, 97, 55, 53, 56, 51, 99, 55, 97, 52, 52, 97, 51, 49, 51, 56, 55, 55, 49, 97, 101, 57, 100], [52, 102, 51, 51, 101, 54, 57, 50, 56, 57, 55, 57, 53, 57, 53, 56, 48, 55, 99, 97, 99, 54, 102, 54, 102, 101, 52, 102, 50, 55, 98, 55], [], [67, 78, 89], [97, 111, 116, 103, 48, 49, 48, 48, 48, 48, 48, 48, 48, 48, 48, 51], [68, 95, 97, 97, 97, 100, 97, 67, 78, 89], [107, 97, 105, 64, 111, 50, 111, 103, 116, 46, 99, 111, 109], [49, 50, 51, 52], [], [49, 50, 51, 52], [50, 48, 49, 57, 45, 49, 48, 45, 50, 52], [49, 50, 51, 52], [], [], [], 1573725030137, 1573802504922, 1573877541737, [49, 57, 50, 46, 49, 54, 56, 46, 54, 51, 46, 49, 49], [49, 57, 50, 46, 49, 54, 56, 46, 54, 51, 46, 49, 49], [79, 49], 6, 3, 0, 0, 0, 1573722463033, [50, 48, 49, 57, 45, 49, 49, 45, 49, 52, 32, 48, 57, 58, 48, 55, 58, 52, 51], 1573802504922, [50, 48, 49, 57, 45, 49, 49, 45, 49, 53, 32, 48, 55, 58, 50, 49, 58, 52, 52], 1]:
    java.lang.IllegalArgumentException: Unexpected value for JDBC type -5 and column last_logout_time BIGINT(20) NOT NULL DEFAULT VALUE 0: class=class [B
        at io.debezium.jdbc.JdbcValueConverters.handleUnknownData(JdbcValueConverters.java:1171)
        at io.debezium.jdbc.JdbcValueConverters.convertValue(JdbcValueConverters.java:1214)
        at io.debezium.jdbc.JdbcValueConverters.convertBigInt(JdbcValueConverters.java:816)
        at io.debezium.jdbc.JdbcValueConverters.lambda$converter$6(JdbcValueConverters.java:287)
        at io.debezium.relational.TableSchemaBuilder.lambda$createValueGenerator$4(TableSchemaBuilder.java:257)
        at io.debezium.relational.TableSchema.valueFromColumnData(TableSchema.java:143)
        at io.debezium.connector.mysql.RecordMakers$1.update(RecordMakers.java:272)
        at io.debezium.connector.mysql.RecordMakers$RecordsForTable.update(RecordMakers.java:499)
        at io.debezium.connector.mysql.Binlo
    xi gao
    @Kevin_gx_twitter
    Hey Guys, I really need help... I'm using MongoDB connector as the sink, how could I use the PK in original record as PK in MongoDB or ElasticSearch instead of always creating a new record?
    Ashika Umanga Umagilya
    @ashikaumanga
    @jpechane about the MySql issue I mentioned above,sorry for the mistake, it's an issue at our end.Even Though our DBAs claimed said that they have enabled binlog in slaves , we could not see binlog files are getting updated. I think it's a misconfiguration at DBAs end . Sorry for the confusion
    Eero Koplimets
    @pimpelsang
    Hi. Got interesting question. Is there a way for downstream consumers to receive "drop database" events inside table topics. Right now if table is truncated or database dropped there will be no individual item delete events and stale data will remain in consumers database. I belive there is a way to receive those DDL events from schema changes topic, but there is no order guarantees between this and data topic. Any thoughts for proper consumer logic in this case?
    Divya Bhayana
    @divya_bhayana_twitter
    Hi, I'm using ddlparser and not able to parse the statement starting with "EXEC SQL" . Does anyone has faced this before?
    richard chng
    @richardchng_gitlab
    Hi
    I m new to debezium
    is Kafka a must ?
    Jiri Pechanec
    @jpechane
    @richardchng_gitlab Hi, not it is not, there is also so called embedded mode - see https://debezium.io/documentation/reference/1.0/operations/embedded.html
    Loganathan Velsamy
    @loganathanav
    Hello, It looks interesting to know about Debezium
    may I know that how can I use debezium with Microsoft SQL server databases?
    Loganathan Velsamy
    @loganathanav
    @jpechane Thank you!
    is there any possibility to connect with Azure Service bus Topic and Debezium?
    We've started using Azure Service bus topics in production already
    Jiri Pechanec
    @jpechane
    @loganathanav Hi, please take a look at what we have for AWS Kinesis - https://github.com/debezium/debezium-examples/tree/master/kinesis
    The prinicple will be the same - just idfferent code
    yando
    @undoroid
    Hi I can connect the Database to Kinesis.
    but when I run it with ECS, where is the best place to store offset.storage and database.history? (That debezium-examples kinesis code is in memory.)
    Is there a reference implementation such as DynamoDB or S3 that can be persisted and shared across multiple instances?
    Jiri Pechanec
    @jpechane
    @undoroid Hi, there is no reference implementation but I remebero somebody wrote S3 implemenation for its project but I cannot recall who was it, soory :-(
    Fatih Güçlü Akkaya
    @fgakk

    @channel Hi, everyone. Wir are using debezium postgres connnector for capturing database change in AWS RDS. Currently, we have a problem, that RDS used storage size keep increasing due to lag in the active replication_slot.
    ```
    select slot_name, pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),restart_lsn)) as replicationSlotLag,
    active from pg_replication_slots ;
    slot_name | replicationslotlag | active
    -----------+--------------------+--------
    debezium | 27 GB | t

    select * from pg_replication_slots ;
    slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
    -----------+----------+-----------+--------+------------------+-----------+--------+------------+------+--------------+-------------+---------------------

    debezium | wal2json | logical | 16395 | test_database | f | t | 24264 | | 23097 | 56/B6002D60 | 56/B6002D98
    ```This is a test environment so RDS does not have much data or frequent changes. I have already search Google and found some Jira issues regarding this problem and they were already resolved. We are using the docker image debezium/connect:1.0 to deploy it as a Fargate container. I think either we are using a wrong version or our configuration has a problem. Any help is appreciated.

    Fatih Güçlü Akkaya
    @fgakk
    @jpechane Aha. No, I have not seen it. OK, according to documentation heartbeat.interval.ms is disabled and we did not have such config. I will try this.
    Loganathan Velsamy
    @loganathanav
    @jpechane Great, Thanks
    Pierre Bernardo
    @pierre.bernardo_gitlab
    Hi All, I have some problem with Debezium on K8S. When Katfa reboot or not available, the worker doesn't reconnect and I have error : Unable to find available brokers to try and BrokerNotAvailableError. Could you help me ? Thanks
    yando
    @undoroid
    @jpechane
    Thank you for your reply! I'll search it .
    If I can't use the Embed version, is it best to connect to Kinesis from Debezium via Kafka using the Kafka-Kinesis-Connector?
    If so, it will a long journey....
    https://github.com/awslabs/kinesis-kafka-connector
    Thiyagu06
    @thiyagu06
    @lucas-piske you can help with @undoroid question I believe.
    xi gao
    @Kevin_gx_twitter
    Hi Guys, I'm trying to introduce debezium into the project but got some concerns from our architect, first one, how much data debezium can handle in the snapshot? how long to process initial sync for a bit table(10m records)?
    Jiri Pechanec
    @jpechane
    @undoroid Hi, just in case Kafka is necessary and AWS connector is too immature then you can use Apache Camel https://camel.apache.org/components/latest/index.html and combine Kafka and Kinesis endpoints