by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Chris Cranford
    @Naros
    That should guarantee that the schema gets re-snapshotted but not the data and the binlog begins to be read from the expected position you set in (3).
    So in MySQL, when a table change happens, two records are written to the binlog:
    1. Describes the state of the table, think of it like a table descriptor, has all the columns, data types, etc.
    2. The actual DML operation, the insert/update entry.
    It stands to reason that perhaps what's happened is the offset used in (3) maybe points to a point in the binlog that we get the DML operation, but we never read the actual table state capture entry, so we get a binlog entry for a table number that we haven't yet been told exists and thus have no idea about its columns, etc.
    Therefore, forcing a schema resnapshot should correct the problem as I understand it.
    Mike Kamornikov
    @mikekamornikov
    @Naros It won't work for me because I start not from "now" but from some defined offset. For the same reason schema_only_recovery won't work too. As I understand these modes recreate the current schema. And if between binlog position and "now" there was db/table drop debezium won't find a schema for those db/table events.
    @Naros Assuming I tried the same thing several times (and got the same error) I don't believe my offset from step 1 is wrong (just state and not DML).
    Chris Cranford
    @Naros
    @mikekamornikov To be clear, the offset position used, is that before the restoration or immediately after it?
    Mike Kamornikov
    @mikekamornikov

    @Naros it's the position of AWS Aurora snapshot. I get it this way:

        aws_result=$(aws rds describe-events --source-identifier "$target_db_instance_identifier" --source-type db-instance --output json | jq -r '.Events[] | select(.Message | contains("Binlog position from crash")) | .Message | sub("Binlog position from crash recovery is (?<filename>.*) (?<position>.*)"; "\(.filename) \(.position)")')
        read binlog_name binlog_position <<< $aws_result

    So snapshot is atomic. And restoring from snapshot is atomic.

    Chris Cranford
    @Naros
    What I'm trying to grasp (and I apologize if this is a tad outside my wheelhouse) is if the binlog position you're using is at the point right before you did the restore or immediately after the restore concluded.
    Mike Kamornikov
    @mikekamornikov
    @Naros What we do is similar to https://thedataguy.in/debezium-mysql-snapshot-for-aws-rds-aurora-from-backup-snaphot/ . The only difference is instead of "schema_only_recovery" we try "never". Btw, could you explain why the error is possible in section "Update the Endpoint".
    Mike Kamornikov
    @mikekamornikov
    @Naros Regarding binlog position it's not before or after. It's the exact position of cluster snapshot.
    Chris Cranford
    @Naros
    So I'm taking a stab in te dark here trying to digest the code, perhaps @jpechane can correct me if I am mistaken.
    So when you use the same schema history topic, we read this history back in and iiuc, each table read gets assigned an incremental table number.
    When the connector begins to stream, we come across a binlog entry that says table X with table-number 123 had an insert done to it.
    The connector takes table-number 123 and attempts to resolve this to the table's schema in memory, but we don't find a table with that table-number.
    This leads to the error.
    In MySQL, the binlog for an insert as I said earlier has 2 entries:
    1. TABLE_MAP entry that describes the table structure, which maps this table-number to a physical table name.
    2. A ROWS_EVENT (more specifically a WRITE_ROWS_EVENT, an UPDATE_ROWS_EVENT, or a DELETE_ROWS_EVENT). This event carries this table-number and all pertient details about the insert/update/delete operation.
    This error stems from the fact we get this ROWS_EVENT that says we're doing something with table-number 123, we look it up, but we don't have something for that number. This is why I questioned the binlog position earlier, b/c I would have expected the event that is causing this error to have also included a TABLE_MAP entry too earlier in the binlog event stream.
    Mike Kamornikov
    @mikekamornikov
    Regarding that table-number to a physical table name map, does it mean that it can be different for 2 clusters where 1 is a clone of the other one. If that's the case .. does it mean we must recreate dbhistory topic?
    Mike Kamornikov
    @mikekamornikov
    Shortly we're trying to collect snapshot data on a clone cluster and then switch back to source cluster and continue reading binlog from position used to create clone. It guarantees 0 downtime. Do I miss anything obvious?
    Ryan Truran
    @RyanTruran
    what types of files should be in my plugin path?
    1 reply
    is that where the jar files should be located?
    Sergei Morozov
    @morozov

    The full error message the @mikekamornikov mentioned earlier is this:

    Encountered change event for table somedatabase.sometable whose schema isn't known to this connector.

    So I assume that Debezium manages to map the table id from the binlog to a name but didn't manage to locate the table schema.

    The connector takes table-number 123 and attempts to resolve this to the table's schema in memory, but we don't find a table with that table-number.

    With the above said, it doesn't look valid.

    As I see from the documentation, a binlog message refers to a table by ID:

    # at 218
    #080828 15:03:08 server id 1  end_log_pos 258   Write_rows: table id 17 flags: STMT_END_F

    But when I look at the dbhistory topic messages, they are basically JSON messages with DDL like CREATE TABLE.... How does debezium know the mapping between table ids and names and why would it be different if we switch from a clone to the original instance?

    Ryan Truran
    @RyanTruran
    What is topic not present in metadata after 10000ms telling me?
    Ryan Truran
    @RyanTruran
    more specificically this is the error that I am getting
    ERROR Producer failure (io.debezium.pipeline.ErrorHandler) io.debezium.DebeziumException: io.debezium.relational.history.DatabaseHistoryException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Topic dbhistory.tags not present in metadata after 10000 ms.
    1 reply

    followed by

    ERROR WorkerSourceTask{id=dbz-test-connector-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)

    Jiri Pechanec
    @jpechane
    @mikekamornikov @morozov Hi, I'd start with io.debezium.relational.history log level set to TRACE. This way you should be able to see what and how is read from the database history..
    shauryayellam
    @shauryayellam
    Hi @jpechane I want add new property into existing running mysql debezium connector . can you pls provide me the command ?
    17 replies
    Jiri Pechanec
    @jpechane
    @mikekamornikov @morozov ok, I probably now what the problem is. IIUC after this operation the binlog corrdinates will change. If this is the case then comprator used in io.debezium.relational.history.AbstractDatabaseHistory.recover(Map<String, ?>, Map<String, ?>, Tables, DdlParser) will not work correctly and it is possible that some or all history records will not be updated. Hence the need for new database history topic and schema resnapshotting to align binlog coordinates store in datbase history with new binlog
    6 replies
    张峰
    @zhangfengsd
    I ran Debezium with the Docker version, but I didn't see an Environment variable in Environment Variables specifying the Kafka account password.
    As long as the BOOTSTRAP_SERVERS parameter, you can only specify the address and port.
    Is it true that Debezium docker doesn't support the kafka account password in the environment variable?What version do I want
    The docker version address: https://hub.docker.com/r/debezium/connect
    13 replies
    Francesco la Torre
    @fflatorre

    Hi everyone, using debezium on postgreSQL 11 and I'm not able to get UPDATES, every tiime I update a row got the following :

    RelationalChangeRecordEmitter - no new values found for table '{ key : null, value : {"name" : "local.postgresql_11.6.public.table_name.Value", "type" : "STRUCT", "optional" : "true", "fields" : [{"name" : "column_1", "index" : "0", "schema" : {"type" : "INT32", "optional" : "true"}}]} }' from update message at 'Struct{version=1.2.4.Final,connector=postgresql,name=local.postgresql_11.6,ts_ms=1601031501583,db=alfa,schema=public,table=table_name,txId=596,lsn=24658768}'; skipping record

    can easily replicate this way :

    • create table "table_name" with coulumn "column_1"
    • no DLL message expected (OK)
    • add a row and CREATE message correctly received (OK)
    • update the row just added and receive the message above :(

    Any clue on what could be the error ?
    If it helps, I'm using

    • the plugin decoderbufs
    • debezium version 1.2.4.Final
    Francesco la Torre
    @fflatorre

    @jpechane I've run the following :

    SELECT CASE relreplident
              WHEN 'd' THEN 'default'
              WHEN 'n' THEN 'nothing'
              WHEN 'f' THEN 'full'
              WHEN 'i' THEN 'index'
           END AS replica_identity
    FROM pg_class
    WHERE oid = 'table_name'::regclass;

    and seems to be default ... is it correct ?

    Jiri Pechanec
    @jpechane
    If a table does not have a primary key, the connector does not emit UPDATE or DELETE events for that table. For a table without a primary key, the connector emits only create events. Typically, a table without a primary key is used for appending messages to the end of the table, which means that UPDATE and DELETE events are not useful.
    Francesco la Torre
    @fflatorre
    oh, the table has no primary key defined ...
    yeah, thanks, got it, about to test with the proper schema
    Francesco la Torre
    @fflatorre
    thanks @jpechane all good now :)
    Aalok Kamble
    @aalokkamble
    Has anyone recently used the debezium with Google bigquery? I'm looking forward to migrate MySQL db.
    MrGui123
    @MrGui123
    Hi team, How to specify to use your own log configuration file?
    1 reply
    akhila
    @userakhila
    hi, some of our debezium postgres connectors tasks are going into unassigned state with no reason, status is showing as below. Can you please help me understand what could be the reason for this issue?
    {"state":"UNASSIGNED","trace":null,"worker_id":"ip:8083","generation":52}
    1 reply
    vicky
    @vkvicky3_twitter
    Hi team, is there a way to enable full document discovery in case of update patches ?
    Current change log only captures partial records from mongo.
    4 replies
    RadioGuy
    @RadioGuy
    Hi Team,
    I have a healthcheck script that deletes and creates the conenctor with the same config, if it goes to failed state.
    But I am getting the following error when starting it.
    Should I give some delay between deleting and starting connector.
        Caused by: io.debezium.DebeziumException: Failed to start replication stream at LSN{31/E6F6298}; when setting up multiple connectors for the same database host, please make sure to use a distinct replication slot name for each.
    1 reply
    Sam--Shan
    @Sam--Shan
    hi guys,i am new to kafka and debezium. i want to know if i can produce message to debezium topic with kafka-console-producer.sh or kafka-avro-console-producer . i tried it,i can get it by kafka-console-consumer.sh,but i cant consume it in my jdbc sink connector.
    1 reply
    `ERROR WorkerSinkTask{id=test-sink-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted. Error: Sink connector 'test-sink' is configured with 'delete.enabled=false' and 'pk.mode=record_key' and therefore requires records with a non-null key and non-null Struct or primitive key schema, but found record at (topic='sjw5.public.testjdbc1',partition=0,offset=0,timestamp=1601285491128) with a null key and null key schema. (org.apache.kafka.connect.runtime.WorkerSinkTask:586)
    org.apache.kafka.connect.errors.ConnectException: Sink connector 'test-sink' is configured with 'delete.enabled=false' and 'pk.mode=record_key' and therefore requires records with a non-null key and non-null Struct or primitive key schema, but found record at (topic='sjw5.public.testjdbc1',partition=0,offset=0,timestamp=1601285491128) with a null key and null key schema. `
    Joir-dan Gumbs
    @SakuraSound
    This is probably a silly question, but I am wondering if there was a way to use debezium to just capture database schemas and their changes without capturing dml events to Kafka? I’m looking for an approach that doesn’t involve just setting retention for those tables to 10ms, unless that is my only option.
    1 reply
    Chris Cranford
    @Naros
    @SakuraSound You could setup the content-filtering SMT on your connector, you could then capture the schema changes but prevent the emission of change events. Take a look at https://debezium.io/documentation/reference/configuration/filtering.html
    Joir-dan Gumbs
    @SakuraSound
    Thanks! I’ll take a look
    Jason Seriff
    @jseriff_gitlab
    Hello - We've been evaluating Debezium Server to send change data to PubSub - but we want all messages to be sent to a single topic. I've taken a look at the example here: https://github.com/debezium/debezium-examples/tree/master/debezium-server-name-mapper - but am curious as the deployment model.
    Is the intent that we'd effectively build our own Docker container running our own full build of debezium server, or is the intent that we somehow build and inject just our change into an existing server container?
    2 replies
    CB Yuvaraj
    @YuvarajCB
    Hello, I am trying out the emebedded debezium engine using mysql connector. Using snapshot mode as initial, database history as FileDatabaseHistory, upon every connector start up, ddl statements are being written to local file. My understanding was on start up, if db history records are all present, that would be used. But what I am observing is that a snapshot is being done on every start up of connector. Is this the expected behavior?
    1 reply
    icecold21
    @icecold21
    any body encountered
    Caused by: com.github.shyiko.mysql.binlog.network.ServerException: Could not find first log file name in binary log index file
    at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:914)
    on amazon RDS
    James Gormley
    @jgormley6
    For using the SqlServer connector, how should we specify to use SSL when connecting with the sql server? I see that for MySql there is a 'database.ssl.mode` configuration property, but don't see a corresponding one for SqlServer. Is that not currently supported? Or am I missing a way to do it?
    Jiri Pechanec
    @jpechane
    @jgormley6 Hi, just bear in ming that database.* props are passthrough to JDBC driver so you can use them to configure SQL Server JDBC driver connection
    1 reply