These are chat archives for arnaud-lb/php-rdkafka
log.retention.hourscofiguration option, https://kafka.apache.org/documentation/#configuration
@chabior It will remove all the message in a topic right?
eg : First message time in a topic : 1pm
Last message time in a tpic : 4pm
when i consume the topic at 4pm, i need data only form 3pm to 4pm
1.Need to delete the data from 1pm to 3pm
Is it possible?
@chabior honestly it's not that. Kafka doest't delete messages, it deletes
log file, which contains messages. Your should consider a pair of options
Lets see the example. You set
log.retention.hours=1 (1 hour) and
log.retention.bytes=100000000 (100Mb). Lets consider producer puts new message every 10 minutes and 1 message weighs 1Mb. So in a hour there will be 60 messages in the log. It seems 1st message should be deleted, in that case Kafka should delete log file, but it can't do it because rest messages in the log exist less than 1 hour, so 1st message will live further. Lets see 2 cases:
1) Producer stops put new messages to Kafka. In that case Kafka will wait 1 hour. When the most fresh message became older than 1 hour, Kafka delete log file. What we have... 1st message has lived 2 hours.
2) Producer continues put messages every 10 minutes. After 100st message we reach
log.retention.bytes and next messages Kafka starts write to new log file. Kafka can't delete last log file immediately, it guarantees that every message will live at least 1 hour, so previous log will live 1 hour more. What we have... 100 messages * 10 min = 1 hour 40 min. + 1 hour (ttl of last message). As result 1st message will live 2h 40m.
offsetin Redis for an hour. Then you get the smallest offset which still alive in Redis and use it to consume proper messages from Kafka.