Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
    Leo Romanovsky
    Hello, world!
    Mikhail Chernetsov

    @leoromanovsky Hey, Leo.

    First of all, thanks a lot for updating the spark2cassandra library in you fork!

    I've been using it recently for batch ingest workloads for our analytics pipeline at Grammarly.

    I have a question regarding your experience of using the library: though I see a great speed up in how faster ingestion via streaming is compared to regular ingestion via spark cassandra connector, as a side effect I observe that cassandra is not able to catch up compacting that many small sstables it gets. I've tried tuning it by increasing compaction concurrency and removing compaction mbps threshold, but it seems that I still to dramatically decrease the streaming rate to address this.

    Curious if you faced similar challenges in your case?

    Hi there, I just created a Pull Request to be able to configure the TTL through the spark configuration. Could you have a look?