@leoromanovsky Hey, Leo.
First of all, thanks a lot for updating the spark2cassandra library in you fork!
I've been using it recently for batch ingest workloads for our analytics pipeline at Grammarly.
I have a question regarding your experience of using the library: though I see a great speed up in how faster ingestion via streaming is compared to regular ingestion via spark cassandra connector, as a side effect I observe that cassandra is not able to catch up compacting that many small sstables it gets. I've tried tuning it by increasing compaction concurrency and removing compaction mbps threshold, but it seems that I still to dramatically decrease the streaming rate to address this.
Curious if you faced similar challenges in your case?