These are chat archives for kite-sdk/kite

25th
Sep 2014
Eric Sammer
@esammer
Sep 25 2014 22:28
hey all!
Joey Echeverria
@joey
Sep 25 2014 22:28
look who's back!
Eric Sammer
@esammer
Sep 25 2014 22:29
; P
Back to complain.
Joey Echeverria
@joey
Sep 25 2014 22:29
shocker
Ryan Blue
@rdblue
Sep 25 2014 22:29
Hey!
How's it going?
Eric Sammer
@esammer
Sep 25 2014 22:29
It looks like the changes to the filesystem writers now writes two copies of every record. Is that true?
Ryan Blue
@rdblue
Sep 25 2014 22:29
Where is this?
Joey Echeverria
@joey
Sep 25 2014 22:29
if you use the default parquet appender, yes
Eric Sammer
@esammer
Sep 25 2014 22:30
Good until a few minutes ago when we found we were doing 2x all IO to HDFS.
Joey Echeverria
@joey
Sep 25 2014 22:30
there's an option you can set
for the old beahavior
this was added as a workaround for doing durable appends to parquet
Eric Sammer
@esammer
Sep 25 2014 22:30
Why does this exist?
Gah.
Ok.
Joey Echeverria
@joey
Sep 25 2014 22:30
parquet doesn't support sync
Eric Sammer
@esammer
Sep 25 2014 22:30
No, nor can it.
Ryan Blue
@rdblue
Sep 25 2014 22:31
Yes, durability has to be done through Avro
Eric Sammer
@esammer
Sep 25 2014 22:31
Well, at least not today.
Ryan Blue
@rdblue
Sep 25 2014 22:31
we're working on a long-term fix, but people want to write directly to parquet NOW!
And we want to know more about proper row group sizes before adding durability primitives (or not)
Eric Sammer
@esammer
Sep 25 2014 22:31
That's what we do, but we can manage offsets in Kafka until the file is closed (at which point we roll the commit offset forward).
Joey Echeverria
@joey
Sep 25 2014 22:31
for just 2x the IO!
Eric Sammer
@esammer
Sep 25 2014 22:32
Yea, this is terrible behavior.
No one wants this.
Joey Echeverria
@joey
Sep 25 2014 22:32
there's an option for disabling it
Flume does
Eric Sammer
@esammer
Sep 25 2014 22:32
Really, really shouldn't do this by default.
Ryan Blue
@rdblue
Sep 25 2014 22:32
Flume doesn't, MR does
Joey Echeverria
@joey
Sep 25 2014 22:32
no, i mean Flume wants this
Ryan Blue
@rdblue
Sep 25 2014 22:32
Yes, you're right
Joey Echeverria
@joey
Sep 25 2014 22:33
i'd be open to changing the default
Eric Sammer
@esammer
Sep 25 2014 22:33
I mean, we do ~10TB / hour in some cases. Doing 20TB when you expect 10 is... a surprise.
Ryan Blue
@rdblue
Sep 25 2014 22:33
I think this is the right default. Otherwise we silently break the contract of flush and sync
Requiring callers to turn off durability is the right thing so users aren't surprised and lose data
Eric Sammer
@esammer
Sep 25 2014 22:34
I don't know. This is terrible.
Joey Echeverria
@joey
Sep 25 2014 22:34
you can set the property FileSystemProperties.NON_DURABLE_PARQUET_PROP
to true
Eric Sammer
@esammer
Sep 25 2014 22:34
I don't think this is the right want to solve this.
Joey Echeverria
@joey
Sep 25 2014 22:34
for your use case Eric
Eric Sammer
@esammer
Sep 25 2014 22:35
Kite shouldn't be trying to provide guarantees the underlying storage doesn't support.
I'd just document the behavior difference and be done with it.
Ryan Blue
@rdblue
Sep 25 2014 22:35
The problem with that is that users will lose data
Eric Sammer
@esammer
Sep 25 2014 22:36
Or systems fail because they're doing 2x the IO
Joey Echeverria
@joey
Sep 25 2014 22:36
for Flume users
Ryan Blue
@rdblue
Sep 25 2014 22:36
Opt-out is definitely the safer choice
Joey Echeverria
@joey
Sep 25 2014 22:36
the only work around
Eric Sammer
@esammer
Sep 25 2014 22:36
No one expects Parquet to be durable.
Joey Echeverria
@joey
Sep 25 2014 22:36
is write to Avro then convert to parquet after
which is still 2x the IO
Ryan Blue
@rdblue
Sep 25 2014 22:36
But you understand the risks and can opt-out
Joey Echeverria
@joey
Sep 25 2014 22:37
so, I think this is a valid mode
Eric Sammer
@esammer
Sep 25 2014 22:37
That's a really weird thing to slip into a release.
Ryan Blue
@rdblue
Sep 25 2014 22:37
No one expects Parquet to be durable? How do you know?
Eric Sammer
@esammer
Sep 25 2014 22:37
Heh. Did you just ask me if I have experience working with users of these systems?
Joey Echeverria
@joey
Sep 25 2014 22:38
I agree we should have documented this change better
Eric Sammer
@esammer
Sep 25 2014 22:39
I think we're going to fork Kite. It just does too much. It's too unpredictable. :( I don't really want to do that.
Ryan Blue
@rdblue
Sep 25 2014 22:39
That wasn't what I wanted to ask. I want to understand why you think users won't or don't expect durability
Eric Sammer
@esammer
Sep 25 2014 22:40
Because it's pretty well documented that columnar formats can be sync'd because of the way they buffer row groups.
Ryan Blue
@rdblue
Sep 25 2014 22:40
In our original discussion on the flume sink, we decided to disallow parquet writes precisely because users would not understand the durability guarantees. I thought you helped make that decision.
Eric Sammer
@esammer
Sep 25 2014 22:41
I argued for making it a configuration option. You and Brock decided to hard code it so people couldn't configure it.
I want a much thinner library for what we do. I want to understand what it's doing. This change just disrupted a production workload for a feature I never wanted.
Ryan Blue
@rdblue
Sep 25 2014 22:42
but difficulty of getting it to write to parquet aside (config vs. require coding), the decision was that Kite users don't understand the durability trade-offs
I think that is still the case.
Joey Echeverria
@joey
Sep 25 2014 22:43
@esammer that's a risk with our pre-1.0 status
Eric Sammer
@esammer
Sep 25 2014 22:43
I think that's probably true. You're right.
@joey but there are bugs and then there are "features"
Ryan Blue
@rdblue
Sep 25 2014 22:43
Anyway, the problem is that we didn't handle this well to make you aware of it
Joey Echeverria
@joey
Sep 25 2014 22:43
i agree, and we screwed up in documenting this change
Ryan Blue
@rdblue
Sep 25 2014 22:43
and I apologize for that
We will make these changes much more visible in the future
Eric Sammer
@esammer
Sep 25 2014 22:45
I mean, the code doesn't do anything useful. If the writer fails during a write, there's no ability (short of writing custom code) to get those edits in the avro file back.
Joey Echeverria
@joey
Sep 25 2014 22:46
your data isn't lost
Eric Sammer
@esammer
Sep 25 2014 22:46
Or is the goal to just capture it and leave recovery to teh user?
Joey Echeverria
@joey
Sep 25 2014 22:46
the intention was to have a recovery tool
but that hasn't been built yet
Ryan Blue
@rdblue
Sep 25 2014 22:46
Recovery is similar to Avro, although you can't just copy the file over and have most of it work
Joey Echeverria
@joey
Sep 25 2014 22:46
but at least the data isn't lost
Eric Sammer
@esammer
Sep 25 2014 22:47
Hm.
Ryan Blue
@rdblue
Sep 25 2014 22:47
And yes, we do plan to add a data recovery tool to the CLI suite
Eric Sammer
@esammer
Sep 25 2014 22:49
I just don't think this should be part of the core library. I mean, you could wrap a dataset in a durable version or something...
Joey Echeverria
@joey
Sep 25 2014 22:49
that's how it's implemented
Eric Sammer
@esammer
Sep 25 2014 22:50
Well, no, it's under the existing dataset, not a wrapper around it.
Joey Echeverria
@joey
Sep 25 2014 22:50
but you're still voting that the default should be regular version
Eric Sammer
@esammer
Sep 25 2014 22:50
Yep.
Ryan Blue
@rdblue
Sep 25 2014 22:50
This is something that I think is implied by the dataset contract. We have flush and sync methods and they shouldn't be empty unless you opt-in for that behavior
Eric Sammer
@esammer
Sep 25 2014 22:51
Yea, I totally get what you mean, but there's also a pretty obvious implication that writing a record doesn't write 2 records...
Ryan Blue
@rdblue
Sep 25 2014 22:51
It doesn't in the long-term because the avro files are cleaned up
And they are never read by the Data
Joey Echeverria
@joey
Sep 25 2014 22:51
our API docs don't make flush a contract
Ryan Blue
@rdblue
Sep 25 2014 22:51
sorry, DatasetReaders
Joey Echeverria
@joey
Sep 25 2014 22:51
"Implementations of this interface must declare their durability guarantees."
Eric Sammer
@esammer
Sep 25 2014 22:51
I don't care about storage. I care about IO.
IO is far more precious than storage in this case.
Ryan Blue
@rdblue
Sep 25 2014 22:52
That's why we support opting out of the durability guarantee
Eric Sammer
@esammer
Sep 25 2014 22:53
Well, consider it feedback. If you guys think it makes sense, that's ok. I don't, and I worry about what else is under there now.
I think I just want something different. We seem to have a pretty different view of what the library should do.
Ryan Blue
@rdblue
Sep 25 2014 22:53
Thanks for letting us know
What can we do in the future to make this easier for you?
I don't want to have any more surprises messing up your production systems.
Eric Sammer
@esammer
Sep 25 2014 22:54
I think we're just going to fork it.
It's not mine. It's our customers. We ship Kite to folks today.
Ryan Blue
@rdblue
Sep 25 2014 22:55
I understand, and I'd like to see how we can avoid the problem so you don't have to fork.
But that's up to you.
If there's a way we can communicate more with you that works for you better, I'm happy to do it since we want to keep working together on this
Eric Sammer
@esammer
Sep 25 2014 22:56
Like I said, I feel like we see the role of the library very differently. I'd like to have very little change / implicit functionality in the writers / datasets. I'd like to see higher level libraries on top of those APIs that do this stuff. I don't think we should change the behavior of the underlying functionality.
In other words, we should layer on top, not slide layers underneath.
I don't think it's a communication issue. If I had known, I would only be marginally less bummed.
Ryan Blue
@rdblue
Sep 25 2014 22:58
I don't think we've done that very much -- this is the only case I can think of where we restructured a writer or reader
We've added optional features, but they are largely unchanged
Eric Sammer
@esammer
Sep 25 2014 22:58
Views in the write path were similar.
But that's a very subjective position ("a lot" versus "a little")
And this is big.
Ryan Blue
@rdblue
Sep 25 2014 22:59
This is big, but we thought it was very important to customers not to lose data unexpectedly
Eric Sammer
@esammer
Sep 25 2014 23:00
I don't get the sense that I'm properly conveying the impact of 2 x the IO.
And you still could have done it. I'm just saying above, not under.
Ryan Blue
@rdblue
Sep 25 2014 23:02
No, I definitely get that 2x IO sucks
Eric Sammer
@esammer
Sep 25 2014 23:02
Kite now pulls in MR which has also caused us heart ache. Things like that...
Ryan Blue
@rdblue
Sep 25 2014 23:02
The MR InputFormat caused you heart ache?
What can we do to fix it for you?