These are chat archives for cloudera/kudu

Jan 2018
Aaron Hiniker
Jan 05 2018 20:23

I have a Spark job that appears to be hung (Kudu 1.4.0-cdh5.12.0):
In the driver stacks, I see threads stuck here:

java.lang.Object.wait(Native Method)

and in the driver logs, I'm seeing these messages get logged:

20:19:07 WARN  ConnectToCluster - Unable to find the leader master (,,, will retry
20:19:07 ERROR TabletClient - [Peer master-] Unexpected exception from downstream on [id: 0xf129dffb, / => /]
java.lang.RuntimeException: Could not deserialize the response, incompatible RPC? Error is: step
        at org.apache.kudu.client.KuduRpc.readProtobuf(
        at org.apache.kudu.client.Negotiator.parseSaslMsgResponse(
        at org.apache.kudu.client.Negotiator.handleResponse(
        at org.apache.kudu.client.Negotiator.messageReceived(
Running kudu cluster ksck <master> reports all tables as healthy
I should probably mention that we have multiple threads operating on independent KuduContext's in this case, but this same code has been running fine for months until recently
Aaron Hiniker
Jan 05 2018 20:29
Oh, and also this in the Kudu logs: W0105 19:58:16.417731 18382] Unauthorized connection attempt: Server connection negotiation failed: server connection from authentication token expired
Dan Burkert
Jan 05 2018 23:05
@hindog You'll probably have more luck on the Kudu slack channel, discussion is much more active there: . KUDU-2013 is probably your issue, especially if that job took more than 6 or 7 days