conf$sparklyr.livy.sources <- TRUE
and continue to use conf$livy.jars
and that seems to push the correct livy settings to the livy server. But then there is an issue with Failed to initialize livy connection: Failed to execute Livy statement with error: <console>:24: error: not found: value LivyUtils
... not sure why ?
sc <- spark_connect(master = 'local')
Error in start_shell(master = master, spark_home = spark_home, spark_version = version, :
Failed to find 'spark-submit2.cmd' under 'C:\Users\Owner\AppData\Local\spark\spark-3.0.0-bin-hadoop2.7', please verify SPARK_HOME.
I faced an issue and raised via sparklyr/sparklyr#2769
sc <- spark_connect(master = 'local')
Error in start_shell(master = master, spark_home = spark_home, spark_version = version, : Failed to find 'spark-submit2.cmd' under 'C:\Users\Owner\AppData\Local\spark\spark-3.0.0-bin-hadoop2.7', please verify SPARK_HOME.
I faced an issue and raised via sparklyr/sparklyr#2769
Solved !!!
Step :
1) https://spark.apache.org/downloads.html
2) extract zipped file to 'C:/Users/scibr/AppData/Local/spark/spark-3.0.1-bin-hadoop3.2'.
3) manually choose latest version : spark_home_set('C:/Users/scibr/AppData/Local/spark/spark-3.0.1-bin-hadoop3.2')
Hi everyone. Spark newbie here. Got the following error in class and haven't been able to solve it:
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
Try running options(sparklyr.log.console = TRUE)
followed by sc <- spark_connect(...)
for more debugging info..
I saw @javierluraschi answer here sparklyr/sparklyr#801, but this fix hasn't proven effective for me. Any kind of help would be deeply appreciated.
Thanks a lot!
Hi Everyone!, Im trying to migrate a script to sparklyR, And I cant find the equivalent to spread and gather. My code looks something like:
ltv_curves %>%
spread(
key = !!as.name(column_to_fill),
value = grosstotal
) %>%
gather(
key = !!as.name(column_to_fill),
value = grosstotal,
-ignore_columns
)
Anyone around that can help with this?
I'm getting an error using spark_apply
when connecting to a Kubernetes cluster. I can run sdf_len(sc, 10)
just fine but running sdf_len(sc, 10) %>% spark_apply(function(df) I(df))
returns the following error:
Error: java.io.FileNotFoundException: File file:/var/folders/jf/lqnngxkj0x75cdmv_xjygfq40000gq/T/RtmpuaCX4s/packages/packages.8599.tar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1534)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1498)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sparklyr.Invoke.invoke(invoke.scala:147)
at sparklyr.StreamHandler.handleMethodCall(stream.scala:136)
at sparklyr.StreamHandler.read(stream.scala:61)
at sparklyr.BackendHandler.$anonfun$channelRead0$1(handler.scala:58)
at scala.util.control.Breaks.breakable(Breaks.scala:42)
at sparklyr.BackendHandler.channelRead0(handler.scala:39)
at sparklyr.BackendHandler.channelRead0(handler.scala:14)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:321)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:295)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Spark 3.0.1, Scala 2.12, sparklyr 1.5.2
sc <- spark_connect(master = "local", version = "2.3")#connect to this local cluster
Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
Try running options(sparklyr.log.console = TRUE)
followed by sc <- spark_connect(...)
for more debugging info.
spark_install()
to download some version of Spark to ~/spark
? Feel free to file an issue at https://github.com/sparklyr/sparklyr/issues with more details.Hi, I'm trying to use spark_read_avro
but I'm always getting the same error:
Error in validate_spark_avro_pkg_version(sc) :
Avro support must be enabled with `spark_connect(..., version = <version>, packages = c("avro", <other package(s)>), ...)` or by explicitly including 'org.apache.spark:spark-avro_2.12:3.1.1-SNAPSHOT' for Spark version 3.1.1-SNAPSHOT in list of packages
I specified my spark version and added avro to packages in the spar_connect function, tried both examples I've got in the error message, but none are working.
Does anybody know how to fix this?