@JakeRuss I'm trying to connect to remote cassandra using host, port, user name and password
conf <- spark_config()
conf[["spark.cassandra.connection.ssl.enabled"]] = TRUE
conf[["spark.cassandra.connection.host"]] = cassandra_host
conf[["spark.cassandra.connection.port"]] = cassandra_port
conf[["spark.cassandra.auth.username"]] = cassandra_username
conf[["spark.cassandra.auth.password"]] = cassandra_password
config[["sparklyr.defaultPackages"]] <- c("org.apache.hadoop:hadoop-aws:2.7.3", "datastax:spark-cassandra-connector:2.0.0-RC1-s_2.11")
sc <- spark_connect(master = "local", version = "2.2.0", spark_home = spark_path, config = conf)
df <- spark_read_source(
sc,
name = "emp",
source = "org.apache.spark.sql.cassandra",
options = list(keyspace = "temp", table = "category_distribution"),
memory = FALSE)
but this is not working. please suggest a solution
Error in force(code) :
Failed while connecting to sparklyr to port (8880) for sessionid (52016): Gateway in localhost:8880 did not respond.
Path: C:\Users\Tarun_Gupta2\AppData\Local\spark\spark-2.4.5-bin-hadoop2.7\bin\spark-submit2.cmd
Parameters: --class, sparklyr.Shell, "C:\Users\TarunGupta2\Documents\R\win-library\3.6\sparklyr\java\sparklyr-2.4-2.11.jar", 8880, 52016
Log: C:\Users\TARUN~1\AppData\Local\Temp\Rtmpw9ZV82\filea70487da97_spark.log
---- Output Log ----
/Java/jdk1.8.0_251\bin\java was unexpected at this time.
---- Error Log ----
Hi , Someone please help me to fix the below error ,I have another setup working which is hadoop 2 (EMR 5.x) , Now I am testing EMR 6 with new spark home that is /usr/lib/spark6/ , Just I compare with both setting everything looks good for me. Is there any specific setting I need to checlk
sc <- spark_connect(master = "yarn", spark_home = "/usr/lib/spark6", deploymode = "cluster", enableHiveSupport = TRUE)
Error in force(code) :
Failed while connecting to sparklyr to port (8880) for sessionid (32486): Gateway in localhost:8880 did not respond.
Path: /usr/lib/spark6/bin/spark-submit
Parameters: --class, sparklyr.Shell, '/opt/R/3.6.0/lib64/R/library/sparklyr/java/sparklyr-2.4-2.11.jar', 8880, 32486
Log: /tmp/RtmpijZOtA/filee69e18f188dc_spark.log
---- Output Log ----
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at sparklyr.Shell$.main(shell.scala:9)
at sparklyr.Shell.main(shell.scala)
GOOGLE_CITY_DESC
' given input columns:"compute()
forces the SQL query you have accumulated so far to be evaluated so that might help
@njesp You can try the following to print spark-submit log to console to see what's failing:
library(sparklyr)
options(sparklyr.log.console = TRUE)
sc <- spark_connect(master = "local")
The spark-submit log usually ends up in a text file, but the path to that file is highly system-dependent and also could be influenced by your local config... so rather than spending time figuring out where it might be it's just easier to have options(sparklyr.log.console = TRUE) while trouble-shooting
@yl790 Error in spark_connect_gateway(gatewayAddress, gatewayPort, sessionId, :
Gateway in localhost:8880 did not respond.
:: resolution report :: resolve 84419ms :: artifacts dl 0ms
:: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | ---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
module not found: saurfang#spark-sas7bdat;1.1.5-s_2.11
==== local-m2-cache: tried
file:/C:/Users/njn/.m2/repository/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.pom
-- artifact saurfang#spark-sas7bdat;1.1.5-s_2.11!spark-sas7bdat.jar:
file:/C:/Users/njn/.m2/repository/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.jar
==== local-ivy-cache: tried
C:\Users\njn\.ivy2\local\saurfang\spark-sas7bdat\1.1.5-s_2.11\ivys\ivy.xml
-- artifact saurfang#spark-sas7bdat;1.1.5-s_2.11!spark-sas7bdat.jar:
C:\Users\njn\.ivy2\local\saurfang\spark-sas7bdat\1.1.5-s_2.11\jars\spark-sas7bdat.jar
==== central: tried
https://repo1.maven.org/maven2/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.pom
-- artifact saurfang#spark-sas7bdat;1.1.5-s_2.11!spark-sas7bdat.jar:
https://repo1.maven.org/maven2/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.jar
==== spark-packages: tried
https://dl.bintray.com/spark-packages/maven/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.pom
-- artifact saurfang#spark-sas7bdat;1.1.5-s_2.11!spark-sas7bdat.jar:
https://dl.bintray.com/spark-packages/maven/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.jar
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: saurfang#spark-sas7bdat;1.1.5-s_2.11: not found
::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
Server access error at url https://repo1.maven.org/maven2/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.pom (java.net.ConnectException: Connection timed out: connect)
Server access error at url https://repo1.maven.org/maven2/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.jar (java.net.ConnectException: Connection timed out: connect)
Server access error at url https://dl.bintray.com/spark-packages/maven/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.pom (java.net.ConnectException: Connection timed out: connect)
Server access error at url https://dl.bintray.com/spark-packages/maven/saurfang/spark-sas7bdat/1.1.5-s_2.11/spark-sas7bdat-1.1.5-s_2.11.jar (java.net.ConnectException: Connection timed out: connect)
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: saurfang#spark-sas7bdat;1.1.5-s_2.11: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1302)
at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:304)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:774)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSub
Hi, I tried to connect via sparklyr to our spark cluster via yarn-cluster mode. But the connection fails after 30 seconds. When looking at the logs I see the following behaviour. Everything looks quite normal until the the application starts:
20/07/24 13:21:01 INFO Client: Submitting application application_1595494790876_0069 to ResourceManager
20/07/24 13:21:01 INFO YarnClientImpl: Submitted application application_1595494790876_0069
20/07/24 13:21:02 INFO Client: Application report for application_1595494790876_0069 (state: ACCEPTED)
20/07/24 13:21:02 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1595596856971
final status: UNDEFINED
tracking URL: http://lm:8088/proxy/application_1595494790876_0069/
user: hadoop
20/07/24 13:21:03 INFO Client: Application report for application_1595494790876_0069 (state: ACCEPTED)
20/07/24 13:21:04 INFO Client: Application report for application_1595494790876_0069 (state: ACCEPTED)
The last message then continuously repeats. After some while I get some messages from the logs.
20/07/24 13:22:03 WARN sparklyr: Gateway (35459) Failed to get network interface of gateway server socketnull
Any idea what could go wrong? I guess a lot of things... especially since we are in a quite restrict network. It was already quite a pain to reach this point. The client only sees the workers ports 9868. I now also opened
port 8880 since I thought maybe the sparklyr gateway on the node tries to communicate with the client
and fails. But this didn't change anything.