Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
Boaz Mohar
@mellertd As far as I understand, yarn mode is different for how you deploy your spark cluster, in Janelia it is in standalone mode. As far as thunder is concerned, they are both the same. You would need a Jupyter notebook and a way to get a spark context in it, which might be different then how it is done in Janelia. Is this the part you need help with?
I guess the question had more to do with the virtual environment issue. I recall at Janelia, Thunder and dependencies were installed on all the nodes, so you could launch standalone clusters and everything worked. For various reasons, we don’t want to maintain installs on all the nodes and would rather let users manage their own environments. This is possibwith yarn mode, but it is rather clunky and is not officially supported in interactive mode (i.e. in Jupyter)
I am just wondering if anyone had any experience with this, because there seem to be many degrees of freedom to get things working well
Boaz Mohar
I ah
I have used a virtual environment with thunder, you should change the environmental variable SPRAK_PYTHON and both the driver and workers would see the same python virtual environment.
It is actually much more
Complicated than that
This is whatwe are currently tring:
Hm this chat is unusuable on iOS safari, I’ll paste a link when I can get to my laptop
Our tests seem to work with virtualenv, but it is very slow. Haven't gotten it to work with Conda yet, but it should not work any differently. I am currently trying to figure out how we might speed things up
Boaz Mohar
I am definitely not an expert here, and have not used any of these using spark submit. But for interactive mode I have used this code to make sure I am in the right enviorenmt for the driver and workers:
import numpy
def test1(x):
    import numpy
    return numpy.__file__
data = sc.range(10).map(test1).collect()
and PYSPARK_PYTHON worked by pointing it to the virtual environment from conda: export PYSPARK_PYTHON=/groups/svoboda/home/moharb/anaconda2/envs/py35/bin/python
As we all know,spark support two kind of changes denote transform and acrion.so how can I know the API in the thunder belong to transform or action? @boazmohar
Renato Marinho
This message was deleted
Vimmal Sivakumar
after installing thunder what do i have to do in order to start interacting with the software


We are trying to connect hive tables from python on windwos,while connecting facing issues.We are sending our full details please help on this.
we installed python version 2.7.15 and anaconda version 2.7.14
We installed below packages

pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive

we written the below code to connect hive tables from python script

from pyhive import hive
conn = hive.Connection(host="", port=10000, username="mapr", database="default")
cursor = conn.cursor()
cursor.execute("SHOW DATABASES")
for result in cursor.fetchall():

We are getting below error


Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 64, in connect
return Connection(args, *kwargs)
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 162, in init
File "C:\Users\mapr\Anaconda2\lib\site-packages\thrift_sasl__init__.py", line 79, in open
message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

Please help on this issue...if anybody know
Renato Marinho
This message was deleted

China, #Russia, #Germany setting up their own #SWIFT #financial #transaction networks.. what could go wrong? Stochastic Harmonization over the proposed UTZ Universal Time Zone? Common #OPSCODE syntax lexicon #Rosetta Stone? https://www.activistpost.com/2017/04/russia-china-preparing-alternative-banking-architecture.html

Hi you all, img_mean = data.seriesMean().pack()
Hi you all, img_mean = data.seriesMean().pack() when trying to run this from one of the examples I got the following error: AttributeError: 'Series' object has no attribute 'seriesMean'. Is there any updated documentation for thunder?
Suman S

hello all, if you ready to secure your data then take through Information security management system(ISO 27001) Certification with Certivatic just click ISO Certification in UAE for more details.


Amir Bahmanyari
Hi All, am new to Thunder. Tried to run the tutorial on Jupyter Notebook. When I try to execute this in a cell: series = td.series.fromexample('fish')...I get this error: dtype not specified either in conf.json or as argument...I skipped the intermediate errors...I checked the readers.py under both series and the parent folder...nothing suspicious...expected to move on smoothly in there...appreciate any hint...thx+regards
mede quip

Thanks for this discussion, whenever I get doubt i tried to research in thread for solution and i got some helpful tips. regards


Copeland Nicolas
Composer based Thunder installation. This project template should provide a kickstart for managing your site dependencies with Composer. https://www.paymydoctor.ltd//
<a href="https://www.essayduniya.com/">essay in hindi</a>
<a href="https://english.essayduniya.com/">essay in english</a>