Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
Davis Bennett
the tricks apply if you want to do registration outside of the registration stuff built in to thunder
so that's the second time you've mentioned using the functions outside of what is built in to thunder. Is thunder not quite being supported the same as it was 6 months-1 year ago? I've noticed the commits on the repo have basically stopped since a year or so ago.
Davis Bennett
i think development has slowed down, but even if it was under active development I find it's much easier and flexible to use thunder for applying functions I wrote instead of using functions that are built in
in my case, I want to use a registration method that isn't built in to thunder so I don't have a choice
@d-v-b So I went back to running it on Spark again and it seems to still be slightly faster to load the multi-page tiffs than the individual tiffs. I'm still loading the files from a standard file system since Thunder can't load tiffs from an hdfs. I imagine reading from the hdfs it might be quicker, but it seems like the multipage tiffs are still quicker.
Davis Bennett
i don't understand how reading from a single file can be faster than reading from multiple files in a parallel context, but whatever works!
It's a difference of 3-7 seconds or so from my testing, so not much, but it is faster in both cases on Spark than it is in local mode.
Gilles Vanwalleghem
Hello, so I was wondering if anyone has any idea how to produce these kind of plots : http://research.janelia.org/zebrafish/trajectories.html
As far as I see from the paper, that's supposed to be the online methods, but it's a bit sparse in details
Hello all, I am new to thunder, I have a quick question I was wondering if thunder is an appropriate tool for applying large scale multi variable clustering with a mixture of spatial and temporal data... here is how my data looks like ['lable', 'Origin [lat, long]', 'Destination' [lat,long], 'distance', 'departure_time', 'total_travel_time', 'arrival_time',
Hello everyone. My organization is trying to get Thunder going on a Yarn cluster using virtual environments. Has anyone tried this? It seems to be a little outside of the norm from what I have found, and it is certainly different from how things were done at Janelia when I was there. I am wondering whether this is an okay idea that we should continue to pursue
Boaz Mohar
@mellertd As far as I understand, yarn mode is different for how you deploy your spark cluster, in Janelia it is in standalone mode. As far as thunder is concerned, they are both the same. You would need a Jupyter notebook and a way to get a spark context in it, which might be different then how it is done in Janelia. Is this the part you need help with?
I guess the question had more to do with the virtual environment issue. I recall at Janelia, Thunder and dependencies were installed on all the nodes, so you could launch standalone clusters and everything worked. For various reasons, we don’t want to maintain installs on all the nodes and would rather let users manage their own environments. This is possibwith yarn mode, but it is rather clunky and is not officially supported in interactive mode (i.e. in Jupyter)
I am just wondering if anyone had any experience with this, because there seem to be many degrees of freedom to get things working well
Boaz Mohar
I ah
I have used a virtual environment with thunder, you should change the environmental variable SPRAK_PYTHON and both the driver and workers would see the same python virtual environment.
It is actually much more
Complicated than that
This is whatwe are currently tring:
Hm this chat is unusuable on iOS safari, I’ll paste a link when I can get to my laptop
Our tests seem to work with virtualenv, but it is very slow. Haven't gotten it to work with Conda yet, but it should not work any differently. I am currently trying to figure out how we might speed things up
Boaz Mohar
I am definitely not an expert here, and have not used any of these using spark submit. But for interactive mode I have used this code to make sure I am in the right enviorenmt for the driver and workers:
import numpy
def test1(x):
    import numpy
    return numpy.__file__
data = sc.range(10).map(test1).collect()
and PYSPARK_PYTHON worked by pointing it to the virtual environment from conda: export PYSPARK_PYTHON=/groups/svoboda/home/moharb/anaconda2/envs/py35/bin/python
As we all know,spark support two kind of changes denote transform and acrion.so how can I know the API in the thunder belong to transform or action? @boazmohar
Renato Marinho
This message was deleted
Vimmal Sivakumar
after installing thunder what do i have to do in order to start interacting with the software


We are trying to connect hive tables from python on windwos,while connecting facing issues.We are sending our full details please help on this.
we installed python version 2.7.15 and anaconda version 2.7.14
We installed below packages

pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive

we written the below code to connect hive tables from python script

from pyhive import hive
conn = hive.Connection(host="", port=10000, username="mapr", database="default")
cursor = conn.cursor()
cursor.execute("SHOW DATABASES")
for result in cursor.fetchall():

We are getting below error


Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 64, in connect
return Connection(args, *kwargs)
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 162, in init
File "C:\Users\mapr\Anaconda2\lib\site-packages\thrift_sasl__init__.py", line 79, in open
message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

Please help on this issue...if anybody know
Renato Marinho
This message was deleted

China, #Russia, #Germany setting up their own #SWIFT #financial #transaction networks.. what could go wrong? Stochastic Harmonization over the proposed UTZ Universal Time Zone? Common #OPSCODE syntax lexicon #Rosetta Stone? https://www.activistpost.com/2017/04/russia-china-preparing-alternative-banking-architecture.html

Hi you all, img_mean = data.seriesMean().pack()
Hi you all, img_mean = data.seriesMean().pack() when trying to run this from one of the examples I got the following error: AttributeError: 'Series' object has no attribute 'seriesMean'. Is there any updated documentation for thunder?
Suman S

hello all, if you ready to secure your data then take through Information security management system(ISO 27001) Certification with Certivatic just click ISO Certification in UAE for more details.