Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Davis Bennett
@d-v-b
so where is the problem?
Kyle
@kr-hansen
And I'm trying to do this prior to motion correction, so I'm interested in doing the frame by frame removal
So I can create a gaussian filtered array (4000, 1024, 1024) and I have my original array (4000, 1024, 1024)
But if I want to do original-gaussian, in a frame by frame manner, it doesn't work
I can do as you suggest, and take a single composite frame, to subtract from my original array, and that works ok.
Davis Bennett
@d-v-b
is your gaussian filter 3D or 2D?
Kyle
@kr-hansen
2D
2D spatial gaussian filter
Davis Bennett
@d-v-b
ok so you don't ever need a big gaussian filtered array
you need to gaussian filter each image independently
by using something like images.map(lambda v: v - gaussian_filter(v))
Kyle
@kr-hansen
Ok, I see what you're suggesting
Maybe I'll give that a try again.
I'm taking the processing pipeline we've been using in matlab in our lab and trying to convert it over to a Python mode, hence why I've been trying to use Thunder. The frame-by-frame portion was getting me stuck in Spark, so I shifted to try working only in local mode again, but this might still work in Spark mode.
Thanks. I'll try it out and report back for the benefit of others.
Davis Bennett
@d-v-b
no problem, and let me know if you have issues with image registration, that requires a few tricks
Kyle
@kr-hansen
@d-v-b I'll let you know if I have issues. When you talk about a few tricks, what do you mean? I was doing the registration about 6 months ago by taking a mean image as the reference to send to all the executors which seemed to work ok. Are you able to save the motion correction models as .jsons now? I was previously doing it with pickled files.
Davis Bennett
@d-v-b
the tricks apply if you want to do registration outside of the registration stuff built in to thunder
Kyle
@kr-hansen
so that's the second time you've mentioned using the functions outside of what is built in to thunder. Is thunder not quite being supported the same as it was 6 months-1 year ago? I've noticed the commits on the repo have basically stopped since a year or so ago.
Davis Bennett
@d-v-b
i think development has slowed down, but even if it was under active development I find it's much easier and flexible to use thunder for applying functions I wrote instead of using functions that are built in
in my case, I want to use a registration method that isn't built in to thunder so I don't have a choice
Kyle
@kr-hansen
@d-v-b So I went back to running it on Spark again and it seems to still be slightly faster to load the multi-page tiffs than the individual tiffs. I'm still loading the files from a standard file system since Thunder can't load tiffs from an hdfs. I imagine reading from the hdfs it might be quicker, but it seems like the multipage tiffs are still quicker.
Davis Bennett
@d-v-b
i don't understand how reading from a single file can be faster than reading from multiple files in a parallel context, but whatever works!
Kyle
@kr-hansen
It's a difference of 3-7 seconds or so from my testing, so not much, but it is faster in both cases on Spark than it is in local mode.
Gilles Vanwalleghem
@Yassum
Hello, so I was wondering if anyone has any idea how to produce these kind of plots : http://research.janelia.org/zebrafish/trajectories.html
As far as I see from the paper, that's supposed to be the online methods, but it's a bit sparse in details
mshahabi
@mshahabi
Hello all, I am new to thunder, I have a quick question I was wondering if thunder is an appropriate tool for applying large scale multi variable clustering with a mixture of spatial and temporal data... here is how my data looks like ['lable', 'Origin [lat, long]', 'Destination' [lat,long], 'distance', 'departure_time', 'total_travel_time', 'arrival_time',
]
mellertd
@mellertd
Hello everyone. My organization is trying to get Thunder going on a Yarn cluster using virtual environments. Has anyone tried this? It seems to be a little outside of the norm from what I have found, and it is certainly different from how things were done at Janelia when I was there. I am wondering whether this is an okay idea that we should continue to pursue
Boaz Mohar
@boazmohar
@mellertd As far as I understand, yarn mode is different for how you deploy your spark cluster, in Janelia it is in standalone mode. As far as thunder is concerned, they are both the same. You would need a Jupyter notebook and a way to get a spark context in it, which might be different then how it is done in Janelia. Is this the part you need help with?
mellertd
@mellertd
I guess the question had more to do with the virtual environment issue. I recall at Janelia, Thunder and dependencies were installed on all the nodes, so you could launch standalone clusters and everything worked. For various reasons, we don’t want to maintain installs on all the nodes and would rather let users manage their own environments. This is possibwith yarn mode, but it is rather clunky and is not officially supported in interactive mode (i.e. in Jupyter)
I am just wondering if anyone had any experience with this, because there seem to be many degrees of freedom to get things working well
Boaz Mohar
@boazmohar
I ah
I have used a virtual environment with thunder, you should change the environmental variable SPRAK_PYTHON and both the driver and workers would see the same python virtual environment.
mellertd
@mellertd
It is actually much more
Complicated than that
This is whatwe are currently tring:
Hm this chat is unusuable on iOS safari, I’ll paste a link when I can get to my laptop
Our tests seem to work with virtualenv, but it is very slow. Haven't gotten it to work with Conda yet, but it should not work any differently. I am currently trying to figure out how we might speed things up
Boaz Mohar
@boazmohar
I am definitely not an expert here, and have not used any of these using spark submit. But for interactive mode I have used this code to make sure I am in the right enviorenmt for the driver and workers:
import numpy
print(numpy.__file__)
def test1(x):
    import numpy
    return numpy.__file__
data = sc.range(10).map(test1).collect()
print(data[0])
and PYSPARK_PYTHON worked by pointing it to the virtual environment from conda: export PYSPARK_PYTHON=/groups/svoboda/home/moharb/anaconda2/envs/py35/bin/python
Sid-Sloth
@Sid-Sloth
As we all know,spark support two kind of changes denote transform and acrion.so how can I know the API in the thunder belong to transform or action? @boazmohar
Renato Marinho
@renatomarinho
This message was deleted
Vimmal Sivakumar
@bluefoo19_twitter
after installing thunder what do i have to do in order to start interacting with the software
srinu989
@srinu989

Hi

We are trying to connect hive tables from python on windwos,while connecting facing issues.We are sending our full details please help on this.
we installed python version 2.7.15 and anaconda version 2.7.14
We installed below packages

pip install sasl
pip install thrift
pip install thrift-sasl
pip install PyHive

we written the below code to connect hive tables from python script

from pyhive import hive
conn = hive.Connection(host="172.16.17.196", port=10000, username="mapr", database="default")
cursor = conn.cursor()
cursor.execute("SHOW DATABASES")
for result in cursor.fetchall():
use_result(result)

We are getting below error

==========================

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 64, in connect
return Connection(args, *kwargs)
File "C:\Users\mapr\Anaconda2\lib\site-packages\pyhive\hive.py", line 162, in init
self._transport.open()
File "C:\Users\mapr\Anaconda2\lib\site-packages\thrift_sasl__init__.py", line 79, in open
message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2

srinu989
@srinu989
Please help on this issue...if anybody know
Renato Marinho
@renatomarinho
This message was deleted
Eco_Econ_Heartbeat
@Heart_Beacon_twitter

China, #Russia, #Germany setting up their own #SWIFT #financial #transaction networks.. what could go wrong? Stochastic Harmonization over the proposed UTZ Universal Time Zone? Common #OPSCODE syntax lexicon #Rosetta Stone? https://www.activistpost.com/2017/04/russia-china-preparing-alternative-banking-architecture.html

anciubo
@anciubo
Hi you all, img_mean = data.seriesMean().pack()
Hi you all, img_mean = data.seriesMean().pack() when trying to run this from one of the examples I got the following error: AttributeError: 'Series' object has no attribute 'seriesMean'. Is there any updated documentation for thunder?
Thanks