These are chat archives for thunder-project/thunder

9th
May 2016
Joseph Winston
@josephwinston
May 09 2016 13:24
Attempting the tutorial but I am having problems loading data. Installed via pip install thunder-python.
import thunder as td
series = td.series.fromexample('fish')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-bdfc2b8d9957> in <module>()
----> 1 series = td.series.fromexample('fish')

C:\Users\hb55683\AppData\Local\Continuum\Anaconda2\lib\site-packages\thunder\series\readers.pyc in fromexample(name, engine)
    423             if not key.name.endswith('/'):
    424                 key.get_contents_to_filename(os.path.join(d, key.name))
--> 425         data = frombinary(os.path.join(d, 'series', name), engine=engine)
    426 
    427         if spark and isinstance(engine, spark):

C:\Users\hb55683\AppData\Local\Continuum\Anaconda2\lib\site-packages\thunder\series\readers.pyc in frombinary(path, ext, conf, dtype, shape, skip, index, labels, engine, credentials)
    277         Credentials for remote storage (e.g. S3) in the form {access: ***, secret: ***}
    278     """
--> 279     shape, dtype = _binaryconfig(path, conf, dtype, shape, credentials)
    280 
    281     from thunder.readers import normalize_scheme, get_parallel_reader

C:\Users\hb55683\AppData\Local\Continuum\Anaconda2\lib\site-packages\thunder\series\readers.pyc in _binaryconfig(path, conf, dtype, shape, credentials)
    350 
    351     if 'dtype' not in params.keys():
--> 352         raise ValueError('dtype not specified either in conf.json or as argument')
    353 
    354     if 'shape' not in params.keys():

ValueError: dtype not specified either in conf.json or as argument
Kyle
@kr-hansen
May 09 2016 15:29
@josephwinston What OS/specs are you using? Are you on Windows? I also get a similar issue when I run this command from my command prompt on my local Windows machine. However, I don't have the issue running on my linux-based cluster.
This message was deleted
However, I can still load my own files using thunder on my own local machine
I'm guessing it is some difference with how Windows would access the example files hosted on the server, that isn't an issue with Linux
Jeremy Freeman
@freeman-lab
May 09 2016 15:30
@josephwinston @kkcthans thanks both fairly certain this is a windows-specific issue involving local file reading
the way we do some of the example data is to download it locally and then load it from the local filesystem
i'm not very familiar with windows and it's probably a fairly minor glitch
when i next get access to a windows machine i can try to debug, and if anyone can figure it out in the meantime PRs definitely welcome!
Kyle
@kr-hansen
May 09 2016 16:09
This message was deleted
Kyle
@kr-hansen
May 09 2016 16:39

@freeman-lab @josephwinston I played around with it a little and figured out a few things. I'm not familiar enough with building from source on Windows to try implementing fixes myself from a pull request, but I found some work-arounds for the time being that should work for you @josephwinston or anyone else with this issue on Windows:

  1. For loading the series examples that you tried, it is trying to load binary files. For loading binary files on Windows, you need to explicitly provide the full path as well as the data type and shape. For your reference, the load commands for these for 'fish' would be:
    fishseries = td.series.frombinary('s3n://thunder-sample-data/series/fish/*.bin', dtype= 'uint8', shape=(76,87,2,20))
    and for 'mouse':
    mouseseries = td.series.frombinary('s3n://thunder-sample-data/series/mouse/*.bin', dtype='int16', shape=(64,64,20))
    From what I can tell, in the _binaryconfig function of thunder/readers.py, Windows can't properly find or open the conf.json file. However, when you provide it the specific parameters and it doesn't try to find that from the conf.json file, you can still load the files correctly (From my limited testing of viewing several traces).

  2. For loading the images examples I was better able to trace the issue for this one. For your reference, to load 'fish' use:
    fishimgs = td.images.fromtif('s3n://thunder-sample-data/images/fish/*.tif')
    and for 'mouse':
    I couldn't figure this one out.
    The fish example would be a pretty easy fix. In thunder/readers.py the function addextension uses os.path.sep to append the system separator and extension to find .tif files. The issue occurs because the windows path separator \ is different than the provided path separator / in loading the examples. Python on Windows is smart enough to take either \ or /, but it doesn't do well mixing and matching them. My suggestion here is just force the added separator to be / and I think Python should be able to handle all of those cases on Windows or any system. However, I'm not certain on that.
    The mouse example loads the images from binary, and when I input the dtype='int16' and shape=(20, 64, 64) as in loading the binary series above, Windows still wasn't able to load it. Something about buffering issues. I'm not very familiar at working with binary files, so I had a harder time debugging what was going on and the issues I was seeing when trying to load binary files.

@josephwinston I hope that is helpful enough to help you get through all the examples except for the 'mouse' examples. However, note that it is an issue with thunder in loading the example data on Windows. I have had less issues working with my own data directly in this way because it is easier to control how the paths are handled, which is where the issue is in this case.