These are chat archives for dereneaton/ipyrad

20th
Feb 2018
Glib Mazepa
@mazepago_twitter
Feb 20 2018 09:28

Thanks for the suggestions, @isaacovercast! Yet another question: I am trying to run >390 samples and continuously running out of quota on the cluster at the step #6 (forlder *_across is > 900 Gb), is there any way to calculate approximate size of the output (or the .tmp)? I would like to avoid sacrifice of the samples... Also, as far as i understand I cannot run step #6 on the subsets of the samples and merging them afterwards... From the report below it seems like this Step was quite close to finish, wasnt't it?

Step 6: Clustering at 0.9 similarity across 396 samples
[####################] 100% concat/shuffle input | 0:02:02
[####################] 100% clustering across | 2:01:53
[####################] 100% building clusters | 0:04:15
[####################] 100% aligning clusters | 0:07:25
[####################] 100% database indels | 0:09:02
[################### ] 97% indexing clusters | 0:31:54 Traceback (most recent call last):
File "/home/gmazepa/miniconda2/lib/python2.7/logging/init.py", line 885, in emit
self.flush()
File "/home/gmazepa/miniconda2/lib/python2.7/logging/init.py", line 845, in flush
self.stream.flush()
IOError: [Errno 122] Disk quota exceeded

Isaac Overcast
@isaacovercast
Feb 20 2018 15:34
@mazepago_twitter Well, that's a pretty big dataset. You could try cleaning up some of the files from earlier steps to free up space. What's weird is that it's dying during indexing, which should be consuming that much disk. It's also weird that it's dying on flushing the log file, which again shouldn't be a super disk heavy activity. I assume this is on a cluster? Sometimes clusters have different home and scratch drives. I've seen it where the home folders are highly restricted on quotas, but the scratch drives are unlimited. It could be that you're just filling up your home directory?
Glib Mazepa
@mazepago_twitter
Feb 20 2018 19:02

@isaacovercast, I tried to rerun step #6 after cleaning $HOME directory and here is current quota reports at $HOME:

Disk quotas for user gmazepa (uid 25067):
Filesystem blocks quota limit grace files quota limit grace
nfsserv4.infiniband.vital-it.ch:/exports/home
4885M 5120M 5632M 64991 0 0

But at this stage: [################### ] 95% indexing clusters | 0:29:01

it crushes and an error is reported:
ipyrad [v.0.7.22]

Interactive assembly and analysis of RAD-seq data

Begin run: 2018-02-20 16:32
Using args {'preview': False, 'force': False, 'threads': 2, 'results': False, 'quiet': False, 'merge': None, 'ipcluster': None, 'cores': 40, 'params': 'params-18_Feb_bash_90_steps12345.txt', 'branch': None, 'steps': '6', 'debug': False, 'new': None, 'download': None, 'MPI': True}
Platform info: ('Linux', 'dee-serv06.vital-it.ch', '2.6.32-696.18.7.el6.x86_64', '#1 SMP Thu Jan 4 17:31:22 UTC 2018', 'x86_64')2018-02-20 19:18:25,095 pid=7912 [assembly.py] ERROR IOError(Driver write request failed (File write failed: time = tue feb 20 19:18:24 2018
, filename = '/stn4/ul/monthly/mazepago/ipyrad_18_feb_on_bash/18_feb_bash_90_steps12345_across/p_bedriagae_jordan_gmj_01_56_unassembledr2.tmp.h5', file descriptor = 141, errno = 122, error message = 'disk quota exceeded', buf = 0x7ffe09dbd520, total write size = 96, bytes this sub-write = 96, bytes actually written = 18446744073709551615, offset = 0))
2018-02-20 19:18:25,509 pid=7912 [assembly.py] ERROR shutdown warning: [Errno 122] Disk quota exceeded

Isaac Overcast
@isaacovercast
Feb 20 2018 19:32
Well this is conspicuous: bytes actually written = 18446744073709551615. That's a LOT of bytes. Something must be weird.
Ooooooh followed a hunch and turns out the max value for int64 is very close to 18446744073709551615.
18446744073709551616L
Can't be a coincidence
Isaac Overcast
@isaacovercast
Feb 20 2018 19:40
Haha, nevermind: "bytes actually written = 18446744073709551615 probably just refers to a return value of -1 for the POSIX routine write (man7.org/linux/man-pages/man2/write.2.html) which is cast to double and back which would produce this value. It just means what it says: Your HD is full."
This is just the standard warning message when you're out of disk. the quota report looks like you're already using the vast majority of your disk. Is there any way you can get access to a bigger disk or a temporary quota easement?
dionyes
@dionyes
Feb 20 2018 20:03
@eaton-lab thanks so much for all the help! It’s working now :).