These are chat archives for nextflow-io/nextflow

15th
Mar 2019
David Mas-Ponte
@davidmasp
Mar 15 10:15
Hi all, I may have missed this is the docs but ... When I use the collect operator to get all elements in the channel merged in one, and then I use the file("*.txt")(like this ) the files are renamed as 1.fq, 2.fq, ... . Is there a way to maintain the original file names? I am dong something wrong?
Paolo Di Tommaso
@pditommaso
Mar 15 10:18
if you want the original file names use just *
David Mas-Ponte
@davidmasp
Mar 15 10:24
Perfect, thanks!
Paolo Di Tommaso
@pditommaso
Mar 15 10:24
:v:
Daniel E Cook
@danielecook
Mar 15 10:46
I find that the cache seems to get dropped for inexplicable reasons.... does anyone have any obvious ideas why that might be? For instance, I was running a pipeline last night and caching was working fine. This morning? No longer cached.
Is there a way to 'diff' or look at whats changed among inputs so I can figure out why this is happening?
I have experimented with the different cache types and it seems to happen there as well. The problem is, it's difficult to reproduce b/c I don't know whats causing it
Daniel E Cook
@danielecook
Mar 15 10:53
If I could see which inputs in a channel have changed/what changed about them might be easier to figure this out.
Rad Suchecki
@rsuchecki
Mar 15 11:35
I'm keen too hear about any ideas about that too @danielecook as it can be a pain to debug. From past experiences, there are a few things that come to mind:
  • using cache 'lenient'
  • comparing in detail content of two task.workDirs for supposedly the same input but with one being the unexpected re-run of the other - you may be able to find e.g. some stray log file captured by in/out globs
  • is that on a cluster? consider possible effects of file system / policies? I've had some issues with flush drives on our HPC
  • have a close look at what operations are applied to input channels - is the input really guaranteed to be consistent
Daniel E Cook
@danielecook
Mar 15 11:37
Thanks @rsuchecki I've tried cache lenient but still seem to have problems sometimes
Daniel E Cook
@danielecook
Mar 15 13:05
I'm playing around with -dump-hashes; it would be nice if there was a way to dump hashes for the name of a process. Not sure if that is possible.
Evan Floden
@evanfloden
Mar 15 13:11

Bug report of the day @rsuchecki

config.env.SECRET = 'If you want to keep a secret, you must also hide it from yourself - George Orwell, 1984'

Paolo Di Tommaso
@pditommaso
Mar 15 13:13
:satisfied:
@danielecook nope, however the interesting one are usually the first process for which cache is failing
KochTobi
@KochTobi
Mar 15 15:16
Hi, are default values for process.cpus, process.time and process.mem automatically set? If so what are the defaults?
KochTobi
@KochTobi
Mar 15 15:56
(for me it's cpus:1, mem:null, time:null) but I was wondering if those are always the defaults