These are chat archives for nextflow-io/nextflow

15th
Mar 2017
amacbride
@amacbride
Mar 15 2017 00:15
Hmmm. I've increased maxConnections to 200, and maxErrorRetry to 100, but it's still conking out at ~ 54-59 simultaneous downloads.
(which is of course just slightly too small for my purposes)
Tim Diels
@timdiels
Mar 15 2017 13:19
https://www.nextflow.io/docs/latest/process.html#input-repeaters
The example says it enumerates 3 different sizes. Seems it should be 2. Or is this some sort of Groovy black magic?
What is the difference between each and val? Would the example be any different if each was replaced with val? What would it do?
Paolo Di Tommaso
@pditommaso
Mar 15 2017 13:30
oops, you have found a typo in the docs .. !
it should be
process combine {
  input:
  val shape from shapes
  each color from 'red','blue'
  each size from 1,2,3

  "echo draw $shape $color with size: $size"

}
Tim Diels
@timdiels
Mar 15 2017 13:35
Goodie, and what would be the answer to the second question?
Paolo Di Tommaso
@pditommaso
Mar 15 2017 13:38
yes sorry, I was interrupted
the difference between val and each is that it allows you to apply the same inputs multiple time over a range of values
so if you look at the first example
Tim Diels
@timdiels
Mar 15 2017 13:40

So each does a cartesian product whereas multiple val would make combinations as they arrive? So

val color from 'red', 'blue'
val size from 1,2

would output

draw $shape red with size: 1
draw $shape blue with size: 2
Paolo Di Tommaso
@pditommaso
Mar 15 2017 13:40
for every input sequences you get a task executed for each different method
yes, val just pick the values as they arrive
Tim Diels
@timdiels
Mar 15 2017 13:44
Ok, thanks
Paolo Di Tommaso
@pditommaso
Mar 15 2017 13:47
welcome
amacbride
@amacbride
Mar 15 2017 19:08
@pditommaso Are the config variables accessible from within the script? I'd like to verify the values of aws.maxConnections, etc.
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:09
only params are accessible from within the script
amacbride
@amacbride
Mar 15 2017 19:10
Is there a debug option that might let me look at them?
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:10
yes, let me check
you should find in the log file
amacbride
@amacbride
Mar 15 2017 19:25
Interesting. Nothing from the aws config section is getting shown. (as if def config = Global.getAwsClientConfig() is returning null)
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:25
!
amacbride
@amacbride
Mar 15 2017 19:26
Mar-15 12:20:24.864 [main] DEBUG nextflow.file.FileHelper - AWS S3 config details: {region=us-west-2, access_key=XXXXXX.., secret_key=XXXXXX..}
My config file contains:
aws {
    maxConnections = "512"
    maxErrorRetry = "1000"
    socketTimeout = "600000"
}
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:27
nope
amacbride
@amacbride
Mar 15 2017 19:27
I also tried unquoted.
What did I miss? :)
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:28
it should be like this
aws {
    client {
        maxConnections = 20
        connectionTimeout = 10000
    }
}
amacbride
@amacbride
Mar 15 2017 19:28
D'oh!
Paolo Di Tommaso
@pditommaso
Mar 15 2017 19:28
I know ..
amacbride
@amacbride
Mar 15 2017 19:53
It's really baffling -- changing the connections doesn't seem to have helped. This was working a couple of months ago, so I'm going to have to do some more digging. (When I do one sample, 8 files, everything works fine, but a full run with 120 files barfs. Out of the box, the connection limit is 50, so changing it should help, but it seems to be ignored.)
Paolo Di Tommaso
@pditommaso
Mar 15 2017 20:00
I'm wondering if decreasing maxConnections instead of increasing it, it could be better
amacbride
@amacbride
Mar 15 2017 20:00
I see that the config variables are changed from camelCase to under_score, but I don't see anyplace that they are passed on to the underlying s3fs provider
(but I'm still learning the source tree)
Paolo Di Tommaso
@pditommaso
Mar 15 2017 20:02
a bit tricky that part, I needed to adapt to that existing S3 client
you should check S3FileSystemProvider#newFileSystem but it isn't in the NF source tree
amacbride
@amacbride
Mar 15 2017 20:04
Where is that code located? I'd like to try to understand it more
Ah! that's why I couldn't find it
however I would try to reduce the number of maxConnections
amacbride
@amacbride
Mar 15 2017 20:12
No effect on behavior. When I look at newFileSystem, I don't see that any configuration options are being passed to the AmazonS3Client.
In the SDK, there's a constructor that is passed a ClientConfiguration object, but it's not being used here.
Paolo Di Tommaso
@pditommaso
Mar 15 2017 20:14
config options are passed as the env map
then are added to a Property object
then it's called createClientConfig with that props
et voila
amacbride
@amacbride
Mar 15 2017 20:22

Got it. Is it possible for me to turn on the detailed tracing to that I can see the traces in createClientConfig ?

For example, I'd love to see this debug message:

if( props.containsKey("max_connections")) {
            log.trace("AWS client config - max_connections: {}", props.getProperty("max_connections"));
            config.setMaxConnections(Integer.parseInt(props.getProperty("max_connections")));
        }
Paolo Di Tommaso
@pditommaso
Mar 15 2017 20:23
yes, it should be reported if running
nextflow -trace com.upplication.s3fs.S3FileSystemProvider run .. etc
amacbride
@amacbride
Mar 15 2017 20:27
(Ah, my earlier confusion was because I was looking at the main branch of s3fs, not your fork.)