Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jun 24 22:38

    mr-c on main

    Prevent duplicated fields from … (compare)

  • Jun 24 22:38

    mr-c on inherit-doc

    (compare)

  • Jun 24 22:38
    mr-c closed #556
  • Jun 24 22:38
    mr-c closed #555
  • Jun 24 21:43
    kinow commented #220
  • Jun 24 21:42
    kinow commented #220
  • Jun 24 21:41
    kinow commented #220
  • Jun 24 21:40
    kinow edited #220
  • Jun 24 21:02
    kinow commented #556
  • Jun 24 21:02
    kinow commented #556
  • Jun 24 21:00
    codecov[bot] commented #556
  • Jun 24 21:00
    kinow synchronize #556
  • Jun 24 21:00

    kinow on inherit-doc

    Prevent duplicated fields from … (compare)

  • Jun 24 19:55

    mr-c on main

    Add support for DocFx doc gener… (compare)

  • Jun 24 19:55
    mr-c closed #557
  • Jun 24 19:02
    codecov[bot] commented #557
  • Jun 24 19:01
    codecov[bot] commented #557
  • Jun 24 19:01
    codecov[bot] commented #557
  • Jun 24 19:00
    ZimmerA commented #557
  • Jun 24 19:00
    ZimmerA ready_for_review #557
pvanheus
@pvanheus
pickValue: all_non_null picks all the lists because each list is not null?
Kaushik Ghose
@kaushik-work
@pvanheus yes - it does not recurse, as agreed upon. If we used merge_flattened, pickValue would remove all these nulls.
pvanheus
@pvanheus
ok.
Kaushik Ghose
@kaushik-work
In one of those cases where my personal life intersects with my work life: https://www.apotelyt.com/photo-lens/mft-compatibility
What has this got to do with anything? You ask. Reminds me that adhering to a standard constrains somethings, but not others and does not guarantee a uniform user experience.
Michael R. Crusoe
@mr-c
:-)
Michael R. Crusoe
@mr-c
BeeCWL https://sc19.supercomputing.org/proceedings/tech_poster/tech_poster_pages/rpost235.html (includes links to the poster PDF and extended abstract)
Andrey Kartashov
@portah
Have no idea why. However, I'm surprised there is no CWL mentioned
Michael R. Crusoe
@mr-c
@portah feel free to contribute to alan-turing-institute/the-turing-way#652 :-)
Andrey Kartashov
@portah
I probably confused what is first (egg and chicken) is CWL following Turing or to achieve Turing we need CWL
Peter Amstutz
@tetron
@all weekly CWL video chat is happening in five minutes https://meet.jit.si/cwl
Kaushik Ghose
@kaushik-work
@mr-c should I move the conditionals design document to the cwl-v1.2 repo? It might fit better there, since I think going forward that's going to be our main repo, and for 1.3 we will just fork that and move on etc.
Michael R. Crusoe
@mr-c
@kaushik-work makes sense to me, sure!
Kaushik Ghose
@kaushik-work
I've copied the design document over to the CWL v1.2 repo. I think the discussion in the pull request is very valuable. So I'm going to copy the discussion over. I apologize for some spam that discussion participants will get as a result of this - I will tag them for their comments
Michael R. Crusoe
@mr-c
@kaushik-work I wonder if the "move issue" feature works for PRs...
Kaushik Ghose
@kaushik-work
I checked, and did not see a button for that.
John Chilton
@jmchilton
Gitter just deleted my comment. Maybe I should simplify it anyway - should the doc generator handle doc: attributes that are defined as lists (e.g. on field definitions or record type definitions) - or is this only valid for type: documentation? If it is a bug, I can fix it - I just wanted to verify it is a bug
Michael R. Crusoe
@mr-c
Feels like something that was added for CWL v1.1
What does the metaschemas say?
Peter Amstutz
@tetron
Re doc strings with lists, the values should be concatenated, I don't recall exactly what happened but it may be been done to unify the behavior of the doc field across different types it could show up
bogdang989
@bogdang989
Hi all, can someone please help me understand the usage of inputBinding in CommandInputRecordSchema and interaction with inputBinding in the fields? E.g. what should a simple record defined like this produce?
  id: input
  type:
    inputBinding:
      position: 3
      valueFrom: c
    type: record
    fields:
    - name: input_field
      type: string?
      inputBinding:
        position: 2
        valueFrom: b
    name: input
I feel this kind of scenario can be better addressed in both the spec and conformance tests. I'll put up an issue on the github repo anyway, but thought I ask here first just to make sure I am not missing anything
Michael Franklin
@illusional

Hi everyone, just wondering if there's a way to slightly rename an output file after the tool is run. Essentially, GATK MergeSamFiles is generating the index as ^.bai, but I need it as .bai on input, and then on the glob as well.I could manually brute force some mv statements in there with && and shell=True, but less keen if there's a better way.

The more tools I explore, the more confused I am that there seems to be no consistency or standard in this respect.

Michael Franklin
@illusional
I can do the rename on input with InitialWorkDirRequirement
Michael Franklin
@illusional

Hmm, from the Misc guide I can do the following:

outputEval: ${self[0].basename=inputs.newname; return self;}

Tbh, I'm not strictly sure how it works, but it does

Michael Franklin
@illusional
But this doesn't work for secondary files
Michael R. Crusoe
@mr-c
@illusional You can do a similar trick for secondaryFiles:
outputEval: ${self[0].secondaryFiles[0].basename=inputs.newname; return self;}
Michael Franklin
@illusional

Hi @mr-c , I've been playing with rewriting the secondaryFiles, but inside the output glob it looks like there isn't a secondaryFiles property.

cwlVersion: v1.0
class: CommandLineTool
baseCommand: "ls"
inputs:
  bam:
    type: File
    secondaryFiles: ["^.bai"]
outputs:
  std: stdout
  out:
    type: File
    secondaryFiles: ["^.bai"]
    outputBinding:
      glob: $(inputs.bam.basename)
      outputEval: |
        ${
            console.log(self)
            self[0].secondaryFiles[0].basename=".bai"
        }
requirements:
  InitialWorkDirRequirement:
    listing:
      - $(inputs.bam)
  InlineJavascriptRequirement: {}

(Formatted error)

('Error collecting output for parameter \'out\':Expression evaluation error:Expecting value: line 1 column 1 (char 0)script was:
01 "use strict";
02 var inputs = {
03     "bam": {
04         "class": "File",
05         "location": "file:///Users/franklinmichael/Desktop/tmp/bamsplit/BRCA1.bam",
06         "size": 2997846,
07         "basename": "BRCA1.bam",
08         "nameroot": "BRCA1",
09         "nameext": ".bam",
10         "secondaryFiles": [
11             {
12                 "location": "file:///Users/franklinmichael/Desktop/tmp/bamsplit/BRCA1.bai",
13                 "basename": "BRCA1.bai",
14                 "class": "File",
15                 "nameroot": "BRCA1",
16                 "nameext": ".bai",
17                 "path": "/private/tmp/docker_tmplo7vt97z/BRCA1.bai",
18                 "dirname": "/private/tmp/docker_tmplo7vt97z"
19             }
20         ],
21         "path": "/private/tmp/docker_tmplo7vt97z/BRCA1.bam",
22         "dirname": "/private/tmp/docker_tmplo7vt97z"
23     }
24 };
25 var self = [
26     {
27         "location": "file:///private/tmp/docker_tmplo7vt97z/BRCA1.bam",
28         "path": "/private/tmp/docker_tmplo7vt97z/BRCA1.bam",
29         "basename": "BRCA1.bam",
30         "nameroot": "BRCA1",
31         "nameext": ".bam",
32         "class": "File",
33         "checksum": "sha1$8a38a4e8d58c91d9aaca8cdc739e05eddf7dec1d",
34         "size": 2997846
35     }
36 ];
37 var runtime = {
38     "cores": 1,
39     "ram": 1024,
40     "tmpdirSize": 1024,
41     "outdirSize": 1024,
42     "tmpdir": "/private/var/folders/jz/y9gqxt_s7jxcjkc26gr71ywr7zs5yz/T/tmpoer80xbz",
43     "outdir": "/private/tmp/docker_tmplo7vt97z"
44 };
45 (function(){
46     console.log(self)
47     self[0].secondaryFiles[0].basename=".bai"
48 })()stdout was: \'\'stderr was: \'evalmachine.<anonymous>:
47    self[0].secondaryFiles[0].basename=".bai"                          ^TypeError: Cannot read property \'0\' of undefined    at evalmachine.<anonymous>:47:
27    at evalmachine.<anonymous>:48:3    at Script.runInContext (vm.js:107:20)    at Script.runInNewContext (vm.js:113:17)    at Object.runInNewContext (vm.js:296:38)    at Socket.<anonymous> ([eval]:11:57)    at Socket.emit (events.js:182:13)    at addChunk (_stream_readable.js:283:12)    at readableAddChunk (_stream_readable.js:260:13)    at Socket.Readable.push (_stream_readable.js:219:10)\'', {})
Michael Franklin
@illusional

My guess is this is because the "outputEval" block is before the secondaryFiles block

I checked out the source code and moved the outputEval block below the secondary files, and this works.

Michael R. Crusoe
@mr-c
@illusional Alas, the order of operations you want to change is set in the standard itself: https://www.commonwl.org/v1.0/CommandLineTool.html#CommandOutputBinding
Michael R. Crusoe
@mr-c
Perhaps you can use a generous glob that catches all the files you need, assemble & rename them in the outputEval, including attaching the index to the primary file in a synthesized secondaryFiles property. Then hopefully this full object will be passed without problems through the secondaryFiles: .bai processing stage.
That may or may not work with cwltool and I'd have to think on it more to see if it is compliant with the standard. Maybe @tetron has an idea.
It may just be easier, and less surprising to CWL engines, to put an expression in the secondaryFiles field of your output and not use outputEval, as you are allowed to return a File object and thus can do the regular renaming trick
Michael R. Crusoe
@mr-c
The expression must return a filename string relative to the path to the primary File, a File or Directory object with either path or location and basename fields set, or an array consisting of strings or File or Directory objects.
Michael Franklin
@illusional

Damn...

My trick was supposed to hide the fact that some tools only accept ^.bai (looking at GATK) while most others accept .bai, the renaming trick was a backdoor to solve this problem in CWL.

Hopefully @tetron has an idea.

I'm happy to use a CWL expression to do this, but I don't quite understand what I should return if:

  • The primary file (correctly globbed) is called myfile.bam
  • A file called myfile.bai is present in the output directory
  • The returned secondary file should have the extension myfile.bam.bai

This would ensure it's correctly localised in future steps. But maybe I don't quite understand how secondary files are passed around in CWL especially CWLTool

Keiran Raine
@keiranmraine
@illusional FYI, bai will eventually be retired in preference from csi (among other things it's compressed and supports longer references)
Michael Franklin
@illusional
Thanks @keiranmraine, I'll ask around at work tomorrow to see if it's something we can convince people to use. Looks like it's only been in htsjdk since earlier this year and idk how quickly tools tend to update. Have you got any good reading stuff about how well it's adopted?
Keiran Raine
@keiranmraine

Handling bam + bai|csi and cram + crai for a single tool gets complicated. We ended up having to define multiple CWLs:

https://github.com/cancerit/dockstore-cgpmap/tree/develop/cwls

More elegant solutions would be preferable, but we have to ensure they are also compatible (or will be compatible) with the Dockstore registry.

Michael Franklin
@illusional
My background isn't bioinformatics, so these index files, and the differing format of these index files are crazy! I have the luxury of generating my CWL tool wrappers from a mixture of these .bai annotations, so as long as I can find a generalised solution I'm set
Keiran Raine
@keiranmraine
We are likely to migrate fully to CRAM for new data quite soon, so csi is not going to be that much of an issue for us. CRAM with modified split size (1k, instead of 10k) is as fast and sometimes faster than BAM to parse now (with samtools/htslib). If I was tied to BAM (legacy tools can be a pain), and setting up something new I'd certainly go with csi.
Michael R. Crusoe
@mr-c
@illusional can you make a forum post describing the situation and your desired outcome? There may be a solution yet
Michael R. Crusoe
@mr-c
CWL Video chat is now, if anyone is around