Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 20:24
    Hexotical synchronize #4085
  • 20:24

    Hexotical on 3514-missing-permission-warning-aws

    Remove iam.py from mypy ignore,… Update permission check function Merge branch 'issues/3514-missi… (compare)

  • Jun 30 18:41
    unito-bot unassigned #4083
  • Jun 30 17:36
    unito-bot edited #3811
  • Jun 30 17:36
    adamnovak commented #3811
  • Jun 30 17:34
    adamnovak commented #3619
  • Jun 30 17:31
    unito-bot edited #3619
  • Jun 30 17:31
    adamnovak commented #3619
  • Jun 30 17:12
    unito-bot edited #3638
  • Jun 30 17:12
    adamnovak closed #3638
  • Jun 30 17:12
    adamnovak commented #3638
  • Jun 30 17:11
    unito-bot edited #4159
  • Jun 30 17:10
    mr-c opened #4159
  • Jun 30 17:08
    unito-bot edited #3612
  • Jun 30 17:08
    adamnovak commented #3612
  • Jun 30 17:08
    adamnovak closed #3612
  • Jun 30 17:04
    unito-bot edited #3718
  • Jun 30 17:04
    adamnovak closed #3718
  • Jun 30 17:04
    adamnovak commented #3718
  • Jun 30 17:01
    unito-bot edited #1917
Marcel Loose
@gmloose
But the more fundamental, and somewhat worrying, question is: why does this problem only occur on some machines. I get the feeling I'm running into some kind of race condition.
Michael R. Crusoe
@mr-c:matrix.org
[m]
agreed
Marcel Loose
@gmloose
I'll try to collect the output JSON, but I have to rerun the workflow.
mr-c
@mr-c:matrix.org
[m]
Thanks
Marcel Loose
@gmloose
OK, I recreated the JSON output file for the workflow with the renamed file (file extension f4_2). Here's the link to gist.github.com: https://gist.github.com/gmloose/8f1c469ff5084cb38d09ec8fc0c7ff30.
Michael R. Crusoe
@mr-c:matrix.org
[m]
huh. Lines 972 & 1969 are the relevant sections
Marcel Loose
@gmloose
Sure, but what should I write in there? I'm not sure what the bug description should be, other than the name conflict.
Michael R. Crusoe
@mr-c:matrix.org
[m]
That there is a false positive naming conflict ; yes there are two files with the same name, but they are in different directories
Marcel Loose
@gmloose
Marcel Loose
@gmloose
What's the best way to redirect to a file the JSON string containing the outputs generated by a workflow, which is printed to stdout?
Michael R. Crusoe
@mr-c:matrix.org
[m]
toil-cwl-runner ... > outputs.json or similar
Marcel Loose
@gmloose
And if I invoke toil-cwl-runner from a Python script, using toil.cwl.cwltoil.main() function? Instead of redirecting stdout in the Python call, could I just add a line to the CWL Workflow that redirects stdout to a file. Or would that produce more output than just the JSON output?
1 reply
Marcel Loose
@gmloose
Any idea if DataBiosphere/toil#4101 can be picked up soon? It is blocking us, because Toil currently produces bogus results for our workflows. To me this seems to be a serious bug!
3 replies
Michael R. Crusoe
@mr-c:matrix.org
[m]
A repeatable example, but I know that is difficult
What if I gave you a script to fix the renames post execution?
1 reply
Michael R. Crusoe
@mr-c:matrix.org
[m]
Huh
Marcel Loose
@gmloose

Not sure if it's in any way related to #4101, but I have the following expression to convert an array of an array of directories into an array of directories:

id: merge_array_directories
label: merge_array_directories
class: ExpressionTool

cwlVersion: v1.2
inputs:
    - id: input
      type:
        - type: array
          items:
            - type: array
              items: Directory
outputs:
    - id: output
      type: Directory[]

expression: |
  ${
    var out_directory = []
    for(var i=0; i<inputs.input.length; i++){
        var item = inputs.input[i]
        if(item != null){
            out_directory = out_directory.concat(item)
        }
    }
    return {'output': out_directory}
  }

requirements:
  - class: InlineJavascriptRequirement

This expression, when run, produces a number of warnings like:

Warning: invalid field `nameroot`, expected one of: 'class', 'location', 'path', 'basename', 'listing'
[2022-05-20T18:04:18+0200] [MainThread] [W] [salad] Warning: invalid field `nameroot`, expected one of: 'class', 'location', 'path', 'basename', 'listing'
2 replies
Michael R. Crusoe
@mr-c:matrix.org
[m]
Do you get the same warnings when running just the ExpressionTool? Is this is batch or single machine mode?
1 reply
Michael R. Crusoe
@mr-c:matrix.org
[m]
There is a scatter prior to this step?
2 replies
Okay. If you can run with debug worker and the regular debug, that should show the input objects prior to this step, and will provide a big hint as to what is going on. I'll need the entire log
3 replies
Marcel Loose
@gmloose
Please see https://gist.github.com/gmloose/74ca0916148041be6a1ce0643cd2563f for a (hopefully) useful log file.
Rohith B S
@rohith-bs
[2022-05-11T06:09:53+0000] [MainThread] [I] [toil.worker] ---TOIL WORKER OUTPUT LOG---
[2022-05-11T06:09:53+0000] [MainThread] [I] [toil] Running Toil version 5.6.0-c34146a6437e4407a61e946e968bcce67a0ebbca on host bsr.
[2022-05-11T06:09:53+0000] [MainThread] [I] [toil.worker] Working on job 'normalize_and_index' kind-submit_run/instance-lt36baix v6
[2022-05-11T06:09:53+0000] [MainThread] [I] [numexpr.utils] Note: NumExpr detected 16 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/toil/worker.py", line 379, in workerScript
    job = Job.loadJob(jobStore, jobDesc)
  File "/usr/local/lib/python3.7/site-packages/toil/job.py", line 2285, in loadJob
    job = cls._unpickle(userModule, fileHandle, requireInstanceOf=Job)
  File "/usr/local/lib/python3.7/site-packages/toil/job.py", line 1910, in _unpickle
    runnable = unpickler.load()
  File "/usr/local/lib/python3.7/site-packages/toil/job.py", line 2904, in __new__
    return cls._resolve(*args)
  File "/usr/local/lib/python3.7/site-packages/toil/job.py", line 2913, in _resolve
    with cls._jobstore.read_file_stream(jobStoreFileID) as fileHandle:
  File "/usr/lib64/python3.7/contextlib.py", line 112, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.7/site-packages/toil/jobStores/fileJobStore.py", line 596, in read_file_stream
    self._check_job_store_file_id(file_id)
  File "/usr/local/lib/python3.7/site-packages/toil/jobStores/fileJobStore.py", line 779, in _check_job_store_file_id
    raise NoSuchFileException(jobStoreFileID)
toil.jobStores.abstractJobStore.NoSuchFileException: File 'files/no-job/k/file-4294b849ee3148f087a8efee7183a598/stream' does not exist.
[2022-05-11T06:09:56+0000] [MainThread] [E] [toil.worker] Exiting the worker because of a failed job on host bsr

Kindly sugges the cause and possible fix for above issue

Please help we are facing this very often.

Michael R. Crusoe
@mr-c:matrix.org
[m]
Can you tell us more about the shared filesystem you use? It seems to be having consistency problems
1 reply
Michael R. Crusoe
@mr-c:matrix.org
[m]
Oh, just a single system? No batch scheduler?
Michael R. Crusoe
@mr-c:matrix.org
[m]
If just a local system, or even a batch system using the file store, I recommend the --bypass-file-store option
2 replies
In fact, I will open an issue about making that the default for toil-cwl-runner with a filestore
Douglas Lowe
@douglowe
I am getting the following generated script and error from an ExpressionTool step in my workflow, when I run it on a specific HPC system (using slurm). I do not get this error on any other HPC systems I've tried (slurm, or SGE). I'd like to see what a generated script looks like for a successful workflow instance, but can't work out how to extract this. Could anyone advise on how I do this (or on what might be causing a problem on this one system)?
        cwltool.sandboxjs.JavascriptException: Expecting value: line 1 column 1 (char 0)
        script was:
        01 "use strict";
        02 var inputs = {
        03     "external_files": [
        04         {
        05             "location": "toilfile:615963:0:files/for-job/kind-CWLJob/instance-sqj9ora8/file-a02c8eb20af64c1bad1f33a368ff8cdb/topology_ions_top.zip",
        06             "basename": "topology_ions_top.zip",
        07             "nameroot": "topology_ions_top",
        08             "nameext": ".zip",
        09             "class": "File",
        10             "checksum": "sha1$febce5fbfe825fc780719f74a53eda12ff7d6dd1",
        11             "size": 615963,
        12             "format": "http://edamontology.org/format_2333",
        13             "http://commonwl.org/cwltool#generation": 0
        14         }
        15     ],
        16     "external_project_file": {
        17         "class": "File",
        18         "format": "http://edamontology.org/format_1476",
        19         "location": "toilfile:116316:0:files/no-job/file-e8101cd7fbbc4ef192a69bae99192cba/lysozyme.pdb",
        20         "size": 116316,
        21         "basename": "lysozyme.pdb",
        22         "nameroot": "lysozyme",
        23         "nameext": ".pdb",
        24         "streamable": false
        25     },
        26     "external_string": ""
        27 };
        28 var self = null;
        29 var runtime = {
        30     "cores": 1,
        31     "ram": 256,
        32     "tmpdirSize": 1024,
        33     "outdirSize": 1024,
        34     "tmpdir": null,
        35     "outdir": null
        36 };
        37 (function(){
        38 return {"project_work_dir": 
        39     {"class": "Directory", 
        40      "basename": inputs.external_project_file.basename + inputs.external_string, 
        41      "listing": inputs.external_files}
        42 };
        43 })()
        stdout was: ''
        stderr was: ''
Michael R. Crusoe
@mr-c:matrix.org
[m]
Could be the nodejs version varies?
Douglas Lowe
@douglowe
Is there a way to check this? Both workflows used the docker://docker.io/node:slim command, and both run in the same week - so I'd hope they are both using the same docker image?
Michael R. Crusoe
@mr-c:matrix.org
[m]
Hmm.. maybe one system has nodejs installed locally, so it doesn't use the container

I'd like to see what a generated script looks like for a successful workflow instance, but can't work out how to extract this.

There isn't really a generated script for the whole workflow. Is there something more specific you'd like to see?

Douglas Lowe
@douglowe
Unfortunately there's no installed nodejs for either system.
Michael R. Crusoe
@mr-c:matrix.org
[m]
toil-cwl-runner --clean never will keep the Toil jobstore around. Also see --cleanWorkDir never
Douglas Lowe
@douglowe
Thanks - I'll try these --clean never commands
I'd be interested in just the javascript for that step. Will that be stored as a file in the jobstore?
Michael R. Crusoe
@mr-c:matrix.org
[m]
you can also try running just that step with a scheduled job running cwltool --debug --leave-tmpdir --leave-outputs --js-console
I don't know if we keep the javascript files around; you can look using the above. If you don't find it, you'll need to hack it in
Douglas Lowe
@douglowe
okay - I'll give this a go, see if there's anything there. Thanks for the suggestions :)
Michael R. Crusoe
@mr-c:matrix.org
[m]
Good luck; you are welcome!
Douglas Lowe
@douglowe
This is the generated javascript which fails: failed_tool.js
I've tried running it independently using node --eval "$(cat failed_tool.js)" and get no output. But I'm not familiar with javascript, so I don't know if that's what I should expect or not. Is there anything I should look out for to tell me if it's working as intended?
Michael R. Crusoe
@mr-c:matrix.org
[m]
{"project_work_dir":{"class":"Directory","basename":"lysozyme.pdb","listing":[{"location":"toilfile:615963:0:files/for-job/kind-CWLJob/instance-sqj9ora8/file-a02c8eb20af64c1bad1f33a368ff8cdb/topology_ions_top.zip","basename":"topology_ions_top.zip","nameroot":"topology_ions_top","nameext":".zip","class":"File","checksum":"sha1$febce5fbfe825fc780719f74a53eda12ff7d6dd1","size":615963,"format":"http://edamontology.org/format_2333","http://commonwl.org/cwltool#generation":0}]}}
Douglas Lowe
@douglowe
Thanks for adding this code. That output looks sensible to me.
I'm not getting any js files saved in the jobstore - I'll try running that step by itself using the cwltool and --js-console
Michael R. Crusoe
@mr-c:matrix.org
[m]
they might not be saved there; check the $TMPDIR? The debugging around nodejs execution needs to be improved
Marcel Loose
@gmloose

The help that toil-cwl-runner provides on --linkImports and --noLinkImports is identical.

  --linkImports         When using a filesystem based job store, CWL input
                        files are by default symlinked in. Specifying this
                        option instead copies the files into the job store,
                        which may protect them from being modified externally.
                        When not specified and as long as caching is enabled,
                        Toil will protect the file automatically by changing
                        the permissions to read-only.
  --noLinkImports       When using a filesystem based job store, CWL input
                        files are by default symlinked in. Specifying this
                        option instead copies the files into the job store,
                        which may protect them from being modified externally.
                        When not specified and as long as caching is enabled,
                        Toil will protect the file automatically by changing
                        the permissions to read-only.

I guess the help text applies to --noLinkImports

Michael R. Crusoe
@mr-c:matrix.org
[m]
Whoops, that needs fixing. Can you open an issue gmloose (Marcel Loose)? Thanks!
2 replies
Douglas Lowe
@douglowe
@mr-c:matrix.org - I've discovered the problem with our expressiontool: our local HPC is tardy, and cwltool is impatient. I've raised an issue for this, with some suggestions of changes I think need making: common-workflow-language/cwltool#1680
1 reply