Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Oct 16 09:12
    dependabot[bot] commented #3859
  • Oct 16 09:12

    dependabot[bot] on pip

    (compare)

  • Oct 16 09:12
    dependabot[bot] closed #3859
  • Oct 16 09:12
    mr-c commented #3859
  • Oct 16 09:10
    mr-c edited #3858
  • Oct 16 09:09
    mr-c auto_merge_enabled #3858
  • Oct 16 09:09
    mr-c edited #3858
  • Oct 16 09:09
    mr-c synchronize #3858
  • Oct 16 09:09

    mr-c on pip

    pyyaml is no longer used dropp… (compare)

  • Oct 16 08:30
    mr-c synchronize #3845
  • Oct 16 08:30

    mr-c on pip

    Allow more docker-py versions (… Bump cwltest from 2.1.202106261… Merge branch 'master' into depe… (compare)

  • Oct 16 08:30
    mr-c commented #3858
  • Oct 16 08:27
    mr-c commented #3859
  • Oct 16 08:23
    dependabot[bot] labeled #3862
  • Oct 16 08:23
    dependabot[bot] opened #3862
  • Oct 16 08:23

    dependabot[bot] on pip

    Update apache-libcloud requirem… (compare)

  • Oct 16 08:23
    mr-c updated the wiki
  • Oct 16 08:23
    dependabot[bot] labeled #3861
  • Oct 16 08:23
    dependabot[bot] opened #3861
  • Oct 16 08:23

    dependabot[bot] on pip

    Update addict requirement from … (compare)

Marcel Loose
@gmloose
It's a quite complicated workflow, consisting of a number of sub-workflows and several tens of steps (commands and expressions). The actual input files that need to be processed (not the JSON-file) are huge. I don't have debug logs of this particular run, and it is the first time I ever saw this error message. I can try to restart it (with --logDebug) and see if it fails again.
crusoe
@mr-c:matrix.org
[m]
@gmloose: Thanks, do let us know what happens!
Marcel Loose
@gmloose
As was (almost) to be expected, the restarted workflow completed without error. So this type of error is probably very hard to track down.
anthfm
@anthfm
Hi, I have a CWL workflow (subworkflow with scatter) that takes in input files, processes them with step1, and subsequently utilizes step1 output to run step2, resulting in final output files. It executes with no errors using Toil, however, I noticed that after step1 and step2 jobs are executed and completed, Toil re-issues empty step1 jobs that terminate successfully immediately. Is this normal behaviour for CWL scatter/subworkflows using Toil? Thank you
Adam Novak
@adamnovak
@anthfm That's normal behavior for Toil; jobs are issued once on the way down through the workflow graph, and then again on the way back up to do some cleanup. It shouldn't actually redo any CWL work, although the resource requirements might be excessive for the cleanup because I think it still asks for the same as the original job.
1 reply
Martín Beracochea
@mberacochea
Hey, I have a CWL pipeline which I run in LSF. I'm using TOIL_LSF_ARGS to set the queue, all good. But, I want to run one step in a different queue... is this possible?
crusoe
@mr-c:matrix.org
[m]
@mberacochea just a single step and skip the rest of the workflow?
There is a cwltool option to extract a single step from a workflow and either print it out or run it
I forget if we exposed that in toil-cwl-runner. If we didn't, you can use cwltool --single-step name --print-subgraph and then take the result to toil-cwl-runner
Martín Beracochea
@mberacochea
hi @mr-c:matrix.org , I need to run the whole workflow (but that step has to run in a different queue).
crusoe
@mr-c:matrix.org
[m]
Okay. What is special about that queue? Longer walltime is allowed? Special hardware? Bigmem?
Martín Beracochea
@mberacochea
bigmem
Adam Novak
@adamnovak
Yeah, Toil doesn't have this feature. If you have an idea for how to improve the LSF batch system code in Toil so that it can know what queues jobs need to go in based on their memory requirements, and you can figure out how to code it so that it will still work on everybody else's LSF clusters, we could take a PR.
Or if CWL grew a way to add an LSF queue annotation to a job, we could maybe punch a hole through to the batch system so that it could know about it.
crusoe
@mr-c:matrix.org
[m]
@mberacochea do you have to specify a queue? Toil does put the memory requirements in the batch job. You could ask your LSF admin if that'd be enough
There is an unimplemented proposal for overriding the queue for a certain job: common-workflow-language/common-workflow-language#581
Martín Beracochea
@mberacochea
Thank you both. I need to specify the queue, otherwise the LSF rejects the job (I'll see if the admins have something to sort that out).
yeah, the overrides seem to be the way to go in my case
crusoe
@mr-c:matrix.org
[m]
Okay, if they can't cope then we could try implementing a vendor extension version of BatchQueue for toil-cwl-runner
Martín Beracochea
@mberacochea
All right. Sounds like a plan. Thanks
crusoe
@mr-c:matrix.org
[m]
@adamnovak that would require a way to pass per Toil.job options to the batch systems, yes
Adam Novak
@adamnovak
In general we need to revise requirements to be a bit more free-form to support things like GPUs. I think we'll end up with some kind of dict of keys that the batch system can consult.
Michael Milton
@multimeric
If I run a Python workflow, and it completes, then I edit the workflow and re-run it with--restart, Toil still thinks it's finished and I get [2021-09-24T14:33:48+1000] [MainThread] [W] [toil.common] Requested restart but the workflow has already been completed; allowing exports to rerun.. What I actually want it to do here is cache the jobs that are unchanged and re-run those that have changed. Is this possible at all in toil?
Adam Novak
@adamnovak
@multimeric Unfortunately that's not a feature that we have. We don't keep any record of what particular Python values or class or function definitions each workflow task depends on, or what version of those definitions it ran with. In fact, we don't really keep records of the jobs at all after they and their descendants complete, and we sever the connection between a job description and its Python code after that code runs successfully, unless it's a checkpoint job.
If we wanted to do this, we'd have to basically make all jobs checkpoint jobs, except even more so because we'd have to keep them around after they and their descendants all finished. Then we'd have to come up with a new way to enumerate the jobs that still need to happen (which right now is basically 1:1 with the jobs that still exist). We'd also need to come up with a way to traverse the possible call and constant access graph of a Python function, determine an identifier for the version of each function or constant that is used, and store that along with the finished job.
And anything that accessed code via dynamic lookup would need to either always or never rerun, because we wouldn't be able to find the code.
Adam Novak
@adamnovak
That all being said, WDL runners are able to do this with WDL code, so it might not be impossible.
Michael Milton
@multimeric
Thanks for the answer. One angle I've seen used in another system is to annotate each job with a hash which we allow the user to calculate. Then the user can try simple solutions like just hashing the file that the job resides in, or alternatively just keeping a manual version number for each task.
If WDL already does this, then I guess Toil has some concept of "has this job changed", which I would just need to plug this logic into
Marcel Loose
@gmloose
Can anyone explain to me how to interpret the output of toil --stats? The documentation is quite limited.
Marcel Loose
@gmloose
Today, I've been bitten by the fact that CWLTool URL-encodes a + character is a filename to %2B. This results in a error: Cannot make job: Invalid filename: 'P233%2B35_structure.txt' contains illegal characters
I saw there are several issues that refer to this:
common-workflow-language/cwltool#1260,
common-workflow-language/cwltool#1098, and
common-workflow-language/cwltool#1445.
Where the last one even contains an almost finished pull request.
So I was wondering, what's the status of this issue? Is it indeed a bug in CWLTool, or is this a (too) strict limitation by CWLTool on allowed characters in a filename?
crusoe
@mr-c:matrix.org
[m]
Sorry to hear that @gmloose ; https://github.com/common-workflow-language/cwltool/pull/1446#issuecomment-850896086 shows that the PR needs some assistance. Would you like to finish it up?
It is indeed a bug in cwltool; and it should be fixed
Marcel Loose
@gmloose
I could have a look, though I have limited time.
crusoe
@mr-c:matrix.org
[m]
Thanks. Seems like it just needs a docstring plus tweaking Alex's tests to cover the rest of the newly added code paths
Marcel Loose
@gmloose
I was curious about the doc-string part. None of the functions in that file have a doc-string. Why is it enforced on this single test function?
crusoe
@mr-c:matrix.org
[m]
It was a requirement added later, and we use diff-cover to only enforce it for new code
Marcel Loose
@gmloose
I'm not really familiar with tox. How can I run only the modified test test_path_checks.py and get coverage stats?
crusoe
@mr-c:matrix.org
[m]

The two lines with no test coverage are annotated at https://github.com/common-workflow-language/cwltool/pull/1446/files#annotation_2008443310

For local checking you'll need to run all the tests with make diff-cover

Marcel Loose
@gmloose

Hm, that fails. ```$ make diff-cover
python --version 2>&1 | grep "Python 3"
Python 3.6.9
python -m pytest -rs --cov --cov-config=.coveragerc --cov-report=
ERROR: usage: main.py [options] [file_or_dir] [file_or_dir] [...]
main.py: error: unrecognized arguments: -n --cov --cov-config=.coveragerc --cov-report=
inifile: /home/marcel/code/cwltool/tox.ini
rootdir: /home/marcel/code/cwltool

Makefile:155: recipe for target 'testcov' failed
make: * [testcov] Error 4
```

OK, make install-dep seemed to do the trick.
Marcel Loose
@gmloose
But it still runs all tests :(. I had expected make diff-cover would only run tests in test_path_checks.py. Am I doing something wrong?
crusoe
@mr-c:matrix.org
[m]
you have to run all the tests to see how the other changes may have impacted the code coverage. You aren't doing anything wrong, no.
crusoe
@mr-c:matrix.org
[m]

So when I run make diff-cover on that PR locally I get:

cwltool/command_line_tool.py (62.5%): Missing lines 207,256-257

Which matches what codecov.io reports, so that is good 🙂
Marcel Loose
@gmloose
I've tried to get my head around what exactly is going on in revmap_file and the new test. I get the impression that the test (in its current setup) can only check that what you put in as filename, also gets out (i.e. the external filename representation). I guess that's why only the if clause is covered by the test. I guess the else clause will only be executed if you supply an internal filename representation (at least, that's what I'm guessing right now). I'm not sure how I would have to supply an internal filename representation in that current test, because it uses a CommandLineTool, which is an external thingy.
crusoe
@mr-c:matrix.org
[m]
internal in this case refers to a path within a software (docker) container
(if that works then we can collapse the code duplication later, don't worry about that for now)