Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jul 23 18:13
    unito-bot opened #3729
  • Jul 23 16:35

    w-gao on 3487-migrate-to-ignition-testing

    (compare)

  • Jul 23 16:19

    DailyDreaming on 3640-cwl-in-toil-docs

    (compare)

  • Jul 23 16:19

    DailyDreaming on master

    Fix CWL in toil docs jobstore u… (compare)

  • Jul 23 16:19
    DailyDreaming closed #3728
  • Jul 23 16:19
    DailyDreaming closed #3640
  • Jul 23 16:18

    DailyDreaming on 3709-clean-up-test-s3-buckets

    (compare)

  • Jul 23 15:58
    w-gao ready_for_review #3727
  • Jul 23 10:30
    jonathanxu18 synchronize #3728
  • Jul 23 10:30

    jonathanxu18 on 3640-cwl-in-toil-docs

    Clarify jobstore use description (compare)

  • Jul 23 05:44

    mr-c on master

    Set cls.bucket (#3726) (compare)

  • Jul 23 05:44
    mr-c closed #3726
  • Jul 23 00:58
    jonathanxu18 opened #3728
  • Jul 23 00:36

    jonathanxu18 on 3640-cwl-in-toil-docs

    Fix typo (compare)

  • Jul 23 00:35

    jonathanxu18 on 3640-cwl-in-toil-docs

    Adjust jobstore usage for CWL i… (compare)

  • Jul 23 00:28
    jonathanxu18 synchronize #3725
  • Jul 23 00:28

    jonathanxu18 on 3444-internet-pytest-marker

    Add test_item arg (compare)

  • Jul 22 23:44
    w-gao opened #3727
  • Jul 22 23:40

    w-gao on 3714-cluster-based-cloud-storage

    Initial setup for provisioner-b… (compare)

  • Jul 22 23:30
    w-gao opened #3726
pvanheus
@pvanheus
Thanks for the tips... I've added the --linkImports and installed from source (so my toil-cwl-runner is now version 5.3.0a1) and it is still copying everything into the work directory...
pvanheus
@pvanheus
and then after doing so it does some kind of reading of each file (perhaps populating "contents"?). I'm running it on a subset of 200 files to examine behaviour more closely
pvanheus
@pvanheus
oh... it copies each file into the jobStore dir. sigh unfortunately still a lot of copying
after all this is completed though, this version is much better at keeping my cluster busy :)
Lon Blauvelt
@DailyDreaming
@pvanheus That's odd, and I wouldn't expect it to still be copying.
I could try to run it from my end if you have a reproducible workflow that you wanted to make an issue for: https://github.com/DataBiosphere/toil/issues
Vijay Lakhujani
@vlakhujani
the clusterStats option does not produce a json output, am I missing something ?
Lon Blauvelt
@DailyDreaming
@vlakhujani That option could be worded better, as it only works with mesos (and therefore AWS). If using a different cluster, try the --stats option. If using mesos/aws, then let me know because then that's a bug we need to fix.
1 reply
I'll go ahead and change the wording on the option to explain that it only works on mesos/aws.
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] Should we throw if we're trying to use it not on Mesos/AWS?
Lon Blauvelt
@DailyDreaming
That's probably a good idea Adam. I'll add that in too.
Douglas Lowe
@douglowe
should the cwltool:overrides: notation work when using toil-cwl-runner? I have an input file containing this, which does what I expect when I try running it with cwltool, but does not when I try using toil-cwl-runner
Peter Amstutz
@tetron
I don't think so, I think that's still a cwltool specific feature. it might show up in a future CWL spec revision
Douglas Lowe
@douglowe
ahh, okay
I'll stop trying to fix my syntax then :-/
Lon Blauvelt
@DailyDreaming
[Lon Blauvelt, UCSC GI] @douglowe We have an issue for this, so it's on the roadmap, but not currently being worked on.
karma29
@karma29:matrix.org
[m]
hi! i'm new to CWL & Toil. can i discuss this issue DataBiosphere/toil#3469 in this channel?
Peter Amstutz
@tetron
@karma29:matrix.org yes that would be appropriate. also https://gitter.im/common-workflow-language/common-workflow-language
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] Uh-oh, does the Matrix bridge not talk to the Slack bridge?
Lon Blauvelt
@DailyDreaming
@karma29:matrix.org What's the issue? @adamnovak D:
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] Oh, looks like it works.
karma29
@karma29:matrix.org
[m]

hello! just curious about the code here : https://github.com/DataBiosphere/toil/blob/master/src/toil/cwl/cwltoil.py#L736

mutable is by default set to false, which means that there is no downloaded copy of the file (and a link to it is created instead). in the function call examples, there wasn't any explicit declaration of mutable to true, so do they exhibit "streamable" properties? what more changes should we make here?

karma29
@karma29:matrix.org
[m]
actually i'm a gsoc'21 applicant so i wasn't sure if it would be more suitable to reply to the comment threads on github or join the irc. what's recommended though? i'm fine with either! 😃
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] We watch Github issues, and the chat here, but not really Github code comments. Either there or here is fine; here is maybe better fro questions that are not themselves bugs.
[Adam Novak, UCSC GI] I think the idea behind CWL streamable is that you will get a pipe (a FIFO) presented to the tool instead of a normal file? I'm not really sure.
Peter Amstutz
@tetron
yes
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] Anyway, readGlobalFile produces an ordinary file, with a filename and all the data on disk. It might be given via a symlink, and the lack of mutable means the user code isn't allowed to modify it, as other jobs may be using the same copy.
Peter Amstutz
@tetron
and if the data is coming from a remote location, it can be streamed incrementally instead of waiting for a full download
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] Toil already has a file_store.readGlobalFileStream function, but that returns a Python file object, and doesn't currently produce a FIFO on disk. So I think getting CWL streamable support would involve handling streamable requests from CWL by using file_store.readGlobalFileStream and producing a FIFO and a thread to fill it with data.
Michael R. Crusoe
@mr-c
Or by using a third party utility to achieve the same thing, given the s3 url
karma29
@karma29:matrix.org
[m]

okay, i see! thank you 👍️ so we need to implement a FIFO process for

  1. a sequence of files (irrespective of whether they're downloaded or not, because i think right now it waits for all the input files to be processed to release the outputs?)
  2. a sequence of data within the file, in case the file comes from a remote location or s3 url (so we don't need to download it)
  3. combination of the above two, if there are more files from a remote location or s3 url in the pipe

is that correct?

Lon Blauvelt
@DailyDreaming
@karma29:matrix.org I'm sorry, it seems the matrix.org connection doesn't sync to our slack channel, so your message was missed. In general, this will need to be one FIFO per file. I would focus on the AWS s3 functionality alone first.
I'll try to check the gitter manually more often.
Michael R. Crusoe
@mr-c
@DailyDreaming does your slack connection sync via gitter? Maybe switch to directly syncing your slack channel with the matrix version?
Lon Blauvelt
@DailyDreaming
I'll try that.
Thanks!
karma29
@karma29:matrix.org
[m]
Okay! Is there a link to the slack group i can join?
1 reply
I'd be grateful if you could give feedback on it! I'm a bit confused regarding how we would actually implement streamable properties, except for the part where we change file_store.readGlobalFileStream so any inputs regarding that would be much appreciated
karma29
@karma29:matrix.org
[m]
Also is a "job class" a Python class? I have heard of text files getting pickled. So does a job class getting pickled mean that the job class is being implemented as an object and then their contents being pickled to a file?
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] The jobs in Toil are instances of various "job classes", like CWLJob and JobFunctionWrappingJob. We do indeed instantiate these classes and then pickle the resulting objects into files.
karma29
@karma29:matrix.org
[m]

Okay, thank you for the feedback!

I've made some changes - if it looks good, may I go ahead and submit the proposal?

karma29
@karma29:matrix.org
[m]

Hello!

Just a follow-up to my previous message, since there's ~ 2 hours for the submission to end 😃 crusoe @DailyDreaming

1 reply
karma29
@karma29:matrix.org
[m]
Thank you, just updated with another query. Can you please check the doc? Thank you crusoe @DailyDreaming
Dennis R Kennetz
@drkennetz
Is it appropriate to post job openings in here? We have a more senior position open that is CWL/toil related for bioinformatics.
Michael Milton
@multimeric
Is there any interest in my suggestion here: https://github.com/DataBiosphere/toil/issues/1768#issuecomment-818475173 ? I'd be happy to give it a go if it's okayed by a maintainer
Lon Blauvelt
@DailyDreaming
[Adam Novak, UCSC GI] It sounds like a good idea to me!
Lon Blauvelt
@DailyDreaming
@drkennetz That's fine by me.
crusoe
@mr-c:matrix.org
[m]
Google just "alerted" me to this 11 month old toil-cwl-runner question on biostars https://www.biostars.org/p/448085