Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Ward Vandewege
@cure
is everything working OK otherwise @tschoonj ?
Tom Schoonjans
@tschoonj
yes, we did get our minimal crunch-dispatch-slurm working
Ward Vandewege
@cure
excellent
Andrey Kartashov
@portah
@tschoonj Do you have a recommendation for a newbie? or just follow the docs?
Tom Schoonjans
@tschoonj
just follow the docs
one thing we are still discussing is federation
is it possible with 2.2 to submit a CWL wf to cluster A that has a keep but without crunch, to get that wf to get picked up by cluster B, which has no keep, but crunch support.
Andrey Kartashov
@portah
Arvados federation? Or some integration with you current infrastructure?
Tom Schoonjans
@tschoonj
Arvados federation
Peter Amstutz
@tetron
@tschoonj cluster B still needs to have Keep for storing intermediate results
there's some federation token handling that we need to work on to make that case of "cluster B uses federated data from cluster A" work smoothly
Tom Schoonjans
@tschoonj
@tetron so this won't be possible even with 2.3, even with cluster B having its own keep?
Peter Amstutz
@tetron
there might be a workaround
you can override the token it uses
actually
it depends on how they are federated
Tom Schoonjans
@tschoonj
in our case cluster A would be the LoginCluster
so I assume that its tokens should work also on cluster B?
Peter Amstutz
@tetron
so here's how it works: when a process (container) runs, it gets an ephemeral token for the lifetime of the process
that's created by the API server that owns the container
the problem is that if it is a satellite cluster, not the main login cluster, a token created on the satellite is only good for accessing data on the satellite
so what it needs to do is get a new token from the login cluster
but that feature doesn't exist yet
Tom Schoonjans
@tschoonj
and there's no workaround for this currently?
Peter Amstutz
@tetron
the API allows you to provide an explicit token to use when submitting a container
Tom Schoonjans
@tschoonj
aha
that sounds promising
Peter Amstutz
@tetron
I'm checking to see what circumstances the workflow runner passes that parameter
the parameter is "runtime_token" on container_request
Tom Schoonjans
@tschoonj
I assume that this is the runtime_token in the container_requests API?
:)
does arvados-cwl-runner have a similar option?
Peter Amstutz
@tetron
it does not. this is actually isolated from the workflow runner. the workflow runner has the ability to request that a container be submitted to a different cluster than the main one, and runtime_token is how controller provides credentials
so actually something like this might work
run arvados-cwl-runner --local configured to communicate with the login cluster
use arv:ClusterTarget to send all the jobs to cluster B
what would happen is the job is submitted to cluster A, cluster A redirects the request to cluster B along with proper credentials, cluster B receives the job along with credentials recognized by cluster A to access it's data
Tom Schoonjans
@tschoonj
that sounds like something we could test
would this work with 2.2.2 on cluster A?
Peter Amstutz
@tetron
yes
Tom Schoonjans
@tschoonj
many thanks Peter! I will discuss this with our team
Peter Amstutz
@tetron
the workaround is a bit speculative but it is worth a try. the main drawback it that you probably can't run the workflow runner itself as a container, so you'll end up with a bunch of disassociated container records instead of them being grouped under a single leader
but improving this situation is on our roadmap
and probably get a boost in priority
Ward Vandewege
@cure
yeah sounds like we need to prioritize #16888
Peter Amstutz
@tetron
the arvados user group meeting is happening now: https://forum.arvados.org/t/arvados-user-group-video-chat/47/9
Peter Amstutz
@tetron:matrix.org
[m]
@room Aravdos 2.3 has been released! https://forum.arvados.org/t/arvados-2-3-0-released/95
Brad Chapman
@chapmanb
When using the Python API, what is the right function call to update a collection with new files replacing the old ones? I'm happily creating initial containers using save_new, but when trying to create a new version of a collection with new files using a similar approach and calling saveI get the dreaded [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate which I suspect is just telling me I'm not doing it the right way. Are there any cookbook examples of this? I'm having lots of fun learning my way around the API and enjoying using the fabulous new workbench, cool to see all the awesome work you've been doing.
Peter Amstutz
@tetron
are you using playground or arvbox?