Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Nov 11 2022 09:34
    codecov[bot] commented #117
  • Nov 11 2022 09:34
    codecov[bot] commented #117
  • Nov 11 2022 09:34
    codecov[bot] commented #117
  • Nov 11 2022 09:34
    codecov[bot] commented #117
  • Nov 11 2022 09:31
    maxfischer2781 assigned #117
  • Nov 11 2022 09:31
    maxfischer2781 assigned #117
  • Nov 10 2022 16:18
    lgtm-com[bot] opened #117
  • Nov 10 2022 16:18
    lgtm-com[bot] opened #117
  • Aug 29 2022 13:31
    maxfischer2781 closed #115
  • Aug 29 2022 13:31
    maxfischer2781 closed #115
  • Aug 17 2022 08:52
    codecov[bot] commented #115
  • Aug 17 2022 08:52
    codecov[bot] commented #115
  • Aug 17 2022 08:51
    codecov[bot] commented #115
  • Aug 17 2022 08:51
    codecov[bot] commented #115
  • Aug 17 2022 08:49
    codecov[bot] commented #115
  • Aug 17 2022 08:49
    codecov[bot] commented #115
  • Aug 17 2022 08:49
    codecov[bot] commented #115
  • Aug 17 2022 08:49
    codecov[bot] commented #115
  • Aug 17 2022 08:49
    maxfischer2781 synchronize #115
  • Aug 17 2022 08:49
    maxfischer2781 synchronize #115
Stefan Kroboth
@stefan-k
Sorry about the noise in the PR, I thought I was in our local Gitlab... embarassing...
Manuel Giffels
@giffels

Can I somehow trigger codecov for the slurm PR? It seems like the current codecov result is not based on the most recent commit...

I have re-started CI at travis, show actually update codecov afterwards. Fingers crossed.

Stefan Kroboth
@stefan-k
@giffels: Thanks! unfortunately it looks like it didn't work. At least the diff seems to be missing the last commit :/
Max Fischer
@maxfischer2781
It seems there was an issue mapping the report to the commit. codecov complains that it was "Unable to find commit in GitHub"
Max Fischer
@maxfischer2781
Looks like the metadata was garbled, not sure if we can reset that. The easiest is probably for you to push a new commit, so that an entirely new CI+Codecov run is triggered.
Stefan Kroboth
@stefan-k
Thanks a lot! I'll try to push something today
WestByNoreaster
@mtwest2718
Good evening all.
Is this community forum still in use?
Max Fischer
@maxfischer2781
@mtwest2718 Yes it is. Latency is a bit longer when there isn't much going on, but us channel owners get notified of messages.
WestByNoreaster
@mtwest2718
I know you have a very pretty new github.io page
But there wasn't a clear way to contact y'all.
WestByNoreaster
@mtwest2718
So to introduce myself, I am a sys-admin at the University of Exeter.
  • We are setting up a 2500 core OpenStack system, which mostly will be used to serve researchers bespoke interactive VMs. I will admit I am a very new sys-admin so a lot of this stuff is confusing to me.
  • I would love to use Cobald/Tardis to utilize an idle cpu resources for batch compute. But the docs feel a bit light in setting up a new pool.
WestByNoreaster
@mtwest2718
I understand this is asking a bit from your (collective) time but a walk through on setting up a new pool of OpenStack nodes would be greatly beneficial.
Max Fischer
@maxfischer2781
Right, we've wanted to brush up the github.io page for a while but there are still some parts missing.
In case you want other ways of contacting us as well:
You can reach the core team at matterminers@lists.kit.edu at any time for practically any questions.
We've got a mattermost at https://chat.eudat.eu/ where you can reach other users of C/T as well – though it's mostly in German, as usual for the community people can switch to English at any time and will gladly do so.
Finally, a lot of us are also on the CERN mattermost.
Now that we've got that covered... :sweat_smile:
@giffels has more experience using C/T with OpenStack, but I should be able to walk you through the rough steps and we can work our way up from there on. ;)
Before we dive into C/T itself, do you have an existing batch system through which users will access the resources? I see you've been active on the HTCondor mailing, so I assume you're familiar with running an HTCondor Cluster?
WestByNoreaster
@mtwest2718
The batch system on our general purpose cluster is Slurm and I'd like to avoid touching that production system as much as possible. The OpenStack cluster is new and we have more flexibility to play with it.
3 replies
I am an experienced HTCondor user, not admin.
So would have to spin up a central manager and submit hosts. How my boss wants storage managed, I am not sure.
We just had the machines installed last week, a bunch is still in flux. I just wanted to reach out before hand. This also helps me make a clearer case to supervisor if I might need an extra node for running the submit &/or CM.
Max Fischer
@maxfischer2781
So, the rough outline of what you should be planning with: (I'll add them as separate message, feel free to reply-in-thread)
  • You'll need some HTCondor submit hosts to get jobs into the system. C/T doesn't really care about these, but be prepared that they must be able to communicate with your opportunistic resources. A public IP address is advisable since that means the resources don't need one then.
  • You'll need a central manager for the Collector and Negotiator. This one must be reachable from all opportunistic resources. I recommend configuring the Collector as a Condor Connection Broker as well, since that means only one of submit/worker node must be public.
Max Fischer
@maxfischer2781
  • You'll need a machine that runs COBalD/TARDIS. This one must be able to reach your resource provider (i.e. OpenStack) and the Collector. It's fine to just use the central manager for this as well; C/T is pretty flexible about being redeployed later on, so you don't lock-in yourself.
  • You can have static worker nodes (STARTD) but don't need them. C/T will opportunistically add them to your system. There is no problem having multiple C/T instances add resources, nor in having resources from other sources as well.
Max Fischer
@maxfischer2781
If you need recommendations on any of these, feel free to ask. Generally, C/T doesn't care much about how HTCondor is setup so we try to avoid giving people wrong ideas. But we do have the expertise for the entire HTCondor stack so we can give you more information as needed.
WestByNoreaster
@mtwest2718
Is it better to run HTCondor (and relevant container engine) on bare metal on these systems or have a VM instance that auto-starts as a worker node and connects to the pool, so as to use the OpenStack provisioning tools?
2 replies
Max Fischer
@maxfischer2781
As for getting started with C/T: I recommend to go through our tutorial first, which will walk you through setting up COBalD and TARDIS with a dummy resource pool. We'll walk you through using the OpenStack resource pool afterwards.