Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 25 2018 19:37
    j6k4m8 closed #211
  • Jun 17 2017 03:20
    falkben unassigned #174
  • Jun 17 2017 03:20
    falkben unassigned #77
  • Jun 17 2017 03:20
    falkben unassigned #24
  • Jun 17 2017 03:20
    falkben unassigned #154
  • Jun 17 2017 03:20
    falkben unassigned #218
  • Jun 17 2017 03:20
    brianlee324 added as member
  • Jun 17 2017 03:20
    keithdlevin added as member
  • Jun 17 2017 03:20
    howdosheeplamp added as member
  • Jun 17 2017 03:20
    ikuznet1 removed as member
  • Jun 17 2017 03:20
    dtward added as member
  • Jun 17 2017 03:20
    ikuznet1 added as member
  • Jun 17 2017 03:20
    alee156 added as member
  • Jun 17 2017 03:20
    ewalke31 added as member
  • Jun 17 2017 03:20
    keithdlevin removed as member
  • Jun 17 2017 03:20
    j6k4m8 added as member
  • Jun 17 2017 03:20
    disa-mhembere added as member
  • Jun 17 2017 03:20
    wrgr added as member
  • Jun 17 2017 03:20
    ewalke31 removed as member
  • Jun 17 2017 03:20
    gkiar added as member
Alex Baden
@alexbaden
FYI, I played wth some resource limits on. Raincloud
Braincloud
This may result in jobs being killed -- unclear. But better than cluster being killed! :-)
Greg Kiar
@gkiar
:+1:
Eric Bridgeford
@ebridge2
Word; psyched for the ami!
Greg Kiar
@gkiar
@ebridge2, send me your pipe file please? the one you just ran
Eric Bridgeford
@ebridge2
Is on my desktop; the one I ran is exactly identical to what I sent you last night though
With the virtual free change
Greg Kiar
@gkiar
ok cool
Eric Bridgeford
@ebridge2
Is stats not in 317?
I'm the only one here
joshua vogelstein
@jovo
or 301, we haven’t decided yet where it will be
Greg Kiar
@gkiar
@alexbaden the cluster crashed again - it's completely dumping to one node, and doing so ridiculously (i.e. attempts to give the load of 740 cores to 48). We can't run m2g on bc1 until we figure this out - which I'm more than willing to help support in any way I can. @/all In the mean time I'm focusing on aws.
Alex Baden
@alexbaden
Worse than yesterday AM or the same?
Greg Kiar
@gkiar
same
Alex Baden
@alexbaden
the queue on compute1 was in some sort of error state
i need to go over and reboot compute0, will get that done this afternoon
Alex Baden
@alexbaden
I think we made some progress w/ braincloud. @gkiar is running some test jobs now, and we're evenly split across both compute nodes. so that's better. if this goes well we can slowly open back up to everyone!
however, it appears loni doesnt pass any helpful info to the scheduler about memory usage, etc. so i think i have sge setup to kill jobs that are going to eat all the available memory -- but im not sure that works. so you should have a high index of suspicion for (1) your jobs being killed and (2) the cluster crashing as you run stuff going forward
and i can add sge_admin to my list of "Useless Technologies" on my resume :-)
William Gray
@willgray
hmmm bc1 error for matlab (licensing):

[will@compute0 bin]$ /usr/local/matlab/bin/glnxa64/need_softwareopengl: error while loading shared libraries: libGL.so.1: cannot open shared object file: No such file or directory

MATLAB is selecting SOFTWARE OPENGL rendering.

Error: Activation cannot proceed. You may either:

  1. Set an X11 display, and restart the activation process
  2. Use the silent activation feature
  3. Activate using the license center

@alexbaden
?
Alex Baden
@alexbaden
i guess use octave isn't an okay answer
i added it to my todo list. will try and look tonight, if not tomorrow AM
William Gray
@willgray
thanks, bud
Alex Baden
@alexbaden
my own matlab expired, too. im wondering if all ours expired
whats funny is i could actually connect bc1 to our aws license server.... since its just inbound ports
William Gray
@willgray
HA
i wonder if it expired 9/1 and we didn’t know bc of the other thing
i didn’t run yesterday
Greg Kiar
@gkiar
@/all @alexbaden is a hero. The cluster seems to be working again. I ran through the workflow that @ebridge2 last made (i.e that subset of the nki-enh dataset) last night and it seems to be running to completion which is awesome. @ebridge2 this means that you should now setup a workflow with the remainder of the subjects from that dataset, using the workflow I'm about to post here as the base.
@ebridge2 I did all of the first 66 subjects you originally put into the workflow (the workflow I pasted here has fewer subjects because it is just a subset I used for testing)
joshua vogelstein
@jovo
yay @alexbaden !!!!!
Eric Bridgeford
@ebridge2
Awesome sounds good
Alex Baden
@alexbaden
@/all matlab has been reactivated on braincloud1.
Greg Kiar
@gkiar
Greg Kiar
@gkiar
uh oh...
Screen Shot 2015-09-08 at 3.26.31 PM.png
Greg Kiar
@gkiar
@/all Design decision: We will not run more than 40 subjects in a single workflow of m2g. After this point it seems to break down very quickly, for reasons I'm unsure. Moving forward, @ebridge2 , partition datasets into chunks of $\leq$ 40 subject workflows in m2g and save them all in a 'workflows' folder within the dataset directory on bc1
William Gray
@willgray
Can you guys figure out how many submissions to scheduler happen for each subject? Is it the same for grouped workflows or those submitted through cli? I have some ideas on how to scale up
For em processing things we parcellate at about 10k jobs
Greg Kiar
@gkiar
It's way less than that. Per subject there are 13 jobs, tfr 13 submissions, so we shouldn't be even coming close to reaching that limit you use for i2g. I am pretty sure that the scheduler and LONI aren't communicating properly and when we get to tensor gen things go wonky because that's the first module which really uses a lot of ram
William Gray
@willgray
happily, @gkiar, the work you and I are doing before the end of the month will also result in a way to submit directly to sge. :)
if sge itself is messed up, we need an alex
Greg Kiar
@gkiar
:) wonderful, @willgray
arana91
@arana91
Hello Matlab programmers, Needed some guidance.
Hit me back when online!
Greg Kiar
@gkiar
Hi there. Gitter is deprecated for our communication, please email support@neurodata.io with a specific question if there is something we can help you with. Also, if you can please let us know where you found this link that would be great so that we can make that more clear for others in the future. Thanks @arana91