Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Christophe Prud'homme
    @prudhomm
    @victorsndvg could it be the info in the process name ?
    where could I find the info on the node where it ran ?
    I don’t think I have it in the log files
    Christophe Prud'homme
    @prudhomm
    @victorsndvg I manage to reproduce the error with the advanced app. so What I have is: minimal app uses thin-shared successfully, avanced app works with thinnodes but not thin-shared. Weird.
    @victorsndvg yet another instance in limbo: MSO4SC/MSOPortal#156
    victorsndvg
    @victorsndvg
    allways the same error?
    Christophe Prud'homme
    @prudhomm
    yes
    victorsndvg
    @victorsndvg
    are you using the production portal?
    Christophe Prud'homme
    @prudhomm
    and minimal is working on thin-shared it is hard coded in the blue print
    yes
    victorsndvg
    @victorsndvg
    Ok, let me reproduce it from the portal
    Christophe Prud'homme
    @prudhomm
    the difference between the two is that in advanced you have access to the HPC input parameters
    victorsndvg
    @victorsndvg
    can you send a link to both blueprints?
    Christophe Prud'homme
    @prudhomm
    Christophe Prud'homme
    @prudhomm
    advanced0 provides the minimal interface which works fine (I tried twice) whereas advanced1 provides the advanced interface which fails on thin-shared (I tried twice) but works on thinnodes(I tried once and I am waiting for the second test to run, it is currently pending)
    @victorsndvg it seems that it is very hard to work on thinnodes currently. We are trying to make a video for eye2brain and it takes a long time to go from PENDING to RUNNING even though we don’t use a lot of ressources
    victorsndvg
    @victorsndvg
    @prudhomm , I've launched eye2brain-level-0-advanced from the portal and it worked
    Christophe Prud'homme
    @prudhomm
    without changing anything ?
    thin-shared ?
    victorsndvg
    @victorsndvg
    Anything
    Christophe Prud'homme
    @prudhomm
    then why did I have these failures with squashfs ?
    I got that twice, let me try a third time on thin-shared with the advanced app. (it is still pending on thinnodes)
    victorsndvg
    @victorsndvg
    Don't know, as I cannot reproduce it. I need more details. Maybe the sbatch script that the portal generates for you
    Christophe Prud'homme
    @prudhomm

    Now I have this with the advanced app:

    [2018-11-21T11:47:11.072Z] -------INSTALL-------
    [2018-11-21T11:46:50.549Z] Starting 'install' workflow execution
    [2018-11-21T11:46:51.475Z] Creating node main_hpc_ydp64b (main_hpc)
    [2018-11-21T11:46:53.304Z] Configuring node main_hpc_ydp64b (main_hpc)
    [2018-11-21T11:46:54.305Z] Starting node main_hpc_ydp64b (main_hpc)
    [2018-11-21T11:46:54.602Z] Sending task 'hpc_plugin.tasks.prepare_hpc' main_hpc_ydp64b (main_hpc)
    [2018-11-21T11:46:54.797Z] Task started 'hpc_plugin.tasks.prepare_hpc' main_hpc_ydp64b (main_hpc)
    [2018-11-21T11:46:56.413Z] Task failed 'hpc_plugin.tasks.prepare_hpc' -> 'workload_manager' main_hpc_ydp64b (main_hpc)
    KeyError: 'workload_manager'
        Traceback (most recent call last):
      File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 641, in main
        payload = handler.handle()
      File "/opt/mgmtworker/env/lib/python2.7/site-packages/cloudify/dispatch.py", line 397, in handle
        result = self.func(*self.args, **kwargs)
      File "/opt/mgmtworker/env/plugins/default_tenant/eye-l0-advanced-3-hpc/lib/python2.7/site-packages/hpc_plugin/tasks.py", line 68, in prepare_hpc
        wm_type = config['workload_manager']
    KeyError: ‘workload_manager’

    on thin-shared, I am at a loss here (I still don’t have the result on thinnodes)

    Christophe Prud'homme
    @prudhomm
    @victorsndvg the level0 advanced app just ran on thinnodes successfully whereas it fails all three times for me on thin-shared.
    and the level 0 minimal app worked three times in a row properly.
    Could there be some instabilities on thin-shared?
    I have launched a 4-th time the level 0 advanced app on thin-shared and it finally worked.
    Christophe Prud'homme
    @prudhomm
    level1 app worked on thinnodes. It seems that thinnodes are somewhat more stable for our apps than thin-shared
    I think I will revert to thinnodes for the time being for the level 0 apps, it is sager for the tests
    victorsndvg
    @victorsndvg
    I cannot explain what is happening (taking into account that the 3rd error is different)
    Please, if you think that using thinnodes is better for your app, feel free to use it by default
    I will try again to run the app to see if something wrong happens ...
    Christophe Prud'homme
    @prudhomm
    @victorsndvg yes this is weird, some kind of instability, I will revert that change and move back to thinnodes, it can take quite some time to run there though.
    victorsndvg
    @victorsndvg
    @prudhomm , I've run the previous version (yesterday) and the newone (today) on thin-shared and it worked perfectly for me both times
    Christophe Prud'homme
    @prudhomm
    @victorsndvg thank you for the tests. Unfortunately I tested too and got successes as well as failures. Thin-shared can still be used in advanced interface for now, i preferred to be on tge safe side and use thinnodes until we understand what is going on with the instabilities
    victorsndvg
    @victorsndvg
    @prudhomm , I agree with you
    Trophime
    @Trophime
    @victor I cannot get a remote desktop on canary: Couldn't read the desktops: Couldn't create a new desktop
    same behaviour on the portal
    victorsndvg
    @victorsndvg
    This should be fixed some time ago
    Trophime
    @Trophime
    ??
    victorsndvg
    @victorsndvg
    When ECMI. Let me check the email. I'm going to give you a quick fix
    It's happening with your account?
    Trophime
    @Trophime
    yep
    victorsndvg
    @victorsndvg
    Can you create a desktop from https://portalusuarios.cesga.es/tools/remote_vis ?
    Trophime
    @Trophime
    it seems no.. a popup widow appears just for a second..
    victorsndvg
    @victorsndvg
    It seems that there is a poblem with the visualization node. Technicians are taking a look to it. I will tell you as soon as it's ready
    Trophime
    @Trophime
    @emepetres seems like I cannot register a new instance on the portal: Could'nt create new instance error 500...
    Trophime
    @Trophime
    Did someone manage to create a new instance in the newest portal?