Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Louis Jenkins
    @LouisJenkinsCS
    Cool, it worked!
    Thanks
    Louis Jenkins
    @LouisJenkinsCS
    It is unfortunate that the client does not have a way of backing out when the server is unavailable in between tests. It seems to happen at random times, but client is just hanging.
    glitch
    @glitch
    If you are running the unit tests on a multi-locale system the hangling client is generally indicative of the server dying for some reason. How many locales are you running on? (There are a few known issues with running unit tests with small amounts of data on systems with num_locales > num_data_elements and I'm wondering if that's what you're running into.
    I'd also advocate for running a specific test file / class with the -v flag on first to get an idea of what test is failing, then running it with the -s flag to capture the output.
    python3 -m pytest tests/categorical_test.py::CategoricalTest -v
    Louis Jenkins
    @LouisJenkinsCS
    The default number of locales which is 2
    I'll try the -s flag
    Louis Jenkins
    @LouisJenkinsCS
    Unfortunately I can't seem to drag-and-drop here... so here is a pastebin: https://pastebin.com/xvPa8ZHB
    It works fine down to testEquality
    Louis Jenkins
    @LouisJenkinsCS
    This is weird, I am having so many problems running on the supercomputers, but it works fine on my laptop. Now I can't even do a pip install to build the arkouda package...
    Building wheels for collected packages: arkouda
      Building wheel for arkouda (setup.py) ... error
      ERROR: Command errored out with exit status 1:
       command: /home/users/p02405/anaconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/lus/scratch/tmp/pip-req-build-mra5diqu/setup.py'"'"'; __file__='"'"'/lus/scratch/tmp/pip-req-build-mra5diqu/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /lus/scratch/tmp/pip-wheel-2cq41ur_
           cwd: /lus/scratch/tmp/pip-req-build-mra5diqu/
      Complete output (30 lines):
      /home/users/p02405/anaconda3/lib/python3.8/site-packages/setuptools/dist.py:461: UserWarning: Normalizing 'v2021.08.20+35.g6f5616b.dirty' to '2021.8.20+35.g6f5616b.dirty'
        warnings.warn(tmpl.format(**locals()))
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib
      creating build/lib/arkouda
      copying arkouda/groupbyclass.py -> build/lib/arkouda
      copying arkouda/io_util.py -> build/lib/arkouda
      copying arkouda/pdarrayclass.py -> build/lib/arkouda
      copying arkouda/timeclass.py -> build/lib/arkouda
      copying arkouda/message.py -> build/lib/arkouda
      copying arkouda/sorting.py -> build/lib/arkouda
      copying arkouda/logger.py -> build/lib/arkouda
      copying arkouda/numeric.py -> build/lib/arkouda
      copying arkouda/infoclass.py -> build/lib/arkouda
      copying arkouda/strings.py -> build/lib/arkouda
      copying arkouda/client.py -> build/lib/arkouda
      copying arkouda/security.py -> build/lib/arkouda
      copying arkouda/pdarrayIO.py -> build/lib/arkouda
      copying arkouda/pdarraysetops.py -> build/lib/arkouda
      copying arkouda/_version.py -> build/lib/arkouda
      copying arkouda/pdarraycreation.py -> build/lib/arkouda
      copying arkouda/__init__.py -> build/lib/arkouda
      copying arkouda/dtypes.py -> build/lib/arkouda
      copying arkouda/categorical.py -> build/lib/arkouda
      copying arkouda/join.py -> build/lib/arkouda
      UPDATING build/lib/dist/arkouda/_version.py
      error: [Errno 2] No such file or directory: 'build/lib/dist/arkouda/_version.py'
      ----------------------------------------
      ERROR: Failed building wheel for arkouda
    This one is on a Cray-XC
    glitch
    @glitch

    Are you pip installing from pypi or are you building your own client package to install (https://github.com/Bears-R-Us/arkouda#install-ak).

    You shouldn't install from pypi as we have not updated that in quite some time and it will not work with the current server.

    Louis Jenkins
    @LouisJenkinsCS
    :o I see, I got it now, thanks
    Brett Eiffert
    @brettceiffert

    hi all - new to this project. i have just been playing with arkouda for the past month or so.

    i am having an issue with using a system where 256 threads are available (2 x 64 core processors with hyperthreading) and am only getting 128ish arkouda threads. how is the number of arkouda threads spun up decided for a given system? it seems there is a default value that applies to smaller machines but not this machine i am trying. Is there a max number of threads? Is there a flag to specify how many threads to spin up when starting the server? Thanks!

    Elliot Ronaghan
    @ronawho
    Arkouda is backed by Chapel, and by default Chapel uses physical cores only and not hyperthreads (which generally do not help performance of HPC Applications). You can control this value manually with CHPL_RT_NUM_THREADS_PER_LOCALE, but generally speaking using logical threads will not help performance and may in fact hurt. See https://chapel-lang.org/docs/usingchapel/tasks.html#controlling-the-number-of-threads for more info.
    Brett Eiffert
    @brettceiffert
    Thanks Elliot! that’s helpful
    Oliver Alvarado Rodriguez
    @alvaradoo
    @mhmerrill Hey Mike! Hope all is well. Where do we find the link with the weekly Arkouda meeting presentations on GitHub? I want to reference a student to one of the first ones you gave, but I am having trouble finding it on GitHub, thanks!
    Nevermind!
    I found it
    glitch
    @glitch
    Hi All, Mike asked me to pass on the following message:
    ATTENTION: We will NOT be having the Arkouda Weekly Call today.
    He expects things to resume as normal next week. Thanks!
    Louis Jenkins
    @LouisJenkinsCS
    I think I found reason why it was hanging during those tests from before; the walltime is set to a default 300 seconds; is there a way to specify the test use a walltime for each test?
    Louis Jenkins
    @LouisJenkinsCS
    Also interestingly enough, the memory limit does not seem to increase per-locale Nevermind
    glitch
    @glitch

    I'm fairly certain the reason for hanging/failure was the server process dying and the client ZMQ socket waiting for reconnection. If that's the case the test will never actually complete and should be considered failed as soon as the server process dies.

    However, if you want to try to increase the time out you can install the pytest-timeout plugin and it enables the cmd line arg --timeout=300 where the value is in seconds. An alternative is to try and use pytest marks to set something on a specific test @pytest.mark.timeout(10, "slow", method-"thread") (from the pytest docs)

    Michael Merrill
    @mhmerrill
    Does anyone have a subject they would like to discuss at today's Arkouda weekly call?
    Michael Merrill
    @mhmerrill
    Sorry for the late notice but since we don't have a topic... ATTENTION: We will NOT be having the Arkouda Weekly Call today.
    kay doh
    @kaydoh
    @glitch can you let me know if you are online?
    Michael Merrill
    @mhmerrill
    Good Morning! We will NOT be having an Arkouda Weekly Call today.
    We will resume next Tuesday at 1pm!
    dgrichardson
    @dgrichardson
    I just got a chapel 1.25.0 built and (I think) running on my cluster with gasnet (ibv conduit) to try with arkouda. For getting the arkouda part, should I clone master and follow the instructions? Or is there a release someplace that I missed?
    4 replies
    Brad Chamberlain
    @bradcray
    @dgrichardson : I think in practice most Arkouda devs and power users clone master for Arkouda. And I believe it’s currently in a clean state based on recent nightly testing runs.
    dgrichardson
    @dgrichardson
    Thanks. I'll clone master.
    Elliot Ronaghan
    @ronawho
    master should be pretty stable, though I believe the Arkouda devs have recently started tagging releases again -- https://github.com/Bears-R-Us/arkouda/releases
    glitch
    @glitch
    That's right, the master branch is generally stable. The README instructions for getting up and running should be a good start, I know our online docs have lagged a bit behind and need to be fixed/updated in some areas.
    Michael Merrill
    @mhmerrill
    we will tag a new release this week after we add the Parquet support PR
    dgrichardson
    @dgrichardson
    I got an arkouda server running (4 nodes with gasnet and an ibv conduit), and connected a notebook. The online instructions were reasonably straight forward to follow.
    Brad Chamberlain
    @bradcray
    Great, congrats! (and thanks for the update)
    dgrichardson
    @dgrichardson
    :)
    Michael Merrill
    @mhmerrill
    awesome!
    we are about to get on our weekly arkouda call if you want to join for a min
    dgrichardson
    @dgrichardson
    Sure. How do I join?
    Michael Merrill
    @mhmerrill

    Zoom Invite

    Michael Merrill is inviting you to a scheduled Zoom meeting.

    Topic: Arkouda Weekly Zoom Meeting Time: recurring meeting Tuesdays @ 1pm ET

    Join Zoom Meeting https://us04web.zoom.us/j/77717000423?pwd=TGlmaUN3L2hScFovTy9NRXNnUTE5dz09

    Meeting ID: 777 1700 0423 Passcode: kjM3WS

    it’s just a half hour for people to touch base
    dgrichardson
    @dgrichardson
    I didn't see how to get the server to use all (or most of) the available memory. Each node has 750GB. gasnet is saying it will only pin 335 GB per node (it looks like my ulimit is unlimited), and the server is saying it will only use 486 GB of RAM.
    Probably I missed some config options?
    Michael Merrill
    @mhmerrill
    there is a command line flag but it needs to beable to see the RAM from an OS call
    OS is called from a Chapel module to get physical memory limit
    black magic incantation ;-)
    Elliot Ronaghan
    @ronawho

    During your first run you should have gotten a message from gasnet recommending a value for GASNET_PHYSMEM_MAX. This is usually ~2/3 of physical memory or a limit set by the HCA. This limits how much memory can be pinned, but not how much physical memory you can allocate (just how much can be pinned / communicated at any given time).

    The amount the server reports comes from https://chapel-lang.org/docs/modules/standard/Memory/Diagnostics.html#Diagnostics.locale.physicalMemory, which should just be the physical memory of a system. Can you run free -g on the nodes to verify the OS reports ~750G and not ~512G?

    dgrichardson
    @dgrichardson
    You are right about 512GB. I should have run free my self instead of just reporting what the sysadmin said. :) But it sound s like that is fine, because it does not affect max useable memory.