Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 05:22

    dsharletg on interpret_nn

    Make it easier to add unary ops. Fix overflow. Tweak locality of conv output c… (compare)

  • 03:11
    alexreinking synchronize #5754
  • 03:11

    alexreinking on debian

    Use absolute paths in xc example (compare)

  • 02:50
    alexreinking synchronize #5754
  • 02:50

    alexreinking on debian

    Fix tutorial 15 test Don't add Halide DLL to PATH on… Rename wabt-obj to Halide_wabt … and 1 more (compare)

  • 02:45

    dsharletg on logistic

    Add logistic placeholder. Rough implementation of logisti… Merge branch 'interpret_nn' of … and 1 more (compare)

  • Apr 09 21:16
    dsharletg opened #5893
  • Apr 09 21:15

    dsharletg on remove-nn_ops

    Remove nn_ops app in favor of h… (compare)

  • Apr 09 20:52
    steven-johnson commented #5891
  • Apr 09 20:51
    dsharletg commented #5891
  • Apr 09 20:50
    steven-johnson opened #5892
  • Apr 09 20:50

    steven-johnson on cuda-debug

    Add debugging code to gpu_dynam… (compare)

  • Apr 09 20:48

    steven-johnson on cuda-debug

    (compare)

  • Apr 09 20:46
    dsharletg synchronize #5891
  • Apr 09 20:46

    dsharletg on interpret_nn

    clang-format (compare)

  • Apr 09 20:36
    dsharletg opened #5891
  • Apr 09 20:28

    dsharletg on interpret_nn

    Convolution -> Conv for consist… Remove unnecessary includes Remove stale TODOs and 1 more (compare)

  • Apr 09 20:17

    dsharletg on interpret_nn

    Create README.md (compare)

  • Apr 09 20:11

    dsharletg on interpret_nn

    Update comments. Better failure when dynamic sha… (compare)

  • Apr 09 18:39

    dsharletg on logistic

    Add logistic placeholder. (compare)

steven-johnson
@steven-johnson:matrix.org
[m]
Adding a specific reviewer is fine (if you guess wrong then we will reassign.)
steven-johnson
@steven-johnson:matrix.org
[m]
ATTENTION: buildbots will be going down for a bit; I'm going to be downgrading the buildbot version to see if it affects some flaky zombie-failures that have started recently. This will require wiping out the build history, so you'll have to push a change to your PR to trigger rebuilds. Sorry for the inconvenience.
Dan-Yeh
@Dan-Yeh
Hello everyone. I'm getting this error : "'Halide::RuntimeError' what(): Error: halide_copy_to_device does not support switching interfaces" when trying to run schedule for gpu (simple .gpu_tile()) on CUDA backend (Windows) and Metal backend (OSX). But the program runs smoothly on OpenCL backend (Ubuntu and Windows). Does anyone know what happens to CUDA and Metal backends?
Zalman Stern
@zvookin
Is there code scheduled via OpenCL elsewhere in all cases?
Zalman Stern
@zvookin
@Dan-Yeh Where is the halide_copy_to_device being called from? Is it generated by Halide itself?
Generally, inside a Halide pipeline, Halide will arrange to move GPU buffers between different devices (or device APIs on the same hardware, it's the same path).
I would not say this is a well exercised path
It is also possible some buffer is not initialized properly and the device interface pointer is garbage
Mohsin Mehmood
@mohsinmahmood12
Hi Everyone! My name is mohsin mahmood. Is a Computer science student comfortable with python C C++ kindly tell me from where should I start and kindly tell me am I too late? Any help would be appreciable
rohan123
@rohan123:matrix.org
[m]
I have many issues to install Halide in my Windows 10, Can anyone please help me out.
Andrew Adams
@abadams
@mohsinmahmood12 if you're talking about GSoC, then you're not too late and you should start by emailing a resume and what project you're interested in and what relevant background you have to halide-gsoc2021-mentors@mit.edu
FE
@FyisFe
image.png
Excuse me, sirs. When I use cmake to build Halide, this error occurred.
image.png
After I set it to YES here, the same error also occurred
Can anyone kindly help me out? Thanks so much
aavbsouza
@aavbsouza
hello @FyisFe have you tried to pass the options -DTARGET_WEBASSEMBLY=OFF or -DHalide_SHARED_LLVM=YES. Usually is simpler to compile the llvm as per the instructions on the readme of the github repository
FE
@FyisFe
It works!
Thanks so much!
Alex Reinking
@alexreinking
@FyisFe -- you should never set option()s by editing the CMakeLists.txt source files. They are meant to be set at the command line as @aavbsouza described. That applies to all CMake projects.
The reason it didn't work by editing the sources is because the NO value had already been written to the cache (CMakeCache.txt in the build folder) and so the cached value was preferred.
The intended way to edit the cache is either through the cmake command line, the ccmake ncurses interface, or the cmake-gui GUI
FE
@FyisFe
Get it. Thanks so much. It's the first time I work with a CMake project, please forgive me if I asked a very stupid question hahaha
Alex Reinking
@alexreinking:matrix.org
[m]
No worries! Just thought telling you that would be generally useful in the future 🙂
Lev Yudalevich
@lyudalev
Dear All,
I ran into a strange problem. Below is a minimal example, not a real code. Here it basically just copies a buffer. What I'm observing is that when the buffer is big enough (1280x1024), it runs ~10 (yes, ten) times slower than what it takes for a smaller buffer size (eg., 1280x512). It seems that the specific schedule doesn't matter -- it behaves the same way with different schedules as well as with auto scheduler. Do I miss something or is it a cache miss case? How can I work around it, please?
class TestCase : public Generator <TestCase>
{
public:
    Input<Buffer<float>>  input  {"input", 2};
    Output<Buffer<float>> result {"result", 2};

    void generate ()
    {
        Var x, y;

        result (x, y) = input (x, y);

        if (auto_schedule)
        {
            input.set_estimates ({{0, 1280}, {0, 1024}});
            result.set_estimates ({{0, 1280}, {0, 1024}});
        }
        else
        {           
            Var xo, yo, xi, yi, tidx;
            result
                .tile (x, y, xo, yo, xi, yi, 1280, 512)
                .fuse (xo, yo, tidx)
                .parallel (tidx);

            Var ixo, iyo, ixi, iyi;
            result
                .tile (xi, yi, ixo, iyo, ixi, iyi, 128, 4)
                .vectorize (ixi, 32)
                .unroll (iyi);
        }
    }
};
aavbsouza
@aavbsouza
Hello @lyudalev , This performance issue persist with the default scheduling ?
Lev Yudalevich
@lyudalev
@aavbsouza Yes, it is.
ROHAN ZANJAL
@rohan123:matrix.org
[m]
1 reply
Can anyone help me to solve this error
Dan-Yeh
@Dan-Yeh
@zvookin Thanks for the help! I found the bug, which caused by passing the same halide buffer pointer to different device interfaces.
shoaibkamil
@shoaibkamil:matrix.org
[m]
I believe you add -DWITH_WEBASSEMBLY=OFF but someone can confirm <— edited to add: should actually be -DTARGET_WEBASSEMBLY=OFF
1 reply
Heh, Alex did already :)
Alex Reinking
@alexreinking:matrix.org
[m]
-DTARGET_WEBASSEMBLY=OFF
WITH is the definition in the code, TARGET is the CMake option. I didn't make that decision ;)
Lev Yudalevich
@lyudalev
Sorry for bothering and annoying but I desperately looking for help. Anyone please?
:point_up: April 4, 2021 4:55 PM
kani4
@kani4
Hi when i try to execute Halide::Runtime::Buffer on device, it gives error. I have a host, rpc idl and dsp. When i declare runtime buffer in the dsp file, it fails on device. Please let me know how i can proceed with this
ROHAN ZANJAL
@rohan123:matrix.org
[m]
But what I have to change in the codebase?
1 reply
Dan-Yeh
@Dan-Yeh
Hi guys! I'm trying to write a generator that could generate gpu_schedule to different targets(e.g. OpenCL, CUDA), but could we specify the target inside the generator class like in compile_jit instead of explicitly specifying target in command-line?
steven-johnson
@steven-johnson:matrix.org
[m]
Has something about Cuda codegen (or tests) changed recently? Several of the buildbots seem to be timing out on cuda tests, and it doesn't seem platform-specific, so I wonder if we've injected a bug
correctness_gpu_reuse_shared_memory seems to be the last test started
steven-johnson
@steven-johnson:matrix.org
[m]
yeah, it hangs on my local linux box too (or at least takes > 30 secs, which is close enough)
I'll try to bisect
steven-johnson
@steven-johnson:matrix.org
[m]
...ack, it seems specific to LLVM13 (doesn't repeat under LLVM12). I'll get out the bisect-in-LLVM script.
steven-johnson
@steven-johnson:matrix.org
[m]
bisect claims injection point is a6d2a8d6f59a974752666305c4baaabebee41b95
steven-johnson
@steven-johnson:matrix.org
[m]
...aaand it looks like it's fixed in 9ef6aa020b6fb9c7672919985b0ed2a6953a3596 (which is a followup to the injection). I'll trigger LLVM13 rebuilds.
Patrick M Young
@youngpm
Is it possible to use the autoscheduler when one uses define_exern? I got a Condition failed: loop: Could not compute plausible site for unscheduled Func when I gave it a try.
Andrew Adams
@abadams
Sadly no. Nobody has figured out how to support that
Zalman Stern
@zvookin
Probably a good use for programmer provided autoscheduler guidance
steven-johnson
@steven-johnson:matrix.org
[m]
Note: some of the buildbots are down while network connectivity issues are being investigated.