Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 07:05
    knzivid opened #6778
  • 01:16
    steven-johnson synchronize #6777
  • 01:16

    steven-johnson on jit-calls

    clang-format (compare)

  • 01:13
    steven-johnson review_requested #6777
  • 01:12
    steven-johnson opened #6777
  • 01:12

    steven-johnson on jit-calls

    Prototype of revised JIT-call c… (compare)

  • May 23 18:31

    steven-johnson on init-from-context

    (compare)

  • May 23 18:31

    steven-johnson on main

    Allow overriding of `Generator:… (compare)

  • May 23 18:31
    steven-johnson closed #6760
  • May 23 18:31

    steven-johnson on gen-main-api

    (compare)

  • May 23 18:31

    steven-johnson on main

    Add execute_generator() API (#6… (compare)

  • May 23 18:31
    steven-johnson closed #6771
  • May 23 17:46
    steven-johnson commented #6760
  • May 23 17:46
    steven-johnson commented #6771
  • May 23 17:45
    steven-johnson commented #6773
  • May 21 19:04
    LebedevRI synchronize #6775
  • May 21 18:23
    LebedevRI synchronize #6775
  • May 21 02:14
    mym2009 commented #6776
  • May 20 18:04
    steven-johnson ready_for_review #6760
  • May 20 17:34
    pranavb-ca commented #6776
Alex Reinking
@alexreinking:matrix.org
[m]
Totally! Hopefully we'll be able to get the release latency (after LLVM) even tighter for the next releases :)
Derek Gerstmann
@derek-gerstmann
@alexreinking:matrix.org Hiya! I'm looking at backporting PR #6405 from master to create a v13.0.1 release. Should I create a "backports/13.x" branch and merge things in there, and then push the results into "releases/13.x"? Just trying to match what Andrew and I are seeing in the repo to follow conventions. Any suggestions?
Alex Reinking
@alexreinking:matrix.org
[m]
I use backports/N.x for staging changes to release/N.x. When it's ready, I open a PR with release/N.x as the target branch. There is CI set up for this scenario. Be sure to include a commit that bumps the version to 13.0.1.
release/N.x is (or ought to be) protected (like master), so you can't push to it
Derek Gerstmann
@derek-gerstmann
Ahh ... okay cool! Does the "release/N.x" branch itself get created automatically?
Alex Reinking
@alexreinking:matrix.org
[m]
No. When creating a new major release, we fork it off of master.
Derek Gerstmann
@derek-gerstmann
Makes sense! Okay, I'll work on getting things merged! Thanks!
Alex Reinking
@alexreinking:matrix.org
[m]
Also, I prefer to not squash commits from backports/N.x into release/N.x. I think the cherry-picking history is valuable (as are any separate/additional patches necessary to correctly backport) as is keeping the version number bump separate.
No problem! Happy to share the release responsibility :)
Derek Gerstmann
@derek-gerstmann
Cool. Yeah, I'm the same. I rarely squash commits unless there's a really good reason to.
Sweet! Happy to help! :)
Alex Reinking
@alexreinking:matrix.org
[m]
I bring it up because the repository default PR-merge mode is to squash
Derek Gerstmann
@derek-gerstmann
Oooh, good to know! I'll turn it off for the release PR. Thanks for the heads up!
Alex Reinking
@alexreinking:matrix.org
[m]
Of course!
Ashish Uthama
@ashishUthama
@alexreinking:matrix.org - is it reasonable to ask for the LICENSE file to be included in the downloads?
Alex Reinking
@alexreinking:matrix.org
[m]
I think it is... I would add it to the share/doc/Halide folder (at least on Linux), along with the other READMEs.
Ashish Uthama
@ashishUthama
I'll create an issue and try to make the change.
Alex Reinking
@alexreinking:matrix.org
[m]
Sure... I can review, but I'm traveling through 12/5, and pushing for the 11/19 PLDI deadline, so my bandwidth is limited
Ashish Uthama
@ashishUthama
no hurry!
Nikola Smiljanić
@popizdeh
Can someone please explain this message Loop over output.s0.x has extent output.extent.0. Can only vectorize loops over a constant extent > 1. Let's say we're dealing with floats and SSE, I don't get why the loop over x can't simply loop over 1/4 of the extent and process 4 float values at the time (let's ignore the case where extent is not divisible by 4). Do I need to split x into constant size chunks in order to get vectorization working?
Zalman Stern
@zvookin
How are you calling vectorize? Typically one does f.vectorize(x, 4) which provides the split as part of a single directive. If one writes f.vectorize(x) it means the extent must be constant and known and the vectorization amount is the complete extent.
5 replies
steven-johnson
@steven-johnson:matrix.org
[m]
mac-buildbot-1 is going DOWN for a long-overdue OS upgrade (since I am actually physically in front of it). Back up soon-ish I hope.
Derek Gerstmann
@derek-gerstmann
FYI -- Halide v13.0.1 has been released: https://github.com/halide/Halide/releases/tag/v13.0.1
Vlad Levenfeld
@vladl-innopeaktech_gitlab
To call some AOT generated code from a Hexagon binary (running on the simulation), what do I need to link my Hexagon binary to? I am getting some error messages about undefined symbols (halide_string_to_string, halide_msan_annotate_memory_is_initialized, a couple of others) but libHalide.so and libHalide.a are both x86_64 libs
(I am getting the error messages when I run the simulation)
Or is there perhaps a way to statically link those missing functions when I run the AOT generator?
shoaibkamil
@shoaibkamil:matrix.org
[m]
It sounds like you're missing a runtime perhaps? What was the target for the AOT generated code?
Vlad Levenfeld
@vladl-innopeaktech_gitlab
hexagon-32-qurt-hvx_128-hvx_v66-no_asserts-no_bounds_query-enable_llvm_loop_opt
I can see some DSP libs in Hexagon SDK's copy of Halide, but I don't seem to be generating these libs when I am building Halide from source
Vlad Levenfeld
@vladl-innopeaktech_gitlab
this might be relevant
$ objdump -t build/host/ResizeNearestNeighbor.a | grep halide_
00000000 l    df *ABS*  00000000 halide_buffer_t.cpp
00000000         *UND*  00000000 halide_error
00000000         *UND*  00000000 halide_msan_annotate_memory_is_initialized
00000000  w    F .text.halide_qurt_hvx_lock     000000b0 halide_qurt_hvx_lock
00000000  w    F .text.halide_qurt_hvx_unlock   000000ac halide_qurt_hvx_unlock
00000000  w    F .text.halide_qurt_hvx_unlock_as_destructor     00000008 halide_qurt_hvx_unl
ock_as_destructor
00000000         *UND*  00000000 halide_string_to_string
00000000  w    F .text.halide_vtcm_free 00000008 halide_vtcm_free
00000000  w    F .text.halide_vtcm_malloc       0000000c halide_vtcm_malloc
That ResizeNearestNeighbor.a is the AOT generation result. It already has references to these functions at this stage.
So either I need to link them statically at AOT-generation-time or I need to dynamically link some runtime lib... and hope that it will work on the hexagon simulator
Vlad Levenfeld
@vladl-innopeaktech_gitlab
Either that, or maybe I am making a mistake during generation and these functions shouldn't be getting referenced at all in the output?
Vlad Levenfeld
@vladl-innopeaktech_gitlab
Ok, looks like I was missing something during AOT generation. I just needed to add the dependency ResizeNearestNeighbor.runtime to the same target that's pulling in the ResizeNearestNeighbors target in my CMakeLists.
Now I am just missing definitions for halide_error and halide_print
Vlad Levenfeld
@vladl-innopeaktech_gitlab
Found them in libhalide_hexagon_remote_skel.so in Qualcomm's SDK fwiw
Problem solved
Ashish Uthama
@ashishUthama
@alexreinking:matrix.org - also wondering if the packages should include the tools/ folder (esp GenGen.cpp )
whoops..nvm. Its there!
didnt look under share/Halide/tools/
Sambhav Saxena
@sambhavsaxena
Hi everyone, I'm a second year CS student from a second tier university. I'm eager to contribute to this organization but can't get the slightest idea of where to get started. Can I get some help regarding the same?
vincenzoml
@vincenzoml
Hi there, I have a simple question: is there some library of halide-implemented imaging functions? As a start, connected components in 2d and 3d?
vincenzoml
@vincenzoml
I'm considering using it as a backend for voxlogica (https://github.com/vincenzoml/VoxLogicA) in the future, in place of OpenCl / SimpleITK
But I would definitely love to avoid re-implementing all kernels from scratch
tuscasp
@tuscasp

Hi there. I would like to create a basic inheritance structure based on Halide::Generator, so as to avoid duplicated code.

The idea is that the base generator class should have a (virtual) function that is to be overridden by derived classes. Moreover, each derived class should have a specific input parameter, not available in the base class.

In normal C++ this is quite straightforward. My current Halide implementation is all in a single file, having the following classes:

class Base : public Halide::Generator<Base> {
public:
    Input<Buffer<float>> input{"input", 2};

    Output<Buffer<float>> output{"brighter", 2};

    Var x, y;

    virtual Func process(Func input);

    virtual void generate() {
        output = process(input);
        output.vectorize(x, 16).parallel(y);
    }
};
class DerivedGain : public Base {
    public:
    Input<float> gain{"gain"};

    Func process (Func input) override{
        Func result("result");
        result(x,y) = input(x,y) * gain;
        return result;
    }
};
class DerivedOffset : public Base{
    public:
    Input<float> offset{"offset"};

    Func process (Func input) override{
        Func result("result");
        result(x,y) = input(x,y) + offset;
        return result;
    }
};

Finally, I only register the two derived classes, since I have no direct interest in the Base one:

HALIDE_REGISTER_GENERATOR(DerivedGain, derived_gain)
HALIDE_REGISTER_GENERATOR(DerivedOffset, derived_offset)

But during compilation, it launches an error suggesting that class Base was being instantiated (which I do not need to happen) :

in function Base::Base():
my_generators.cpp:(.text._ZN4BaseC2Ev[_ZN4BaseC5Ev]+0x2f): undefined reference to vtable for Base

If instead of using a virtual function I implement it in the Base class like so:

class Base : public Halide::Generator<Base> {
public:
    Input<Buffer<float>> input{"input", 2};

    Output<Buffer<float>> output{"brighter", 2};

    Var x, y;

    // Func process(Func input);
    Func process (Func input){
        Func result("result");
        result(x,y) = input(x,y);
        return result;
    }

    virtual void generate() {
        output = process(input);
        output.vectorize(x, 16).parallel(y);
    }
};

Then everything compiles, but the object and header files with the generated code have the wrong function signatures (noticeable as there are missing gain/offset parameters):

derived_gain.h:

int derived_gain(struct halide_buffer_t *_input_buffer, struct halide_buffer_t *_result_buffer);

derived_offset.h:

int derived_offset(struct halide_buffer_t *_input_buffer, struct halide_buffer_t *_result_buffer);

Therefore, I would like to know which mistake I am introducing in the class definitions and how to solve it.

shoaibkamil
@shoaibkamil:matrix.org
[m]

@popizdeh:

I was just calling vectorize(x) and getting the error. What I find confusing is that your example works, like how is this split factor useful? Documentation says "Split a dimension by the given factor, then vectorize the inner dimension.", but say I split x by a factor of 4, it still needs to know what the width of the image is to do the split, so why doesn't vectorize(x) work like a split with factor 1? It knows the image width! Why does it throw the error?

A split with factor 1 would result in the inner loop being a single iteration; it doesn't really make sense to vectorize a single iteration. On the other hand, splitting with a factor >1 that is static allows vectorization. Note that from the perspective of Halide, we generate code that vectorizes by the width of this inner loop-- LLVM then lowers that to native instructions that are of native length. In addition, the code Halide generates doesn't statically contain the image width, but rather the image width is a parameter (contained in the Buffer struct).

1 reply
shoaibkamil
@shoaibkamil:matrix.org
[m]

If you look at the documentation for split() (https://github.com/halide/Halide/blob/a89041b9563352edfb5e6c8ce4a1de4c490b751f/src/Func.h#L1425) it says

Split a dimension into inner and outer subdimensions with the given names, where the inner dimension iterates from 0 to factor-1

The vectorize() overload with an integer factor's documentation, in that context, is saying that vectorize(x, n) is equivalent to split(x, x_outer, x_inner, 4).vectorize(x_inner)

1 reply
shoaibkamil
@shoaibkamil:matrix.org
[m]
Your interpretation of split() is incorrect. The documentation of split() that I linked above says the factor is the number of inner iterations, not outer iterations.
Alex Reinking
@alexreinking:matrix.org
[m]
it's not "split into N pieces", it's "split into pieces of size N"
Zalman Stern
@zvookin
I'm not sure any amount of documentation will fix the fact that people, even those who've been using Halide for a long time, get confused about how split works, but the behavior is that it splits something into blocks of size N. For some reason tile has less of this confusion, at least for me, though maybe it is that it is less often used. Part of the problem is that sometimes one wants to break something into a specific number of pieces, regardless of the size they happen to be, and Halide doesn't really do that. (If one knows the full extent a priori, one can compute a split factor that does the job, but usually these cases are dynamic. Which is a lot of the reason split works the way it does.)