## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
• 19:28
zvookin synchronize #6786
• 19:28

zvookin on target_vector_bits

More formatting. (compare)

• 19:26
zvookin commented #6781
• 19:06
zvookin synchronize #6786
• 19:06

zvookin on target_vector_bits

Code formatting fix. (compare)

• 18:42
zvookin opened #6786
• 18:38
LebedevRI commented #6785
• 18:27

zvookin on target_vector_bits

Move vector_bits_* Target suppo… (compare)

• 17:40
LebedevRI commented #6782
• 17:33

[miscompile] Don't de-negate an… (compare)

• 17:33
• 17:33
• 17:21
LebedevRI commented #6782
• 17:12
steven-johnson commented #6777
• 17:11
steven-johnson commented #6777
• 17:04
steven-johnson commented #6777
• 17:04
steven-johnson review_requested #6781
• 16:59
• 16:58
• 16:57
alinas commented #6718
shoaibkamil
@shoaibkamil:matrix.org
[m]

@popizdeh:

I was just calling vectorize(x) and getting the error. What I find confusing is that your example works, like how is this split factor useful? Documentation says "Split a dimension by the given factor, then vectorize the inner dimension.", but say I split x by a factor of 4, it still needs to know what the width of the image is to do the split, so why doesn't vectorize(x) work like a split with factor 1? It knows the image width! Why does it throw the error?

A split with factor 1 would result in the inner loop being a single iteration; it doesn't really make sense to vectorize a single iteration. On the other hand, splitting with a factor >1 that is static allows vectorization. Note that from the perspective of Halide, we generate code that vectorizes by the width of this inner loop-- LLVM then lowers that to native instructions that are of native length. In addition, the code Halide generates doesn't statically contain the image width, but rather the image width is a parameter (contained in the Buffer struct).

shoaibkamil
@shoaibkamil:matrix.org
[m]

If you look at the documentation for split() (https://github.com/halide/Halide/blob/a89041b9563352edfb5e6c8ce4a1de4c490b751f/src/Func.h#L1425) it says

Split a dimension into inner and outer subdimensions with the given names, where the inner dimension iterates from 0 to factor-1

The vectorize() overload with an integer factor's documentation, in that context, is saying that vectorize(x, n) is equivalent to split(x, x_outer, x_inner, 4).vectorize(x_inner)

shoaibkamil
@shoaibkamil:matrix.org
[m]
Your interpretation of split() is incorrect. The documentation of split() that I linked above says the factor is the number of inner iterations, not outer iterations.
Alex Reinking
@alexreinking:matrix.org
[m]
it's not "split into N pieces", it's "split into pieces of size N"
Zalman Stern
@zvookin
I'm not sure any amount of documentation will fix the fact that people, even those who've been using Halide for a long time, get confused about how split works, but the behavior is that it splits something into blocks of size N. For some reason tile has less of this confusion, at least for me, though maybe it is that it is less often used. Part of the problem is that sometimes one wants to break something into a specific number of pieces, regardless of the size they happen to be, and Halide doesn't really do that. (If one knows the full extent a priori, one can compute a split factor that does the job, but usually these cases are dynamic. Which is a lot of the reason split works the way it does.)
Jonathan Ragan-Kelley
@jrk
The reason split and tile take the inner not outer sizes is that the size of the block is what determines working set / potential locality; the number of blocks is irrelevant for this. (There are of course contexts, like parallelism, where the number of blocks is relevant, but for compute_at/store_at block sizes, as well as vectorization and unrolling, the size of the block is what you care about.)
Ashish Uthama
@ashishUthama
Has anyone used the mex target on windows?
It works wonderfully on linux, on windows I managed to get the mex file, but it crashes on load without much useful info in the crash log.
not sure how to go about debugging this (since there is no source code..)
steven-johnson
@steven-johnson:matrix.org
[m]
The mex/Matlab support is very lightly used -- I wouldn't be surprised if it hasn't ever been tested on Windows, unfortunately.
(So lightly used that we had considered dropping it entirely, in fact)
Ashish Uthama
@ashishUthama
@steven-johnson:matrix.org - thanks for the response. I was initially surprised to see this support (and very happy too :)). Its been really useful on Linux for quick prototyping.
steven-johnson
@steven-johnson:matrix.org
[m]
Thank @dsharletg , he's the one who wrote it :-)
Jonathan Ragan-Kelley
@jrk
The good news is the mex support is pretty simple/lightweight, last I knew. It shouldn’t be too hard to fix as needed.
Do I need the Hexagon SDK to build and run Hexagon code or can I do it with Halide master? I've noticed that I have libhexagon_remote_skel.so and such in the src directory but not under build, and when I modify some code (like setting halide_print to segfault intentionally in the qurt and sim source) I don't see the modification either in the bindump or the side effects during the simulation run. All I really want is to get halide_print to utilize the Hexagon FARF so I can get some debug output from Halide when I run it for Hexagon simulator.
I've even tried renaming some of the halide libs in the Hexagon SDK so they are not accessible, but run_main_on_hexagon just plows ahead like it never even noticed they were gone (the output log even shows it dynamically loading some of these libs)
@aankit-ca @pranavb-ca do ya'll know what might be going on?
I also don't see any of the hexagon libs show up in the installation manifest when I cmake --install
steven-johnson
@steven-johnson:matrix.org
[m]
You should not need the Hexagon SDK. libhexagon_remote_skel.so and friends are checked-in in binary form.
@steven-johnson:matrix.org thanks, that is good to know. run_main_on_hexagon is still only in the Hexagon SDK, though, I'd imagine. Any idea why the changes I make in the Halide source aren't appearing in the simulator? It looks like libhexagon_remote_skel.so is the only library I need a copy of, but there are several others (hexagon_sim_remote, libsimqurt.a) that are built in Halide's src/runtime/hexagon_remote dir but whose absence doesn't trigger any linker errors
Another weird thing - when I grep for halide_malloc in binary files, nothing that I've built in Halide matches. But the Halide_TOOLS inside the Hexagon SDK does match. It looks like this symbol isn't being included in my build. My pre-build configuration command is cmake -B ./build/ -S . -G Ninja -DCMAKE_BUILD_TYPE=Release -DTARGET_WEBASSEMBLY=OFF -DWITH_TESTS=OFF -DWITH_TUTORIALS=OFF -DWITH_UTILS=OFF -DWITH_PYTHON_BINDINGS=OFF
Nikola Smiljanić
@popizdeh
Thanks for clearing that up @zvookin and @jrk. I just think that documentation could be much better, when it says "Split a dimension by a given factor" and function parameter is called factor I read this as "my dimension is X, so width, split by factor of 4 is then width / 4", I'm not sure how to read it to get the behaviour you're explaining. If on the other hand parameter was called num_elements_in_inner_dimension" there would be little room for confusion! I'm not proposing this name, but trying to illustrate how more explicit name (or more text in the docs) could clear up any confusion once and for all.
Alex Reinking
@alexreinking:matrix.org
[m]
I'm at the Cancun airport heading back... will have intermittent internet access on the planes, airport terminals, etc.
Greg Cotten
@gregcotten
Hi all, I cannot find an appropriate place to ask this but I'm looking for an engineer on porting a Metal-based image effect plugin to Halide. High pay, 4-6 weeks estimated hourly or part-time. It has been really hard to find a candidate for such a niche project! Async/remote work allowed and encouraged. Please email me at greg@videovillage.co if you're interested.
Svenn-Arne Dragly
@dragly
I think I might have asked this before, but I cannot remember: Is there a way to construct a Halide::Runtime::Buffer` that is only allocated on device? I have a few AOT compiled functions that will run on device only and it would be nice to not allocate the input and output of these on host.
Bill Pringlemeir
@bpringlemeir
I updated the stackoverflow Halide tag. Hopefully the information is balanced and correct.
Alex Reinking
@alexreinking:matrix.org
[m]

Halide is an open-source domain-specific language which iterates over up to four dimensions to apply computations.

There is no such restriction...

steven-johnson
@steven-johnson:matrix.org
[m]
Early versions of Halide had a restriction of 4 dimensions for buffers passed as inputs or outputs. This restriction has been gone for several years now.
Bill Pringlemeir
@bpringlemeir
I believe you. I am pretty sure it was in a document I read. Perhaps it was a paper (which would refer to older versions). I updated the text. Thanks,
Jonathan Ragan-Kelley
@jrk
If you find the reference, let us know, in case something is out of date — but thanks!
steven-johnson
@steven-johnson:matrix.org
[m]
FYI: The Mac x86 buildbot is going down for maintenance. Back up in a bit.
LJ
@woodknight
Hi, I'm curious why Halide is called Halide? I couldn't find a description of the name's origin anywhere.
2 replies
Daniel Saier
@saierd
Hi, I'd like to do an "argmax" that returns the maximum + argument and additionally the second highest value + argument. Is there a way to do this in a single loop in Halide?
Ashish Uthama
@ashishUthama

I see these in the stmt files:
let t7162 = ((t7077 - input.min.0) + t7170)
So I figured, adding something like this:
would simplify the stmt and hopefully make things faster.

While it did simplify the stmt, the performance was significantly slower. Any thoughts on why just adding that requirement that all buffers have 0 min might impact performance negatively?

steven-johnson
@steven-johnson:matrix.org
[m]
Notice: Windows buildbots will be down for a bit as we upgrade some tooling. Back up later today.
ivangarcia44
@ivangarcia44

Does someone knows if there is a way to map the Func fields in a Halide::Generator class to their corresponding Func’s in a compiled pipeline (seen in the dumped *schedule.h file dumped by the 2019 auto-scheduler from Andrew Adams)?

Does Halide provide an automated way of pulling this mapping or any other information that could help to derive the map?

For example, the auto-scheduler pipeline has the func1_1 Func local variable, which belongs to the func1 private field in the Halide::Generator class. Another less trivial mapping is with pipeline Func local variables that start with “repeatedge*”. I understand that these are mapped to Func’s fed by a BoundaryConditions expression in the Halide::Generator class, although I am not sure about this. Thanks

ivangarcia44
@ivangarcia44

For the question above, here is a the sample code. I am looking a way to
automatically map func1 from the HalideGenerator class to func1_1 in the
scheduler pipeline below. And func2 to repeat_edge_1.

class HalideGenerator1 : public Halide::Generator <HalideGenerator1> {
public:
...
void generate() {
...
func2(x, y) = BoundaryConditions::constant_exterior(func1, 0)(x, y);
...
}
void schedule() {
...
}
private:
...
Func func1{"func1"};
Func func2{"func2"};
...
};

inline void apply_schedule_HalideGeneratorName(
::Halide::Pipeline pipeline,
::Halide::Target target
) {
using ::Halide::Func
...
Func func1_1 = pipeline.get_func(28);
Func repeat_edge_1 = pipeline.get_func(27);
...
func1_1
.split(...)
.vectorize(...)
.compute_root()
.parallel(...);
...
repeat_edge_1
.split(...)
.vectorize(...)
.compute_root()
.parallel(...);
...
}

Roman Lebedev
@LebedevRI
halide itself is pgo-ignorant, right? i'm failing to find any mention of pgo/-fprofile-generate/-fprofile-instr-generate in the repo. i'm trying to understand what goes wrong in llvm/llvm-project#52845
Alex Reinking
@alexreinking:matrix.org
[m]
That's right... the CMake build has no opinion about those flags
Roman Lebedev
@LebedevRI
thought as much
Roman Lebedev
@LebedevRI
are there any success stories of halide-enabled projects and fuzzing? i'm guessing the C_BACKEND is catch-all escape hatch for this?
Alex Reinking
@alexreinking:matrix.org
[m]
The rungen stuff has a random input generator. You have to provide the output extents though
Not sure why C_BACKEND would be relevant?
You generally don't want to go through the C backend if LLVM has a backend for your target
Roman Lebedev
@LebedevRI
no, i mean, what if i have some code that is currently being fuzzed as part of oss-fuzz project, and now i want to replace some pieces of that code with halide. how do i retain fuzzing coverage?
Alex Reinking
@alexreinking:matrix.org
[m]
I don't know, how does oss-fuzz measure coverage?
Roman Lebedev
@LebedevRI
i mostly mean coverage in general sense. for plain c code, the ir would then be instrumented by some pass as instructed by clang, but here as i can tell there's only asan option
s/option/feature/
well, and tsan
Alex Reinking
@alexreinking:matrix.org
[m]
We could look into enabling that pass via a feature flag, assuming one exists at the LLVM level