Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 02:06
    alexreinking synchronize #6815
  • 00:53
    steven-johnson commented #6777
  • 00:52
    steven-johnson synchronize #6777
  • 00:52

    steven-johnson on jit-calls

    Revise error handling (compare)

  • 00:21
    SplittyFyre opened #6828
  • Jun 28 23:56
    alexreinking synchronize #6815
  • Jun 28 23:53
    alexreinking synchronize #6815
  • Jun 28 23:45
    steven-johnson synchronize #6777
  • Jun 28 23:45

    steven-johnson on jit-calls

    kill report_if_error Update callable_bad_arguments.c… Update Pipeline.cpp (compare)

  • Jun 28 23:11
    abadams synchronize #6827
  • Jun 28 23:11

    abadams on lower_halving_sub

    Fix comment (compare)

  • Jun 28 23:11
    abadams synchronize #6827
  • Jun 28 23:11

    abadams on lower_halving_sub

    Add explanatory comment (compare)

  • Jun 28 22:54
    abadams synchronize #6827
  • Jun 28 22:54

    abadams on lower_halving_sub

    Cast result back to signed (compare)

  • Jun 28 22:22
    abadams edited #6827
  • Jun 28 22:21
    abadams synchronize #6827
  • Jun 28 22:21

    abadams on lower_halving_sub

    Make signed rounding_halving_ad… (compare)

  • Jun 28 22:09
    abadams review_requested #6827
  • Jun 28 22:09
    abadams review_requested #6827
steven-johnson
@steven-johnson:matrix.org
[m]

If for a Halide Generator class, when compiled, the "auto_schedule" argument is set to "false", is it possible for Halide engine to use any default parallelization/scheduling technique (e.g., vectorization, parallelization, tiling, loop reversal)? Or it is guaranteed that no scheduling primitives are going to be used?

No scheduling primitives will be used unless you specify them.

1 reply

Even with target=x86-64-linux-disable_llvm_loop_opt, I notice xmm* registers being used in the output assembly file (fileName.s). Does that mean there is auto vectorization going on somewhere in generation pipeline?

No: Halide assumes that SSE2 is present for all x86-64 architectures, and uses the XMM register for scalar floating point operations

1 reply
Soufiane KHIAT
@soufiane.khiat:matrix.org
[m]
Does someone aware about a 'simple' to have autodiff in 2D?
ref/details:
https://github.com/halide/Halide/discussions/6347
aalan
@asouza_:matrix.org
[m]
Hello Soufiane KHIAT if I am not mistaken you could already do what is being proposed with the region parameter and a auxiliar array (an vjp)
1 reply
Jonathan Ragan-Kelley
@jrk
I think @BachiLi is the right person for the autodiff question above!
Vlad Levenfeld
@vladl-innopeaktech_gitlab
Do I need the Hexagon SDK in order to generate Hexagon code or can I do this with Halide master branch? I am trying to add some instrumentation to my generated code (maybe there's an easier way to do this)
Vlad Levenfeld
@vladl-innopeaktech_gitlab
And, if I can do it from master branch, how do I build the tools dir (GenGen.cpp and such)
Vlad Levenfeld
@vladl-innopeaktech_gitlab
Nevermind, I was having some linking issues and just needed a sanity check. Sorry for the noise
Soufiane KHIAT
@soufiane.khiat:matrix.org
[m]
Another discussion, how could we implement and Halide version of the "InsertKey" similar to std::unordered_map or std::set.
Idea: storing unique keys in a contigious array.
https://github.com/halide/Halide/discussions/6373
Alex Reinking
@alexreinking:matrix.org
[m]
@dsharletg: Are there any changes to the Hexagon backend that should make the release notes? Trying to push Halide 13 out the door.
Soufiane KHIAT
@soufiane.khiat:matrix.org
[m]

I have this Issue:

Condition failed: in.is_bounded()
Unbounded producer->consumer relationship: Vertices-> FaceNormal

When I try to read an array with a buffer of indices.

ref:
https://github.com/halide/Halide/issues/4108#issuecomment-956546487
halide/Halide#4108

Alex Reinking
@alexreinking:matrix.org
[m]
aalan
@asouza_:matrix.org
[m]
Congrats to all the team
Hello Alex Reinking very interesting the link about the Photoshop on the web. Do you have any more information about the use of halide on web projects or on Photoshop? Thanks
Alex Reinking
@alexreinking:matrix.org
[m]
I don't work for Adobe, so I'm sorry to say I do not
I know that @steven-johnson and @shoaibkamil have been involved in the WASM backend
shoaibkamil
@shoaibkamil:matrix.org
[m]
The backend was all Steven :). I don’t have more details to share other than those shared on the blog post linked from the release notes.
steven-johnson
@steven-johnson:matrix.org
[m]
well, give a lot of credit to the WebAssembly team for the LLVM backend we use... :-)
steven-johnson
@steven-johnson:matrix.org
[m]
Looks like there's an LLVM top-of-tree failure on some of the bots -- I'll get to it after lunch
steven-johnson
@steven-johnson:matrix.org
[m]
Svenn-Arne Dragly
@dragly

I am working "serializing" a Python object with Expr members to a Halide Func. In the process, I end up having a function with a large number of explicit definitions in one dimension. Unfortunately, I am not able to make those be calculated in an efficient way - once for each value while and sharing potential pre-calcualted values. In particular, this code:

f = Func("f")
f[row, col] = 0.0
f[row, 0] = 1.0 + sqrt(row*row)
f[row, 1] = 2.0 + sqrt(row*row)
f[row, 2] = 3.0 + sqrt(row*row)
f[row, 3] = 4.0 + sqrt(row*row)

g = Func("g")
g[row, col] = f[row, col] + 42.0

g.compile_to_lowered_stmt("out.txt", [], StmtOutputFormat.Text)
print(np.asanyarray(g.realize(2, 4)))

Leads to the following generated code:

  for (g.s0.col, g.min.1, g.extent.1) {
  ...
   for (g.s0.row, g.min.0, g.extent.0) {
    allocate f[float32 * 1 * (max(t6, 3) + 1)]
    produce f {
     f[t7] = 0.000000f
     f[t8] = (float32)sqrt_f32(float32((g.s0.row*g.s0.row))) + 1.000000f
     f[t9] = (float32)sqrt_f32(float32((g.s0.row*g.s0.row))) + 2.000000f
     f[t10] = (float32)sqrt_f32(float32((g.s0.row*g.s0.row))) + 3.000000f
     f[t11] = (float32)sqrt_f32(float32((g.s0.row*g.s0.row))) + 4.000000f
    }
    consume f {
     g[g.s0.row + t12] = f[t7] + 42.000000f
    }

Unfortunately, Halide does not notice that only one value of f is needed, and calculates all of f for each g. I guess this is expected.

Calling f.compute_root() helps reduce the number of calculations, but results in code with 4 four loops over row instead. This is problematic in my actual use-case, because it no longer automatically shares values that can be pre-calculated (such as the sqrt above).

Is there a way to get Halide to calculate f for each explicitly set col in one loop over row?

6 replies
Ashish Uthama
@ashishUthama

upgrading from Halide 12 to Halide 14 (tip)
running into a lot of:

Unhandled exception: Error: Cannot split a loop variable resulting from a split using PredicateLoads or PredicateStores.

Right now, it looks like something related to tile() with tailstrategy omitted (i.e the default Auto) . Does this ring a bell? (will dig more in a bit)

Did some defaults change?
@dsharletg - likely related to halide/Halide#6020 ? (I'll explore by adding explicit tailstrategies to the code)
Dillon Sharlet
@dsharletg
That shouldn't happen if you weren't explicitly using PredicateLoads or PredicateStores
Ashish Uthama
@ashishUthama
I am not, will try to create a repro and file an issue
Source code worked as-is on Halide 12
Ashish Uthama
@ashishUthama
@alexreinking:matrix.org - thanks for the comment on the windows build. We build Halide from source, but with the regular versioned release cadence - I am considering just using the released binaries.
Alex Reinking
@alexreinking:matrix.org
[m]
Sure! I'd still like to understand why your build was failing, though :)
Ashish Uthama
@ashishUthama
The downloaded libraries appear to be significantly larger than what I build locally, ~150MB vs ~45MB. Would you have any thoughts on why that may be?
Alex Reinking
@alexreinking:matrix.org
[m]
It could be a difference in how we're building LLVM or what targets are enabled?
Ashish Uthama
@ashishUthama
I would like to too .. but not sure how to proceed :( I checked the definition in VS and it correctly opened up that header
These are our cmake flags:
51 CMAKE_HALIDE_OPTIONS:= \
52 -DLLVM_DIR=${LLVM_ROOT}/release/lib/cmake/llvm \
53 -DCLANG=${CLANG} \
54 -DWARNINGS_AS_ERRORS=OFF \
55 -DWITH_PYTHON_BINDINGS=OFF \
56 -DWITH_TEST_AUTO_SCHEDULE=ON \
57 -DWITH_TEST_CORRECTNESS=OFF \
58 -DWITH_TEST_ERROR=ON \
59 -DWITH_TEST_WARNING=ON \
60 -DWITH_TEST_PERFORMANCE=ON \
61 -DWITH_TEST_OPENGL=OFF \
62 -DWITH_TEST_GENERATOR=ON \
63 -DWITH_APPS=OFF \
64 -DWITH_TUTORIALS=OFF \
65 -DWITH_DOCS=OFF \
66 -DWITH_UTILS=OFF .
oh, llvm - ok, likely that. Will double check, it might be that our in house llvm build has limited targets.
Alex Reinking
@alexreinking:matrix.org
[m]
That would make sense. Our binaries contain all supported backends
Ashish Uthama
@ashishUthama
yes, that explains it.Thanks! And -- thanks for the push to versioned regular releases, Much appreciated!
Alex Reinking
@alexreinking:matrix.org
[m]
Totally! Hopefully we'll be able to get the release latency (after LLVM) even tighter for the next releases :)
Derek Gerstmann
@derek-gerstmann
@alexreinking:matrix.org Hiya! I'm looking at backporting PR #6405 from master to create a v13.0.1 release. Should I create a "backports/13.x" branch and merge things in there, and then push the results into "releases/13.x"? Just trying to match what Andrew and I are seeing in the repo to follow conventions. Any suggestions?
Alex Reinking
@alexreinking:matrix.org
[m]
I use backports/N.x for staging changes to release/N.x. When it's ready, I open a PR with release/N.x as the target branch. There is CI set up for this scenario. Be sure to include a commit that bumps the version to 13.0.1.
release/N.x is (or ought to be) protected (like master), so you can't push to it
Derek Gerstmann
@derek-gerstmann
Ahh ... okay cool! Does the "release/N.x" branch itself get created automatically?
Alex Reinking
@alexreinking:matrix.org
[m]
No. When creating a new major release, we fork it off of master.
Derek Gerstmann
@derek-gerstmann
Makes sense! Okay, I'll work on getting things merged! Thanks!
Alex Reinking
@alexreinking:matrix.org
[m]
Also, I prefer to not squash commits from backports/N.x into release/N.x. I think the cherry-picking history is valuable (as are any separate/additional patches necessary to correctly backport) as is keeping the version number bump separate.
No problem! Happy to share the release responsibility :)
Derek Gerstmann
@derek-gerstmann
Cool. Yeah, I'm the same. I rarely squash commits unless there's a really good reason to.
Sweet! Happy to help! :)
Alex Reinking
@alexreinking:matrix.org
[m]
I bring it up because the repository default PR-merge mode is to squash
Derek Gerstmann
@derek-gerstmann
Oooh, good to know! I'll turn it off for the release PR. Thanks for the heads up!
Alex Reinking
@alexreinking:matrix.org
[m]
Of course!