These are chat archives for halide/Halide

25th
Apr 2018
Suyog
@suyogsarda_twitter
Apr 25 09:39
Anyone observing error "Halide/src/runtime/cuda.cpp:108:5: error: misaligned or large atomic operation may incur significant performance penalty [-Werror,-Watomic-alignment]
atomic_load(&context, &local_val, ATOMIC_ACQUIRE);" with latest upstream Halide and LLVM?
Perhaps because of this change https://reviews.llvm.org/D45319 which was mentioned in earlier too? Any solution for it?
Zalman Stern
@zvookin
Apr 25 15:06
Please see halide/Halide#2915
Trying to get buildbots green to submit it
Suyog
@suyogsarda_twitter
Apr 25 15:09
Thanks
Steven Johnson
@steven-johnson
Apr 25 16:18
pushed the windows fix, then synced the cuda fix to master to rebuild. hopefully all green soon and can merge to master.
Shoaib Kamil
@shoaibkamil
Apr 25 17:28
Hm, what does it mean when a buildbot says "command interrupted, attempting to kill"
Followed by cancelled / 0 no reason?
Steven Johnson
@steven-johnson
Apr 25 17:29
As usual, some buildbot misbehavior this morning — I canceled a bunch of jobs and am trying to restart them
which builds?
Shoaib Kamil
@shoaibkamil
Apr 25 17:30
Looking at the logs here: halide/Halide#2755
Ah, ok, probably because you were restarting builds
Steven Johnson
@steven-johnson
Apr 25 17:30
yeah. simplest is to sync it to head and let it restart naturally.
Ill do that now
Shoaib Kamil
@shoaibkamil
Apr 25 17:31
Makes sense, but I'm sure @slomp can do it
Steven Johnson
@steven-johnson
Apr 25 17:31
done
Shoaib Kamil
@shoaibkamil
Apr 25 17:31
Or you can, thanks!
:)
Marcos Slomp
@slomp
Apr 25 17:32
\o/
Pranav Bhandarkar
@pranavb-ca
Apr 25 17:50
@abadams - In my generator I have an output func that is the result a simple copy after the update stage of a Func. Is it possible to get rid of the copy and use the output buffer for the pure and the update stages of the previous Func. See https://gist.github.com/pranavb-ca/bda3d35c214b7f30d0620d061a2dc346
@zvookin mentioned copy elision might handle it, but I don't know how to use it.
Zalman Stern
@zvookin
Apr 25 17:56
(I should clarify that I also mentioned that I'm not sure copy elision is good enough yet, but I think it does sort of exist and is my envisioned tool for this job :-))
Andrew Adams
@abadams
Apr 25 18:37
Copy-elision doesn't really exist in that sense yet
You can explicitly schedule the host<->device transfers to avoid a redundant second copy
But you can't currently avoid the copy that you're hitting, I think
Actually you could try output.copy_to_host() as a scheduling primitive
It might do the right thing in this case (skip the device side copy entirely)
Steven Johnson
@steven-johnson
Apr 25 18:56
@abadams : I’m a little confused about the meaning of the ‘dimensions’ field in the halide trace packet; it’s defined as ‘length of the coords array’, and Tracing.cpp just passed the length of coords for it… but halide_trace_helper multiplies it by the trace type’s vector width, which seems like it should always be wrong. Is this a mistake or am I missing something?
Pranav Bhandarkar
@pranavb-ca
Apr 25 19:02
@abadams - That's giving me this error -
User error triggered at /local/mnt/workspace/bots/hexbotmaster-aus-builder-02/halide-20/src/halide/src/Func.cpp:2806
Condition failed: func.output_buffers().size() == 1
Steven Johnson
@steven-johnson
Apr 25 22:07
@abadams so there’s an interesting bug (I think) in tracing: VectorizeLoops vectorizes all the args to Call::trace (normally, this is good); however, if call is for begin_realization, and the Exprs for the min/extent are vectorized, the coordinates list gets inflated by the vector width… so if you are vectorizing by 16, a 2-dim realization becomes a 2x16==32-dim realization in the trace stream.
(I’m thinking the right answer is to special-case call::trace vectorization even more than it already is, so that load/store trace calls are handled the current way, others are scalarized, but am not 100% sure yet)