These are chat archives for halide/Halide

21st
Aug 2018
Steven Johnson
@steven-johnson
Aug 21 17:29
@pranavb-ca What’s the update on the HVX/Simplifier bugs? At this point I think it’s the only major issue that is in the way of the new Simplifier landing.
Andrew Adams
@abadams
Aug 21 17:32
Well, once we're past correctness there's also the issue of performance (failing to prove things that we were previously able to). I've been holding off on adding new simplifier rules until we discover what the missing cases are.
Steven Johnson
@steven-johnson
Aug 21 17:32
I stand corrected.
Andrew Adams
@abadams
Aug 21 17:32
(via the "failed to prove" statements that should appear in the build logs)
Steven Johnson
@steven-johnson
Aug 21 17:36
on a different note: looks like correctness_simd_op_check is failing on llvm trunk linux; anyone already looking into this?
Andrew Adams
@abadams
Aug 21 17:36
Not to my knowledge
might be a stale branch
(halide/Halide#3224 — ready to merge except for that)
It's definitely broken on master
Steven Johnson
@steven-johnson
Aug 21 17:37
ok
but only on linux gcc5.3
merged it since clearly unrelated
Steven Johnson
@steven-johnson
Aug 21 17:38
wait what?
Andrew Adams
@abadams
Aug 21 17:38
The only trunk builder for PRs is linux gcc5.3
the rest use an llvm release version
Steven Johnson
@steven-johnson
Aug 21 17:39
work on my mac… maybe current trunk tip? I’ll sync and retry
Jing Pu
@jingpu
Aug 21 18:14
@abadams, re: the new simplifier topic, I asked Steven about it because I want to check in a large change (mostly to the google internal codebase) but I am blocked as I cannot check in new simplifier rules to master (some of them already merged to the new simplifier branch, e.g. #2996). If it is going to take a while for the new simplier to land, can we add new rules to both the master and the new simplifier branch to unblock my progress. I will try to keep the diff small.
Andrew Adams
@abadams
Aug 21 18:14
It's quite possible that there is no performance regression, and I'm being pessimistic (ha!)
But I'm not opposed to adding the rules to master too, as long as we're sure they're also in the PR so we don't regress when merging the PR
Jing Pu
@jingpu
Aug 21 18:22
That's great! I may go ahead to sync'ing some PRs back to master. By the way, are there any more comments for PR #3202?
Andrew Adams
@abadams
Aug 21 18:27
Sync it to the current state of the branch to hopefully fix the fft crash
No actual bug - I just set a canary stack limit too aggressively
Steven Johnson
@steven-johnson
Aug 21 18:27
ok, simd_op_check crashing after syncing to llvm tip
Andrew Adams
@abadams
Aug 21 18:29
Not sure what's going on with arm32 yet
Steven Johnson
@steven-johnson
Aug 21 18:32
the simd-op-check failure for me is bad access with RIP=0… we’re jumping into a null JIT function ptr. whee
Zalman Stern
@zvookin
Aug 21 18:32
simd_op_check was failing on one or two of my PRs last week.
Steven Johnson
@steven-johnson
Aug 21 18:33
it seems to be across the board at current llvm trunk
Steven Johnson
@steven-johnson
Aug 21 19:57
looking at the jit crash, but no clue as of yet. the jitted code (eventually) calls into a null ptr, but haven’t tracked down the details so far.
welp:
    0x10a2fc021: movabsq $0x0, %rax
    0x10a2fc02b: movl   $0x18001, %esi            ; imm = 0x18001
    0x10a2fc030: movq   %rcx, 0x10(%rsp)
    0x10a2fc035: movq   %rcx, %rdi
    0x10a2fc038: callq  *%rax
there’s yer problem right there
Andrew Adams
@abadams
Aug 21 20:00
... that'd do it
It looks like a relocation didn't occur
given that it's an immediate 0
Steven Johnson
@steven-johnson
Aug 21 20:06
now to figure out what is supposed to be there
halide_malloc is the likely culprit
Steven Johnson
@steven-johnson
Aug 21 20:17
but why would it only be sometimes? the test successfully runs a bunch of jitted calls first. hmmm
Steven Johnson
@steven-johnson
Aug 21 20:38
ok, so this test uses HL_TARGET Instead of HL_JIT_TARGET for some reason
Andrew Adams
@abadams
Aug 21 20:44
Because it's mostly a cross-compilation-and-check-the-source test
Steven Johnson
@steven-johnson
Aug 21 20:44
ok, this is weird: my machine’s host target is x86-64-osx-avx-avx2-f16c-fma-sse41; running with that craters as above. If I remove just the f16c feature, though, it runs fine.
Andrew Adams
@abadams
Aug 21 20:44
If it happens that the target is the same as the host target, then it additionally decides to compile and run things to check correctness
f16c? Weird
Steven Johnson
@steven-johnson
Aug 21 20:45
ah, that’s because it’s skipping tests when I remove that, since target != host
Marcos Slomp
@slomp
Aug 21 23:13
I am a bit confused untangling some implementation details here:
Q: In Call node, there's value_index that tells me about which of the possibly many values the Call refers to. That's fine, but how do I know if the call is referring to the pure definition or to an update definition of FunctionPtr func? Does the FunctionPtr func already contains the appropriated pure-or-update definition in question?
Pranav Bhandarkar
@pranavb-ca
Aug 21 23:29
@steven-johnson - it is a problem in the software pipeliner. Two of the im16(x-2, ..) loads don't seem to be fixed up correctly by the pipeliner. Instead of resulting in the value 5, we load 7 which changes the output from 1760 to 1764. Working with the engr. who wrote the software pipeliner with the debug info from the pipeliner
Steven Johnson
@steven-johnson
Aug 21 23:30
excellent news, thanks!