These are chat archives for halide/Halide

7th
Dec 2017
Steven Johnson
@steven-johnson
Dec 07 2017 01:06
Bah: looks like I spoke too soon on WinBot2; I’m once again seeing CUDA/OpenCL failures just like before for win32 builds. (Perhaps previous successful builds today happened to schedule these onto WinBot1?) https://buildbot.halide-lang.org/master/#/builders/68/builds/438
Zalman Stern
@zvookin
Dec 07 2017 02:39
The onto thing almost works, though I'm getting a couple conflicts
Trying to do it for the buffer_device_crop branch
Merging from master into this branch is now conflict city
so....
Zalman Stern
@zvookin
Dec 07 2017 02:45
I'm seeing a ton of conflicts
with the onto method
If you want to try it yourself, I'm doing "git rebase --onto 9715583a18be882d55a9c807f808b430f91b6843 c7823b64300b0904c56783b4aa5e63961cad1f9e" in the buffer_device_crop branch
Perhaps I have the wrong commit hashes
Andrew Adams
@abadams
Dec 07 2017 02:46
Possible
Zalman Stern
@zvookin
Dec 07 2017 02:46
but I tried to follow the instructions
Andrew Adams
@abadams
Dec 07 2017 02:46
You could try resetting onto 9715583a18be882d55a9c807f808b430f91b6843 instead and seeing what the diff looks like
If it's the right commit in the new history then it should only contain changes from the branch
Zalman Stern
@zvookin
Dec 07 2017 02:47
ok
At this point, it seems simplest just to recreate the changes and restart the PR
but I'll try this a bit more
But I kinda need to take off now
Steven Johnson
@steven-johnson
Dec 07 2017 15:29
I was hoping to come in to MTV today to try to wrangle WinBot2 back into working, but I feel like I'm still probably contagious to subject people to my presence; will try to keep tweaking things remotely in hopes of getting OpenCL working again.
Andrew Adams
@abadams
Dec 07 2017 15:46
The bots are getting permission denied on writing to the artifacts folder
might need a chmod
I'm trying to delete stale branches. A huge fraction of them are hexagon related, and I'm not sure which ones should be preserved. Could you take a pass through @dsharletg ?
Lots of branches from Ron and Pranav
ronlieb
@ronlieb
Dec 07 2017 16:06
if you have a list of mine handy i could take a quick whack at it, Pranav is out for a few days, so i could look at his as well
ronlieb
@ronlieb
Dec 07 2017 16:17
i may need to leave Pranav's for Pranav to do
btw, the only one i left open of mine is https://github.com/halide/Halide/tree/sim_shlib_addr_print
Andrew Adams
@abadams
Dec 07 2017 16:41
thanks!
ronlieb
@ronlieb
Dec 07 2017 16:45
pranav has 12 still open, if you need this done today, i can make a best effort and delete those that are at least 12 months old
Steven Johnson
@steven-johnson
Dec 07 2017 16:45
@abadams chmod’ed /home/abadams/artifacts to be world-writable.
@abadams : if I just take down WinBot2 entirely (while investigating), I presume all Windows tests will get routed to the one worker left — so builds won’t stop entirely, just process more slowly.
Patricia Suriana
@psuriana
Dec 07 2017 17:23
Can someone approve PR #2597
Steven Johnson
@steven-johnson
Dec 07 2017 17:24
Oooh, could this cause segfaults when trying to use a define_extern function? This might be biting me on a Halide merge
(or is it specific to auto-scheduler?)
Shoaib Kamil
@shoaibkamil
Dec 07 2017 17:44
@abadams Did rewriting history break git tags?
Can someone sanity-check something about the rewritten history? I tried:
$ git checkout https://github.com/halide/halide && cd halide && git checkout release_2017_10_30 && make -j7
make: *** No rule to make target `bin/build/Simplify.o', needed by `lib/libHalide.a'.  Stop.
make: *** Waiting for unfinished jobs....
$ find . -name Simplify.cpp
<none found>
Patricia Suriana
@psuriana
Dec 07 2017 18:21
@steven-johnson which segfault are you referring to? The PR is a fix for the auto-scheduler. Currently, it would segfault when there is extern func in the pipeline due to the extern refactoring (extern func no longer has definition)
Steven Johnson
@steven-johnson
Dec 07 2017 18:22
yes, it’s unrelated to what I’m seeing. sorry for noise.
Andrew Adams
@abadams
Dec 07 2017 18:24
Ugh. I restored the branches but not the tags.
So the tags will have missing files.
I'll fix them
Patricia Suriana
@psuriana
Dec 07 2017 18:31
@abadams are you done reviewing PR #2597?
Andrew Adams
@abadams
Dec 07 2017 18:32
ok, I reverted the tags to their pre-rewrite state
they have the old history
merged #2597
Steven Johnson
@steven-johnson
Dec 07 2017 18:36
re: OpenCL, it’s not just us; a sample NVidia OpenCL app also fails on WinBot2 at the point of trying open a cl context.
Andrew Adams
@abadams
Dec 07 2017 18:45
Well, that's good I guess
Maybe a GPU swap is in order
I left a 750Ti behind, which would be a good card for a buildbot
but is useless for anyone else
Steven Johnson
@steven-johnson
Dec 07 2017 18:46
I noticed there is an option for ‘clean install’ under ‘advanced’ for the driver, trying that now
Andrew Adams
@abadams
Dec 07 2017 18:46
I tried that :(
Steven Johnson
@steven-johnson
Dec 07 2017 18:46
oh
well
it will be extra clean now
Shoaib Kamil
@shoaibkamil
Dec 07 2017 18:48
Thanks @abadams
Steven Johnson
@steven-johnson
Dec 07 2017 18:50
yeah, still no luck. nvidia app fails with “-9999 (Unspecified Error)”.
gonna google a bit more to see if I can find similar failures
Zalman Stern
@zvookin
Dec 07 2017 18:51
What card is in the Windows box? Is just upgrading the hardware the right answer?
Looks like -9999 does have a meaning
it's just nvidia-specific
Not particularly illuminating if demo apps are throwing that though
Zalman Stern
@zvookin
Dec 07 2017 18:52
Either we're reading out of bounds or perhaps something is hardwearily wrong
Andrew Adams
@abadams
Dec 07 2017 18:53
Yeah, at this point I'd swap the card out
Zalman Stern
@zvookin
Dec 07 2017 18:53
Anything in the logs?
Gack I can't even remember the name of the command to bring up the windows log program
Steven Johnson
@steven-johnson
Dec 07 2017 18:53
wait
red herring
apparently that was “run from wrong dir” in this case
how helpful
lemme try our code now
looks promising
Marcos Slomp
@slomp
Dec 07 2017 18:59
so, HALIDE_REGISTER_GENERATOR(...)
how can I pass/specify generator parameters alongside registering the generator itself?
Steven Johnson
@steven-johnson
Dec 07 2017 18:59
there isn’t a way to do so inside the macro; you specify them when invoking the Generator in the build system
Andrew Adams
@abadams
Dec 07 2017 19:01
I guess you could make them template parameters instead of GeneratorParams if you want them concretely bound at C++ compile time, but that would be an anti-pattern I think
Steven Johnson
@steven-johnson
Dec 07 2017 19:01
Usually, yes
If you build your Generator into an executable named foo.generator
Marcos Slomp
@slomp
Dec 07 2017 19:02
hmmm... before I was using GeneratorFactory and GeneratorRegistry::register_factory to route parameters during registration time
Steven Johnson
@steven-johnson
Dec 07 2017 19:02
that… should never have worked. I’m trying to figure out how that worked before.
Marcos Slomp
@slomp
Dec 07 2017 19:02
yeah, we rather have all our generators in a single executable and manage them monolithically
Steven Johnson
@steven-johnson
Dec 07 2017 19:03
welp
Marcos Slomp
@slomp
Dec 07 2017 19:03
@steven-johnson let me share some snippets with you
Steven Johnson
@steven-johnson
Dec 07 2017 19:03
yes please
Andrew Adams
@abadams
Dec 07 2017 19:03
That seems like it would make changing the code in a single generator somewhat expensive, in terms of time taken to rebuild
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:03
@abadams Yup.
But having to play nicely with existing build systems makes it worse for other reasons to have lots of independent executables
Steven Johnson
@steven-johnson
Dec 07 2017 19:05
ok, assuming I set this up correctly, we are now passing test_correctness with all targets set to host-opencl, so I’m cautiously optimistic this may be unbroken again. Gonna restart the bot on this worker and see.
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:05
(<insert snarky comment about using a better build system>)
Andrew Adams
@abadams
Dec 07 2017 19:05
msvc solutions do seem to scale poorly with number of executables
I guess we should make sure it is feasible to link lots of generator object files into a single generator executable
So that you at least only pay a link-time cost
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:07
Yeah, that might be a good tradeoff, but @slomp definitely knows the issues better than I do (especially with windows)
Andrew Adams
@abadams
Dec 07 2017 19:08
We could also do something like make generators accept a flag that dlopens a .so to find generators within
Then compiling GenGen.cpp would be sufficient, and the rest could be compiled as .so files
Though in principle an executable and a shared object should be equivalent complexity in a build system.
Marcos Slomp
@slomp
Dec 07 2017 19:10
if there's a way to call a constructor in the derived generator classes, I think I can adapt things
(and I feel there should be some way, I am just "lost-in-code" :) )
Steven Johnson
@steven-johnson
Dec 07 2017 19:11
Unfortunately, it’s a little complicated.
Opened halide/Halide#2614 to track.
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:12
Yeah, not sure the dynamic lib approach would be much better... msvc seems to have the same issues with DLLs
Marcos Slomp
@slomp
Dec 07 2017 19:13
msvc has issues with pretty much everything :)
if not for the debugging experience, I wouldn't touch msvc with a 10 ft pole...
Steven Johnson
@steven-johnson
Dec 07 2017 19:14
MSVC + CMake are pretty much a match made in heaven
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:15
I can't tell if that's heaven or hell
Marcos Slomp
@slomp
Dec 07 2017 19:15
I have an idea of how to work around the issue
it's palliative, but should work
fortunately, only a handful of generators need this specialization
neat song from black sabbath, btw :metal:
Steven Johnson
@steven-johnson
Dec 07 2017 19:16
"only a handful of generators need this specialization”, I haven’t heard that one
Marcos Slomp
@slomp
Dec 07 2017 19:18
well... handful in % terms hehehe
Andrew Adams
@abadams
Dec 07 2017 19:18
@dsharletg What happened with Alina's arm strided store stuff?
I see a branch on it, which I just updated
Zalman Stern
@zvookin
Dec 07 2017 19:18
I actually owned that on vinyl. Dio ruled!
Andrew Adams
@abadams
Dec 07 2017 19:18
and I also see an open PR from Alina's repo
Dillon Sharlet
@dsharletg
Dec 07 2017 19:18
it was failing tests
Steven Johnson
@steven-johnson
Dec 07 2017 19:18
@slomp I can think of several ways to make workarounds that are maintainable, but since AFAICT you are the main client for it, LMK about your ideas (via issue or email).
Marcos Slomp
@slomp
Dec 07 2017 19:20
@zvookin the cover art of that album is awesome!
Zalman Stern
@zvookin
Dec 07 2017 19:24
What would happen if we introduced a new concept at compile time of Generator+params
?
Steven Johnson
@steven-johnson
Dec 07 2017 19:25
It wouldn’t be hard to support.
It just seems… odd, since the whole point of GeneratorParams are “things that a Generator doesn’t know about at compile time"
Marcos Slomp
@slomp
Dec 07 2017 19:28

No worries, need to turn things upside down because of this.
I think I found a simple workaround.

I just asked the question because I thought there was a chance I could be missing something obvious.

Steven Johnson
@steven-johnson
Dec 07 2017 19:29
gotcha
Zalman Stern
@zvookin
Dec 07 2017 19:31
I think the thing I'm thinking off is just a stylized main that instantiates the generator with specific parameters and then compiles it to a .o
The Generator is unchanged
Steven Johnson
@steven-johnson
Dec 07 2017 19:31
right
that’s what I figured
Zalman Stern
@zvookin
Dec 07 2017 19:32
But it might be a way to register another generator that is the same as the target one, but with different default parameters
Steven Johnson
@steven-johnson
Dec 07 2017 19:32
basically a way to add an ‘alias’ for ‘generator + param set'
Zalman Stern
@zvookin
Dec 07 2017 19:32
the differences is the latter would still go through the same build system path
Steven Johnson
@steven-johnson
Dec 07 2017 19:32
yes
Hmm
thinking
What if we had something like
HALIDE_REGISTER_GENERATOR_ALIAS(
   “existing-name”, “new-name”, { { “gp-name”, “gp-value” }, …} )
any instantiation of ‘new-name’ is just like creating an instance of ‘existing-name’ and applying the gp values to it
It should be trivial to support
Zalman Stern
@zvookin
Dec 07 2017 19:36
Yeah, that looks decent
Steven Johnson
@steven-johnson
Dec 07 2017 19:45
prototyping it now
Shoaib Kamil
@shoaibkamil
Dec 07 2017 19:46
@slomp ^
Marcos Slomp
@slomp
Dec 07 2017 19:48
ah, sorry, I was fixing the code here with my workaround
yeah, this alias looks good to me
Steven Johnson
@steven-johnson
Dec 07 2017 20:03
yeah, this will definitely work, I just need to fiddle with the syntax a little to make it happy with macros. look for a PR after lunch.
Shoaib Kamil
@shoaibkamil
Dec 07 2017 20:03
Thanks @steven-johnson!
Marcos Slomp
@slomp
Dec 07 2017 20:33
many thanks (and no rush, I already have a workaround in place)
Steven Johnson
@steven-johnson
Dec 07 2017 21:51
well…. crud: WinBot2 is once again failing OpenCL tests (although my tests directly on the machine this morning appeared to be passing). Either we have intermittent failures, or my tests this morning were mistaken, or something has broken since then.
Zalman Stern
@zvookin
Dec 07 2017 21:55
Lets swap the GPU when we can
Steven Johnson
@steven-johnson
Dec 07 2017 21:55
worth a try
Perhaps we should temporarily disableGPU testing on windows till then
Andrew Adams
@abadams
Dec 07 2017 21:56
Or just turn off that bot
There's another windows bot
Steven Johnson
@steven-johnson
Dec 07 2017 21:56
yes but that will halve windows build times
Andrew Adams
@abadams
Dec 07 2017 21:57
Better than flakiness or missing test failures
Steven Johnson
@steven-johnson
Dec 07 2017 21:57
hrm
fair enough
Andrew Adams
@abadams
Dec 07 2017 21:57
I can stop by and swap out the GPU on my way home, if Google is OK with me meddling with one of their machines :)
Steven Johnson
@steven-johnson
Dec 07 2017 21:57
nice try :-)
Andrew Adams
@abadams
Dec 07 2017 21:57
Assuming the 750 is where I left it
Steven Johnson
@steven-johnson
Dec 07 2017 21:58
I just stopped WInBot2
Andrew Adams
@abadams
Dec 07 2017 21:58
Hey, it's on a sandboxed network designed for this sort of thing, just consider me as a sample malicious actor.
Andrew Adams
@abadams
Dec 07 2017 22:03
@steven-johnson The branch hex-remove-constant-bit-vector-hack doesn't appear to have any commits in it not in master. Am I missing something, or can it be deleted?
Steven Johnson
@steven-johnson
Dec 07 2017 22:04
kill it
Andrew Adams
@abadams
Dec 07 2017 22:17
For some of these branches, the easiest thing to do might be to use the rewritten version, and just add back in the missing files.
Steven Johnson
@steven-johnson
Dec 07 2017 22:17
in hindsight, would have been easier to do the branch pruning before the rewrite
Andrew Adams
@abadams
Dec 07 2017 22:27
Er, @psuriana it looks like the compute_with branch now contains the entire old history and the entire new history
One was merged into the other?
Oh, looks like new master was merged into compute_with, but compute_with was based on old master
Now that it's merged with new master, I'd just do a git reset --soft master
and squash it down to one commit
Patricia Suriana
@psuriana
Dec 07 2017 22:43
@abadams will do once I address all the comments. I didn't notice there were new comments on Nov 3 :worried: I think once I did the reset --soft, it will mess up the comments.
Dillon Sharlet
@dsharletg
Dec 07 2017 23:20
I'm back now I could swap the GPU I guess
I hate digging around in computesr though
Steven Johnson
@steven-johnson
Dec 07 2017 23:20
we can limp by till tomorrow or monday