Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 17:13
    benoitsteiner commented #4462
  • 00:00

    abadams on apps_from_autoscheduler

    Add BGU implementation Add histogram equalization Add max filter and 4 more (compare)

  • Dec 11 22:44

    abadams on define_div_by_zero

    Calculate Expr bounds using fun… Added JIT-test and removed appl… Merge branch 'master' of https:… and 65 more (compare)

  • Dec 11 22:44
    abadams synchronize #4439
  • Dec 11 18:41

    vksnk on increase-device-num

    (compare)

  • Dec 11 18:40

    vksnk on pos_inf-memory-assert

    (compare)

  • Dec 11 18:40

    vksnk on master

    Check if shared memory allocati… Use has_upper_bound() to check … Merge pull request #4467 from h… (compare)

  • Dec 11 18:40
    vksnk closed #4467
  • Dec 11 18:09
    steven-johnson commented #4439
  • Dec 11 18:07
    steven-johnson commented #4467
  • Dec 11 18:07
    steven-johnson commented #4462
  • Dec 11 17:18
    abadams synchronize #4439
  • Dec 11 17:18

    abadams on define_div_by_zero

    Fixes to Bounds.cpp (compare)

  • Dec 11 16:42

    dsharletg on fix-llvm

    (compare)

  • Dec 11 16:42

    dsharletg on master

    Fix for trunk LLVM. Merge pull request #4468 from h… (compare)

  • Dec 11 16:42
    dsharletg closed #4468
  • Dec 11 07:29
    dsharletg opened #4468
  • Dec 11 07:28

    dsharletg on fix-llvm

    Fix for trunk LLVM. (compare)

  • Dec 11 02:38
    abadams synchronize #4439
  • Dec 11 02:38

    abadams on define_div_by_zero

    Simplify bounds of div (compare)

Andrew Adams
@abadams
@dsharletg the host->device case also works, but there's no benefit for cuda because the version without async already manages to overlap the cpu compute and copies in a subtle way.
Confused me for a while.
CPU compute -> synchronous copy -> async kernel launch -> next batch of CPU compute (overlapped with GPU kernel launch) -> synchronous copy (stalls until kernel launch is done) ->
Wait, so I guess the CPU compute is hidden under the GPU compute
not the copy
Dillon Sharlet
@dsharletg
That's great news!
Steven Johnson
@steven-johnson
I’m restarting the buildbot master now
Steven Johnson
@steven-johnson
On the recent issue of exported symbols varying between opt levels: it looks like CMake added a feature in 3.4 that attempts to auto-build a .def file for you on Windows, with the net effect of (mostly) acting like the gcc-ish default of “export all symbols”: https://blog.kitware.com/create-dlls-on-windows-without-declspec-using-new-cmake-export-all-feature/
I haven’t tried it (and we are talking about CMake here so who knows)...
Steven Johnson
@steven-johnson
We explicitly forbid using ‘.’ in a Func name since we use that as a separator internally, but we don’t seem to have a similar constraint on Var name. Deliberate or accidental?
Andrew Adams
@abadams
Var names are not uniqued either
Accidental I think
Zalman Stern
@zvookin
Var names are not uniqued by design
They're value types
Steven Johnson
@steven-johnson
Right
Andrew Adams
@abadams
Lack of '.' enforcement is the accidental thing
Steven Johnson
@steven-johnson
Just idly wondering if more constraints on the names allowed would give us more flexibility in the future. (e.g. GeneratorParam names are limited to C-style identifier rules, with additional constraints on underscore usage). Probably overthinking it.
Re: the windows buildbots: I updated the scripts and did a buildbot stop and start, but builds completing since then still seem to be using the old, broken windows testing approach. I wonder, do the workers queue up the commands on the worker (and thus this could be just stale builds completing)? Investigating...
Steven Johnson
@steven-johnson
Hmm, this is odd: I stopped buildbot again; when restarting, it is now failing with "could not find buildbot-www; is it installed?” which is something I haven’t seen before. @abadams, is it wise/unwise to restart the entire buildbot VM when updating?
Steven Johnson
@steven-johnson
logout, log back in, now starting it is telling me I need a txrequests package installed. Oy.
Just gonna reboot the VM.
Nope. Still busticated.
Steven Johnson
@steven-johnson
bah: chmod is not my friend
chmod’ing stuff to my user seems to have healed it, per comments in @abadams document — sadly, the failure modes were obscure and unrelated enough that I didn’t think to try that
ronlieb
@ronlieb
Hi Folks, i am seeing a failure building camera_pipe after the most recent commit.
make: * No rule to make target bin/Demosaic.o', needed bybin/process'. Stop.
Dillon Sharlet
@dsharletg
just pushed what should fix it
ronlieb
@ronlieb
it did ,thx
Suyog
@suyogsarda_twitter

Hi All, i was looking at issue 2317 (halide/Halide#2317) where input.dim(0).set_min(0) was resulting in slower code on CPU. Further digging into code and some experiment showed that slowness is only due to input.dim(0).set_min(0) and not due to input.dim(1).set_min(0).

In the codegen, i see some checks and asserts for "halide_buffer_is_bounds_query" and these are inserted on CPU side always. Even if the schedule is offloaded to Hexagon, the asserts are always inserted in CPU code. Hence the slowness is always observed on CPU schedule, but not on Hexagon.

Q - For schedules offloaded to Hexagon, even if the asserts are on CPU side, why isn't slowness observed? I assume we are measuring time which involves the CPU to Hexagon and back offload time too. Any idea?

Andrew Adams
@abadams
It'd have to be a really small pipeline for that assert to matter - e.g. processing an 8x8 image
It's an inlined comparison of two of the input buffer fields to zero - should be perfectly branch-predicted too
The no_asserts-no_bounds_query target flags turns off all that code, so you can try those for testing.
Suyog
@suyogsarda_twitter
Thanks i will try that. However, the effect is observed for every test case though (and size of image is large enough). Also, in the code generated, CPU code differs only on those asserts accompanied by some bunch of mov instructions at the end of computations for that function, while for schedules on hexagon, code generated is exactly same. Hence my guess was those asserts for slowness on CPU code.
Andrew Adams
@abadams
@steven-johnson windows builds are now all failing tests (now that we're running them). Can't tell if it's real, or a build config issue. I see cmake reporting steps as failed, but don't see any indication of what failed.
It's not doing something dumb like assuming the word "error" in the output means a failure is it? I recall that being an issue
Steven Johnson
@steven-johnson
Well, that's progress I guess... Will investigate when I get in today. It's almost certainly a build config issue unless something has injected a platform specific failure in the last week or so as these targets worked correctly on my local box.
Suyog
@suyogsarda_twitter
@abadams arm-64-android-no_asserts-no_bounds_query didn't produce faster code. Problem seems something else then
Steven Johnson
@steven-johnson
looking at windows failure log now — there are only a handful of failures but the errors aren’t really explicated, gonna have to try to replicate individually by remoting into one of the windows buildbots. (My personal Windows box is down again and I’m in different office today, yay windows)
(given that it’s only a handful, it’s possible these are legit errors that have crept in over the past two weeks of non-testing rather than config stuff)
Steven Johnson
@steven-johnson
@abadams: are the two Windows buildbots interchangeable from a build target standpoint?
Andrew Adams
@abadams
yes
Steven Johnson
@steven-johnson
Unsurprisingly, trying to Chromote into them from a GBus is unusably slow. (Not even glacial, this is more tectonic, several-seconds-to-respond-to-keystroke-slow.) I’m just gonna assume that ssh’ing into them isn’t a thing. Will have to wait until I get in to debug further :-(
Andrew Adams
@abadams
I find them to be super-slow until you turn off the buildbot, just due to windows process priority issues.
But the gbus certainly won't help :)
Should really remember to lower the process priority of the shell the buildbot is launched in, so that everything it spawns is also low-priority.
Steven Johnson
@steven-johnson
ahhh yes stopping the worker makes it usable :)
Steven Johnson
@steven-johnson
I hate CMake with the fury of a million exploding suns
Andrew Adams
@abadams
Because you discovered what was wrong and it's dumb, or because you can't figure out what's wrong because it sucks?
Steven Johnson
@steven-johnson
If you run something via add_custom_target(), it considers a nonzero error code to be failure, as it should
but