These are chat archives for bjz/gfx-rs

18th
Jul 2014
Dzmitry Malyshau
@kvark
Jul 18 2014 00:04
@bjz so where would be the best place for me to implement syntax extensions?
Corey Richardson
@cmr
Jul 18 2014 00:04
@kvark I think the only one who knows how deriving works is huon
Dzmitry Malyshau
@kvark
Jul 18 2014 00:05
@cmr thanks, I'll try to bug him later tonight
Dzmitry Malyshau
@kvark
Jul 18 2014 01:02
@cmr I've made up my mind on the texture subtyping - I'm for a single type. Here is why:
  1. Imagine how device::target::Plane will look like with multiple types, or envir::Storage::textures
  2. Looking into the future, it seems like everything converges to having raw data (chunk of memory) and different ways to view it or render it. These shader/render views can be multiple per texture, hence there is no point in putting one of them into the texture type.
Corey Richardson
@cmr
Jul 18 2014 01:04
@kvark the first I wasn't worried about since it'd be an enum, and I agree with the second.
Dzmitry Malyshau
@kvark
Jul 18 2014 01:07
In fact, it's not a future. It's been like that since DX10 days or even earlier, AFAIK
it's just GL that stayed behind
Dzmitry Malyshau
@kvark
Jul 18 2014 02:09
Does this doc link work for anyone?
Corey Richardson
@cmr
Jul 18 2014 02:09
it works for me
Dzmitry Malyshau
@kvark
Jul 18 2014 02:10
My search stopped working at all :(
Dzmitry Malyshau
@kvark
Jul 18 2014 02:27
ok, turned out I was out of free HDD space
Corey Richardson
@cmr
Jul 18 2014 04:59
@kvark do you think we'll need to expose anything around things like sync objects?
Andy Trevorah
@trevorah
Jul 18 2014 09:22
@bjz gitter will stay free for open source projects
we will start charging for private orgs/rooms soon
Dzmitry Malyshau
@kvark
Jul 18 2014 11:37
@cmr what sync objects do you have in mind? GPU queries and fences definitely need to be exposed in some way.
Dzmitry Malyshau
@kvark
Jul 18 2014 12:04
Here it comes - prototype 3 of the shader parameters. @csherratt @cmr @bjz
Marvin Löbel
@Kimundi
Jul 18 2014 12:09
(And @Kimundi, even though he is completely unqualified for it ;) )
Dzmitry Malyshau
@kvark
Jul 18 2014 12:11
@Kimundi sorry, really everyone is invited to look! I just listed pals who used to discuss these things with me.
Marvin Löbel
@Kimundi
Jul 18 2014 12:12
No need to be sorry! This was just a joke about me idling in here for like a week :D
Dzmitry Malyshau
@kvark
Jul 18 2014 12:14
@Kimundi I was wondering what you might want to tackle next since GL compat is working now.
Marvin Löbel
@Kimundi
Jul 18 2014 12:18
@kvark: Good question. Next goal for me would probably be to actually write some toy code using gfx-rs. Like, a few boxes, a camera, etc
I keep getting sidetracked though xD
Dzmitry Malyshau
@kvark
Jul 18 2014 12:21
@Kimundi nice, I see. I believe @bjz wanted to do the terrain example, but we need more ;)
(client3.rs code updated)
Marvin Löbel
@Kimundi
Jul 18 2014 12:28
@kvark: So, if I understand your prototype 3 right it basically provides a type-safe way to provide arbitrary data structures as shader parameters?
Dzmitry Malyshau
@kvark
Jul 18 2014 12:30
A shader program is supposed to be represented by a single data structure that has all the free-standing (not in uniform blocks) uniforms, blocks, and textures.
All the verification is done by the macro that generates ShaderParam implementation. Once macro is done, the structure is extremely safe and simple to use. There is no way to "forget" about some parameter as well as to provide incorrect type.
Marvin Löbel
@Kimundi
Jul 18 2014 12:32
Sounds good
So that Data struct example contains a texture and uniform, correct?
Dzmitry Malyshau
@kvark
Jul 18 2014 12:37
yes, a texture and a vec4 uniform. Their names are supposed to match shader names 1:1
Marvin Löbel
@Kimundi
Jul 18 2014 12:37
okay
Dzmitry Malyshau
@kvark
Jul 18 2014 12:40
@Kimundi I'm wrong. verification is done in the create_link at run time, the code for which is generated.
Marvin Löbel
@Kimundi
Jul 18 2014 12:41
Hm, I'm wondering how #[shader_param] is going to work. Does it just have hardcoded knowledge of legal types, or will all fields of the struct have to implement a trait?
Dzmitry Malyshau
@kvark
Jul 18 2014 12:42
(gist is updated)
@Kimundi it will have a range of types it supports, yes, hard-coded
sorry, bb in 1h
Marvin Löbel
@Kimundi
Jul 18 2014 12:43
ah, I see
Dzmitry Malyshau
@kvark
Jul 18 2014 13:13
Also note that the whole concept of Environment goes away with that, with envir.rs file going down in favor of shade.rs
Corey Richardson
@cmr
Jul 18 2014 13:19
@kvark I don't know the details of how program introspection works when your shaders have structs -- can it work with this? What would the method look like on Uploader?
Dzmitry Malyshau
@kvark
Jul 18 2014 13:21
@cmr good question. GL reports struct fields by full paths (struct_name.field_name), and we can query those by traversing user structure recursively. Honestly, I don't think this is a big deal since we are only talking about free-standing parameters. If the user needs to use a struct so badly, he can just use a uniform block.
Corey Richardson
@cmr
Jul 18 2014 13:22
ah ok
Dzmitry Malyshau
@kvark
Jul 18 2014 13:26
@cmr I'm almost completely unaware on how D3D reports the variables back. Our production code here parses the sources and extracts all the info from them (instead of asking the API).
Another (arguably) good thing about prototype-3 is that the draw call has a single parameter for the bundle, which is the program + data.
Dzmitry Malyshau
@kvark
Jul 18 2014 13:31
ah, we are actually using this too
well, seems like D3D will be just fine then ;)
Corey Richardson
@cmr
Jul 18 2014 13:41
The prototype looks neat.
Dzmitry Malyshau
@kvark
Jul 18 2014 13:42
@cmr awesome, thanks for having a look! The implementation may get me a while, but so far it at least seems possible.
Corey Richardson
@cmr
Jul 18 2014 13:52
It looks like Metal will also be fine, but I honestly cannot tell.
The ObjC runs deep in its interface :fire:
Corey Richardson
@cmr
Jul 18 2014 13:58
@kvark another annoyance is compressed textures.
which is very backend specific
mobile doesn't have s3tc
Dzmitry Malyshau
@kvark
Jul 18 2014 14:00
well that's why we have device caps, right?
Corey Richardson
@cmr
Jul 18 2014 14:01
you're right, that should work.
even if you're compressing the textures at runtime that's still better than using uncompressed textures
Dzmitry Malyshau
@kvark
Jul 18 2014 14:03
if you can bear with the driver-produced compression quality, yeah :) basically trading loading time and quality for the frame time and memory afterwards
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:06
@kvark I need to see the ShaderParam trait and #[shader_param] macro to get an idea of what this actually looks like. But this is very neat.
what is envir::Uploader doing?
Dzmitry Malyshau
@kvark
Jul 18 2014 15:10
@csherratt thanks for having a look! ShaderParam trait is there (note multiple files), and the macro is just going to auto-implement it.
The Uploader is perhaps the weakest part of the interface here. It's a visitor from the Renderer side that will translate the incoming assignments from the ShaderParam into device cast messages.
@csherratt your crazy idea becomes more of a reality, slowly :)
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:21
r+
This is going to be very neat to use.
Corey Richardson
@cmr
Jul 18 2014 15:22
I'm worried that we are going to be bottlenecked by channel usage.
The fastest a channel is ever going to be is about 30ns
just sending a uint back and forth
Dzmitry Malyshau
@kvark
Jul 18 2014 15:23
@cmr we'll start packing multiple calls into one, that is invisible to the user so can be postponed
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:23
@cmr is it actually that fast?
Corey Richardson
@cmr
Jul 18 2014 15:24
@csherratt on my machine, yes.
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:24
30ns is faster then a L3 cache miss....
do you have a context switch?
Corey Richardson
@cmr
Jul 18 2014 15:25
nope
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:25
in your benchmark, or it is send, recv.
ah...
Corey Richardson
@cmr
Jul 18 2014 15:25
just send/recv within a single task
Dzmitry Malyshau
@kvark
Jul 18 2014 15:25
so 10-20 messages per call, resulting in about 0.5us per call, allowing about 2000 calls to reach 1ms mark. I think we are good ;)
@cmr especially since we don't receive anything typically during rendering, so even faster
Corey Richardson
@cmr
Jul 18 2014 15:26
If you aren't worried about it, then I'm not worried about it :)
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:28
Mutex lock/unlock 25 ns
Dzmitry Malyshau
@kvark
Jul 18 2014 15:29
@cmr Frankly, I'm not the person you should align your worries to, I'm mostly as calm as an elephant
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:36
Reading how channel is implemented is always like :O for me. They have done a pretty crazy good job of optimizing it.
Dzmitry Malyshau
@kvark
Jul 18 2014 15:36
@csherratt link?
Corey Richardson
@cmr
Jul 18 2014 15:36
"They" specifically being Alex :P
Are fractional values for GL_TEXTURE_MAX_ANISOTROPY_EXT​ useful? It'd be nice to keep it to a u8 size, instead of a huge f32
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:37
The meat of channel is implemented in spsc_queue.rs
hmm that is not a pretty link
The high level api that you use is here. stream.rs
Coraline Sherratt
@removed~csherratt
Jul 18 2014 15:43
Also take a look at mpmc queue Alex forced the the data structure to be split across cache lines to avoid cache contention.
Dzmitry Malyshau
@kvark
Jul 18 2014 15:43

@cmr from GL extension

N = min(ceil(Pmax/Pmin),maxAniso)
Lamda' = log2(Pmax/N)

It seems that the floating point makes sense for the calculus there

Corey Richardson
@cmr
Jul 18 2014 15:44
d3d uses a uint as well.
Dzmitry Malyshau
@kvark
Jul 18 2014 15:45
in this case we don't have a choice, we need the common denominator. Uint it is then!
Dzmitry Malyshau
@kvark
Jul 18 2014 15:50
@csherratt interesting. He must have used some heavy valgrind profiling for it
Coraline Sherratt
@removed~csherratt
Jul 18 2014 16:04
Considering at one point all queues were implemented as a linked list I am super happy with it.
Corey Richardson
@cmr
Jul 18 2014 16:04
heheh
Coraline Sherratt
@removed~csherratt
Jul 18 2014 16:05
30ns is faster then LTE's sample rate, so it's pretty fast.
That being said, I imagine you will want more course commands to allow for optimizations on the render server side when the project gets to that point. There will probably be quite a bit of tweaking to get it right.
Coraline Sherratt
@removed~csherratt
Jul 18 2014 16:12
@cmr would you be able to bench using two channels? Like writing to a cloned channel, that should turn the queue into a mcsc queue which might be slower.
Corey Richardson
@cmr
Jul 18 2014 16:24
@csherratt weirdly, doesn't seem to be much of a perf difference, seeing 48ns and 46ns respectively.
It may have been go that was ~30ns ...
This message was deleted
This message was deleted
Coraline Sherratt
@removed~csherratt
Jul 18 2014 16:43
Poking around with this, I get around 60-65ns in both cases. But I replaced datum with an array to see how memory bandwidth scaled. A block of 64bytes => ~100ns, 256bytes => ~150ns and 1024 => ~275ns
Corey Richardson
@cmr
Jul 18 2014 16:45
Looks like logarithmic scaling? Do you have an explanation for that? I would have figured it'd be roughly linear.
Coraline Sherratt
@removed~csherratt
Jul 18 2014 16:49
64 => 640MB/s 256 => 1700MB/s 1024 => 3700MB/s
I think we are actually benchmarking the memory copy at that point.
Corey Richardson
@cmr
Jul 18 2014 16:49
Oh it's obvious, it's scaling up to memory bandwidth.
And the overhead of the channel is being amortized.
cmr @cmr loves it when structures fit nicely into a u64
Coraline Sherratt
@removed~csherratt
Jul 18 2014 17:00
For memcpy I am getting at least 24GB/s with rust. So the queue is nearly an order of magnitude slower...
Corey Richardson
@cmr
Jul 18 2014 17:06
That's about what I'd expect, given that it has a bunch of branches and swaps.
Corey Richardson
@cmr
Jul 18 2014 17:30
Bikeshed time! Default texture filtering method? I think 8x aniso, for quality. If you don't know what you're doing, it seems reasonable that we give you the (almost) highest quality.
Dzmitry Malyshau
@kvark
Jul 18 2014 18:11
@cmr I'd say just trilinear for mipmapped textures and bilinear for others, but I don't really care that much
Corey Richardson
@cmr
Jul 18 2014 18:34
So forgive my ignorance, but can someone explain the precise separation between device and render?
It seems render is just a nice wrapper around device?
Dzmitry Malyshau
@kvark
Jul 18 2014 18:41
render is:
  1. provides a higher level abstraction. Operates on meshes and shader bundles, instead of separate attributes and parameters.
  2. provides the interface for deferred resources
  3. hides the internals of message passing from the user
Corey Richardson
@cmr
Jul 18 2014 19:50
I think we'll need associated types to have multiple backends with #[cfg]
There's no good way to say -> SomePerBackendSamplerType for ApiBackEnd
Dzmitry Malyshau
@kvark
Jul 18 2014 19:55
@cmr but it's not a diversity issue, it's just the lack of support for some heavy sampler types on weak platforms, right? In which case you can just have them all in one enum
Corey Richardson
@cmr
Jul 18 2014 19:55
@kvark well I don't care about that specifically, its for every type we use with dev::Foo
also I don't mean sampler type like 2d or 3d
I mean dev::Sampler
it can have different sizes for different backends
d3d is a pointer, ogl is a u32, metal is some horrendous objc object (probably also a pointer)
Dzmitry Malyshau
@kvark
Jul 18 2014 19:58
@cmr yeah, and for Gnm it's a bare struct
but why do you care about different sizes?
Corey Richardson
@cmr
Jul 18 2014 19:58
Because if you are going to have the trait ApiBackEnd implemented for all of these, what type does create_buffer return?
Dzmitry Malyshau
@kvark
Jul 18 2014 19:59
hmm I see. Perhaps ApiBackEnd will have to wait then...
Corey Richardson
@cmr
Jul 18 2014 19:59
Not necessarily, just the ability to have a single compiled copy of gfx-rs containing support for multiple backends.
What we have now with #[cfg] works.
and I think it will be straightforward once Rust has associated types
(so you can say that a trait implementation provides a specific type for the trait to use)
Dzmitry Malyshau
@kvark
Jul 18 2014 20:00
yeah, that'd be great
Corey Richardson
@cmr
Jul 18 2014 20:01
Also, what is Gnm?
Also, do we currently lack any way of destroying objects?
Dzmitry Malyshau
@kvark
Jul 18 2014 20:08
@cmr yeah, destroying objects was supposed to come after/with resource management (#22).
Gnm is PS4 graphics API
Corey Richardson
@cmr
Jul 18 2014 20:09
@bjz is @photex still working on a resource manager?
Corey Richardson
@cmr
Jul 18 2014 20:44
Is anyone against me porting all the gl stuff to use hgl? It's killing me seeing all of this duplication for things hgl could be doing for everyone.
@kvark ?
meh
Corey Richardson
@cmr
Jul 18 2014 21:38
@bjz r? bjz/gl-rs#106
Looks like gl_generator!(..., "GL_EXT_texture_filter_anisotropic,GL_ARB_something")
Dzmitry Malyshau
@kvark
Jul 18 2014 22:53
@cmr I'd rather keep it without hgl, even though I haven't looked deep into it. What string duplication are you talking about?
Corey Richardson
@cmr
Jul 18 2014 22:53
Not string duplication, just almost all of the gl that gfx-rs is doing hgl also provides rusty wrappers for
I decided I didn't care, though.
Dzmitry Malyshau
@kvark
Jul 18 2014 22:57
I see that it could fit, but I got used to work with raw GL for too long.
brb 1h