These are chat archives for rust-lang/rust

8th
May 2019
octave99
@octave99
May 08 12:42
Could someone help me understand in below code, why the "new" call from main works while the same call from within the implementation gives error saying "expected type parameter, found struct std::vec::Vec"
octave99
@octave99
May 08 12:50
#[derive(Debug)]
pub struct Animal<T>(T);

impl<T> Animal<T> where T: AsRef<[u8]>{
    fn new(n: T) -> Self {
        Animal(n)
    }
    // ERROR : expected type parameter, found struct `std::vec::Vec`
    fn from_vec() -> Self {
        Self::new(vec![1, 2, 3])
    }
}

fn main() {
    // WORKS
    print!("{:?}", Animal::new(vec![1, 2, 3]));
}
Denis Lisov
@tanriol
May 08 12:56
Your from_vec returns Self, which is Animal<T>, but you're trying to return Animal<Vec<_>>. What if T is not a Vec<_>?
octave99
@octave99
May 08 13:21
@tanriol T is anything which implements AsRef and I think Vec has it implemented, so thought it should work, same as in the case of call from main.
Denis Lisov
@tanriol
May 08 13:29
T is anything caller asks for that implements AsRef<[u8]>. If the caller calls Animal::<Bytes>::from_vec(), your code has to return Animal<Bytes>, but it cannot do that.
octave99
@octave99
May 08 13:31
@tanriol I see. So what could be a good pattern here? Write individual functions with specific return types? Like: fn from_vec() -> Animal<Vec<u8>>
Denis Lisov
@tanriol
May 08 13:32
Depending on what you want to do, that's one of the options.
Zakarum
@omni-viral
May 08 13:36

Can somebody think of simplified function that does the same?

Here is the function I have now.

impl TheType {
  fn chunk_size(&self, size: u64) -> u64 {
      let size_exp = ((size / 2 + 1).next_power_of_two()).trailing_zeros();
      let max_exp_diff = self.max_chunk_size_exp - size_exp;
      let align_exp = self.blocks_per_chunk_exp / (((max_exp_diff - 1) / self.blocks_per_chunk_exp + 1));
      let aligned_max_exp_diff = (max_exp_diff + align_exp - 1) / align_exp * align_exp;
      let aligned_size_exp = self.max_chunk_size_exp + self.blocks_per_chunk_exp - aligned_max_exp_diff;
      let chunk_size = 1 << aligned_size_exp;

      debug_assert!(chunk_size <= (size * 1 << self.blocks_per_chunk_exp));
      debug_assert!(chunk_size <= 1 << self.max_chunk_size_exp);
      chunk_size
    }
}
Denis Lisov
@tanriol
May 08 13:37
Does it have to return exactly the same as this one?
Zakarum
@omni-viral
May 08 13:38
No
But there are requirements
Denis Lisov
@tanriol
May 08 13:38
Then what are the requirements?
Zakarum
@omni-viral
May 08 13:39
It should return value up to size * (1 << self.blocks_per_chunk_exp)
As size goes closer to 1 << self.max_chunk_size_exp it should round down more values to the same result
This function result is used as argument to create hierarchy of block sizes
And this hierarchies should converge to same big values
No matter what starting size was
In the same time it should result in as big value as possible each time. Especially for small argument
Denis Lisov
@tanriol
May 08 13:42
What do you mean by "converge to"? Are you passing its output on its input?
Zakarum
@omni-viral
May 08 13:42
Yes
When I call this function with block size it gives me size of chunk where this block will be allocated
And then I would allocate this chunk if there no chunks of this size so I call it again with result as argument
And so on until certain limit is reached
Denis Lisov
@tanriol
May 08 13:43
Hope this is not a general purpose memory allocator :-)
Zakarum
@omni-viral
May 08 13:43
Which is never larger than 1 << (self.max_chunk_size_exp - 1)
No. This is for GPU )
General purpose allocators may use similar techniques. You never want to allocate just few bytes from OS
So you allocate from chunks which you allocate from bigger chunks and so forth until you reach a size fitting for OS allocator (usually few MBs)
For GPU this allocator will be configured. Demanding apps (like games) should allocate from GPU driver using very big blocks (256 MB for modern hardware)
If app does not use much GPU memory this allocate can be configured to stop at few MBs chunk size
Denis Lisov
@tanriol
May 08 13:50
There are probably some harsh alignment requirements, aren't there?
Zakarum
@omni-viral
May 08 13:50
There are. Up to 8KB for small image )
That's why I want to allocate only evenly sized blocks from one chunk to ensure they all are aligned to their size
Then I can just allocate max(size, alignment) that is usually equal to size from the allocator
Zakarum
@omni-viral
May 08 13:55
This also result in near zero fragmentation
Denis Lisov
@tanriol
May 08 14:23
(gone to PM)
But got an error:
error: expected a table, but found a string forccin /home/drasko/.cargo/config
Denis Lisov
@tanriol
May 08 17:33
This file is not a cargo configuration file, you cannot just enable options from it in .cargo/config
Drasko DRASKOVIC
@drasko
May 08 17:43
Where and how can I set CC used for a target?
I can see that linker can be set
but I do not see how I can force the compiler
Denis Lisov
@tanriol
May 08 17:52
Probably depends on how the specific crate is built...
Drasko DRASKOVIC
@drasko
May 08 17:53
I mean system-wide
for example, I want that for target mips-unknown-linux-musl my mips-openwrt-linux-musl-gcc is used
Denis Lisov
@tanriol
May 08 17:54
There is no such thing. rustc and cargo do not call C compilers, the crate build script / C code build system calls them.
Drasko DRASKOVIC
@drasko
May 08 17:55
OK, and is there a way to tell to Cargo where C compiler should look for includes?
because it currently looks in my host system /usr/incllude or something
Denis Lisov
@tanriol
May 08 17:56
If it calls your system gcc, it won't be able to build for a different arch anyway.
If the crate uses the cc crate for build, CC and/or CXX (depending on whether it's built as C or C++) environment variables should be able to force the compiler.
Drasko DRASKOVIC
@drasko
May 08 18:16
to whome is this include = "/usr/local/opt/openssl/include" passed?
i.e. who uses it?
Denis Lisov
@tanriol
May 08 18:20
Probably either no one or some crate used during build...
The two rustc keys are build script overrides and, IIUC, this means that the build script will not be run at all, so it will not compile anything.
Ichoran
@Ichoran
May 08 18:25
@omni-viral - If you want to converge to 2m2^m, I would for k=mrk = m - r restrict the solutions to the form ak2k+ak12k1+...akr2kra_k 2^k + a_{k-1} 2^{k-1} + ... a_{k-r} 2^{k-r}, which can be done with a few shifts and bitmasks.
If you're rounding for memory alignment that's kind of what you want anyway, right? The larger your number, the more zeros on the end.
Zakarum
@omni-viral
May 08 19:29
@Ichoran It seems I need to convere even more
2 ^ m is what I did. It was just (x * N + 1).next_power_of_two() / 2 where x is input and N is constant.
This ways the output is always power of two and in between x * N / 2 .. x * N
But at larger input value I need to converge to 2 ^ (m * k)
Ichoran
@Ichoran
May 08 19:32
Yeah, just clip off the bottom bits.
*N is pretty rapid even for N == 2.
Zakarum
@omni-viral
May 08 19:33
?
Ichoran
@Ichoran
May 08 19:34
Oh, you're dividing by two. Okay.
Zakarum
@omni-viral
May 08 19:34
The requirement for the function is to output be less or equal to x * N
But as high as possible
Ichoran
@Ichoran
May 08 19:34
That still grows pretty fast.
Zakarum
@omni-viral
May 08 19:35
It was just x * N originally )
But then I realized I need it to output similar big values
Ichoran
@Ichoran
May 08 19:36
So you clip off all the bits except the leading one, which is great for converging to a single value (you do it after a single iteration), but not so great for "as high as possible".
Zakarum
@omni-viral
May 08 19:36
Yes. But even then converging is not enough
Ichoran
@Ichoran
May 08 19:36
Why does N need to be variable instead of a fixed value?
Zakarum
@omni-viral
May 08 19:37
So my current implementation not only clips off all except leading bits, but also shitfs to the right to make the only bit stand at even positions if the value is big enough
Ichoran
@Ichoran
May 08 19:38
Is your existing algorithm okay, or does it fail to have some property you desire?
I mean it's a bit of a chunk of code, but six lines isn't that bad, especially if you add a couple of lines of comments explaining the goal.
Zakarum
@omni-viral
May 08 19:39
I changed the solution to make it easier to read and understand.
fn chunk_size(&self, size: u64) -> u64 {
        let size_exp = 63 - size.leading_zeros();
        debug_assert!(size_exp < self.max_chunk_size_exp - 3);
        let chunk_size_exp = size_exp + self.blocks_per_chunk_exp;
        let converging_chunk_size_exp = if chunk_size_exp >= self.max_chunk_size_exp {
            self.max_chunk_size_exp
        } else if chunk_size_exp > self.max_chunk_size_exp - self.blocks_per_chunk_exp / 2 {
            self.max_chunk_size_exp - self.blocks_per_chunk_exp / 2
        } else {
            chunk_size_exp
        };

        let chunk_size = 1 << converging_chunk_size_exp;

        debug_assert!(chunk_size <= (size * 1 << self.blocks_per_chunk_exp));
        debug_assert!(chunk_size <= 1 << self.max_chunk_size_exp);
        chunk_size
}
This wait it shifts result if the value is close to the limit
And for very big input output may be just size * 4
Even though self.blocks_per_chunk_exp is 6 i.e. 64 blocks per chunk max
Very large blocks may be allocated from chunks that fits just 4 blocks
Ichoran
@Ichoran
May 08 19:41
Okay. That looks pretty reasonable to me.
Zakarum
@omni-viral
May 08 19:42
While smaller always use chunk that holds 32 .. 64 blocks
Ichoran
@Ichoran
May 08 19:43
it's kind of inefficient, though. (1 << 20) + 1 is going to waste almost half the space, isn't it?
I mean if the input chunk size is (1 << 20) + 1
Zakarum
@omni-viral
May 08 19:44
Not at all. It can never waste more space in chunk than size of the block - 1
Ichoran
@Ichoran
May 08 19:44
Oh, okay, you pack it that way. But then how do you get the alignment right?
Zakarum
@omni-viral
May 08 19:44
Otherwise it will fit in another block )
All blocks are aligned to their size
Ichoran
@Ichoran
May 08 19:45
Okay, but if they're an odd size that's not very good for memory access. I guess you can take care of that at the block level.
Zakarum
@omni-viral
May 08 19:45
On request I first do size = (size - 1) | (align - 1) | (granularity - 1)
Where align is part of request and granularity is minimal allocation parameter
Ichoran
@Ichoran
May 08 19:46
Okay. I'm assuming you add one again at the end of that.
Zakarum
@omni-viral
May 08 19:46
Oh, right, ofc )
There could be some waste of space when align is bigger than size
But this is rare
Ichoran
@Ichoran
May 08 19:47
Looks good to me then. :thumbsup:
Not exactly how I'd do it but quite reasonable.
Zakarum
@omni-viral
May 08 19:47
Except for very small sizes, because in my use case align typically starts at 256
This is simplest strategy I've come up with actually. But it works well enough for now. Except occasional OOM because of lack of convergence at large sizes
Now should be fixed
Ichoran
@Ichoran
May 08 19:50
I haven't ever done this kind of thing in Rust before. In C++ end up using memory arenas, which makes memory fragmentation much easier to avoid. I can use different strategies for different arenas depending on size etc..
Though I guess I don't actually know your use case well enough to know whether that would help.
But the basic idea is you can get a tiered structure: you always allocate blocks of some huge size, and then when you allocate a small thing, you allocate out of a memory arena that lives in one of those blocks.
Zakarum
@omni-viral
May 08 20:27
@Ichoran That basically what I do. I allocate block from chunk which itself gets allocated as block from larger chunk until chunk is large enough to allocate from driver
Bright side - I don't need to store allocator structures in allocated blocks
And I can't anyway
Because it's different kind of memory
its GPU memory
Ichoran
@Ichoran
May 08 21:41
Ah.
David O'Connor
@David-OConnor
May 08 21:47
Got an easy one, I hope. Trying to implement a print macro that will print normal (fmt::Display?) if it exists, otherwise debug. Is there a convenient way to do this? I notice that just using format etc with {:?} appears to work for both cases - is this the right approach?
Zakarum
@omni-viral
May 08 21:51
{:?} prints using Debug always
It's just that types that implement Display typically implement Debug too
But it may be implemented differently
IIRC You can't do any sort of branching based on the existance of trait implementation for given type
toxicafunk
@toxicafunk
May 08 21:59
i.e. println!("{}", ot.encode(p2)); println!("{:?}", ot.encode(p2));
where encode returns Vec<u8>, {} would print it the characters while {:?} would print the int value
David O'Connor
@David-OConnor
May 08 22:02
Thank you!
I think I may use separate funcs, if indeed there's no branching logic
Dirk Van Haerenborgh
@vhdirk
May 08 22:20
hi guys
is there some document that explains the api changes going from futures_preview 0.2 to 0.3?