These are chat archives for rust-lang/rust

25th
Mar 2019
Sam Johnson
@sam0x17
Mar 25 06:01
how can I write this as an if let (process_chunk returns an io::Result<()>?
                let res = self.process_chunk(input.slice(0, self.remaining_flat() as usize));
                match res {
                    Ok(()) => {},
                    _ => { return res; }
                }
Paul Masurel
@fulmicoton
Mar 25 06:03
if let Err(io_error) = self.process_chunk(input.slice(0, self.remaining_flat() as usize)) {
return Err(io_error);
}
does not work?
Sam Johnson
@sam0x17
Mar 25 06:07
ah it does -- thanks!
Sam Johnson
@sam0x17
Mar 25 06:13
next question -- I am getting unused module warnings for use bytes::{Bytes, BufMut, BytesMut};, but when I remove the parts it says are unused, I get errors because they were actually being used -- why is the compiler giving me this warning?
Paul Masurel
@fulmicoton
Mar 25 06:14
Usually it happens when you use it in parts of the code that is guarded by a #[cfg()
flag
typically : you use it in test, but you import it in non-test code
Sam Johnson
@sam0x17
Mar 25 06:15
ah ok so do I put a test flag on the use line as well?
(that is the case here)
Paul Masurel
@fulmicoton
Mar 25 06:15
yeah, or you move the use statement within the test module
Sam Johnson
@sam0x17
Mar 25 06:15
ok thanks

hmm this is not allowed:

#[test]
use bytes::BufMut;

how do I do it? my tests are in the module they are testing, so I can't simply move to a different module

Paul Masurel
@fulmicoton
Mar 25 06:18
#[cfg(test)]
mod tests {
   use super::*;
   use bytes::BufMut;

   #[test]
    fn test_mytest() {
    }
}
Sam Johnson
@sam0x17
Mar 25 06:19
:+1:
thanks
that works
silmanduin66
@silmanduin66
Mar 25 10:12

in python i have something like that:

data = [["name1", "code1"],["name2", "code2"],...]

how can i have something similar in rust ?

Denis Lisov
@tanriol
Mar 25 10:14
Without additional information my first guess would be let data: Vec<(String, String)> = vec![("name1".into(), "code1".into()), ("name2".into(), "code2".into()), ...];
silmanduin66
@silmanduin66
Mar 25 10:15
ok i didn-t know I could simply make a Vec<(String, String)>
but now i have to add into() to all my element by hand?
my list has around 2500 items
Sergie Kardashov
@leorik
Mar 25 10:16
What problem you are trying to solve here?
Denis Lisov
@tanriol
Mar 25 10:17
Is this list ever modified?
silmanduin66
@silmanduin66
Mar 25 10:19
i have this list and i want to use that data in my rust ( -> wasm ) program
is there a more efficient way to store that data ?
Sergie Kardashov
@leorik
Mar 25 10:21
Couple of questions: 1. Where does this list comes from? 2. How do you use it?
silmanduin66
@silmanduin66
Mar 25 10:22
this list is currently stored in json format in a file, but i will shrink it to get only the name and attribute value
i will sort this list by name and turn it into a balanced binary search tree ( array format )
in my actual program i will only do binary search
Sergie Kardashov
@leorik
Mar 25 10:24
@silmanduin66 So its a dictionary?
silmanduin66
@silmanduin66
Mar 25 10:24
yes
Denis Lisov
@tanriol
Mar 25 10:26
Then you can do just
static DATA: &[(&str, &str)] = &[
    ("name1", "value1"),
    ("name2", "value2"),
];
silmanduin66
@silmanduin66
Mar 25 10:27
wow i have no idea what that is :-P ( new to rust )
Denis Lisov
@tanriol
Mar 25 10:27
Technically you probably could turn it in compile time into some kind of searching automaton, but that's probably an overkill.
Sergie Kardashov
@leorik
Mar 25 10:32
@silmanduin66 One more thing - is there specific reason not to use HashMap (e.g. you just want to make your own impl)?
Denis Lisov
@tanriol
Mar 25 10:35
@leorik If the data structure is static, it's perfectly understandable that one wants to allocate less, especially in wasm context.
silmanduin66
@silmanduin66
Mar 25 10:36
this is what i did for the moment :
 let words = vec!["airplane", "barbecue", "cinema", "destiny", "dictionnary", "emirate", "fangorn", "grizlly", "hero", "icarius"];
let mut my_tree: Vec<&str> = vec![""; (fill(words.len() as u32)) as usize];
my_tree = fill_tree(&words, my_tree, 0, (words.len() as u32) -1, 1, 0);


fn fill_tree<'a>(v: &Vec<&'a str>, mut t: Vec<&'a str>, start: u32, end: u32, side: u32, old_position: u32) -> Vec<&'a str> {

    if start <= end {
        // get middle
        let middle = (end + start) / 2;

        // calculate new position
        let position = 2 * old_position + side;

        if (position-1) < fill(v.len() as u32) {

            // add middle element of array to return array
            t[(position - 1) as usize] = v[middle as usize];

            if start < end {
                // recursively do the same with the left and right part
                t = fill_tree(&v, t, start, middle-1, 0, position);
                t = fill_tree(&v, t, middle +1, end, 1, position);
            }

        } 
    }
    t
}

fn fill(mut x: u32) -> u32{
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    x
}
Sergie Kardashov
@leorik
Mar 25 10:39
@tanriol Yeah, you right. But one rarely want to sort static "dictionary" in run time
silmanduin66
@silmanduin66
Mar 25 10:39
and i have a search_tree function
Denis Lisov
@tanriol
Mar 25 10:39
@silmanduin66 You do know about [T]::binary_search, right?
Michal 'vorner' Vaner
@vorner
Mar 25 10:40
Might be an overkill, but the fst crate (search automaton for bytestrings) can be constructed from a byte slice and you could compile-time generate it and include it as static constant too, making it even smaller. But that would be more complex to do.
silmanduin66
@silmanduin66
Mar 25 10:41

yeah i took a look but the thing is that my search function is a bit different i want to get a vec of elements :

if i enter " ab "

it can get me the words:

"abcefg"
"abgldj"
"abjklfdg

Denis Lisov
@tanriol
Mar 25 10:41
One interesting option I'd try to benchmark is perfect hash function based lookup tables.
silmanduin66
@silmanduin66
Mar 25 10:41
i m not just searching for a matching element, but for all elemetns that start with the string
fn search_tree<'a>(t: &Vec<&'a str>, mut r: Vec<&'a str>, search: &str, node: u32) -> Vec<&'a str> {
    let predicate_fn = predicate::str::starts_with(search); // ----> to be optimized ( called each loop !?) 
    let node_value = t[(node-1) as usize];
    // "" value is empty node
    if node_value != "" {
        // caculate left child node 
        let target = 2*node;
        // search is smaller
        if search < node_value {
            println!("value is smaller than {:?}", node_value);
            if predicate_fn.eval(node_value) {
                    r.push(node_value);
                }
            // search left child node
            if target < t.len() as u32 {
                r = search_tree(&t, r, search, target);
            }        
        // search is bigger or equal
        } else {
            // found search
            if search == node_value {
                println!("found value {:?}", node_value);
                // stop search here
                r.push(node_value);
            // search is bigger
            } else {
                println!("value is bigger than {:?}", node_value);
                // search right child node
                if target +1 < t.len() as u32 {
                    r = search_tree(&t, r, search, target + 1);
                }
            }
        }

    }
    r
}
Denis Lisov
@tanriol
Mar 25 10:46
Is this code performance critical? Also, what's the average number of matches - is it something like 1-2 (almost complete string) or 50-100 (single letter search)?
silmanduin66
@silmanduin66
Mar 25 10:50

i have an input box, and at each letter that is entered it displays all the matching elements (startswith) :

so for "a" maybe 100 elements

and the number of found elements dicreases the more letters the user inputs
this code will be used in webassembly
Denis Lisov
@tanriol
Mar 25 10:55
Then I'd guess that the search performance does not really matter as the bottleneck will be adding the elements to the input box...
silmanduin66
@silmanduin66
Mar 25 10:56
but my search will run in log2(n) right ?
Denis Lisov
@tanriol
Mar 25 10:57
Your search will run in O(match_count), which is probably more significant :-)
silmanduin66
@silmanduin66
Mar 25 11:04
but i thought any balanced binary tree would be log2 n, the number of match doesn t matter because it only runs once no ?
Denis Lisov
@tanriol
Mar 25 11:20
I'd probably use something like this
Denis Lisov
@tanriol
Mar 25 11:34
Or, slightly tweaked, like this
This is not the most efficient version possible, but close enough and readable enough at the same time.
Or really use the fst crate as it seems to support your use case :-)
silmanduin66
@silmanduin66
Mar 25 15:24
ok i ll try that :-)
silmanduin66
@silmanduin66
Mar 25 17:07
@tanriol how can i apply your function to :
static KNOWN_NAMES: &[(&str, &str)] = &[
    ("name1", "value1"),
    ("name2", "value2"),
];
Mikail Bagishov
@MikailBag
Mar 25 17:12
This message was deleted
In comparison, you should just check that key.0 starts_with item you search. Note also that KNOWN_NAMES must be sorted, otherwise tanriol's approach will not work.
John
@onFireForGod_gitlab
Mar 25 19:04
I have multiple vector of structs, and I want to find the intersection of all of them on a particular struct field
all of the structs in the vecs have a time field, I want to interesect all of the vecs on the time field
Denis Lisov
@tanriol
Mar 25 19:06
What do you mean? Find a time for which every vector has a struct?
John
@onFireForGod_gitlab
Mar 25 19:07
not a particular time but all of the times
but yes that is the idea
each vec has approximately 200,000 entries
Denis Lisov
@tanriol
Mar 25 19:08
Are the vectors sorted?
John
@onFireForGod_gitlab
Mar 25 19:08
yeah
I see where your going
How would you approach that?
the time is NaiveDateTime from chrono which implements PartialOrd and all the other traits for comparisions
Denis Lisov
@tanriol
Mar 25 19:12
Are there a few vectors or a significant number of them?
John
@onFireForGod_gitlab
Mar 25 19:14
5 vectors
Oh and some of them could have duplicate entries
Denis Lisov
@tanriol
Mar 25 19:16
But we're interested in times only, so the duplicates do not matter, correct?
John
@onFireForGod_gitlab
Mar 25 19:19
No I actually need the whole structs
Ichoran
@Ichoran
Mar 25 19:19
You can pick them out again once you know the times.
John
@onFireForGod_gitlab
Mar 25 19:20
well that will require another traversal +cn
Ichoran
@Ichoran
Mar 25 19:20
Are traversals expected to be expensive compared to the rest of what you're doing?
I would be very tempted to just start up five iterators and write the logic to pull out the current set when they're all equal.
(All equal in time, assuming they're sorted in time so the in-order traversal is also in-order in time.)
Ichoran
@Ichoran
Mar 25 19:25
The more expensive but harder to muck up thing you could do is put them all into a BTreeMap (or a hash map, really) with the time as the index and a Vec as the contents; the logic would be to just copy everything you've got in your vectors in there. Then you just need to pick out the ones that have a full set represented (maybe you need to store (index, struct) and check to make sure all indices are present).
Denis Lisov
@tanriol
Mar 25 19:25
Yeah, sound like that. Five peekable iterators; advance every to current max time until the times are all equal; take a set of values (and advance every iterator to the next)
Ichoran
@Ichoran
Mar 25 19:26
Yeah, that's what I'd do for efficiency.
If there are duplicates, you figure out how to handle them on the advancing logic (e.g. if you want all of them, when all five hit max time, you collect everything that is the same time until you run out in all five iterators).
John
@onFireForGod_gitlab
Mar 25 19:28
thanks, will be working on it
Denis Lisov
@tanriol
Mar 25 19:30
The "hard to muck up" version would be to make a set of times for every vec, intersect them, then extract the structs by binary search :-)
Ichoran
@Ichoran
Mar 25 19:30
Yeah, that might be even harder to muck up :)
John
@onFireForGod_gitlab
Mar 25 20:19
is there a way to set a mem usage for cargo run?
Denis Lisov
@tanriol
Mar 25 21:00
There may be OS-specific ways to limit memory usage. Nothing universal, AFAIK.