These are chat archives for rust-lang/rust

5th
Jul 2017
Sebastian Blei
@iamsebastian
Jul 05 2017 08:30
Hello.
Could somebody tell, what's the most efficient way to group_by 4 levels?
I need to group data at API before sending the data to the frontend. With 3 levels deep, it was ok and took about 7secs to group ~ 70_000 elements. But now, with a fourth level, it tooks about 600secs to complete the operation.
I used BTreeMap((i32, i32, i32, i32), i32) to group the data. Is there any faster way to do this? Maybe with String as BTreeMap's key? Then I would declare the key maybe as format!("{}::{}::{}::{}", of_type1, of_type2, of_type3, of_type4).
Joonas Koivunen
@koivunej
Jul 05 2017 08:32
@iamsebastian could you publish the grouping code as a gist?
Sebastian Blei
@iamsebastian
Jul 05 2017 08:33
I could try, yes. Just a mom.
Joonas Koivunen
@koivunej
Jul 05 2017 08:43
@iamsebastian no obvious issues i can see except for the many lookups needed... have you actually measured that the "grouping by" part became slower with 4th element or could it had been some random delay as you seem to be downloading a lot data from database?
@iamsebastian another idea is to handle the grouping by in the database, in case it was an sql database.. if you have a lot of data to group by, it must be faster/more economical to group it closer to the data
Sebastian Blei
@iamsebastian
Jul 05 2017 08:53
Yes, you're right, @koivunej . The 4th level is slowing down by ~ 10_000%. I actually benchmarked them.
But sometimes, Rust is slow on development mode. Can I run optimization flags in dev-mode?
Production code on group_bys are much faster. But I could not run production mode on this dev system, as the environmental flags are changed.
Joonas Koivunen
@koivunej
Jul 05 2017 08:57
@iamsebastian i often find myself even running some specific test cases with cargo test --release -- sometestcasename just to see how some change i made affects the outcome. if you cannot use --release build while developing i think you should take a look at fixing that
Sebastian Blei
@iamsebastian
Jul 05 2017 08:57
To do the group_by in postgresql is just possible, but makes to code untestable.
Joonas Koivunen
@koivunej
Jul 05 2017 08:59
@iamsebastian you can end up in a pretty testable situation but those tests will of course always need a database; setting up the first test takes some effort but it's not quite different from the case when you want to have an integration test with your app and a database with some existing data
Sebastian Blei
@iamsebastian
Jul 05 2017 09:00
Yeah, I already have some test data in test institutes, etc.
But I don't think, writing the group_by would much speedup the operation, as the SQL statement would include about ~ 9 joins.
But thanks so far. I will have a look, if I could speed up the operation otherwise.
I will include some measurements inline in the code and will have a look, where the bottleneck is located at.
Joonas Koivunen
@koivunej
Jul 05 2017 09:06
@iamsebastian postgresql query optimizer is really good and might have the opportunity to even reduce the amount of joins after seeing how you use the data. hopefully you'll find a solution. for timing long running sections i've found let start = Instant::now(); and let elapsed = start.elapsed(); good enough.
Sebastian Blei
@iamsebastian
Jul 05 2017 09:08
Yes, thanks. I have a custom Struct for calculation durations, as this information also needs to get inserted into database, to let the API estimate the duration of new calculations, based on used amount of records.
Sebastian Blei
@iamsebastian
Jul 05 2017 09:20
@koivunej Found, where I can declare optimization levels for dev mode. I just only declared them for test and release builds.
Sebastian Blei
@iamsebastian
Jul 05 2017 09:39
Btw: The bottleneck was the preparation of the BTreeMap, as this resulted in a big bunch of combinations.
I just prepared the map, so the frontend will always show all group-by combinations intially with a count of 0. Now I will fill up the entries in the frontend.
Joonas Koivunen
@koivunej
Jul 05 2017 09:40
@iamsebastian i guess there are separate profiles because you might not want to have to debug heavily optimized code; you should be able to run tests in release mode just fine without copying the optimizations to dev profile
yeah that sounds reasonable
Sebastian Blei
@iamsebastian
Jul 05 2017 09:41
Yes. I just used profiles for release, test and bench mode. But not for dev. Was not in knowledge about a possible dev profile.
Non-preparation of map reduced the operation time to ~ 3secs.
Joonas Koivunen
@koivunej
Jul 05 2017 09:42
@iamsebastian sounds pretty high (assuming those need to be converted to json and shipped over the wire still) ... did you test how fast postgresql will do?
Sebastian Blei
@iamsebastian
Jul 05 2017 09:43
I could do. But it's much more slow. I just migrated most calculations from raw SQL some weeks ago.
As they underperformed and calculations gone to be to complex.
But this is not a much-frequented API.
It's just kinda static-calculation-based-financial-data API.
Some days, there are only half a dozen users, do some heavy lifting calculations.
The response time for calculations and results / reports are mostly heavy, as we are talking about millions records, just related to one calculation, configured by dozens of tables.
Sebastian Blei
@iamsebastian
Jul 05 2017 09:49
Atm I just have 84 tables and migrated only a small part, of the existing, but quite bad, old PHP solution, another person created some time before I was installed at this position.
stevensonmt
@stevensonmt
Jul 05 2017 18:19

This seems like a very simple problem, but I can't seem to figure it out. I have a HashMap<String, Vec<String>> and I want to access each item of the Vec<String> value for each key.

let my_keys = my_hashmap.keys();
let my_keys = vec![my_keys]; //necessary b/c no .to_iter method for Keys
let my_keys = my_keys.iter();
for x in my_keys {
  println!("{:?}", x)
}

The above prints a single line of x as a vector of all the keys rather than printing a line for each key. What am I missing?

Denis Lisov
@tanriol
Jul 05 2017 18:24
Keys is already an iterator, you don't need anything like iter
stevensonmt
@stevensonmt
Jul 05 2017 18:26
When I tried without it I got the error ^ the trait `std::iter::Iterator` is not implemented for `&std::collections::hash_map::Keys<'_, std::string::String, std::vec::Vec<std::string::String>>`
n/m I was calling &my_hashmap.keys()
stevensonmt
@stevensonmt
Jul 05 2017 18:32
oh, right. I had to call &my_hashmap.keys() b/c the function I'm actually trying to pass x to also takes the hashmap as a parameter.
Denis Lisov
@tanriol
Jul 05 2017 18:33
What are you actually trying to do?
You can use let my_keys: Vec<_> = my_hashmap.keys().collect();, but that may be suboptimal.
stevensonmt
@stevensonmt
Jul 05 2017 18:35

Just working on the exercise from TRPL:

Using a hash map and vectors, create a text interface to allow a user to add employee names 
to a department in the company. For example, “Add Sally to Engineering” orAdd Amir to Sales”. 
Then let the user retrieve a list of all people in a department or all people in the company by 
department, sorted alphabetically.

I have a function to store employees as Strings in the HashMap<Strings, Vec<String>> with the department as key String.

I have a function to list employees by department.
I am trying to use that function iteratively to list all employees in the company by department.
Denis Lisov
@tanriol
Jul 05 2017 18:38
Does this function take your HashMap by value? That's probably not what you want -- you most likely don't want it to destroy the employee list!
stevensonmt
@stevensonmt
Jul 05 2017 18:40
hmm. How else do I give the function access to the list?
This is my function for listing employees by dept:
pub fn department_list(data: &mut HashMap<String, Vec<String>>, dept: String) {

    match data.entry(dept.to_string()) {
        Entry::Occupied(mut entry) => {
            let vals = entry.get_mut().iter();
            println!("Employees in department {}:", dept);
            for x in vals {
                println!("{}", x)
            }
        },
        Entry::Vacant(entry) => println!("The department {} does not 
        exist or does not have any employees yet.", dept)
    }

}
Arthur
@Biacode
Jul 05 2017 18:50
Hello, what does ? symbol in let mut core = Core::new()?; code? Any documentation link? Thanks.
Arthur
@Biacode
Jul 05 2017 19:04
Also I got error by following this tutorial - https://hyper.rs/guides/client/basic/
error[E0277]: the trait bound `(): std::ops::Try` is not satisfied
  --> examples/hyper.rs:11:20
   |
11 |     let mut core = Core::new()?;
   |                    ------------
   |                    |
   |                    the trait `std::ops::Try` is not implemented for `()`
   |                    in this macro invocation
   |
   = note: required by `std::ops::Try::from_error`

error[E0277]: the trait bound `(): std::ops::Try` is not satisfied
  --> examples/hyper.rs:14:15
   |
14 |     let uri = "http://httpbin.org/ip".parse()?;
   |               --------------------------------
   |               |
   |               the trait `std::ops::Try` is not implemented for `()`
   |               in this macro invocation
   |
   = note: required by `std::ops::Try::from_error`

error[E0277]: the trait bound `(): std::ops::Try` is not satisfied
  --> examples/hyper.rs:24:5
   |
24 |     core.run(work)?;
   |     ---------------
   |     |
   |     the trait `std::ops::Try` is not implemented for `()`
   |     in this macro invocation
   |
   = note: required by `std::ops::Try::from_error`

error: aborting due to 3 previous errors
Denis Lisov
@tanriol
Jul 05 2017 19:06
@stevensonmt I'd suggest you want
pub fn department_list(data: &HashMap<String, Vec<String>>, dept: &str)
@Biacode Error propagation, see the book
Arthur
@Biacode
Jul 05 2017 19:09
@tanriol thanks
Denis Lisov
@tanriol
Jul 05 2017 19:09
Your error is caused by the fact that ? can be used in functions returning a Result only (will be usable in some other cases in future versions)
Arthur
@Biacode
Jul 05 2017 19:12
weird, what if I don't need a method which returns Result... I just need to try that example...
Denis Lisov
@tanriol
Jul 05 2017 19:15
Returning a Result is the standard way of error propagation. You can make your main just
fn main() {
    run().unwrap();
}
And create a
fn run() -> io::Result<()> {
    // Place your code here
    Ok(())
}
Arthur
@Biacode
Jul 05 2017 19:20
extern crate futures;
extern crate hyper;
extern crate tokio_core;

use std::io::{self, Write};
use futures::{Future, Stream};
use hyper::Client;
use tokio_core::reactor::Core;

fn main() {
    run().unwrap();
}

fn run() -> io::Result<()> {
    let mut core = Core::new()?;
    let client = Client::new(&core.handle());

    let uri = "http://httpbin.org/ip".parse()?;
    let work = client.get(uri).and_then(|res| {
        println!("Response: {}", res.status());

        res.body().for_each(|chunk| {
            io::stdout()
                .write_all(&chunk)
                .map_err(From::from)
        })
    });
    core.run(work)?;
    Ok(())
}
error[E0277]: the trait bound `std::io::Error: std::convert::From<hyper::error::UriError>` is not satisfied
  --> examples/hyper.rs:18:15
   |
18 |     let uri = "http://httpbin.org/ip".parse()?;
   |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `std::convert::From<hyper::error::UriError>` is not implemented for `std::io::Error`
   |
   = help: the following implementations were found:
             <std::io::Error as std::convert::From<std::io::ErrorKind>>
             <std::io::Error as std::convert::From<std::io::IntoInnerError<W>>>
             <std::io::Error as std::convert::From<std::ffi::NulError>>
   = note: required by `std::convert::From::from`

error[E0277]: the trait bound `std::io::Error: std::convert::From<hyper::Error>` is not satisfied
  --> examples/hyper.rs:28:5
   |
28 |     core.run(work)?;
   |     ^^^^^^^^^^^^^^^ the trait `std::convert::From<hyper::Error>` is not implemented for `std::io::Error`
   |
   = help: the following implementations were found:
             <std::io::Error as std::convert::From<std::io::ErrorKind>>
             <std::io::Error as std::convert::From<std::io::IntoInnerError<W>>>
             <std::io::Error as std::convert::From<std::ffi::NulError>>
   = note: required by `std::convert::From::from`

error: aborting due to 2 previous errors
still getting errors
too complicated for doing just really simple things...
fn run() -> io::Result<()> {
    let mut core = Core::new()?;
    let client = Client::new(&core.handle());

    let uri = "http://httpbin.org/ip".parse();
    let work = client.get(uri.unwrap()).and_then(|res| {
        println!("Response: {}", res.status());

        res.body().for_each(|chunk| {
            io::stdout()
                .write_all(&chunk)
                .map_err(From::from)
        })
    });
    core.run(work);
    Ok(())
}
this version works, without some ? and unwrapping uri instead of ?