These are chat archives for rust-lang/rust

11th
Mar 2019
Edvin Malinovskis
@nCrazed
Mar 11 08:21 UTC
kylegoetz
@kylegoetz
Mar 11 18:24 UTC

My IDE underlines the p in p.path() and reports that p does not live long enough

paths.filter_map(|dir| {
        return match dir {
            Err(_e) => None,
            Ok(p) => Some(Image { path: p.path() }),
        };
    })

I don't understand this. Isn't the Ok(p) => ... a function/closure that takes ownership of p and then only does one thing with p (gets a Path) and never needs it again? (p.path() returns an object of type Path)

I have a feeling someone will ask to see what Image is:
pub struct Image<'a> {
    pub path: &'a Path,
}
Ingvar Stepanyan
@RReverser
Mar 11 18:25 UTC
You need to use move |dir| { ... } if you want to take ownership of the captured variable
kylegoetz
@kylegoetz
Mar 11 18:26 UTC
I don't want or not-want to take ownership of the captured variable. I want to take ownership of what p.path() returns. I could care less what happens to p after I extract the path via p.path().
Ingvar Stepanyan
@RReverser
Mar 11 18:27 UTC
yes but p.path() is just a borrow of actual p (assuming your p is PathBuf)
kylegoetz
@kylegoetz
Mar 11 18:27 UTC
p is a walkdir::DirEntry, a third party thing
Ingvar Stepanyan
@RReverser
Mar 11 18:27 UTC
same - closure wants to own the data too to make sure that the p.path() borrow doesn't outlive it
alternatively, if you do need p outside of this function, you can calculate p.path().to_owned() outside of the closure and then capture that as another variable, but that would involve a data clone.
kylegoetz
@kylegoetz
Mar 11 18:31 UTC
I don't need p. Basically I've got an iter of Result<DirEntry, Error>s representing a list of files in a directory structure, which is the dir. I'm trying to filter_map to get rid of the errors and return the DirEntrys converted into Images. I will never use any of those ps or dirs again. I'm thinking of this from a functional POV: get some iter, transform it to a vector of other things, and the next step will use the vector and never look at that iter or its contents again.
only thing I need is the vector of Image structs that have a path property that refers to the image on disk
Ingvar Stepanyan
@RReverser
Mar 11 18:32 UTC

I don't need p.

You don't, but p.path() needs it.

kylegoetz
@kylegoetz
Mar 11 18:33 UTC
Right. But in Ok(p) => Some(Image { path: p.path() }) that's the first occurrence of p and it's not got & beside it in the function Ok(p), so it should be owned, not borrowed, for the lifetime of the function Ok(p) => { ... }
Ichoran
@Ichoran
Mar 11 18:34 UTC
@kylegoetz - It's often easier to understand if you just manually loop through the iterator. Ownership is certainly simpler that way.
Ingvar Stepanyan
@RReverser
Mar 11 18:34 UTC
Unfortunately, it's not the default - by default Fn / FnMut closures (which flat_map accepts) would try to borrow data and not own it, unless you specify move.
kylegoetz
@kylegoetz
Mar 11 18:35 UTC
I read this as the function called Ok is passed parameter p and given ownership of it. The function called Ok starts with ownership of p, creates an Image with path property set to the return of p.path(), put sthat in an Option, and after that, the lifetime of p ends because the function that has ownership, Ok, terminates.
So I don't understand how p isn't known to last long enough to fetch p.path()
Ingvar Stepanyan
@RReverser
Mar 11 18:37 UTC

I read this as the function called Ok is passed parameter p and given ownership of it.

This is true only for FnOnce closures, but flat_map wants Fn / FnMut because it can execute it multiple times, and by default these borrow the environment.

That is, your function is not "passed" p but rather just &p.
Or, rather, &mut p (but that doesn't matter for you).
kylegoetz
@kylegoetz
Mar 11 18:37 UTC
ah
Ingvar Stepanyan
@RReverser
Mar 11 18:38 UTC
It's useful to imagine how closure translates to a struct plus a method (which it essentially is)
Ichoran
@Ichoran
Mar 11 18:39 UTC
Rust also tries pretty hard to not make you write & everywhere even if you are borrowing. For example, the match statement in here: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=dbb1a3fe002973add8b81868e8bf9ff9
kylegoetz
@kylegoetz
Mar 11 18:45 UTC
good_paths.filter_map(move |dir| {
        return match dir {
            Err(_e) => None,
            Ok(p) => Some(Image { path: p.path() }),
        };
    })
Same error. I've never seen the move keyword before, so I'm reading about it now but almost a mortal lock I won't understand it for a while. my_concentration does not live long enough, haha. I'll keep at it, though. Thanks.
Ingvar Stepanyan
@RReverser
Mar 11 18:45 UTC
that's odd...
Oh wait. How do you use the result?
Is p by any chance dropped while Image that borrowed from it still lives?
kylegoetz
@kylegoetz
Mar 11 18:46 UTC
I've noticed VS Code will occasionally show errors where lines of code used to exist and don't anymore, so maybe it's not refreshing properly.
Ingvar Stepanyan
@RReverser
Mar 11 18:46 UTC
It takes time with RLS to catch up, yeah.
Can you show the full code snippet?
kylegoetz
@kylegoetz
Mar 11 18:47 UTC
Yeah, but it's been like 5 minutes without update.
OK sure
let walker_iter = walkdir::WalkDir::new("/Users/kylegoetz/Pictures").into_iter();
    let good_paths = walker_iter.filter_entry(|e| is_valid_path(e.path()));

    let file_list: Vec<Image> = good_paths.filter_map(move |dir| {
        return match dir {
            Err(_e) => None,
            Ok(p) => Some(Image { path: p.path() }),
        };
    }).collect();
I don't have anything else written that uses file_list. But file_list will eventually have things performed on it. Image is a struct with a function calculate_checksum and will do some DB selects and inserts and possibly some file moves.

and Image:

pub struct Image<'a> {
    pub path: &'a Path,
}

There's also some traits implemented for Image, but doubt that matters here.

Ingvar Stepanyan
@RReverser
Mar 11 18:49 UTC
Right, so problem in your case is that you're collecting images which borrow paths from temporary items
Each particular DirEntry is lazily created and dropped as the iterator goes through, but you're trying to collect its Paths into a longer-living vector
kylegoetz
@kylegoetz
Mar 11 18:50 UTC
But isn't p guaranteed to exist until the filter_map is done? But the IDE is indicating that p doesn't live long enough within the filter_map closure itself, right? Or is the IDE warning me that should I use it later, there's no guarantee it will exist?
Ingvar Stepanyan
@RReverser
Mar 11 18:51 UTC
You have two options:
1) Handle each Image inside the loop (using .map / .for_each) instead of collecting somewhere.
2) If you do need a vector, then clone each p.path() via .to_owned() and store that on the Image.

But isn't p guaranteed to exist until the filter_map is done?

Nope, only within the particular closure execution. With move you're extending it to allow using it in subsequent .map / .for_each / for as well, but it's still not enough for collecting into an outer Vec.

kylegoetz
@kylegoetz
Mar 11 19:01 UTC
OK so p.path().to_owned() is the same issue: borrowed p doesn't live long enough. Do I have to do this to_owned before I even start the filter_map?
Ingvar Stepanyan
@RReverser
Mar 11 19:04 UTC
No that one should be fine. Did you change your type to hold PathBuf?
kylegoetz
@kylegoetz
Mar 11 19:09 UTC
I did and noticed that type mismatch error. So I added on an .as_path() afterwards and it reverted to the same "not long enough" error. I didn't want to have to change my structs if I didn't have to for no other reason than I could imagine a future scenario where I might have to change a bunch of structs (end goal is some polymorphic code)
I'll change my type to hold PathBuf and remove the .as_path and see what happens!
Ingvar Stepanyan
@RReverser
Mar 11 19:11 UTC
Yeah that's what I meant - if you do as_path() again, you're going back to reference with its issues. You just want an owned PathBuf for now.
kylegoetz
@kylegoetz
Mar 11 19:12 UTC
Oh I thought the as_path would end up with a clone of the original path. Thought PathBuf was a new thing divorced from the original path, and then as_path would be based off this new PathBuf that was separated from the original path
Ingvar Stepanyan
@RReverser
Mar 11 19:12 UTC
Yeah but then it's again a reference to a temporary PathBuf, and you've gone the entire circle.
kylegoetz
@kylegoetz
Mar 11 19:12 UTC
OK if I make it a PathBuf lots of my other code blows up because it needs Path everywhere. Live and learn about picking data types, huh? Should I just be defaulting to PathBuf all the time instead of Path?
Ingvar Stepanyan
@RReverser
Mar 11 19:13 UTC
No, not really - as I mentioned, another alternative is just not collecting images into an array, but depends if it works in your usecase
kylegoetz
@kylegoetz
Mar 11 19:13 UTC
Don't bother answering, haha. I should read up on what PathBuf actually is first since this is the first time I've ever used it.
Ingvar Stepanyan
@RReverser
Mar 11 19:13 UTC
If you can process them one-by-one, that would be preferred as now you don't need to change any types and can just use the borrowed Path as-is
kylegoetz
@kylegoetz
Mar 11 19:15 UTC

So I'm porting some Python code and trying to get step by step working before introducing more complications.

The next complication would be async stuff (DB query) and then parallelization (to push the presumably CPU-bound checksum calculations to different threads in a thread pool or whatever the Rust equivalent is)

I was just trying to get the images, then was going to learn the polymorphism aspect because this will really be a collection of images and videos but checksum calculations are different for the two, then was going to learn the async stuff, etc.

My thinking was

  1. file_list = the filter map.
  2. new_files = file_list.filter(db's checksum != file.calculate_checksum())
  3. Then new_files.foreach(insert into DB)

But I suppose I could do

file_list.foreach( get db checksum THEN if not same checksum do DB insert)

Thanks for the help. My brain is newly wrinkled. :)