Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Nicolas Bigaouette
@nbigaouette
My feeling is that as soon as I put a custom error, pain follows
Nicolas Bigaouette
@nbigaouette
fold_many_m_n() worked beautifully, thank you!
emmanueltouzery
@emmanueltouzery
hello, I'm trying to write a parser that would take a dynamic parameter. So I take the parameter (a reference to a HashSet), then I return a impl FnMut. but since the function I return needs to capture the HashSet parameter, the compiler complains that the dynamic parameter must have the static lifetime. But I'm looking at the source of the tag() function and it seems to accept non-static tag parameters? I could clone the parameter so that the closure will have it for sure but I intend to pass the parameter a few parsers down so I'd rather not.. How am I supposed to handle that?
xiretza
@xiretza:xiretza.xyz
[m]
the impl FnMut you return has to have the lifetime of the reference you pass in - e.g. &'a HashSet would become impl 'a + FnMut(...)
emmanueltouzery
@emmanueltouzery
oh, interesting. that will presumably force me to carry up that lifetime through all the callers.. i'll try that. for now i've made it so that the closures take the parameter by value.. it gets cloned everytime which is not ideal :(
thank you, for the answer, I'll try that tonight and report!
emmanueltouzery
@emmanueltouzery
@xiretza:xiretza.xyz seems to work, great, thanks a lot!
xiretza
@xiretza:xiretza.xyz
[m]
great!
Wojciech Niedźwiedź
@Niedzwiedzw
oof
I have a nice parser like this
       // sale_entires()
        separated_list1(
            separator::separator,
            sale_entry
        )
I'm completely blown away by how first test fails and second passes
    #[test]
    fn test_pair_of_sale_entry_petrol_short_separated() {
        let input = "some_input";
        let (_, sale_entries) = sale_entries(input).unwrap(); // THIS PANICS
        assert_eq!(sale_entries.len(), 2);
    }

    #[test]
    fn test_pair_of_sale_entry_petrol_short_separated_manual() {
        let input = "some_input";

        let (input, _entry) = sale_entry(input).unwrap();
        let (input, _sep) = separator::separator(input).unwrap();
        let (input, _entry) = sale_entry(input).unwrap();
        assert_eq!(input, "");

    }
aren't those equivalent?
Wojciech Niedźwiedź
@Niedzwiedzw
ok they're not if the separator is empty
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter

I've been making a really simple parser for parsing boolean expressions with bools and C style identifiers that represent bools at runtime.

I've got a few outstanding questions that I could use some help with. I'll post them as separate comments for easier threading.
The code
https://gist.github.com/alexmadeathing/63ac3ba02d682dd791cba9918958c671

2 replies
Question 1: I don't like the way the operator precedence is enforced via the recursive functions. I'm wondering, is it possible to implement operator precedence in a more maintainable way?
Question 2: Although it seems to work nicely, I feel like I'm abusing the Nom error architecture by inserting the error message content via the context system. I then propagate the context back to the user by only setting context in add_context() if there currently is no context - this seems to ensure the most relevant error is shown after the parse. Is that a poor approach? It seems to give me specific and fairly obvious errors, and frankly I can't work out how else to interpret the error information to get the same result.
Question 3: Would it be possible to adjust the context API so that it could take types other than &str? This would simplify some of my code.
1 reply
I appreciate there is a lot here. So thanks in advance to anyone able to consider my questions.
So far the parser is working nicely and I'm glad I stumbled upon Nom (error management made me tear my hear out though, ngl).
xiretza
@xiretza:xiretza.xyz
[m]
for the error handling, I'd write a trait that allows you to construct an error from your ExpressionErrorKind and an input position - basically ParseError, but for your own error kind. Then make all your parsers generic on the error type, but restrict it to ParseError + ContextError + YourErrorTrait
1 reply
that way the library user can use any error type they want
if you want, you can even provide dummy implementation of your trait for the nom-provided error types (Error, VerboseError et al)
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter

Hi @xiretza:xiretza.xyz I've modified my code to use a custom trait instead of context() and that's working much nicer. Thanks for your help there! At the moment, for the purposes of testing, I'm keeping my concrete error type too, but I may follow your advice and simply expose the error trait once I've got everything working.

I guess the big question then is that failing test (I have now added a test case for: a & b & c). I have been able to modify the grammar on paper to support the test case. It seems like it would work in Nom using right recursion, but that would result in an incorrect sequence of evaluations of the & terms.

Left recursion has me stumped.

Does Nom have any tools to help with this situation? I can see it working if I use maybe a vector of operands and a post processing step to generate the output AST. Is there a way to do that without allocations?

Actually I think the sequence of evaluations does not matter for my particular application - so I may just roll with right recursion for this. However... I'm kinda interested now. How are people tackling this issue in Nom?
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Hmmmmmmmmmm Maybe fold_many will work for me...
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Oh man, I'm so close to having this fold_many implementation working. I'm tripping up on some lifetime stuff.
    let (input, init) = operand(input)?;
    fold_many0(
        preceded(
            tag("&"),
            cut(specific(Specific::ExpectAndExpression, operand)),
        ),
        ||init, // Lifetime error here
        |a, b| Token::And(Box::new(a), Box::new(b)),
    )(input)
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
It works if I use clone(), but since I know ownership of init should be moved, cloning would be a waste. It would also work if the init parameter of fold_many() were FnOnce instead of FnMut, although I don't know if that would break some combinator usage.
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Ok init parameter being FnOnce wouldn't work because return type of fold_many is FnMut. Hmmmm
xiretza
@xiretza:xiretza.xyz
[m]
yeah that's a frequent pain point with nom's combinators, even if you only need the resultant parser once, the combinator still needs to return a "FnNonOnce", so all inputs need to be Clone/"FnNonOnce"
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Ironically, this near exact pattern is used in nom/tests/arithmetic
Sadly that's using i64, so it's copy and cloneable
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
I can probably solve this by making a container object to manage the token structure
Specifically for use in the fold functions. It's just a shame as it's extra boilerplate
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
omg I think I have a solution
You're going to vomit
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
fn and_term<'a, E>(input: &'a str) -> IResult<&'a str, Token, E>
where
    E: nom::error::ParseError<&'a str> + SpecificError<&'a str>,
{
    let (input, init) = term_or_not_term(input)?;
    let (input, mut root) = fold_many0(
        preceded(
            ws_tag("&"),
            cut(specific(Specific::ExpectAndExpression, term_or_not_term)),
        ),
        || Token::Bool(true),
        |a, b| Token::And(Box::new(a), Box::new(b)),
    )(input)?;

    // Well well well, what is this hot mess?
    // Because the left side is always either And or Bool, we can
    // simply loop until we find the bool then replace it with init
    //
    // #NOTE This is necessary because it's not possible to move
    // init into the closures passed to fold_many0 as they expect
    // FnMut, not FnOnce
    let mut t = &mut root;
    while let Token::And(a, _) = t {
        t = a.as_mut();
    }
    debug_assert!(matches!(t, Token::Bool(_)));
    *t = init;

    Ok((input, root))
}
This passes the tests, but it's so dirty
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Someone please tell me if there's some closure magic that will convince rust to move init directly into fold_many() instead of this filth you see above
tanriol
@tanriol:matrix.org
[m]
@AWildAudioNerd_twitter: I'd try let init = Some(init) outside and || init.take().unwrap() as the initializing closure.
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
WTF that worked (with let mut init = Some(init) outside and move || init.take().unwrap() is the initializing closure)
tanriol
@tanriol:matrix.org
[m]
Sorry, I missed these details :-) the basic idea here is to be explicit: "take the ownership the first time the closure is called; you're free to panic if it's called again"
Alex Blunt ✊🏿 BLM
@AWildAudioNerd_twitter
Ok I understand now
The option allows the ownership transfer to be explicit. If the closure were to be called again (which it won't in this instance), the option would be None and the method would panic
Ok, that's also dirty. But it's less dirty than my implementation.
Thank you! You folks have helped me level up in Rust during this process too
Jozef Miklos
@sparatko_gitlab

hello, any nom_locate users? i have trouble wrapping my mind around the error i am getting...
Original &str input parser:

type MyResult<'a, T> = nom::IResult<&str, T>;
fn yang_version_arg(input: &str) -> MyResult<&str> {
    tag("1.1")(input)
}

seems to work fine...

Then i try to use nom_locate and follow the gitbuh readme with Span 'type':

type MyResult<'a, T> = nom::IResult<Span<'a>, T>;
fn yang_version_arg(input: Span) -> MyResult<&str> {
    tag("1.1")(input)
}

but i get errors across whole app codebase wherever i try to use any nom combinator/parser:

error[E0308]: mismatched types
  --> src\parser\rfc7950\yang_version_stmt.rs:29:5
   |
28 | fn yang_version_arg(input: Span) -> MyResult<&str> {
   |                                     -------------- expected `Result<(LocatedSpan<&str>, &str), nom::Err<nom::error::Error<LocatedSpan<&str>>>>` because of return type
29 |     tag("1.1")(input)
   |     ^^^^^^^^^^^^^^^^^ expected `&str`, found struct `LocatedSpan`
   |
   = note: expected enum `Result<(LocatedSpan<&str>, &str), nom::Err<nom::error::Error<LocatedSpan<&str>>>>`
              found enum `Result<(LocatedSpan<&str>, LocatedSpan<&str>), nom::Err<_>>`

do i miss some piece of puzzle? isn't some deref supposed to happen to allow usage of all pre-build combinators in nom?

tanriol
@tanriol:matrix.org
[m]
Looks like tag here returns LocatedSpan<&str> instead of &str, you may want to change the return type or convert it somehow.
Jozef Miklos
@sparatko_gitlab
my understanding was the nom_locate modifies the input for combinators, not the output, but yes, the message seems opposite/ like you say
tanriol
@tanriol:matrix.org
[m]
I'd guess tag and take are a special case here because their input and output are the same type?..
Jozef Miklos
@sparatko_gitlab
there may be some pattern i am not following / is not very greatly explained in the nom_locate "tutorial", as i get similar hundreds of errors for basically any combinator i use across my parser from nom :( (with the same rustc error pattern)