Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Simon Thornington
@sthornington
because I am not good at gitter
would be curious to hear better ways of doing this
(I've been using nom for like 3 days so this was practice)
working through https://bodil.lol/parser-combinators/ was helpful for me to understand the whole paradigm in Rust
I think you could make a more efficient one_of for ascii A-Z by implementing your own FindToken on 'A' <= c <= 'Z' for instance
or maybe something like that already exists
Jordan Mack
@jordanmack
I actually need a-zA-Z0-9_, which is equiv to \w in regex.
I'm trying to learn Nom as well, so I want to do it the Nom way.
Simon Thornington
@sthornington
maybe implement a FindToken using is_alphanumeric ?
Jordan Mack
@jordanmack
Yes, I think I can tie that with alt or something to get the _ in there.
that simplifies it to:
fn paz(input: &str) -> IResult<&str, &str>
{
    terminated(alphanumeric0,
               nom::character::complete::char(':'))(input)
}

#[test]
fn test_az() {
    assert_eq!(paz("ABC:"), Ok(("", "ABC")));
}
Alberto
@0X1A
Hi folks. I want to use the error propagation operator in a parser function. The function using the propagator returns a custom error I've defined (we'll call it MyError). The parser function returns the usual IResult. I've tried looking at other examples for nom 5 but I can't quite figure out how this can be done. I can obviously define the From trait for MyError -> nom::Err<_> but then I would have to return an ErrorKind, where the ErrorKind would be totally unrelated since errors returned by MyError have essentially nothing to do with parsing.
peckpeck
@peckpeck
the recommended way is to define a global error type and implement ParseError on it
then all my parsers return an IResult<I,O,MyError>
this is done automatically since I use nom5 functions and use the results directly as a MyError
Alberto
@0X1A
@peckpeck See, that's what I thought as well but then I seem to get :
`?` couldn't convert the error to `nom::Err<MyError>`
When ever I use the prop error with a nom parser function. And I can't exactly impl From for Err<_> for Err<MyError> since I can't impl for a type outside my crate
peckpeck
@peckpeck
did you implement ParseError ?
Alberto
@0X1A
Yep, have impl<'a> ParseError<MyInputType<&'a str>> for MyError in my crate
? does an implicit conversion using the From trait, so I wonder if this simply isn't possible without defining From, which of course I can't do since Err is nom's
peckpeck
@peckpeck
i didn't have to define it in my code
and that would imply defining it for nom::Err not for your Error type
on what did you use the ? ?
Alberto
@0X1A
preceded(_, _)(input)?
peckpeck
@peckpeck
are the _ things nom parsers or ones you wrote ?
Alberto
@0X1A
Oh fml, parser I was using there was still using generic IResult
Alberto
@0X1A
Okay, so that resolved that. I can now use IResult<I, O, MyError>, but now I'm back at not being able to use the ? propagator for my functions that return non-IResults. I'm guessing the only way to fix that would be to convert it manually because of IResult...
peckpeck
@peckpeck
for my part i strictly separated parser code from other code so i have few places where I had to manually translate from my parser errors into my application error
anything i used within parser either returned a MyError or I made it so using an equivalent of cut or i simply matched it for error handling
Evgeny Fomin
@fominok
Hello! I'm new to Nom and have a basic question, I've described it here: https://stackoverflow.com/questions/57608442/rust-nom-many-and-end-of-input. Thank you in advance!
peckpeck
@peckpeck
with nom 5 all macro are incomplete, if you want to use the complete version, your have to switch to functions
Evgeny Fomin
@fominok
@peckpeck is there any alternative to do_parse to deny some of the inputs?
peckpeck
@peckpeck
do_parse is not made to deny some input but to write sequence, the recommended alternative is now to write a parser with a sequence of let (i,xxx) = someParser(i)?;
I created a sequence! alternative in https://github.com/peckpeck/nomplus
Evgeny Fomin
@fominok
Both versions (with take_till and eof) and without macros work, thank you!
matrixbot
@matrixbot
bspeice Is there a good way to handle bit-tuple parsing? As an example, in IPv4, there's a 4-bit version number and 4-bit header size as the first byte. It's not hard to just mask off the upper 4 bits, but I was curious about trying to do it in Nom style.
Denis Lisov
@tanriol
There's the nom::bits module with bit-granularity parsers.
matrixbot
@matrixbot
bspeice Right, but the impression was that I needed to surround things with a nom::bits::bits function in order to actually use bits without resetting the alignment each time.
bspeice Specifically, I want something like tuple((take_bits(4), take_bits(4)))
matrixbot
@matrixbot
bspeice Currently I'm getting type errors with both let (_, (a, b)): (&[u8], (u8, u8)) = tuple((bits(take(4usize)), bits(take(4usize))))(buf).expect("Parse failed"); and let (_, (a, b)): (&[u8], (u8, u8)) = bits(tuple((take(4usize), take(4usize))))(buf).expect("Parse failed");.
bspeice (Noting that the take in question has been imported from the bits::complete)
Denis Lisov
@tanriol
bits(tuple(...)) should work. However, you may need to use some dark magic for type inference to work... for me it required bits::<_, _, (_, _), _, _>(tuple(...))
James Carl
@crazycarl

I like Nom. It's incredibly fast and very maintainable.
I think the documentation for the whitespace module hasn't been updated yet. I just know that in the source code, ws! is marked as deprecated.

Looking at the source... is sep the right function to use here? Is this function finished?

prbs23
@prbs23_gitlab
I am trying to figure out if an issue with unicode characters I'm running into is a bug, or intentional. Specifically I'm looking at the "character" parsers (nom::character::streaming::{char, anychar}). It looks like when using &[u8] as the input type these parsers don't correctly handle multi-byte characters. For example the following works:
assert_eq!(nom::character::streaming::anychar::<_, (_, nom::error::ErrorKind)>(&b"A"[..]), Ok((&b""[..], 'A')));
But this does not:
assert_eq!(nom::character::streaming::anychar::<_, (_, nom::error::ErrorKind)>("\u{2140}".as_bytes()), Ok((&b""[..], '\u{2140}')));
Instead it returns: Ok(([133, 128], 'รข'))
This issue doesn't appear to happen when the input is a String type
Is this expected or intended behavior? It seems a little incorrect given that rust characters are by definition utf8 encoded
peckpeck
@peckpeck
this seems normal to me, when you parse a byte slice, you intend to process bytes one by one, whereas when you parse a string you intend to process characters (in rust sense) one by one
Adrien FAURE
@adfaure
Hello, I am trying to do a simple parser that consumes alphanumeric char and "-". It is not clear how to do that for me.
Is it possible to write a function like alphanumeric1 dedicated to my use case ? Thanks!
Adrien FAURE
@adfaure

I did something like:

fn uuid_parser(s: &str) -> IResult<&str, Uuid, VerboseError<&str>> {
     map_res(recognize(uuidchar), Uuid::parse_str)(s)
}

fn uuidchar<T, E: ParseError<T>>(s: T) -> IResult<T, T, E>
where
  T: InputTakeAtPosition,
  <T as InputTakeAtPosition>::Item: AsChar,
{
  s.split_at_position1_complete(|item| {
      let ch = item.as_char();
      !(ch == '-' || is_alphanumeric(ch as u8))
  }, ErrorKind::AlphaNumeric)
}

#[test]
fn test() {
    println!("{:?}", uuid_parser("c15a23cd-22d8-4351-b738-396b274599f8"));
}

However, it looks like a bit hacky. Is there a simpler way to do that?
I basically copied the function : https://docs.rs/nom/5.0.1/src/nom/character/complete.rs.html#571

Denis Lisov
@tanriol
take_while1 will probably be more readable.