by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Denis Lisov
@tanriol
Interesting... this sounds like a good use case for a custom derive, but I don't see a fully functional one.
However, how about two-stage parsing?
You have multiple parse_ignored_table, so I assume you can skip a table even if you don't know its type yet... and you probably have a type code in the table header that lets you identify the specific table.
Denis Lisov
@tanriol
How about the following logic? You parse in a loop 54 "unknown" tables and put them in a HashMap with their type code as the key. Then you start building the structure and only now you get each table by its ID and parse its contents, returning an error if any of the required tables is not found or fails to parse.
By the way, if that's not a secret, what were the previous two crates you had to modify?
Kubik
@xNxExOx
pub struct AbstractCffTable<'a> {
    pub id : u32,
    pub unknown1 : u16,
    pub compressed_size : u32,
    pub unknown2 : u16,
    pub decompressed_size : u32,
    pub data : &'a[u8],
}
all tables looks like this, so I can use id and compressed_size for easy skipping
Kubik
@xNxExOx
obrazek.png
:O I really like this :) thank you very much for this great simplification @tanriol
Elliott Slaughter
@elliottslaughter
is there a simple example somewhere showing how to hook up num to a Read instead of using &str or similar?
or maybe I should just not bother and read the entire file into memory
matrixbot
@matrixbot
bspeice Likely easier to read into memory, but using something like BufReader would make that easier. Most of the nom functions are abstracted over various Input* traits, so maybe have a concrete Reader that implements those?
Denis Lisov
@tanriol
@estrogently I'd do that manually (as in "take a string with the last two fields and split it manually on the last separator").
Giuseppe Longo
@glongo
I have a problem with this macro: cond!(len > 20,flat_map!(take!(len - 20),complete!(dbg_dmp!(many1!(parse_radius_attribute)))))
I can't understand why it returns an incomplete error: Error(Incomplete(Size(1))) at l.50 by ' many1 ! (parse_radius_attribute)
I've tried to print the input bytes to figure out what's going on, and I've noticed that after the last sequence of bytes the function parse_radius_attribute is called with an empty array, an example below:
i: [20, 1f, 4d, 49, 57, 56, 42, 41, 42, 2d, 53, 57, 30, 31, 2d, 4d, 49, 54, 43, 48, 45, 4c, 4c, 2d, 39, 31, 35, 41, 2d, 57, 49]
i: []
anyone can point me what is the problem?
I think that the parser should stop here:
i: [20, 1f, 4d, 49, 57, 56, 42, 41, 42, 2d, 53, 57, 30, 31, 2d, 4d, 49, 54, 43, 48, 45, 4c, 4c, 2d, 39, 31, 35, 41, 2d, 57, 49]
Denis Lisov
@tanriol
@glongo What's your nom version?
Giuseppe Longo
@glongo
@tanriol 4.2
Denis Lisov
@tanriol
Hm, don't remember how exactly this worked in nom 4. I'd probably try many1!(complete!(parse_radius_attribute))
Giuseppe Longo
@glongo
@tanriol you remember correctly :D thank you!
Denis Lisov
@tanriol
There's probably also eof needed in the nested parser to avoid possible trailing junk if that part is malformed.
Giuseppe Longo
@glongo
you mean something like many1!(complete!(eof!(parse_radius_attribute)))
Denis Lisov
@tanriol
No, I mean something like terminated!(many1!(complete!(parse_radius_attribute)), eof!())
Giuseppe Longo
@glongo
let me give it a try
Matwey V. Kornilov
@matwey
Hi everyone
I have my input as a slice of slices: &[& [u8]], how could I make it working with nom?
Denis Lisov
@tanriol
IIRC, you'd need to implement the corresponding input traits for (a newtype wrapping) this slice of slices.
Matwey V. Kornilov
@matwey
Is there any example to follow?
Denis Lisov
@tanriol
Not sure. I'd first ask myself whether I could just collect that into a single array... and if I cannot, whether I could parse it in parts (separate packets or something like that) and keep a buffer manually.
That's likely easier.
Matwey V. Kornilov
@matwey
Easier, but less efficient due to extra memory copy.
Denis Lisov
@tanriol
I'd not be so sure. Extra memory copy on the one hand, yes... more complex parsers, more branches and more code bloat on the other.
Virgile Andreani
@Armavica
Hello! I am searching how I can discard the beginning of the input until a given parser applies, is there a way to do that?
Denis Lisov
@tanriol
You can, but there's no easy way... and the performance may be pretty bad. What do you know about the parser that has to work?
Virgile Andreani
@Armavica
Ok, thank you. I am looking for a DER object with der-parser. For now I am trying the parser on each increasing offset of the input, it is indeed pretty inefficent. But maybe I could first search for a pattern in my data and then parse only from here.
Denis Lisov
@tanriol
Sounds a bit dangerous... I don't know this area good enough, so I'm not sure whether you're likely to have false positives.
Virgile Andreani
@Armavica
Why dangerous? I think false positives are fine, because they are going to be invalidated by the parser.
Denis Lisov
@tanriol
That probably depends on whether you know the expected structure ahead of time so that you could compare the one found to the one expected.
If you know it, you're probably safe from the "noise" prefix accidentally containing a matching structure, but, on the other hand, this sounds like a security vulnerability if the attacker could somehow control the "noise" part and inject the bad version of the structure.
Virgile Andreani
@Armavica
I see
However I am in fact scanning for all such structures in the input, so even if the attacker can insert bad cases, I think I should be able to still find the good ones, which is what matters in my case.
Denis Lisov
@tanriol
I'd probably not bet on that - reading the same data in different ways (or, for example, different instances of this structure) is one of the frequent sources of security vulnerabilities.
Returning to the original question, I'd take the specification, the expected data structure and find some sequence of 4+ bytes that's known ahead of time so that I could seek to it with take_until
Restioson
@Restioson
hi
is there any way to have alt((a, b)) not return (&str, &str)
this is what i have
45 | pub fn alt<I: Clone, O, E: ParseError<I>, List: Alt<I, O, E>>(l: List) -> impl Fn(I) -> IResult<I, O, E> {
   |                                                 ------------ required by this bound in `nom::branch::alt`
   |
   = note: expected enum `std::result::Result<(&str, &str), nom::internal::Err<(&str, nom::error::ErrorKind)>>`
              found enum `std::result::Result<(&str, screen::settings::administration::parse_search::Criteria<'_>), nom::internal::Err<(&str, nom::error::ErrorKind)>>`
i think the issue is the two combinators return diff types, do they each need to return Either or something
Denis Lisov
@tanriol
All the alt branches need to return the same type. For me it's usually an enum.
Restioson
@Restioson
ok thx
Michael Dale Long
@nikarul
Hello, I'm trying to write a parser that needs to parse Python-style variable names ([A-Za-z0-9_] characters, except the first character cannot be a number). I thought I would start by writing a custom version of nom::character::complete::alphanumeric1 that also accepts underscores, but when I tried copying it to my code and modifying it, I found that nom::traits is private. Any suggestions on the idiomatic way to do this with Nom?