Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Oct 13 16:49
    lwandrebeck commented #421
  • Oct 13 15:08
    iago-lito opened #421
  • Oct 10 14:15
    Robert42 edited #420
  • Oct 10 14:14
    Robert42 commented #420
  • Oct 10 06:09
    Robert42 commented #420
  • Oct 10 06:09
    Robert42 commented #420
  • Oct 10 06:06
    Robert42 opened #420
  • Oct 10 03:44
    brson commented #325
  • Sep 30 02:11
    CAD97 commented #419
  • Sep 29 22:37
    Yam76 opened #419
  • Sep 28 00:42
    Keats commented #402
  • Sep 24 12:27
    Nadrieril commented #416
  • Sep 24 06:49
    lwandrebeck commented #416
  • Sep 23 15:38
    ssokolow opened #418
  • Sep 22 23:47
    stephenmac7 opened #417
  • Sep 21 08:01
    lwandrebeck commented #416
  • Sep 20 18:35
    Nadrieril edited #416
  • Sep 20 18:34
    Nadrieril edited #416
  • Sep 20 18:34
    Nadrieril edited #416
  • Sep 20 18:33
    Nadrieril opened #416
Christopher Durham
@CAD97
With nom your parser can recognize any recognizable language (i.e. it's turing complete and could compute anything if you let it)
Whereas with a grammar description you're limited to CFG at most, or sometimes "semi context sensitive" (don't ask)
I'm not really comfortable suggesting one way or another. I've helped improve pest and syn::parse, used nom, and am working on my own experimental tool
But nom has increased significantly in quality recently and has the biggest userbase
Arvid E. Picciani
@aep
is there an example for a prec climbing parser?
rule expr is left-recursive (expr -> term -> expr); pest::prec_climber might be useful in this case
but there's no example in the docs how to use it
RS
@sayrer
there's an example in the book: examples/calculator/src/main.rs
mqnfred
@mqnfred

hey folks, I a problem with a parser I just wrote. This is my grammar:

journal = { SOI ~ ((journal_item | price) ~ space* ~ NEWLINE* ~ space*)* ~ EOI }

journal_item = @{ date ~ space+ ~ description ~ entry* }

date = @{
    ASCII_DIGIT{4}
    ~ date_delimiter
    ~ ASCII_DIGIT{2}
    ~ date_delimiter
    ~ ASCII_DIGIT{2}
}
date_delimiter = _{ "/" | "-" }

description = @{ (char+ ~ space*)+ ~ NEWLINE }

entry = @{
    space{2,}
    ~ path ~ space+
    ~ float ~ space+
    ~ currency ~ space*
    ~ (entry_price ~ space*)?
    ~ NEWLINE
}

path = @{ ((char+ | ":")+ ~ space?)+ }

float = @{ minus? ~ ASCII_DIGIT+ ~ (dot ~ ASCII_DIGIT+)? }

currency = @{ quote? ~ ASCII_ALPHA_UPPER+ ~ quote? }

entry_price = @{ entry_price_prefix ~ space+ ~ float ~ space+ ~ currency }
entry_price_prefix = _{ "@@" | "@" }

price = @{
    price_prefix ~ space+
    ~ date ~ space+
    ~ currency ~ space+
    ~ float ~ space+
    ~ currency ~ NEWLINE
}

minus = { "-" }
price_prefix = _{ "P" }
space = _{ " " }
dot = _{ "." }
quote = _{ "\"" }
char = {
    ASCII_ALPHANUMERIC
    | "-"
    | "("
    | ")"
    | "/"
    | "'"
    | "î"
    | "$"
    | "&"
    | "."
    | ":"
    | "!"
}

This is a snapshot of code that I use to execute the parsing:

#[derive(Parser)]
#[grammar = "ledger.pest"]
pub struct LedgerParser;

pub fn parse(contents: String) -> Result<Ledger, LedgerError> {
    let journal = LedgerParser::parse(
        Rule::journal,
        &contents,
    ).expect("unsuccessful parsing").next().unwrap();
    extract_journal(journal)
}

fn extract_journal(journal: pest::iterators::Pair<Rule>) -> Result<Ledger, LedgerError> {
    let mut ldg = Ledger::new();

    println!("{:#?}", journal);

I get an output that makes me think the outer rule (journal) is matched:

Pair {
    rule: journal,
    span: Span {
        str: "2018/08/01 Rent from another account to start\n  Expenses:House:Rent    2400.00 USD\n  Capital:Starting balance:USD    -2400.00 USD\n\n2018/08/01 Initial USD balance of Revolut\n  Assets:Danna:Revolut:USD    80.99 USD\n  Capital:Starting balance:USD    -80.99 USD\n\n2018/08/02 Walmart groceries\n  Expenses:House:Cleaning    3.98 USD\n  Expenses:Self:Self-care:Beauty:Creams    11.97 USD\n  Expenses:House:Cleaning    0.98 USD\n  Expenses:House:Cleaning    7.48 USD\n  Expenses:Value Added Tax    2.20 USD\n  Assets:Danna:Revolut:USD    -26.61 USD\n\n2018/08/03 Uber for toto car rental\n  Expenses:Transportation:Uber    9.3 [............]
        start: 0,
        end: 161060
    },

So clearly the journal rule properly matches my whole file. The journal_itemrule is also matched as seen in the rest of the output:

    inner: [
        Pair {
            rule: journal_item,
            span: Span {
                str: "2018/08/01 Rent from another account to start\n  Expenses:House:Rent    2400.00 USD\n  Capital:Starting balance:USD    -2400.00 USD\n",
                start: 0,
                end: 130
            },
            inner: []
        },

However, as you can see, the journal_item rule's inner vector is empty. I believe that my grammar is proper given that the journal and journal_item rules match properly. But the inside of journal_item is not "captured" for me to retrieve later on. If I execute into_inner on the journal rule I get a None object.

The data I use takes the following form (I won't post too much as it happens to be my personal accounting information, in ledger-cli form):

2018/08/01 Rent from another account to start
  Expenses:House:Rent    2400.00 USD
  Capital:Starting balance:USD    -2400.00 USD

2018/08/01 Initial USD balance of Revolut
  Assets:Danna:Revolut:USD    80.99 USD
  Capital:Starting balance:USD    -80.99 USD

P 2019-07-03 EUR 7.4074 DKK
P 2019-07-03 USD 6.6489 DKK
...

I tried to quickly tweak atomicity(@)/silence(_) of the rule but nothing has changed. I am obviously missing something, what is it? Thank you for your attention

I hope this is the right place to ask those kinds of support questions, otherwise would be so kind as to direct me to a place where I can ask those questions?
Christopher Durham
@CAD97
@mqnfred by making journal_item @-atomic, pest doesn't emit any child rules. You want $-compound-atomic.
0paIescent
@0paIescent
I'm trying to convert a nom project to use pest instead, and I'm having a bit of trouble converting things over easily. Would someone be able to point me in the right direction?
mqnfred
@mqnfred

@mqnfred by making journal_item @-atomic, pest doesn't emit any child rules. You want $-compound-atomic.

Thank you that solved the problem. I did not know of the compound-atomic rules.

0paIescent
@0paIescent
I've found that when nesting repetitions I end up with very large file sizes from cargo expand, and compilation and building uses up a great deal of memory. Once even used up all 16 GB I have and I had to hard shutdown. Has anyone else experienced this too?
Dragoș Tiselice
@dragostis
How big is the grammar? The next version of the generator should be quite a bit more efficient.
Vincent Prouillet
@Keats
I've found an infinite loop issue (I guess) while fuzzing Tera: https://pest.rs/?bin=6yr6p#editor if you change the Rule of be if_tag it will hang forever
Vincent Prouillet
@Keats
in the generated lexer I meant
Arvid E. Picciani
@aep
is there any example of a programming language in pest? i find it extremely difficult to work around the recursion restriction
Christopher Durham
@CAD97
@aep there's no general recursion restriction
The restriction is on specifically "left recursion" or "non-advancing recursion"
The big trick is that instead of parsing e.g. expr = expr binop expr | non_binary_expr in a traditional CFG specification manner, you line up your grammar more with how you'd write a parser: maybe_binary_expr = non_binary_expr {binop non_binary_expr}*
Dardan
@darddan
Hi, I'm having trouble writing a pest file for a python-like syntax. I am working on a language where higher indented (tabs only) expressions are children of the lower indented ones (first lower indented expression above from them).
I've tried stuff with using PUSH-POP but I'm not having any success. Has anyone done something similar? can you share some example code so I can look into it?
Caleb Winston
@calebwin
When text is parsed, Pest can automatically report errors - right?
Is there a way to report errors after the text is parsed?
SasakiSaki
@GalAster
How to define SOL(Start of Line), some expr must start in a new line
Tesla Ice Zhang‮
@ice1000
mental
@mental32
Is it possible to not match against a rule? for instance "some" ~ NOT(ASCII_DIGIT+) ~ ("thing" | "one") to match something or someone but not some1
Laurent Wandrebeck
@lwandrebeck
@mental32 : not is ! with pest
mental
@mental32
@lwandrebeck Thanks!
mental
@mental32
Is it possible to conditionally silence rules?
for example rule's a and b require that rule c matches too but I want to silence c when matching a but I not in b
Tesla Ice Zhang‮
@ice1000
You can ignore the .next()'s return value conditionally
mental
@mental32
yeah I ended up doing just that, thanks :)
mental
@mental32
@ice1000 does pest support stream parsing?
rbenua
@rbenua
Hi there, I'm seeing this strange message when trying to use the & behavior: https://pest.rs/?bin=1bsukd#editor
Is this a pest bug or just a confusing error message?
(and, if i'm doing it wrong, what would be the standard way to exclude a list of specified keywords from the space of identifiers? I feel like that must be a pretty common thing to have to do)
mental
@mental32
@rbenua I use this trick https://pest.rs/?bin=slfd1#editor
have a rule that matches all identifiers without discrimination, then just pair it as !keyword ~ raw_ident
If you want to allow identifiers to resemble keywords see True being a keyword and True_ being a valid identifier then filter a valid keyword by excluding any matches that are followed by another rule such as ASCII_ALPHA | "_"
you may want to silence the raw_ident rule in that example
Vincent Prouillet
@Keats
Has anyone got an idea on where the issue in pest-parser/pest#402 could be? Is there a way to output a generated parser as a valid Rust file?
Dragoș Tiselice
@dragostis
It probably has something to r
Dragoș Tiselice
@dragostis
... with a rule that recurses a lot. I'll try to a better look when I get the chance. You can try to debug this with the debugger branch on pest.
Vincent Prouillet
@Keats
I'm going on holiday very soon, I'll have a look when I get back
I might have time to create a testcase at least though, where should it be?
Y0hy0h
@Y0hy0h
Hi! I have written a grammar for the R language and while testing it with a real world code file, I noticed that nested if statements make the parser run noticably slow. I'm not surprised that this happens, it feels like exponential complexity growth, but I was wondering whether I could do some improvements to my grammar. I've no previous experience with parsers nor PEGs.
If you'd like to have a look, I've created a bin at https://pest.rs/?bin=10fbo7#editor. You can copy and paste more while (TRUE) to the end of the file to get a feel for when it starts to become slow. (It takes a lot of nestings! :) )
By the way, pest is pretty awesome, no performance issues whatsoever except for this excessive nesting! Really cool!
Christopher Durham
@CAD97
>.> I got all excited because the pest-parser org has Actions but I still don't have access for my normal account >.>
Christopher Durham
@CAD97
image.png