Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 11:14
    bors[bot] closed #445
  • 11:14
    bors[bot] commented #445
  • 11:14

    bors[bot] on master

    initial static precedence climb… merge static_prec_climber into … fix cfg attributes and change m… and 1 more (compare)

  • 11:09

    bors[bot] on staging.tmp

    (compare)

  • 11:09

    bors[bot] on staging

    Updated versions. initial static precedence climb… merge static_prec_climber into … and 2 more (compare)

  • 11:09

    bors[bot] on staging.tmp

    initial static precedence climb… merge static_prec_climber into … fix cfg attributes and change m… and 1 more (compare)

  • 11:08

    bors[bot] on staging.tmp

    [ci skip][skip ci][skip netlify] (compare)

  • 11:08
    dragostis commented #445
  • 10:39
    ametisf synchronize #445
  • 10:17
    dragostis commented #447
  • 10:11
    dragostis labeled #446
  • 10:11
    dragostis labeled #446
  • 10:11
    dragostis labeled #446
  • 10:10
    dragostis commented #344
  • 10:03
    dragostis closed #441
  • 10:03
    dragostis commented #441
  • 09:09
    GalAster commented #441
  • 08:44
    GalAster commented #367
  • Feb 27 22:16
    kvark edited #447
  • Feb 27 21:59
    kvark opened #447
Christopher Durham
@CAD97
@emoon you're trying to match \n in code but it's already been eaten by WHITESPACE
You either need to make code (compound) atomic to opt out of trivia or remove newlines from WHITESPACE
Daniel Collin
@emoon
Alright. Thanks!
Gerred Dillon
@gerred
:wave: Hi! When might I use nom or pest over each other? Trying to understand in general the benefits of different approaches in parsing. :)
Christopher Durham
@CAD97
@gerred pest is a grammar-based parser, so you write a grammar and pest spits out the parsing code. Nom is a parsing framework, so that means you write the parser using tools it gives you.
In theory, a grammar tool is easier to use and more optimizable, because you just describe what you want to parse, and the tool has total control.
Conversely, a parsing framework is more powerful because you can do whatever you want, and there exist grammar-to-nom tools as well to "automate the boring parts".
In practice, it comes out to taste
I enjoy having a automatically up to date semiformal description of my grammar, but the benefits of such are almost entirely intangible
Though it also helps control complexity of the parser
Christopher Durham
@CAD97
With nom your parser can recognize any recognizable language (i.e. it's turing complete and could compute anything if you let it)
Whereas with a grammar description you're limited to CFG at most, or sometimes "semi context sensitive" (don't ask)
I'm not really comfortable suggesting one way or another. I've helped improve pest and syn::parse, used nom, and am working on my own experimental tool
But nom has increased significantly in quality recently and has the biggest userbase
Arvid E. Picciani
@aep
is there an example for a prec climbing parser?
rule expr is left-recursive (expr -> term -> expr); pest::prec_climber might be useful in this case
but there's no example in the docs how to use it
RS
@sayrer
there's an example in the book: examples/calculator/src/main.rs
mqnfred
@mqnfred

hey folks, I a problem with a parser I just wrote. This is my grammar:

journal = { SOI ~ ((journal_item | price) ~ space* ~ NEWLINE* ~ space*)* ~ EOI }

journal_item = @{ date ~ space+ ~ description ~ entry* }

date = @{
    ASCII_DIGIT{4}
    ~ date_delimiter
    ~ ASCII_DIGIT{2}
    ~ date_delimiter
    ~ ASCII_DIGIT{2}
}
date_delimiter = _{ "/" | "-" }

description = @{ (char+ ~ space*)+ ~ NEWLINE }

entry = @{
    space{2,}
    ~ path ~ space+
    ~ float ~ space+
    ~ currency ~ space*
    ~ (entry_price ~ space*)?
    ~ NEWLINE
}

path = @{ ((char+ | ":")+ ~ space?)+ }

float = @{ minus? ~ ASCII_DIGIT+ ~ (dot ~ ASCII_DIGIT+)? }

currency = @{ quote? ~ ASCII_ALPHA_UPPER+ ~ quote? }

entry_price = @{ entry_price_prefix ~ space+ ~ float ~ space+ ~ currency }
entry_price_prefix = _{ "@@" | "@" }

price = @{
    price_prefix ~ space+
    ~ date ~ space+
    ~ currency ~ space+
    ~ float ~ space+
    ~ currency ~ NEWLINE
}

minus = { "-" }
price_prefix = _{ "P" }
space = _{ " " }
dot = _{ "." }
quote = _{ "\"" }
char = {
    ASCII_ALPHANUMERIC
    | "-"
    | "("
    | ")"
    | "/"
    | "'"
    | "î"
    | "$"
    | "&"
    | "."
    | ":"
    | "!"
}

This is a snapshot of code that I use to execute the parsing:

#[derive(Parser)]
#[grammar = "ledger.pest"]
pub struct LedgerParser;

pub fn parse(contents: String) -> Result<Ledger, LedgerError> {
    let journal = LedgerParser::parse(
        Rule::journal,
        &contents,
    ).expect("unsuccessful parsing").next().unwrap();
    extract_journal(journal)
}

fn extract_journal(journal: pest::iterators::Pair<Rule>) -> Result<Ledger, LedgerError> {
    let mut ldg = Ledger::new();

    println!("{:#?}", journal);

I get an output that makes me think the outer rule (journal) is matched:

Pair {
    rule: journal,
    span: Span {
        str: "2018/08/01 Rent from another account to start\n  Expenses:House:Rent    2400.00 USD\n  Capital:Starting balance:USD    -2400.00 USD\n\n2018/08/01 Initial USD balance of Revolut\n  Assets:Danna:Revolut:USD    80.99 USD\n  Capital:Starting balance:USD    -80.99 USD\n\n2018/08/02 Walmart groceries\n  Expenses:House:Cleaning    3.98 USD\n  Expenses:Self:Self-care:Beauty:Creams    11.97 USD\n  Expenses:House:Cleaning    0.98 USD\n  Expenses:House:Cleaning    7.48 USD\n  Expenses:Value Added Tax    2.20 USD\n  Assets:Danna:Revolut:USD    -26.61 USD\n\n2018/08/03 Uber for toto car rental\n  Expenses:Transportation:Uber    9.3 [............]
        start: 0,
        end: 161060
    },

So clearly the journal rule properly matches my whole file. The journal_itemrule is also matched as seen in the rest of the output:

    inner: [
        Pair {
            rule: journal_item,
            span: Span {
                str: "2018/08/01 Rent from another account to start\n  Expenses:House:Rent    2400.00 USD\n  Capital:Starting balance:USD    -2400.00 USD\n",
                start: 0,
                end: 130
            },
            inner: []
        },

However, as you can see, the journal_item rule's inner vector is empty. I believe that my grammar is proper given that the journal and journal_item rules match properly. But the inside of journal_item is not "captured" for me to retrieve later on. If I execute into_inner on the journal rule I get a None object.

The data I use takes the following form (I won't post too much as it happens to be my personal accounting information, in ledger-cli form):

2018/08/01 Rent from another account to start
  Expenses:House:Rent    2400.00 USD
  Capital:Starting balance:USD    -2400.00 USD

2018/08/01 Initial USD balance of Revolut
  Assets:Danna:Revolut:USD    80.99 USD
  Capital:Starting balance:USD    -80.99 USD

P 2019-07-03 EUR 7.4074 DKK
P 2019-07-03 USD 6.6489 DKK
...

I tried to quickly tweak atomicity(@)/silence(_) of the rule but nothing has changed. I am obviously missing something, what is it? Thank you for your attention

I hope this is the right place to ask those kinds of support questions, otherwise would be so kind as to direct me to a place where I can ask those questions?
Christopher Durham
@CAD97
@mqnfred by making journal_item @-atomic, pest doesn't emit any child rules. You want $-compound-atomic.
0paIescent
@0paIescent
I'm trying to convert a nom project to use pest instead, and I'm having a bit of trouble converting things over easily. Would someone be able to point me in the right direction?
mqnfred
@mqnfred

@mqnfred by making journal_item @-atomic, pest doesn't emit any child rules. You want $-compound-atomic.

Thank you that solved the problem. I did not know of the compound-atomic rules.

0paIescent
@0paIescent
I've found that when nesting repetitions I end up with very large file sizes from cargo expand, and compilation and building uses up a great deal of memory. Once even used up all 16 GB I have and I had to hard shutdown. Has anyone else experienced this too?
Dragoș Tiselice
@dragostis
How big is the grammar? The next version of the generator should be quite a bit more efficient.
Vincent Prouillet
@Keats
I've found an infinite loop issue (I guess) while fuzzing Tera: https://pest.rs/?bin=6yr6p#editor if you change the Rule of be if_tag it will hang forever
Vincent Prouillet
@Keats
in the generated lexer I meant
Arvid E. Picciani
@aep
is there any example of a programming language in pest? i find it extremely difficult to work around the recursion restriction
Christopher Durham
@CAD97
@aep there's no general recursion restriction
The restriction is on specifically "left recursion" or "non-advancing recursion"
The big trick is that instead of parsing e.g. expr = expr binop expr | non_binary_expr in a traditional CFG specification manner, you line up your grammar more with how you'd write a parser: maybe_binary_expr = non_binary_expr {binop non_binary_expr}*
Dardan
@darddan
Hi, I'm having trouble writing a pest file for a python-like syntax. I am working on a language where higher indented (tabs only) expressions are children of the lower indented ones (first lower indented expression above from them).
I've tried stuff with using PUSH-POP but I'm not having any success. Has anyone done something similar? can you share some example code so I can look into it?
caleb winston
@calebwin
When text is parsed, Pest can automatically report errors - right?
Is there a way to report errors after the text is parsed?
SasakiSaki
@GalAster
How to define SOL(Start of Line), some expr must start in a new line
Tesla Ice Zhang‮
@ice1000
mental
@mental32
Is it possible to not match against a rule? for instance "some" ~ NOT(ASCII_DIGIT+) ~ ("thing" | "one") to match something or someone but not some1
Laurent Wandrebeck
@lwandrebeck
@mental32 : not is ! with pest
mental
@mental32
@lwandrebeck Thanks!
mental
@mental32
Is it possible to conditionally silence rules?
for example rule's a and b require that rule c matches too but I want to silence c when matching a but I not in b
Tesla Ice Zhang‮
@ice1000
You can ignore the .next()'s return value conditionally
mental
@mental32
yeah I ended up doing just that, thanks :)
mental
@mental32
@ice1000 does pest support stream parsing?
rbenua
@rbenua
Hi there, I'm seeing this strange message when trying to use the & behavior: https://pest.rs/?bin=1bsukd#editor
Is this a pest bug or just a confusing error message?
(and, if i'm doing it wrong, what would be the standard way to exclude a list of specified keywords from the space of identifiers? I feel like that must be a pretty common thing to have to do)
mental
@mental32
@rbenua I use this trick https://pest.rs/?bin=slfd1#editor
have a rule that matches all identifiers without discrimination, then just pair it as !keyword ~ raw_ident
If you want to allow identifiers to resemble keywords see True being a keyword and True_ being a valid identifier then filter a valid keyword by excluding any matches that are followed by another rule such as ASCII_ALPHA | "_"
you may want to silence the raw_ident rule in that example