Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • Sep 20 14:25
    tomtau closed #344
  • Sep 20 14:25
    tomtau commented #344
  • Sep 20 14:20
    tomtau edited #710
  • Sep 20 14:19
    tomtau review_requested #710
  • Sep 20 14:19
    tomtau review_request_removed #710
  • Sep 20 14:19
    tomtau review_requested #710
  • Sep 20 14:19
    tomtau opened #710
  • Sep 20 14:19
    tomtau review_requested #710
  • Sep 18 22:58
    bobbbay closed #660
  • Sep 18 22:58
    bobbbay closed #663
  • Sep 18 22:57
    bobbbay closed #666
  • Sep 16 14:30

    CAD97 on master

    chore: update to unicode 15 Merge pull request #709 from to… (compare)

  • Sep 16 14:30
    CAD97 closed #709
  • Sep 16 14:25
    CAD97 auto_merge_enabled #709
  • Sep 16 14:23
    tomtau review_requested #709
  • Sep 16 14:23
    tomtau review_requested #709
  • Sep 16 14:23
    tomtau review_request_removed #709
  • Sep 16 14:23
    tomtau review_requested #709
  • Sep 16 14:23
    tomtau opened #709
  • Sep 12 08:56
    MarijnS95 commented #702
Samuel Kyletoft
I'm trying to figure out how to match a string
As in normal string rules. Starting and ending with '"', and anything other than " in between.
Jason Qin
would be it viable to build something like scala in pest? being a context sensitive grammar
is anyone here?
im trying to use this in an async environment
but Pairs is not sync so im getting this error
    = help: within `impl futures::Future<Output = ()>`, the trait `std::marker::Send` is not implemented for `std::rc::Rc<std::vec::Vec<pest::iterators::queueable_token::QueueableToken<parse::Rule>>>`
note: future is not `Send` as this value is used across an await
   --> src/parse.rs:113:16
70  |       let pairs = if let Ok(pairs) = TimeParser::parse_time(&message.content) {
    |           ----- has type `pest::iterators::Pairs<'_, parse::Rule>` which is not `Send`
113 |           .exec()
    |  ________________^
114 | |         .await?;
    | |______________^ await occurs here, with `pairs` maybe used later
128 |   }
    |   - `pairs` is later dropped here
Andrzej Lichnerowicz

Hi y'all. What is the PEG/pest way to recognize user-defined types in parser? as in... i can have rule that goes like

type = { ^"void", ^"bool", ^"int", ... }

but the type should also include "user_defined_type", something that parser picks up as it goes. is it doable at the parser/grammar level, or should I allow for any IDENTIFIER and validate user-defined types on run-time while traversing AST, or even make like a AST-pre-run?

Scott Tadman
If I have a grammar that has one weird rule, is there a way to break that out and express it in code instead of deriving it?
Scott Tadman
Looking at a situation where I'm looking to parse some framing, where {4}xxxxYYY captures the 4 x characters, but not the Y. In other words, the repeat count is defined in the {...} part.

Hello I am testing a simple example to get familiar with PEST grammar syntax. I am trying to get every instance of ++ throughout the string but I am running into some issues. I think it may be an issue with the ANY keyword but I am not sure. Can anyone help point me in the right direction as to what is going wrong?

Here is my grammar.pest file

incrementing = {(prefix ~ ANY+ ~ "++" ~ suffix)}

prefix = {(NEWLINE | WHITESPACE)*}
suffix = {(NEWLINE | WHITESPACE)*}
WHITESPACE = _{ " " }

Here is my test case

//parses a file a matching rule and returns all instances of the rule
fn parse_file_contents_for_rule(rule: Rule, file_contents: &str) -> Option<Pairs<Rule>> {
    SolgaParser::parse(rule, file_contents).ok()

fn parse_incrementing(file_contents: &str) {
    //parse the file for the rule
    let targets = parse_file_contents_for_rule(Rule::incrementing, file_contents);

    //if there are matches
    if targets.is_some() {
        //iterate through all of the matches
        for target in targets.unwrap().into_iter() {
            println!("{}", target.as_str());

fn test_parse_incrementing() {
    let file_contents = r#"






Scott Tadman
You probably need a matcher on "all but +" and then a specific test for ++ vs. + and something else.
As in: dplus = { "++" } and string = { (dplus | ANY_minus_plus)* } and then count dplus tokens.
does anyone know how I would write this in pest grammar?

 * Any printable character except single quote or back slash.
fragment SingleQuotedPrintable: [\u0020-\u0026\u0028-\u005B\u005D-\u007E];
 * Any printable character except double quote or back slash.
fragment DoubleQuotedPrintable: [\u0020-\u0021\u0023-\u005B\u005D-\u007E];
Is there a way to automatically convert the Pair<Rules> into an enum I have made without having to manually do it myself?

why doesn't assignment in work?

assignmentOperator = { "=" }
declarationKeyword = _{ "const" | "global" }
assignment = {
    (declarationKeyword ~ WHITESPACE+)? ~ "test" ~ assignmentOperator ~ "test"

I'm trying to make it so an assignment will be test = test (with any whitespace inbetween), but you can add optional keywords (const or global before test as long as you have one or more piece of whitespace between that keyword and the first test

Anyone have any potential solutions for the problem I describe in #601, or any thoughts about the potential addition of anchors?
Is there a way to implicitly add NEWLINE between tokens like you can with WHITESPACE?
Couldn't you define WHITESPACE as ( " " | "\n" ) or smth similar?
@c0ba1t:matrix.org wow, this is so simple, I am not sure how I didn't think of this. Thank you, it works perfectly!
Artavazd Balaian

Hello, team. I'm upgrading a library that uses pest from 1.0 to 2.1.3 and I'm getting the following error for the grammar:

19 | #[derive(Parser)]
   |          ^^^^^^
   = help: message: grammar error

              --> 496:5
           496 |     (":" ~ value_name)? ~ ("supports" ~ interface_name)?␊
               |     ^--------------------------------------------------^
               = expression cannot fail; following choices cannot be reached

Link to the grammar: https://gist.github.com/REASY/f047054e7e19cb7e02acad0caaaaa356

I'm not sure I understand the error. Could you, please, help to understand? Thanks.
(Unfortunately, https://pest.rs/#editor does not work for me)

3 replies
Garret Fick

I'm trying to parse a grammar that's not working and I've not been able to figure out why. The pest editor shows the same behaviour. I expect this to resolve to a configuration_declaration.


WHITESPACE = _{ " " | "\n" | "\t" | "\r" }
identifier = { ASCII_ALPHA+ }
configuration_name = { identifier }
configuration_declaration = { "CONFIGURATION" ~ configuration_name ~ "END_CONFIGURATION" }



This doesn't match, but I don't know why.

Sreeja S Nair

Hello, I am testing a simple example to get familiar with PEST. I have the following.

SLASH = {"/"}
EMPTY = {" "}

double_wild = {"**"}
single_wild = {"*"}

multiple_single_wild = {single_wild ~ SLASH ~ multiple_single_wild | single_wild}

expression = {
    double_wild ~ EMPTY
    | multiple_single_wild ~ SLASH ~ double_wild ~ EMPTY
    | multiple_single_wild ~ EMPTY }

When I use */** as the test string, ideally it should match the second line of the expression. Unfortunately I get an error expected EMPTY or SLASH. My understanding is that multiple_single_wild is getting the precedence, and the parser is trying to match the third line of expression. Any idea on how to fix this? Is it a known issue with PEGs?

Abhijit Sarkar

I just came across pest today. I've a rule as follows:

defn = { COLON ~ (word+ ~ (int ~ binOp)? | cmd*) ~ SEMICOLON }

This produces a match with everything in the curly braces. Is there a way to exclude COLON and SEMICOLON?

Christopher Durham
@/all sorry for the ping, but this is important. https://github.com/pest-parser/pest/discussions/606
if I have something such as this to match expressions made up of terms and operators, how can I stop expression from capturing excess whitespace at the end of it? I don't want it to not match if there is whitespace after the last term, but I don't want that whitespace to be included in the expression rule
program = {
    SOI ~ expression ~ EOI

expression = {
    term ~ (operator ~ term)*

term = {

operator = {
    "+" | "-"

    " "
If I were to parse something such as this "1 + 2 " it parses successfully, but the span of the expression rule includes the final whitespace when I do not want it to
hello, where could i find the definition of the builtin rules such as TITLECASE_LETTER
1 reply
# field with intended
Is it possible to create pattern like this?


# this is comment

section comment


# this is comment
i cant figure out why pest wont recognize the whitespace.
WHITESPACE = { " "|"\t" }
attributeType = { typing ~ "Attribute" }
typing = {("Sync" | "Persistent")}
attributeName = {'A'..'Z'~'a'..'z'}
attributeRow = { attributeType ~ WHITESPACE+ ~ attributeName }
it wont work without the + and it does not work with it. But i have to check for spaces between attributeType and attributeName
hey, could you guys help me to parse something like https://www.juniper.net/documentation/us/en/software/junos/junos-xml-protocol/topics/concept/junos-xml-protocol-configuration-mapping-to-json.html
those "CLI Configuration Statements" ones. esp. those sections like system { .... } or sub section { ... } ones
it's my first time with PEG
and it looks like either the editor at pest.rs is broken oder the INI example doesn't work anymore
other examples also seem to be broken there so i suppose it's the editor on pest.rs that's broken then?
Tomas Tauber
there's indeed one open issue about the INI example: pest-parser/book#19
Hi friends! Anyone knows how I can write a rule that anything but X?
any ideas how to parse that properly? http://dpaste.com/D4W46XL4J that's just a very basic/brief example. esp. those sections like protocols->bgp->group->someisp->neighbor and i need to access those values afterwards

good morning. I'm trying to parse a simple return statement, but I'm having trouble with significant whitespace.

my grammar:

return_statement = 


RETURN_KEYWORD = _{ "return" }

expression = {

"return 0" should parse successfully, but "return0" should not. with the WHITESPACE in the return_statement rule it doesnt match at all, but if I remove it "return0" without whitespace matches.

Yes it has to do with WHITESPACE name, it is success when you change into other name
James Harton
Is there a way to trace the execution of the parser to understand what productions it tried before failing?
1 reply
Jesse Jafa
Hello, I'm trying to match a string that starts with one or more #, contains any WHITE_SPACE | ALPHANUMERIC characters & zero to one NEWLINE. But this seems to match more than one newlines...?
comment = {"#"+ ~ (ASCII_ALPHANUMERIC | WHITE_SPACE)+ ~ NEWLINE{0, 1}}
1 reply
Anchor Modeling
@lwandrebeck Do you have any insight in the current development status, as if the project is or will be actively maintained?
Vladimir Uogov

When trying to run samples from book, got a message:

error[E0468]: an extern crate loading macros must be at the crate root
--> src/lang/mod.rs:4:1
4 | extern crate pest_derive;
| ^^^^^^^^^^^^^^^^^^^^^^^^^

Laurent Wandrebeck
@anchormodeling_twitter project is active, several releases these last weeks :)
Raphaël Duhen
Hi, is there a way to have ~ mean mandatory whitespace instead of optional? I'm making an esolang that's almost only words from a conlang and it's a bit unwieldy to make nearly every rule atomic and then adding WHITESPACE+ between tokens and rules...
Omid Rad

Since I'm new here, let me thank the team for the library <3
And then ask my question...
Is it possible to create pest grammar with macros? Or even dynamically at runtime?

(use case: I would like to let my lib users the ability to add some specific logics to the defined grammar)