Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jan 31 22:57
    erezsh commented #316
  • Jan 31 22:57
    erezsh commented #316
  • Jan 31 22:42
    excitoon commented #316
  • Jan 31 22:41
    excitoon commented #316
  • Jan 31 22:29

    erezsh on master

    Docs: Fixup (compare)

  • Jan 31 22:28
    erezsh commented #316
  • Jan 31 22:28

    erezsh on master

    BUGFIX: Indenter was in corrupt… Docs: Added instructions on how… (compare)

  • Jan 31 22:05
    erezsh commented #309
  • Jan 31 22:04
    erezsh commented #309
  • Jan 31 22:03

    erezsh on master

    BUGFIX: Fixed common.ESCAPED_ST… (compare)

  • Jan 31 19:33
    excitoon edited #316
  • Jan 31 19:33
    excitoon edited #316
  • Jan 31 19:32
    excitoon opened #316
  • Jan 31 15:24
    chaosite commented #314
  • Jan 31 02:31
    Vesuvium commented #314
  • Jan 30 19:24
    Agitolyev starred lark-parser/lark
  • Jan 30 07:36
    YaakovTooth starred lark-parser/lark
  • Jan 29 05:59
    macdavid313 starred lark-parser/lark
  • Jan 29 02:36
    ibrahimsharaf starred lark-parser/lark
  • Jan 29 02:07
    fuyunliu starred lark-parser/lark
Erez Shinan
@erezsh
Speed, mostly
Timo Furrer
@timofurrer
considering my gherkin parsing? (you might remember that)
okay
I'll give it a shot
maybe my grammer already works with Earley
Erez Shinan
@erezsh
It probably does. Maybe you'll need to adjust some rule weights (for example where STRING_NO_NL is used)
Timo Furrer
@timofurrer
okay
so the weight has to be highter then?
Erez Shinan
@erezsh
No, still lower. But Earley doesn't accept weights on terminals, only on rules
Timo Furrer
@timofurrer
okay
so what happens if I have a rule with a weight and it only matches a single terminal?
Erez Shinan
@erezsh
Weights on rules only help Earley disambiguiate
When two rules can match, it will choose the higher priority
Oh, also you'll probably need lexer="dynamic_complete"
Timo Furrer
@timofurrer
okay
what's that?
Erez Shinan
@erezsh
It will make sure Earley considers partial token matches as well. I'm not sure you need it, but maybe it will be useful for STRING_NO_NL
Timo Furrer
@timofurrer
Okay
I'll check it out later
Erez Shinan
@erezsh
Okay. Although if you're considering all this just to avoid "Scenario:", it's probably not worth it
Oliver Epper
@oliverepper

Hi all! I've just recently learned about lark. I want to use it as a substitute for a hand-crafted lexer/parser for very simple commands in an app. Is it somehow possible to have a grammar that works like this:

start: (("LOAD"|"CREATE") WS text WS APPEND WS)? text

text: ESCAPED_STRING

APPEND: "->"

%import common.ESCAPED_STRING
%import common.WS

but without the quotes that escape the string? 'text' should basically match everything but APPEND.

Erez Shinan
@erezsh

@oliverepper Yes, it's possible. Using terminals that match everything can be a little tricky, but it's usually possible to make it work. You can do something like

text.0: /.*/

That means it should try to match everything, but to give it a lower priority than other rules. Also read my conversation with timo, just before your message. It's somewhat relevant

Oliver Epper
@oliverepper

@erezsh Thanks a lot for your answer. I guess I am still missing some thing.

text.0: /.*/

gives me the following error: Dynamic Earley doesn't allow zero-width regexps.

text.0: /.+/

matches the whole input.

my_grammar="""
        start.1: (("LOAD"|"CREATE") WS text WS APPEND WS)? text

        text.0: /.*/
        APPEND: "->"

        %import common.WS
        """

I am testing the following text:

CREATE Name of a thing -> Entry for the thing
Erez Shinan
@erezsh
Yeah, my bad, it's .+
You should probably use %ignore WS (unless you have a very good reason not to)
Erez Shinan
@erezsh
Try this code:
from lark import Lark
g = r"""
    start: load | create
    load: "LOAD" text "->" text
    create: "CREATE" text "->" text

    text.-100: /.+/

    %import common.WS
    %ignore WS
"""

p = Lark(g, lexer="dynamic_complete" )
print(p.parse("CREATE Name of a thing -> Entry for the thing"))
@oliverepper
Oliver Epper
@oliverepper

@erezsh Perfect!

g=r"""
    start: load | create | text
    load.100: "LOAD" text "->" text
    create.100: "CREATE" text "->" text

    text.-100: /.+/

    %import common.WS
    %ignore WS
"""

That does exactly what I want. Lark is really nice. I love adding lark to my toolbox. Thanks for your help!

Erez Shinan
@erezsh
Happy to help
João Henrique
@JohnnyonFlame
Hello, first, thanks for the great library
I'm writing a little DSL for a thesis, and I'm wondering if there's any more "gracious" way of handling the Transformer class items, right now I'm doing things like:
    def binary_op(self, items):
        left, op, right = items
        binop = ast.BinOp()
        binop.left = left
        binop.right = op
        binop.op = right
        binop.lineno, binop.col_offset = find_first_loc([left, right])
        return binop
Erez Shinan
@erezsh
You should look into Python's dataclass
João Henrique
@JohnnyonFlame
but things start getting a little.... hairy, specially when I have one or more optional rules in the grammar
Erez Shinan
@erezsh
Even just creating an __init__ method for each object will really help
Also, you have a bug.. binop.right = op can't be what you meant
João Henrique
@JohnnyonFlame
you are right, I was refactoring some code and didn't mean that
Erez Shinan
@erezsh
What I usually do, is
class BinOp(Expr):
    def __init__(self, left, op, right):
      self.left = left
      .....

class MyTransformer(Transformer):
    binary_op = BinOp
João Henrique
@JohnnyonFlame
I'm abusing of Python's AST a lil bit, they do have init but it gets really long with some of the longer ast nodes
Erez Shinan
@erezsh
You'll have some special cases, but it will really clean things up
Dataclasses can help you do this without writing an init
João Henrique
@JohnnyonFlame
that looks like a good start, thanks
Erez Shinan
@erezsh
np
João Henrique
@JohnnyonFlame
any best practices for optional rules (eg, should I just alias each production?) and tokens that I absolutely need positional data of (even tho they don't have relevant semantic info)
since Tree instances don't have positional info and defining "!rule -> 'MATCH'" seems to be considered a code smell
João Henrique
@JohnnyonFlame
guess I'll just resort to aliasing even tho it causes some minor code dupe
Erez Shinan
@erezsh
I'm not sure what you're asking
But you should be aware of of the maybe_placeholders flag
It's still new, so not well documented, but basically it means that optionals defined like [rule] (rule can be anything) return None if there's no match, instead of disappearing from the tree
João Henrique
@JohnnyonFlame
that is precisely what I need