Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Andy Grove
    @andygrove
    instead of having one ToString implementation to write an AST back out to SQL, it should be dialect specific
    so the Dialect trait could have a write_sql(ast: &SQLQuery) for example
    Really, theDialect trait should have a parse_sql(sql: &str) -> Result<Box<SQLQuery>> too
    and then we can use composition
    so the HiveSQLDialect could delegate to ANSIDialect for many things
    where it gets tricky though is the lack of inheritance
    once HiveQLDialect delegates to ANSIDialect we can't really call back into HiveQLDialect
    that's the part I was struggling with before
    in Java I would have HiveQLDialect extend ANSIDialect and just override certain methods
    Nickolay Ponomarev
    @nickolay
    Yea. I'm curious though: why you started with ToString, I thought your interest was in parsing SQL in order to execute it
    I think of the same problem you brought up in terms of the AST layer: if I want to add an optional attribute to an AST struct (like this one), where do I put it
    Andy Grove
    @andygrove
    For DataFusion I am only interested in parsing+executing. In my day job I build some services in Scala that do a lot of parsing, query manipulation and rewriting, e.g translating HiveQL to other dialects, so I'm just thinking about how that can be handled in Rust
    Some problems really are easier with an OO language
    Nickolay Ponomarev
    @nickolay
    ...regarding AST storage: I like the approach the author of rust-analyzer is promoting, where the parser output is an untyped tree of "tokens" which stores only the location in the tree, the token type and its location in the source text. On top of that there's a typed layer.
    I found it while thinking about the rewriting use-case. Rewriting ideally shouldn't lose the formatting, and storing the whitespace in the typed AST is rather inconvenient
    Andy Grove
    @andygrove
    that sounds more like the approach that projects like ANTLR use as well
    Nickolay Ponomarev
    @nickolay
    doing that seems like a large project though, so I was hoping to come up with a way to store such extensions somewhere outside the main AST tree
    Nickolay Ponomarev
    @nickolay

    in Java I would have HiveQLDialect extend ANSIDialect and just override certain methods

    can't you do that with traits? With the default implementation into the Dialect trait, you should be able to override it from trait impls

    I'm thinking of Dialect::write_sql(*) -> String, not a ToStringimplementation
    dozens of write_sql_<asttype>(<asttype>) -> String really
    Nickolay Ponomarev
    @nickolay
    the problem I see with forcing complete separation of different dialects is that much of the code will have to be "forked" even if the difference is very small
    perhaps forking some parts, while keeping certain tweaks in the base implementations (controlled by flags), would work
    Nickolay Ponomarev
    @nickolay
    Mind if I rename andygrove/sqlparser-rs#5 to "Support EXISTS()"?
    Andy Grove
    @andygrove
    Sure, go for it
    Andy Grove
    @andygrove
    I had another thought about how dialects could be implemented. We could have a struct containing functions for parsing different things e.g.
    struct ParserFuncs {
      parse_literal: Box<ParseFn>,
      parse_identifier: Box<ParseFn>,
      parse_select: Box<ParseFn>,
      ..
    }
    where type ParseFn = Fn(tokens: TokenStream, funcs: Rc<ParserFuncs>) -> Box<AST>
    this would allow the kinda of recursive + inheritence required
    we would have generic ParserFuncs and dialects could replace some of the functions as needed
    Nickolay Ponomarev
    @nickolay
    I'm still not sure why trait Dialect { fn parse_select(&self, tokens) -> AST { /* default impl */ }; impl Dialect for HiveDialect { fn parse_select(...) { /* override */ } } won't work. Granted I'm not very experienced in Rust, but a tiny test worked fine.
    Andy Grove
    @andygrove
    yeah I can explain the problem
    lets use a fictional AcmeSQL where SELECT is 99% same as ANSI but has a different ALIAS syntax e.g. SELECT col1 ALIASED AS 'abc', .. FROM ... WHERE ...
    we don't want to implement parse_select for AcmeSQL, we just want to implement a custom parse_expression most likely
    you know, I should just create a little project to demonstrate
    maybe in rust playground
    Andy Grove
    @andygrove
    well I'd already started on some code , let's see if this makes it clearer
    pub struct Parser {
        parse_select: Box<ParserFunc>,
        parse_identifier: Box<ParserFunc>,
        parse_literal: Box<ParserFunc>,
    }
    
    pub trait Dialect {
        fn create_parser(&self) -> Box<Parser>;
    }
    impl Dialect for CustomDialect {
    
        fn create_parser(&self) -> Box<Parser> {
            // start with base ANSI parser
            let mut base: Box<DialectNew> = Box::new(BaseDialect {});
            let mut parser = base.create_parser();
            // override how literals are parse in my dialect
            parser.parse_literal = Box::new({|tokens| Box::new(ASTNode::SQLValue(Value::Long(123)))});
            parser
        }
    }
    so if we just want to change how literals are parsed we can do this
    no need to duplicate other logic
    in the parser, the parse_select method from the base parser will end up delegating to self.parse_literal() which we have overridden
    I think I've finally figured out how to do inheritance with Rust
    does that make sense?
    I think I can create a small PoC tomorrow for this idea
    Nickolay Ponomarev
    @nickolay
    I understand the problem, I think. What I don't get is why you don't just put the base behavior in a trait
    I get the solution as well, it's more complex than a trait and I'm struggling to understand why
    this is the "tiny test" I mentioned above: nickolay/sqlparser-rs@c8484f2
    I defined the base behavior in Dialect, the methods can be overriden, like virtual functions in C-like
    Andy Grove
    @andygrove
    Ah, I see. I think you are right.. that is a better approach.
    I haven't used traits with default implementations before. I don't think that existed when I started learning Rust
    Nickolay Ponomarev
    @nickolay
    May well be. It's an interesting experience -- seeing the language evolve while you're still learning it