Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    jcmincke
    @jcmincke

    Hi everybody
    I am quite new to ibis.
    I am investigating whether I could use ibis with a bigquery backend. The goal is to replace pyspark with bigquery but keeping the possibility to build queries programmatically.

    However I am facing some difficulties in understanding some parts of the ibis api.

    Is this chat a good place to ask technical questions?

    Scott Hajek
    @scottcode
    Hi @jcmincke, ibis does support spark and bigquery backends. What parts of the API are you having difficulty with?
    jcmincke
    @jcmincke
    Hi @scottcode. Thanks for your help
    This is one of my issues, others are about date/datetime operations. It seems that date diff is not supported by the bigquery backend yet.
    kovar-ursa
    @kovar-ursa
    Is this still the place to ask questions about Ibis?
    Saul Pwanson
    @saulpw
    Hi Kovar, what question did you have?
    Mike Graham
    @mikegraham

    Hi all!

    My team at Google is in the early stages of an extremely Ibis-heavy project. If anyone is interested in working with Ibis, contributing back to Ibis, and building on top of Ibis as well as scalable data science tooling in general, we are looking to hire at least one software engineer right now. You can reach out to me at grahammichael@google.com if you want to discuss further.

    I hope posting this here is appropriate.

    Wes McKinney
    @wesm
    @mikegraham let me know if you'd like to promote this role on Twitter -- always keen to get more people involved in Ibis
    cloud
    @pcloud:matrix.org
    [m]
    Indeed, we are investing in ibis and would love to have more contributors. Happy to help in any way we can!
    robbc
    @robbc_twitter
    Hi - I've been using Ibis for a while in a fairly large project involving timeseries and geospatial data and am trying to use the Geospatial distance method to find the distance to a literal point. How do I make an ibis literal for geometry that I can pass to the distance method?
    1 reply
    cpcloud (Phillip Cloud)
    @pcloud:matrix.org
    [m]
    Let me know if that answers your question
    robbc
    @robbc_twitter
    Yes and thank you!
    Kevin
    @kevinglasson
    Hi, I couldn't find anywhere if this would be able to target data warehouses like snowflake and bigquery etc.. I would like to create an ELT tool like DBT that uses Python to generate SQL instead of jinja templating SQL? Thanks
    4 replies
    Daniel Kim
    @pybokeh
    Hi there! I searched but maybe I missed it in the API docs, but is there an equivalent to pandas ffill() / forward fill method?
    2 replies
    Mike Graham
    @mikegraham
    Thanks a billion, Wes -- I think I've filled the position, but hopefully I'll have another one soon.
    1 reply
    Sayan Sanyal
    @shaayohn

    Hi folks, I'm super excited to be using ibis in my day to day over dbt/pypika for generating bigquery/spark jobs from python. I have a question about the UNNEST/Flatten issue (ibis-project/ibis#1146) -- I understand that there has been a lot of thought put there.

    Wanted to get an understanding of the current perspective here. Do we think that this is:
    (1) impossible in current scope, and therefore unlikely in foreseeable future
    (2) will take significant work, but is a planned work in a foreseeable release
    (3) Already in the works, and might be available sooner than you'd expect?

    Just trying to get a sense of it as this is a blocker before I recommend larger adoption of this within our group at Twitter.

    7 replies
    jcmincke
    @jcmincke

    Hi,

    I am using ibis with BigQuery. BigQuery has data types 'datetime' and 'timestamp'
    whereas ibis has just one 'timestamp' data type.

    That causes some trouble when manipulating a BQ table with datetime attributes.
    Please, look at the example below. Table "test" has the attribute (date: datetime).

    conn = ibis_bigquery.connect(
        project_id='my_project',
        dataset_id='my_dataset')
    
    t = conn.table("test")
    
    e = t.filter(t.date == dt.datetime.now())

    The generated sql is:

    SELECT *
    FROM `my_project.my_dataset.test`
    WHERE `date` = TIMESTAMP '2022-02-23 12:13:28.163721'

    which raises the error:

    No matching signature for operator = for argument types: DATETIME, TIMESTAMP...

    Has anyone any idea how to circumvent that problem?

    Thanks!

    9 replies
    Phillip Cloud
    @cpcloud
    Hi all, I've put up a PR to remove the distinct() method from column expressions because it has extremely limited utility while being incredibly easy to misuse. I'd like to understand how disruptive this might be ibis-project/ibis#3545. distinct will remain a method on table expressions, which doesn't have the same issues as the column expression version.
    Jayce Slesar
    @jayceslesar
    Hey something ive come across pretty regularly is when installing ibis (not sure if just old versions of ibis or a bad install by me) is that I get ibis in my pip environment and end up getting errors like projection = expr.projection([expr['timestamp_sent'], expr['timestamp_received'], AttributeError: 'FloatingColumn' object has no attribute 'projection' and have to uninstall ibis as I have somehting like ibis==3.2.0 in my environment
    Phillip Cloud
    @cpcloud
    That might be because you need to use pip install ibis-framework
    The ibis name on PyPI is another unrelated project
    @jayceslesar Let me know if that helps
    2 replies
    Phillip Cloud
    @cpcloud
    Hey everyone, I've put up a PR to improve QoL around the expression __repr__: ibis-project/ibis#3594. The repr has had numerous problems in the past around performance and readability, and this PR addresses those issues.
    If anyone is up for tinkering with it that'd be really great. You should see a large performance increase when repring expressions with lots of joins or set operations like UNION etc.
    Tory Haavik
    @toryhaavik
    greetings @cpcloud :)
    5 replies
    Jeff Reback
    @jreback
    well we r on near master actually
    Tory Haavik
    @toryhaavik
    i could swear there was documentation in the official docs that describes how to translate a pandas pivot operation into generic Ibis, but i can't find it. does that ring a bell to anyone?
    5 replies
    Phillip Cloud
    @cpcloud
    Hey all, I've put up a PR to support mixing SQL and ibis expressions: ibis-project/ibis#3642. That PR enables writing a SQL string and then using that in a subsequent expression as well as the reverse: writing an ibis table expression and using that in a subsequent SQL string.
    Currently it's implemented for the Postgres, PySpark and DuckDB backends
    Pushkar Nimkar
    @pushkarnimkar
    Hi, All! I'm new here and wondering if there's a standard, language-agnostic (maybe like JSON) interface on top of Ibis? The use-case is: we are building a front-end to allow an end-user to create queries in a drag-and-drop manner. The queries would be recorded as JSON objects and will reach a Python backend. There will translate into Ibis expressions and run on respective backends.
    3 replies
    gil
    @gforsyth:matrix.org
    [m]
    Hey pushkarnimkar (Pushkar Nimkar) -- not sure if this is exactly what you were looking for, and it's still in the early stages, but you might be interested in looking at https://substrait.io/
    Pushkar Nimkar
    @pushkarnimkar
    Thanks a lot, @gforsyth:matrix.org! The Substrait project looks very helpful!
    Sandro Loch
    @esloch

    Hello everyone, I try to connect the schema with con.schema(name) but it is not possible to connect:

    In [13]: con = ibis.postgres.connect(url=PSQL_URI)
    In [14]: con.schema('Dengue_global')
    /docker-env/miniconda/envs/alertadengue-dev/lib/python3.8/site-packages/ibis/backends/base/__init__.py:166: FutureWarning: The `database` method and the `Database` object are deprecated and will be removed in a future version of Ibis. Use the equivalent methods in the backend instead.
      warnings.warn(
    ---------------------------------------------------------------------------
    NoSuchTableError                          Traceback (most recent call last)
    Input In [16], in <cell line: 1>()
    ----> 1 con.schema('Dengue_global')

    What has changed?

    gil
    @gforsyth:matrix.org
    [m]
    Hey esloch (Sandro Loch) -- I think you first want to grab the table and then you can retrieve the schema, e.g.:
    con = ibis.postgres.connect(...)
    dengue_global = con.table("Dengue_global")
    dengue_global.schema()
    Ivan Ogasawara
    @xmnlab
    @gforsyth:matrix.org , I think @esloch is talking about the postgres schema (kind of namespace)
    how would be the correct new way to specify that? maybe con.table("myschema.mytable")? PS: I didn't try it ... just asking :D
    Sandro Loch
    @esloch

    Hey esloch (Sandro Loch) -- I think you first want to grab the table and then you can retrieve the schema, e.g.:

    con = ibis.postgres.connect(...)
    dengue_global = con.table("Dengue_global")
    dengue_global.schema()

    Thanks @gforsyth:matrix.org !

    @xmnlab, I have this:

    @gforsyth:matrix.org , I think @esloch is talking about the postgres schema (kind of namespace)

    schema_dglob = con.schema('Dengue_global')
    t_parameters = schema_dglob.table('parameters')
    Sandro Loch
    @esloch

    maybe you can try this: https://github.com/ibis-project/ibis/blob/master/ibis/backends/base/sql/alchemy/__init__.py#L369

    it works!!!

    In [99]: con.table('parameters','dengue','Dengue_global')
    Out[99]: 
    AlchemyTable[table]
      name: parameters
      schema:
    ....

    Thanks @xmnlab!

    Sandro Loch
    @esloch

    Hi all!
    I have a web server that receives multiple requests and sometimes doesn't load the entire table and returns an empty schema:

    (Pdb) cache_res = (table_hist_uf[table_hist_uf['state_abbv'] == state_abbv].sort_by('SE').execute())
    *** ibis.common.exceptions.IbisTypeError: 'state_abbv' is not a field in []
    
    (Pdb) table_hist_uf
    AlchemyTable[table]
      name: hist_uf_dengue_materialized_view
      schema:

    ...but if I use pdb inside a try/except and assign the connection to the variable, it returns the table:

    (Pdb) _table_hist_uf = con.table(f'hist_uf{_disease}_materialized_view')
    
    (Pdb) _table_hist_uf
    AlchemyTable[table]
      name: hist_uf_dengue_materialized_view
      schema:
        state_abbv : string
        state_name : string
        ...
    
    (Pdb) cache_res = (_table_hist_uf[_table_hist_uf.state_abbv == state_abbv].sort_by('SE').execute())
    
    (Pdb) cache_res
        state_abbv state_name  municipio_geocodigo      SE data_iniSE  casos_est  casos  nivel  receptivo
    0           AC       Acre              1200013  202206 2022-02-06        0.0      0      1          0
    ...

    Any connection failure with the database?

    Ivan Ogasawara
    @xmnlab
    it looks like that for some reason the information is not ready in the time you are getting the table object
    but not sure why it is not raising an error
    Tory Haavik
    @toryhaavik
    i really like the new https://ibis-project.org/docs/dev/backends/support_matrix/ ! i wonder if there's a way to mess with the HTML table to keep the header when you scroll down, because it can be hard to tell what backends support the operations further down the page
    2 replies
    florent-martineau
    @florent-martineau

    Hi!

    I wanted to say thank you for building the Ibis project, it's really awesome and a great step forward for the data ecosystem!

    I was wondering if you knew a "reverse Ibis" project ? By "reverse Ibis", I mean a library that would take SQL and transform it into pandas code.

    The idea is to take my existing SQL workflows and convert them into Ibis code.

    5 replies
    Jeff Reback
    @jreback
    here is a sql to ibis translator (i thought we referenced i. the docs but not sure)
    https://github.com/zbrookle/sql_to_ibis
    1 reply
    Phillip Cloud
    @cpcloud
    ibis 3.0.2 also contains a .sql method, which allows you to run SQL against an ibis expression like t.group_by('a').aggregate(c=t.b.sum()).alias("foo").sql("SELECT * FROM foo")
    Sandro Loch
    @esloch

    Hello everyone!
    I'm upgrading from ibis2.0.0 to ibis3.0.2, and the udf attribute no longer exists...

    def get_epi_week_expr() -> Callable:
           """
           Returns a UDF expression for the epi_week function.
           returns
           -------
           called
           """
           return ibis.postgres.udf.existing_udf(
               "epi_week", input_types=["date"], output_type="int64"
           )

    traceback:

    AttributeError: object 'function' has no attribute 'existing_udf'

    Is there a change of route?

    gil
    @gforsyth:matrix.org
    [m]
    Hey esloch (Sandro Loch) -- there's some kind of import renaming shenanigans going on. For some reason ibis.postgres.udf is pointing to ibis.backends.postgres.udf.udf (which is a function with no existing_udf attribute or method).
    For now you can workaround by importing directly from ibis.backends.postgres.udf import existing_udf
    2 replies
    Sandro Loch
    @esloch
    Thanks for working on it @gforsyth:matrix.org (Gil)!