Skip to content
This repository was archived by the owner on Mar 29, 2023. It is now read-only.
This repository was archived by the owner on Mar 29, 2023. It is now read-only.

Ibis 2.0: Can't connect to table - _parse_project_and_dataset called without database #107

@saschahofmann

Description

@saschahofmann

I am trying to update our codebase to ibis 2.0 and started getting the ValueError("Unable to determine BigQuery dataset.") error when trying to fetch a table from BigQuery like this ibis_client.table("table", database="dataset").

I followed the traceback like this, starting at /ibis_bigquery/init.py, line 177

def table(self, name, database=None) -> ir.TableExpr:
    t = super().table(name, database=database)
    ...

This calls the standard ibis table function /ibis/backends/base/sql/init.py line 41

def table(self, name, database=None):
    qualified_name = self._fully_qualified_name(name, database)
    schema = self.get_schema(qualified_name)
    ...

which in turn calls the get_schema method in bigquery /ibis_bigquery/init.py, line 329 BUT only with a name (this will be important in a sec)

def get_schema(self, name, database=None):
    table_id = self._fully_qualified_name(name, database)
    ...

Now we are trying to create the fully_qualified (/ibis_bigquery/init.py", line 183) name again but this time database is None!

def _fully_qualified_name(self, name, database):
    default_project, default_dataset = self._parse_project_and_dataset(database)
    ...

Which leads us to _parse_project_and_dataset in /ibis_bigquery/init.py, line 161

 def _parse_project_and_dataset(self, dataset) -> Tuple[str, str]:
    if not dataset and not self.dataset:
        raise ValueError("Unable to determine BigQuery dataset.")
    ...

As mentioned earlier dataset is now None and nowhere in the stacktrace above did we set self.dataset, I probably can set the dataset before calling the table method but it feels to me like this is a bug?

This code also means that the bigquery table method makes two requests to bigquery one in the table method itself and one in the get_schema method that is called via the super method. Maybe we could make bq_table a cached property to avoid that?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions