I am trying to update our codebase to ibis 2.0 and started getting the ValueError("Unable to determine BigQuery dataset.") error when trying to fetch a table from BigQuery like this ibis_client.table("table", database="dataset").
I followed the traceback like this, starting at /ibis_bigquery/init.py, line 177
def table(self, name, database=None) -> ir.TableExpr:
t = super().table(name, database=database)
...
This calls the standard ibis table function /ibis/backends/base/sql/init.py line 41
def table(self, name, database=None):
qualified_name = self._fully_qualified_name(name, database)
schema = self.get_schema(qualified_name)
...
which in turn calls the get_schema method in bigquery /ibis_bigquery/init.py, line 329 BUT only with a name (this will be important in a sec)
def get_schema(self, name, database=None):
table_id = self._fully_qualified_name(name, database)
...
Now we are trying to create the fully_qualified (/ibis_bigquery/init.py", line 183) name again but this time database is None!
def _fully_qualified_name(self, name, database):
default_project, default_dataset = self._parse_project_and_dataset(database)
...
Which leads us to _parse_project_and_dataset in /ibis_bigquery/init.py, line 161
def _parse_project_and_dataset(self, dataset) -> Tuple[str, str]:
if not dataset and not self.dataset:
raise ValueError("Unable to determine BigQuery dataset.")
...
As mentioned earlier dataset is now None and nowhere in the stacktrace above did we set self.dataset, I probably can set the dataset before calling the table method but it feels to me like this is a bug?
This code also means that the bigquery table method makes two requests to bigquery one in the table method itself and one in the get_schema method that is called via the super method. Maybe we could make bq_table a cached property to avoid that?