feat: support interactive execution of deferred DataFrames in TableWidget#17486
feat: support interactive execution of deferred DataFrames in TableWidget#17486shuoweil wants to merge 50 commits into
Conversation
# Conflicts: # packages/bigframes/bigframes/display/table_widget_angular.js # packages/bigframes/bigframes/display/table_widget_angular/README.md # packages/bigframes/bigframes/display/table_widget_angular/src/app/app.spec.ts # packages/bigframes/bigframes/display/table_widget_angular/src/app/app.ts
…app/widget-state.service.ts Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…app/app.ts Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…app/app.ts Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…app/app.ts Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
# Conflicts: # packages/bigframes/bigframes/dataframe.py # packages/bigframes/bigframes/display/html.py # packages/bigframes/bigframes/display/table_widget_angular.js # packages/bigframes/bigframes/display/table_widget_angular/README.md # packages/bigframes/bigframes/display/table_widget_angular/src/app/app.ts # packages/bigframes/bigframes/display/table_widget_angular/src/app/widget-state.service.ts # packages/bigframes/bigframes/display/table_widget_angular/src/main.ts # packages/bigframes/tests/js/table_widget_angular.test.js # packages/bigframes/tests/unit/display/test_anywidget.py # packages/bigframes/tests/unit/display/test_html.py
There was a problem hiding this comment.
Code Review
This pull request introduces a deferred execution mode for BigQuery DataFrames, allowing users to view dry-run estimates and trigger query execution interactively via an Angular-based table widget. The review feedback highlights several critical issues, including potential AttributeErrors when processing deferred DataFrames or non-string SQLGlot types, background thread execution bugs (such as premature state transitions and uncleared error messages), and frontend issues where the loading state is not reset on failure and container height locking prevents proper resizing.
| widget = display.TableWidget(df, dry_run_info=dry_run_info) | ||
| display_df, _ = df._process_display_df() |
There was a problem hiding this comment.
In get_anywidget_bundle, df is passed to TableWidget before calling _process_display_df(), and the resulting display_df is discarded. This means for non-deferred DataFrames, the unprocessed df is used, which will fail to display JSON/nested JSON columns correctly. Additionally, if df is a DeferredBigQueryDataFrame, calling df._process_display_df() on line 249 will raise an AttributeError because deferred DataFrames do not have this method.
We should only process the DataFrame if it is a standard (non-deferred) DataFrame and we are not in deferred representation mode.
if not isinstance(df, bigframes.session.deferred.DeferredBigQueryDataFrame) and bigframes.options.display.repr_mode != "deferred":
display_df, _ = df._process_display_df()
else:
display_df = df
widget = display.TableWidget(display_df, dry_run_info=dry_run_info)There was a problem hiding this comment.
Done. Added a check to only call _process_display_df() for standard, non-deferred DataFrames when we are not in deferred representation mode, and successfully initialized the TableWidget with the resulting display_df.
| def run_execution(): | ||
| try: | ||
| if self.is_deferred_mode: | ||
| self.is_deferred_mode = False | ||
| if self._deferred_dataframe is not None: | ||
| result = self._deferred_dataframe.execute() | ||
| if isinstance(result, bigframes.series.Series): | ||
| df = result.to_frame() | ||
| else: | ||
| df = result | ||
| self._dataframe, _ = df._process_display_df() | ||
| self._initialize_from_dataframe() | ||
| elif self._dataframe is not None: | ||
| self._dataframe, _ = self._dataframe._process_display_df() | ||
| self._initialize_from_dataframe() | ||
| elif not self.is_deferred_mode and self._dataframe is not None: | ||
| self._initial_load() | ||
| except Exception as e: | ||
| self._error_message = str(e) | ||
| finally: | ||
| self.is_deferred_mode = False | ||
| self.start_execution = False |
There was a problem hiding this comment.
There are a few issues in the background execution thread:
- If the execution fails,
self.is_deferred_modeis set toFalsein thefinallyblock. This transitions the widget out of deferred mode prematurely, resulting in an empty table display and a confusing 'Internal Error: DataFrame is missing' message. Instead,self.is_deferred_modeshould only be set toFalseupon successful execution and initialization. - If a previous execution failed,
self._error_messageis not cleared when starting a new execution, so the old error message will remain visible while the new query is running. We should clearself._error_messageat the start ofrun_execution. - Avoid broad
except Exception:blocks that silently handle errors without logging. We should log the exception usinglogger.warningto aid in debugging.
def run_execution():
try:
self._error_message = None
if self.is_deferred_mode:
if self._deferred_dataframe is not None:
result = self._deferred_dataframe.execute()
if isinstance(result, bigframes.series.Series):
df = result.to_frame()
else:
df = result
self._dataframe, _ = df._process_display_df()
self._initialize_from_dataframe()
self.is_deferred_mode = False
elif self._dataframe is not None:
self._dataframe, _ = self._dataframe._process_display_df()
self._initialize_from_dataframe()
self.is_deferred_mode = False
elif not self.is_deferred_mode and self._dataframe is not None:
self._initial_load()
except Exception as e:
logger.warning(f"Error in background execution: {e}")
self._error_message = str(e)
finally:
self.start_execution = FalseReferences
- Avoid broad
except Exception:blocks that silently returnNone. Instead, log the exception (e.g., usinglogger.warning) to aid in debugging and prevent masking underlying issues.
There was a problem hiding this comment.
Done. Reset _error_message to None at the start of execution, transitioned is_deferred_mode to False only on successful execution/initialization, and added a logger.warning for background execution exceptions. Also updated test assertions to verify this behavior.
| constructor() { | ||
| effect(() => { | ||
| // Setup dependencies for reactive effect | ||
| const _html = this.state.tableHtml(); | ||
| const _sort = this.state.sortContext(); | ||
| const _orderable = this.state.orderableColumns(); | ||
| const deferred = this.isDeferredMode(); | ||
|
|
||
| if (deferred) { | ||
| this.isHeightInitialized = false; | ||
| } | ||
|
|
||
| // Schedule DOM post-processing once the innerHTML render completes | ||
| setTimeout(() => { | ||
| this.applySortIndicators(); | ||
| this.lockInitialHeight(); | ||
| }, 0); | ||
| }); | ||
| } |
There was a problem hiding this comment.
The isLoading signal is set to true when the user clicks 'Run Query', but it is never reset to false. If the query execution fails, the button will remain permanently disabled with a spinner.
To fix this, we should reset isLoading to false when startExecution becomes false. This can be done reactively using an Angular effect in the constructor.
constructor() {
effect(() => {
// Setup dependencies for reactive effect
const _html = this.state.tableHtml();
const _sort = this.state.sortContext();
const _orderable = this.state.orderableColumns();
const deferred = this.isDeferredMode();
if (deferred) {
this.isHeightInitialized = false;
}
// Schedule DOM post-processing once the innerHTML render completes
setTimeout(() => {
this.applySortIndicators();
this.lockInitialHeight();
}, 0);
});
effect(() => {
if (!this.state.startExecution()) {
this.isLoading.set(false);
}
});
}There was a problem hiding this comment.
Done. Added a reactive effect in the constructor to reset isLoading to false whenever startExecution changes to false.
| private lockInitialHeight() { | ||
| if (this.isHeightInitialized) return; | ||
| const container = this.tableContainerRef?.nativeElement; | ||
| if (!container) return; | ||
|
|
||
| const table = container.querySelector('table'); | ||
| if (table) { | ||
| const tableHeight = (table as HTMLElement).offsetHeight; | ||
| if (tableHeight > 0) { | ||
| container.style.height = `${tableHeight + 2}px`; | ||
| this.isHeightInitialized = true; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Locking the container height to the initial table height via this.isHeightInitialized = true prevents the container from resizing when the user changes the page size or page. For example, if the initial page has 10 rows and the user changes the page size to 50, the container height remains locked to the 10-row height, forcing the user to scroll within a tiny container.
Instead of locking the height permanently in JS, it is recommended to rely on CSS max-height (which was removed from .table-container in this PR) to allow the container to auto-expand up to a reasonable limit, or reset this.isHeightInitialized = false whenever the table HTML changes so the height can be recalculated.
There was a problem hiding this comment.
Done. Changed the constructor effect to unconditionally reset isHeightInitialized = false whenever the table HTML or sort parameters change, allowing the container height to correctly recalculate and expand/shrink.
** this is a testing pr, not ready for review"
This PR introduces support for deferred execution rendering in TableWidget (the anywidget-based interactive table viewer for Jupyter notebooks). Users can now view dry-run estimations (e.g., query size/cost) and trigger execution directly from the notebook output.
Fixes #<460865443> 🦕