[flink] Add restore_as_latest procedure#8139
Conversation
ecd0935 to
0f821ed
Compare
0f821ed to
1492902
Compare
|
I think The new snapshot writes the target snapshot's files into the base manifest list, but writes an empty delta manifest list and marks the commit as Could |
|
@JingsongLi The new snapshot currently has the target snapshot's complete data manifests in baseManifestList, but its deltaManifestList is empty. This makes the final table state correct for batch/full scans, but the restore can be invisible to streaming overwrite readers. I will update restoreAsLatest to generate an overwrite delta relative to the previous latest snapshot: DELETE files that exist only in the previous latest snapshot, and ADD files that exist only in the target snapshot. The baseManifestList will contain the previous latest snapshot's merged effective ADD files, while deltaManifestList will describe the previous-latest-to-target transition. |
Ensure restore_as_latest writes an overwrite delta so streaming overwrite readers can observe restored file changes.
439db01 to
1c523ac
Compare
|
@JingsongLi I also added IT coverage to verify both DELETE-only and ADD-only restore deltas. |
Skip automatic expiration on the restore-as-latest path so it no longer deletes the snapshots/tags it promises to keep (e.g. with snapshot.num-retained.max=1), and keep nextRowId monotonic by taking the max of the previous latest and target snapshot, preventing row id reuse that breaks _ROW_ID global uniqueness on row-tracking tables. Add IT cases covering both fixes.
| targetSnapshot.properties(), | ||
| nextRowId); | ||
|
|
||
| return commitSnapshotImpl(newSnapshot, new ArrayList<>(PartitionEntry.merge(deltaFiles))); |
There was a problem hiding this comment.
This restore commit bypasses the normal post-commit callbacks. Regular commits call commitCallbacks with the committed snapshot and delta files after commitSnapshotImpl succeeds, and those callbacks keep external state in sync (for example Iceberg compatibility metadata uses context.snapshot/context.deltaFiles, and chain-table overwrite handling reacts to CommitKind.OVERWRITE). restoreAsLatest also changes the table state with an overwrite delta, but it returns immediately after writing the Paimon snapshot, so those external views can remain at the pre-restore state. Could we trigger the same commit callback path after a successful restore, using the restored base/delta/index files and the new snapshot context?
There was a problem hiding this comment.
Good catch, thanks. restoreAsLatest committed via commitSnapshotImpl directly and skipped the commit callbacks, so external views (Iceberg metadata, chain-table overwrite) could stay at the pre-restore state.
Fixed by notifying the callbacks after a successful restore, like a regular commit. The context uses the restore delta (DELETE previous-only + ADD target-only) and an index delta derived the same way from the previous-latest and target index manifests. Both callbacks are idempotent, so retries stay correct.
Added an IT case testRestoreTriggersCommitCallback asserting Iceberg metadata is generated for the restore snapshot.
restoreAsLatest committed the restore snapshot directly via commitSnapshotImpl, bypassing the commit callbacks that a regular commit runs. External views that depend on those callbacks (Iceberg compatibility metadata, chain-table overwrite handling) could therefore stay at the pre-restore state. Notify the callbacks after a successful restore using the restored base/delta/index files and the new snapshot. The index changes are derived from the previous latest and target index manifests, mirroring how the data delta files are computed. Add an IT case asserting Iceberg metadata is generated for the restore snapshot.
Purpose
This PR adds a non-destructive restore procedure for Flink:
Unlike
rollback_to, this procedure restores the table to the state of a target snapshot or tag by creating a new latest snapshot. Later snapshots and tags are preserved.What is changed
RestoreAsLatestProcedureand register it in the Flink procedure factory list.Tests
mvn -pl paimon-flink/paimon-flink-common -am -Pfast-build -DfailIfNoTests=false -Dtest=RestoreAsLatestProcedureITCase test git diff --checkNotes
This PR is not associated with an issue yet. If the community prefers following the discussion-first flow strictly, I can open or join an issue/discussion and adjust the design accordingly.