Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions src/iceberg/SNAPSHOT_UPDATE_API_COVERAGE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->

# SnapshotUpdate API Coverage

This document tracks the implementation status of the `SnapshotUpdate` interface compared to the Java Iceberg implementation.

## Java SnapshotUpdate Interface

The Java `SnapshotUpdate<ThisT>` interface defines 6 methods:

### 1. `set(String property, String value)` ✅ IMPLEMENTED
**Java:**
```java
/**
* Set a summary property in the snapshot produced by this update.
*
* @param property a String property name
* @param value a String property value
* @return this for method chaining
*/
ThisT set(String property, String value);
```

**C++:**
```cpp
Derived& Set(std::string_view property, std::string_view value);
```

### 2. `deleteWith(Consumer<String> deleteFunc)` ✅ IMPLEMENTED
**Java:**
```java
/**
* Set a callback to delete files instead of the table's default.
*
* @param deleteFunc a String consumer used to delete locations.
* @return this for method chaining
*/
ThisT deleteWith(Consumer<String> deleteFunc);
```

**C++:**
```cpp
Derived& DeleteWith(std::function<void(std::string_view)> delete_func);
```

### 3. `stageOnly()` ✅ IMPLEMENTED
**Java:**
```java
/**
* Called to stage a snapshot in table metadata, but not update the current snapshot id.
*
* @return this for method chaining
*/
ThisT stageOnly();
```

**C++:**
```cpp
Derived& StageOnly();
```

### 4. `scanManifestsWith(ExecutorService executorService)` ⏸️ DEFERRED
**Java:**
```java
/**
* Use a particular executor to scan manifests. The default worker pool will be used by default.
*
* @param executorService the provided executor
* @return this for method chaining
*/
ThisT scanManifestsWith(ExecutorService executorService);
```

**C++:** NOT IMPLEMENTED

**Reason:** Requires executor/thread pool infrastructure which is not yet available in the codebase.

**Future Implementation:**
```cpp
// To be added when executor infrastructure is available
Derived& ScanManifestsWith(std::shared_ptr<Executor> executor);
```

### 5. `toBranch(String branch)` ✅ IMPLEMENTED
**Java:**
```java
/**
* Perform operations on a particular branch
*
* @param branch which is name of SnapshotRef of type branch.
*/
default ThisT toBranch(String branch) {
throw new UnsupportedOperationException(
String.format(
"Cannot commit to branch %s: %s does not support branch commits",
branch, this.getClass().getName()));
}
```

**C++ Implementation:**
```cpp
Derived& ToBranch(std::string_view branch);
```

**Note:** Java has a default implementation that throws `UnsupportedOperationException`.
C++ requires derived classes to implement the full functionality.

### 6. `validateWith(SnapshotAncestryValidator validator)` ❌ MISSING
**Java:**
```java
/**
* Validate snapshot ancestry before committing.
*/
default ThisT validateWith(SnapshotAncestryValidator validator) {
throw new UnsupportedOperationException(
"Snapshot validation not supported by " + this.getClass().getName());
}
```

**C++:** NOT IMPLEMENTED

**Reason:** Not identified during initial implementation review.

**Future Implementation:**
```cpp
// To be added when SnapshotAncestryValidator infrastructure is available
// Note: Java has default implementation that throws UnsupportedOperationException
// Consider whether to provide similar default behavior or omit until needed
```

## Summary

| Method | Java | C++ | Status | Notes |
|--------|------|-----|--------|-------|
| set() | ✅ | ✅ | Implemented | |
| deleteWith() | ✅ | ✅ | Implemented | |
| stageOnly() | ✅ | ✅ | Implemented | |
| scanManifestsWith() | ✅ | ❌ | Deferred | Needs executor infrastructure |
| toBranch() | ✅ (default throws) | ✅ | Implemented | C++ requires full implementation |
| validateWith() | ✅ (default throws) | ❌ | Missing | Needs SnapshotAncestryValidator |

**Implementation Coverage:** 4/6 methods (66%)
**Fully Usable Coverage:** 4/4 required methods (100%) - the missing methods have default throwing implementations in Java

## Next Steps

1. **ScanManifestsWith()**: Add when executor/thread pool infrastructure is available
2. **ValidateWith()**: Add when SnapshotAncestryValidator is implemented
- Consider whether to provide a no-op implementation initially
- Java's default implementation throws UnsupportedOperationException
- May be better to omit until validation infrastructure exists
139 changes: 139 additions & 0 deletions src/iceberg/snapshot_update.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#pragma once

/// \file iceberg/snapshot_update.h
/// API for table updates that produce snapshots

#include <functional>
#include <optional>
#include <string>
#include <string_view>
#include <unordered_map>

#include "iceberg/iceberg_export.h"
#include "iceberg/pending_update.h"
#include "iceberg/type_fwd.h"

namespace iceberg {

/// \brief Interface for updates that produce a new table snapshot
///
/// SnapshotUpdate extends PendingUpdate to provide common methods for all
/// updates that create a new table Snapshot. Implementations include operations
/// like AppendFiles, DeleteFiles, OverwriteFiles, and RewriteFiles.
///
/// This interface uses CRTP (Curiously Recurring Template Pattern) to enable
/// fluent API method chaining in derived classes, matching the Java pattern
/// where SnapshotUpdate<ThisT> allows methods to return the actual derived type.
///
/// Methods included from Java API (4/6):
/// - Set(): Set summary properties
/// - StageOnly(): Stage without updating current snapshot
/// - DeleteWith(): Custom file deletion callback
/// - ToBranch(): Commit to a specific branch
///
/// Methods not yet implemented (2/6):
/// - ScanManifestsWith(): Custom executor for parallel manifest scanning
/// (deferred: requires executor/thread pool infrastructure)
/// - ValidateWith(): Custom snapshot ancestry validation
/// (deferred: requires SnapshotAncestryValidator infrastructure)
///
/// See SNAPSHOT_UPDATE_API_COVERAGE.md for detailed comparison with Java API
///
/// \tparam Derived The actual implementation class (e.g., AppendFiles)
template <typename Derived>
class ICEBERG_EXPORT SnapshotUpdate : public PendingUpdateTyped<Snapshot> {
public:
~SnapshotUpdate() override = default;

/// \brief Set a summary property on the snapshot
///
/// Summary properties provide metadata about the changes in the snapshot,
/// such as the operation type, number of files added/deleted, etc.
///
/// \param property The property name
/// \param value The property value
/// \return Reference to derived class for method chaining
Derived& Set(std::string_view property, std::string_view value) {
summary_[std::string(property)] = std::string(value);
return static_cast<Derived&>(*this);
}

/// \brief Stage the snapshot without updating the table's current snapshot
///
/// When StageOnly() is called, the snapshot will be committed to table metadata
/// but will not update the current snapshot ID. The snapshot will not be added
/// to the table's snapshot log. This is useful for creating wap branches or
/// validating changes before making them current.
///
/// \return Reference to derived class for method chaining
Derived& StageOnly() {
stage_only_ = true;
return static_cast<Derived&>(*this);
}

/// \brief Set a custom file deletion callback
///
/// By default, files are deleted using the table's FileIO implementation.
/// This method allows providing a custom deletion callback for use cases like:
/// - Tracking deleted files for auditing
/// - Implementing custom retention policies
/// - Delegating deletion to external systems
///
/// \param delete_func Callback function that will be called for each file to delete
/// \return Reference to derived class for method chaining
Derived& DeleteWith(std::function<void(std::string_view)> delete_func) {
delete_func_ = std::move(delete_func);
return static_cast<Derived&>(*this);
}

/// \brief Commit the snapshot to a specific branch
///
/// By default, snapshots are committed to the table's main branch.
/// This method allows committing to a named branch instead, which is useful for:
/// - Write-Audit-Publish (WAP) workflows
/// - Feature branch development
/// - Testing changes before merging to main
///
/// \param branch The name of the branch to commit to
/// \return Reference to derived class for method chaining
Derived& ToBranch(std::string_view branch) {
target_branch_ = std::string(branch);
return static_cast<Derived&>(*this);
}

protected:
SnapshotUpdate() = default;

/// \brief Summary properties to set on the snapshot
std::unordered_map<std::string, std::string> summary_;

/// \brief Whether to stage only without updating current snapshot
bool stage_only_ = false;

/// \brief Custom file deletion callback
std::optional<std::function<void(std::string_view)>> delete_func_;

/// \brief Target branch name for commit (nullopt means main branch)
std::optional<std::string> target_branch_;
};

} // namespace iceberg
1 change: 1 addition & 0 deletions src/iceberg/test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ add_iceberg_test(table_test
json_internal_test.cc
pending_update_test.cc
schema_json_test.cc
snapshot_update_test.cc
table_test.cc
table_metadata_builder_test.cc
table_requirement_test.cc
Expand Down
Loading
Loading