Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 29 additions & 17 deletions src/iceberg/catalog/memory/in_memory_catalog.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@

#include <algorithm>
#include <iterator>
#include <mutex>

#include "iceberg/table.h"
#include "iceberg/table_metadata.h"
Expand Down Expand Up @@ -337,42 +336,42 @@ std::string_view InMemoryCatalog::name() const { return catalog_name_; }

Status InMemoryCatalog::CreateNamespace(
const Namespace& ns, const std::unordered_map<std::string, std::string>& properties) {
std::lock_guard guard(mutex_);
std::unique_lock lock(mutex_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep std::lock_guard if you only need write guard here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::lock_guard cannot be used to acquire a write lock on a std::shared_mutex

Copy link
Contributor

@HuaHuaY HuaHuaY Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::lock_guard is a template wrapper (cppreference link). Any lock which meets BasicLockable can be used in std::lock_guard, including std::shared_mutex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong — lock_guard can indeed lock a shared_mutex, but it is still recommended to use unique_lock and shared_lock for read-write locking on a shared_mutex for better code readability.

Copy link
Contributor

@HuaHuaY HuaHuaY Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are some differences about code readability here. std::unique_lock has additional overhead but std::scoped_lock and std::lock_guard don't. It should not be compared with std::scoped_lock or std::lock_guard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using unique_lock and shared_lock with shared_mutex makes it clear that this is a read-write lock, as seeing unique_lock immediately indicates that exclusive (write) access is being acquired. And, in other mutex scenarios, use lock_guard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on std::lock_guard if we care the performance here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler can optimize the simple scene (godbolt). But I still suggest to keep the code from doing unnecessary things. The simpler the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's unnecessary, and I recommend use the unique_lock/shared_lock pair for locking shared_mutex

return root_namespace_->CreateNamespace(ns, properties);
}

Result<std::unordered_map<std::string, std::string>>
InMemoryCatalog::GetNamespaceProperties(const Namespace& ns) const {
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
return root_namespace_->GetProperties(ns);
}

Result<std::vector<Namespace>> InMemoryCatalog::ListNamespaces(
const Namespace& ns) const {
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
return root_namespace_->ListNamespaces(ns);
}

Status InMemoryCatalog::DropNamespace(const Namespace& ns) {
std::lock_guard guard(mutex_);
std::unique_lock lock(mutex_);
return root_namespace_->DropNamespace(ns);
}

Result<bool> InMemoryCatalog::NamespaceExists(const Namespace& ns) const {
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
return root_namespace_->NamespaceExists(ns);
}

Status InMemoryCatalog::UpdateNamespaceProperties(
const Namespace& ns, const std::unordered_map<std::string, std::string>& updates,
const std::unordered_set<std::string>& removals) {
std::lock_guard guard(mutex_);
std::unique_lock lock(mutex_);
return root_namespace_->UpdateNamespaceProperties(ns, updates, removals);
}

Result<std::vector<TableIdentifier>> InMemoryCatalog::ListTables(
const Namespace& ns) const {
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
const auto& table_names = root_namespace_->ListTables(ns);
ICEBERG_RETURN_UNEXPECTED(table_names);
std::vector<TableIdentifier> table_idents;
Expand All @@ -387,36 +386,40 @@ Result<std::unique_ptr<Table>> InMemoryCatalog::CreateTable(
const TableIdentifier& identifier, const Schema& schema, const PartitionSpec& spec,
const std::string& location,
const std::unordered_map<std::string, std::string>& properties) {
std::unique_lock lock(mutex_);
return NotImplemented("create table");
}

Result<std::unique_ptr<Table>> InMemoryCatalog::UpdateTable(
const TableIdentifier& identifier,
const std::vector<std::unique_ptr<TableRequirement>>& requirements,
const std::vector<std::unique_ptr<TableUpdate>>& updates) {
std::unique_lock lock(mutex_);
return NotImplemented("update table");
}

Result<std::shared_ptr<Transaction>> InMemoryCatalog::StageCreateTable(
const TableIdentifier& identifier, const Schema& schema, const PartitionSpec& spec,
const std::string& location,
const std::unordered_map<std::string, std::string>& properties) {
std::unique_lock lock(mutex_);
return NotImplemented("stage create table");
}

Result<bool> InMemoryCatalog::TableExists(const TableIdentifier& identifier) const {
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
return root_namespace_->TableExists(identifier);
}

Status InMemoryCatalog::DropTable(const TableIdentifier& identifier, bool purge) {
std::lock_guard guard(mutex_);
std::unique_lock lock(mutex_);
// TODO(Guotao): Delete all metadata files if purge is true.
return root_namespace_->UnregisterTable(identifier);
}

Status InMemoryCatalog::RenameTable(const TableIdentifier& from,
const TableIdentifier& to) {
std::unique_lock lock(mutex_);
return NotImplemented("rename table");
}

Expand All @@ -426,31 +429,40 @@ Result<std::unique_ptr<Table>> InMemoryCatalog::LoadTable(
return InvalidArgument("file_io is not set for catalog {}", catalog_name_);
}

Result<std::string> metadata_location;
std::string metadata_location;
{
std::lock_guard guard(mutex_);
std::shared_lock lock(mutex_);
ICEBERG_ASSIGN_OR_RAISE(metadata_location,
root_namespace_->GetTableMetadataLocation(identifier));
}

ICEBERG_ASSIGN_OR_RAISE(auto metadata,
TableMetadataUtil::Read(*file_io_, metadata_location.value()));
TableMetadataUtil::Read(*file_io_, metadata_location));

return std::make_unique<Table>(identifier, std::move(metadata),
metadata_location.value(), file_io_,
return std::make_unique<Table>(identifier, std::move(metadata), metadata_location,
file_io_,
std::static_pointer_cast<Catalog>(shared_from_this()));
}

Result<std::shared_ptr<Table>> InMemoryCatalog::RegisterTable(
const TableIdentifier& identifier, const std::string& metadata_file_location) {
std::lock_guard guard(mutex_);
if (!file_io_) [[unlikely]] {
return InvalidArgument("file_io is not set for catalog {}", catalog_name_);
}

ICEBERG_ASSIGN_OR_RAISE(auto metadata,
TableMetadataUtil::Read(*file_io_, metadata_file_location));
Comment on lines +453 to +454
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be delayed until register is successful. Or we can add a new lockless LoadTableImpl to wrap the internal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an issue: if a file is invalid, register returns failure, but the map may already contain an entry. Subsequent calls to register will then return "already exists".
Therefore, I think we should validate the file's legitimacy in advance, ensuring that only valid table metadata can be successfully registered.


std::unique_lock lock(mutex_);
if (!root_namespace_->NamespaceExists(identifier.ns)) {
return NoSuchNamespace("table namespace does not exist.");
}
if (!root_namespace_->RegisterTable(identifier, metadata_file_location)) {
return UnknownError("The registry failed.");
}
return LoadTable(identifier);
return std::make_unique<Table>(identifier, std::move(metadata), metadata_file_location,
file_io_,
std::static_pointer_cast<Catalog>(shared_from_this()));
}

} // namespace iceberg
4 changes: 2 additions & 2 deletions src/iceberg/catalog/memory/in_memory_catalog.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

#pragma once

#include <mutex>
#include <shared_mutex>

#include "iceberg/catalog.h"

Expand Down Expand Up @@ -103,7 +103,7 @@ class ICEBERG_EXPORT InMemoryCatalog
std::shared_ptr<FileIO> file_io_;
std::string warehouse_location_;
std::unique_ptr<class InMemoryNamespace> root_namespace_;
mutable std::recursive_mutex mutex_;
mutable std::shared_mutex mutex_;
};

} // namespace iceberg
Loading