You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Over the past months, we've collaborated with a wide range of stakeholders—companies, developers, and users—who are invested in the evolution of Apache Polaris. This roadmap consolidates those insights into a shared vision, ensuring that our efforts address the most impactful and widely supported improvements. We appreciate the valuable feedback and collaboration that have shaped this direction.
Note that the features can move in and out of the milestones based on prioritization, and available resources.
The roadmap Items can be broadly classified into several categories such as
Core Polaris functions
Catalog Federation and Integrations
Data Security, Data Governance and Compliance
Observability and Reliability
AI/ML
Feature List
Category
Feature
0.9*
1.0*
1.1*
1.2*
1.3*
1.4*
1.5+*
Core Polaris
Iceberg REST Spec support (including view and multi-table transactions)
X
X
X
X
X
X
X
Support for Delta Format as Foreign Tables
X
X
X
X
X
Policy Store
X
X
X
X
X
X
Table Maintenance Framework
X
X
X
X
X
SQL & No SQL Persistence layer
X
X
X
X
X
X
Catalog Browser experience(UI)
X
Catalog Federation & Integrations
Catalog Federation
X
X
X
X
X
X
Catalog Migrator
X
X
X
X
X
X
Identity Federation and SSO
X
X
X
X
X
Data Security, Governance and Compliance
Role-based Access Control(RBAC)
X
X
X
X
X
X
X
Credential Vending (including support for S3, ADLS and GCP)
X
X
X
X
X
X
X
Table Governance Policies
X
X
X
X
X
Row level and column level policies
X
X
X
X
Audit and events interface
X
X
X
X
X
Data Lineage
X
X
X
Data tagging and classifications
X
Observability and Telemetry
Data lake operational metrics
X
Data health monitoring and alerts
X
AI/ML
Volumes/Directory Tables
X
*This is a tentative proposed version with these features.
Core Polaris Functions
Foreign Tables
Enable support for non Iceberg table formats through the concept of Foreign Tables. Foreign tables behave quite similar to regular tables, with an additional format attribute that defines the type of the table format. This opens up flexibility with respect to supporting additional table format. Milestone 1.0
Delta format Support
Apache Polaris supports the Iceberg REST catalog, which provides a flexible way to manage and query large datasets. However, support for additional table formats, such as Delta, would enhance its capabilities by allowing governance, compliance, data management, disaster recovery and migrations for all tables in the same Catalog.
This includes enabling reading data from Delta tables by generating Iceberg metadata (more details in here: Delta (and other format) Table Support in Polaris) as well as enabling both Delta read and write from engines such as Apache Spark. Milestone: 1.0
Policy Store
Apache Polaris' support for a Policy Store allows it to serve as a centralized repository for all policies related to data assets, ensuring consistent governance and compliance across the organization. This includes policies for table maintenance, access control, data security, and overall data governance, enabling administrators to easily enforce, track, and audit these policies. By consolidating policy management in Polaris, organizations can streamline their data management processes while maintaining compliance and security standards.
Table Maintenance Framework brings capabilities to store Table maintenance policies, properties, statistics, and events necessary for performing Table maintenance and Optimizations. This does not include actual Table maintenance operations that need to run a compute infrastructure. More details here Table Maintenance in Polaris Milestone: 1.1
Support the s3-compatible storage, such as MinIO, Ceph, Dell ECS. More details are here, apache/polaris#389. Milestone: 1.0
Catalog Browser experience (UI)
User Experience and Interface for Apache Polaris. Enable users to browse catalogs, databases and tables. Provides basic operations on governance, policy management, and other governance functions. Milestone: 1.5+
Catalog Federation and Integrations
Catalog Federation
Enable federation of reads and writes to any remote catalog thus making Apache Polaris a Catalog of Catalogs. This primarily includes catalogs that support IRC and Hive protocols. Some details here Polaris Roadmap and Catalog Federation Diagrams Milestone: 1.1
Catalog Migrator
Users may want to move Iceberg Tables from several Catalog solutions into Apache Polaris. Catalog migrator enables migration of tables registered in catalogs such as Glue, Hive, or other Iceberg Rest Catalogs into Apache Polaris. Milestone: 1.0
Data Security, Data Governance and Compliance
Governance Policies for Tables
Polaris will provide the ability to define access policies and other governance policies (such Retention) by Tables. More details here Policy Management in Apache Polaris Milestone: 1.2
Column level and Row level Policies
Provides the capability to define and enforce column level and row level access and other governance policies. More details here Policy Management in Apache Polaris Milestone: 1.2
Identity federation, SCIM, SSO and OAuth support
Supporting SCIM and SAML is essential for efficient user provisioning, seamless access management, and enhanced security, ensuring that users can securely access and manage data resources while complying with organizational policies. This also enable easy identity federation and OAuth federation to third party identity providers. More details here Adding Federated User and Role Support in Polaris Milestone: 1.0
Audit and Events Interface
Enable audit logs and history for Catalog, Database, Table, Property and Policy changes through events interface. Initial spec details here Polaris Event Listeners Milestone 1.2
Data Lineage
Data Lineage functionality allows users to trace the flow of data across different systems and tables providing visibility into its origin, origins and usage. This feature enhances data governance auditability and troubleshooting by visually representing data’s lifecycle from source to destination. This includes Table and Column lineages. Milestone 1.5+
Data Tagging and Classification
Enable categorizing and labeling data assets based on predefined criteria, such as data type, sensitivity, or usage. This helps organizations efficiently organize, search, and secure their data by assigning meaningful tags and classifications, enabling better governance and compliance management. Through this feature, users can quickly locate relevant data and ensure appropriate access controls are in place Milestone 1.5+
Encryption Support
Enable support for encrypted Iceberg tables by managing Key Management Service (KMS) integrations. Facilitate the vending of encryption keys, ensuring seamless key retrieval and rotation for standard KMS solutions Milestone 1.3
Observability, Telemetry and Reliability
Data Lake Operational Metrics
Enable operational metrics on Catalog, Databases and Tables to enable operational manageability of the Data Lake. This includes
Data-level metrics: Number of files, Number of Partitions, Partition sizes, Total table size and more.
Access Metrics: Number of R/W access on table, Query load per file, R/W Latency
Data Health Metrics: Data Skew, Data Freshness, and more
Storage-level Metrics: Storage Utilization, Number of Small Files, Storage Growth, Hot Partitions, and more
Milestone 1.5
Data Health Monitoring and Alerts
Enable capabilities to monitor and alert on health of the data including
Table Size and Growth Monitoring
Uncompacted partitions and files
Data Skew monitoring and alerts
Milestone 1.5
AI/ML
Volumes/Directory Tables
A table-like entity like volumes can be used for organizing and managing unstructured data. Volumes provide a way to group related data files logically, similar to directories or containers. More details here Unstructured Data Support in Polaris Milestone 1.5
The text was updated successfully, but these errors were encountered:
sfc-gh-ygu
changed the title
Apache Polaris OSS Roadmap 2025
Apache Polaris OSS Roadmap Proposal
Feb 19, 2025
Over the past months, we've collaborated with a wide range of stakeholders—companies, developers, and users—who are invested in the evolution of Apache Polaris. This roadmap consolidates those insights into a shared vision, ensuring that our efforts address the most impactful and widely supported improvements. We appreciate the valuable feedback and collaboration that have shaped this direction.
Note that the features can move in and out of the milestones based on prioritization, and available resources.
The roadmap Items can be broadly classified into several categories such as
Feature List
*This is a tentative proposed version with these features.
Core Polaris Functions
Foreign Tables
Enable support for non Iceberg table formats through the concept of Foreign Tables. Foreign tables behave quite similar to regular tables, with an additional format attribute that defines the type of the table format. This opens up flexibility with respect to supporting additional table format.
Milestone 1.0
Delta format Support
Apache Polaris supports the Iceberg REST catalog, which provides a flexible way to manage and query large datasets. However, support for additional table formats, such as Delta, would enhance its capabilities by allowing governance, compliance, data management, disaster recovery and migrations for all tables in the same Catalog.
This includes enabling reading data from Delta tables by generating Iceberg metadata (more details in here: Delta (and other format) Table Support in Polaris) as well as enabling both Delta read and write from engines such as Apache Spark.
Milestone: 1.0
Policy Store
Apache Polaris' support for a Policy Store allows it to serve as a centralized repository for all policies related to data assets, ensuring consistent governance and compliance across the organization. This includes policies for table maintenance, access control, data security, and overall data governance, enabling administrators to easily enforce, track, and audit these policies. By consolidating policy management in Polaris, organizations can streamline their data management processes while maintaining compliance and security standards.
More details here Policy Management in Apache Polaris
Milestone 1.0
Table Maintenance Framework
Table Maintenance Framework brings capabilities to store Table maintenance policies, properties, statistics, and events necessary for performing Table maintenance and Optimizations. This does not include actual Table maintenance operations that need to run a compute infrastructure. More details here Table Maintenance in Polaris
Milestone: 1.1
SQL and NoSQL Persistence
Enable SQL (ex. Postgres) and NoSQL (ex. DynamoDB, Cassandra, etc) persistence storage backends for Polaris. More details hereApache Polaris (incubating) - SQL/NoSQL persistence backend support)
Milestone: 1.0
S3-compatible storage support
Support the s3-compatible storage, such as MinIO, Ceph, Dell ECS. More details are here, apache/polaris#389.
Milestone: 1.0
Catalog Browser experience (UI)
User Experience and Interface for Apache Polaris. Enable users to browse catalogs, databases and tables. Provides basic operations on governance, policy management, and other governance functions.
Milestone: 1.5+
Catalog Federation and Integrations
Catalog Federation
Enable federation of reads and writes to any remote catalog thus making Apache Polaris a Catalog of Catalogs. This primarily includes catalogs that support IRC and Hive protocols. Some details here Polaris Roadmap and Catalog Federation Diagrams
Milestone: 1.1
Catalog Migrator
Users may want to move Iceberg Tables from several Catalog solutions into Apache Polaris. Catalog migrator enables migration of tables registered in catalogs such as Glue, Hive, or other Iceberg Rest Catalogs into Apache Polaris.
Milestone: 1.0
Data Security, Data Governance and Compliance
Governance Policies for Tables
Polaris will provide the ability to define access policies and other governance policies (such Retention) by Tables. More details here Policy Management in Apache Polaris
Milestone: 1.2
Column level and Row level Policies
Provides the capability to define and enforce column level and row level access and other governance policies. More details here Policy Management in Apache Polaris
Milestone: 1.2
Identity federation, SCIM, SSO and OAuth support
Supporting SCIM and SAML is essential for efficient user provisioning, seamless access management, and enhanced security, ensuring that users can securely access and manage data resources while complying with organizational policies. This also enable easy identity federation and OAuth federation to third party identity providers. More details here Adding Federated User and Role Support in Polaris
Milestone: 1.0
Audit and Events Interface
Enable audit logs and history for Catalog, Database, Table, Property and Policy changes through events interface. Initial spec details here Polaris Event Listeners
Milestone 1.2
Data Lineage
Data Lineage functionality allows users to trace the flow of data across different systems and tables providing visibility into its origin, origins and usage. This feature enhances data governance auditability and troubleshooting by visually representing data’s lifecycle from source to destination. This includes Table and Column lineages.
Milestone 1.5+
Data Tagging and Classification
Enable categorizing and labeling data assets based on predefined criteria, such as data type, sensitivity, or usage. This helps organizations efficiently organize, search, and secure their data by assigning meaningful tags and classifications, enabling better governance and compliance management. Through this feature, users can quickly locate relevant data and ensure appropriate access controls are in place
Milestone 1.5+
Encryption Support
Enable support for encrypted Iceberg tables by managing Key Management Service (KMS) integrations. Facilitate the vending of encryption keys, ensuring seamless key retrieval and rotation for standard KMS solutions
Milestone 1.3
Observability, Telemetry and Reliability
Data Lake Operational Metrics
Enable operational metrics on Catalog, Databases and Tables to enable operational manageability of the Data Lake. This includes
Milestone 1.5
Data Health Monitoring and Alerts
Enable capabilities to monitor and alert on health of the data including
Milestone 1.5
AI/ML
Volumes/Directory Tables
A table-like entity like volumes can be used for organizing and managing unstructured data. Volumes provide a way to group related data files logically, similar to directories or containers. More details here Unstructured Data Support in Polaris
Milestone 1.5
The text was updated successfully, but these errors were encountered: