-
Notifications
You must be signed in to change notification settings - Fork 20
Description
We create this roadmap to track the upcoming features and improvements for VIDEX - The Disaggregated, Extensible Virtual Index Engine for What-If Analysis in MySQL. The focus of version 0.2.0 is on MySQL 8.0 adaptation, Standalone 5.7 support, and CI/CD improvements.
If you're interested in working on any of these issues, please respond to the related issue or create a new one.
🚀 Version 0.2.0
System Features
-
✴️ Enhance Python usability via a PyPI package to simplify installation and broaden version compatibility. ([Initiative] Enhance Python Usability and Simplify Installation #71)
-
Concurrency Support in Statistic-Server Description: Validate and enhance VIDEX-Statistic-Server's concurrency capabilities to handle numerous requests with minimal instances. #28
Description: Validate and enhance VIDEX-Statistic-Server's concurrency capabilities to handle numerous requests with minimal instances. -
Plugin-side Request Similarity Cache Description: Implement caching to intercept duplicate requests on the plugin side. #29
Description: Implement caching to intercept duplicate requests on the plugin side. -
Standalone Mode for MySQL 5.7 (Inquiry about VIDEX supporting MySQL 5.x versions #3)
Description: Leverage MySQL 8.0 VIDEX-optimizer to support MySQL 5.7 in Standalone mode.
PR: ([Stats|Pipe|Perf] support MySQL 57 in standalone mode, support descending indexes #14) -
Descending Index Support
Description: Add support for descending indexes introduced in MySQL 8.0.
PR: ([Stats|Pipe|Perf] support MySQL 57 in standalone mode, support descending indexes #14) -
Plugin Mode for MySQL 8.0.42 (Plugin build failures on MySQL 8.0.42 #9)
Description: Ensure compatibility with MySQL 8.0.42 in plugin mode.
PR: ([Compatibility] MySQL 8.0.42 #17) -
Multi-version Plugin Support (Supporting Multiple MySQL/Percona Versions in C++ Plugin #10)
Description: Address minor differences between Percona and MySQL versions through compilation parameters, enabling VIDEX-plugin to support various MySQL and Percona versions.
Algorithm Features
-
Sampling-based Statistic Collection (NDV by sampling instead of full table scan #12)
Description: Develop efficient methods using sampling or modeling to generate statistical information without querying the original database. Requires knowledge of data sampling, single-column NDV estimation, histogram generation, and distribution fitting. -
Multi-Column Cardinality Estimation Enhancement
Description: Implement improved calculation of multi-column cardinality based on sampled data, with special optimization for correlated columns. Requires knowledge of data-driven cardinality estimation.
Documentation
- Protocol & Interface &AI Documentation
Description: Create comprehensive documentation for the RESTful layer to facilitate system developers implementing plugins or Statistic-servers, and for algorithm interfaces to help algorithm developers integrate new algorithms. (Add API Documentation for VIDEX RESTful Services #25 )
Benchmark and Performance
- Comprehensive Testing on TPCH and JOB
Description: Validate the completeness of VIDEX 8.0's support for MySQL 5.7 using standard benchmarks.
CI/CD and Developer Productivity
-
Improve Build and Test Automation ( Add Basic CI/CD Pipeline for VIDEX Testing #36 [CI] Implement a Basic CI Pipeline for VIDEX testing #40 )
Description: Enhance CI/CD pipelines to ensure consistent quality across supported database versions. -
Community Contribution Framework ( Add CONTRIBUTING.md #34 )
Description: Establish clear guidelines and processes for external contributions to both system and algorithm components.
Code Refactoring & Documentation Enhancement
- [Warning] Normalize Dockerfile casing: as → AS to resolve build warning #39
- [Docs]: resolve broken anchor link for TPC-H example (#11) #30
- [Bug] [Docs] Improve Docker build and installation doc #27
VIDEX 0.3.0
-
✴️ Integrate core statistics logic into the plugin, reducing dependency on the external Python server for non-AI parts. ([Initiative] Integrate Core Statistics Logic into the Plugin #72)
-
VIDEX Web Tool
Description: Create a web interface for direct database connection, information collection, query analysis, and index management.
🔮 Long Term & Exploratory (Future Versions)
VIDEX aims to provide a database virtual engine that accurately simulates query plans without requiring real data, supporting index recommendation and join order optimization. Currently supporting Percona, VIDEX will expand to MySQL/MariaDB and PostgreSQL. The algorithm layer will evolve from current heuristic algorithms to AI-boosted solutions, extending beyond cardinality and NDV estimation to simulate all database information.
System Features
-
MariaDB Adaptation (Feature request: support MariaDB #1)
Description: Collaborate with MariaDB to adapt VIDEX-optimizer for MariaDB. Requires MariaDB development and C++ expertise. -
PostgreSQL Adaptation
Description: Extend VIDEX's separated, AI-expandable architecture to PostgreSQL. This is a large task requiring PG development experience and knowledge of hypoPG and cardinality patches. -
MySQL 5.7 Plugin Mode
Description: Rewrite VIDEX-plugin to be compatible with MySQL 5.7 in plugin mode. -
Alternative VIDEX-Statistic Implementations (Self-implemented Videx-stat-server With SpringBoot #8)
Description: Develop VIDEX-Statistic-Server implementations in other languages (e.g., Java SpringBoot). -
Mock Index_read Implementation
Description: Address unsupportedindex_readinterface by implementing mocking, virtual data returns, or direct original database access. -
Virtual Histogram Support
Description: Implement histogram mocking to simulate the impact of MySQL 8.0 histograms on query plans.
Algorithm Features
-
Dataless NDV Estimation
Description: Retrain PLM4NDV on commercially available datasets for NDV prediction without data access. -
Dataless Cardinality Estimation
Description: Implement single-table cardinality estimation without requiring data access. -
Index Cache Percentage Estimation
Description: Develop models to predict index_cached_pct values accurately. -
LLM-Native Dataless Database
Description: Generate simulated statistical information and data distributions based solely on user language descriptions of data scale, range, and correlations, even before database creation.
