Fix: Add ENUM and TIMESTAMP support to histogram generation #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Summary
This PR adds proper handling for
ENUMandTIMESTAMPdata types in the Videx histogram module, addressing issues where these types were not correctly processed during histogram bucket generation and selectivity calculation.Related Issues
Resolves: #55
Changes Made
Type Conversion (
convert_str_by_type):enumto string-type handling listtimestampto datetime-type handling listHistogram Position Calculation (
find_nearest_key_pos):enumtypestimestamptypesSQL Value Formatting (
_format_value_by_type_in_sql):ENUMhandling (format with quotes and escape)TIMESTAMPhandling (format with quotes)Bucket Boundary Generation (
_get_uniform_buckets):ENUMandTIMESTAMPto sampling-based boundary generationHow to test ENUM and TIMESTAMP
generate_bulk_data_simple.py
source_schema_simple.sql
1.generate testing data
2. import data to database
3. start videx_server and collect data
4. test timestamp
5. test enum (haven't fixed)
However, there are still some issues with the ENUM tests. For example, both the user and product tables contain an ENUM column named status, but while the user query returns rows correctly, the query on product.status fails.
More Results
You can see more execution results of explain sqls from innodb_sql.test and videx_sql
Contribution Guidelines (Expand for Details)
We appreciate your contribution to VIDEX! To ensure a smooth review process and maintain high code quality, please adhere to the following guidelines:
Pull Request Title Format
Your PR title should start with one of these prefixes to indicate the nature of the change:
[Core]: Changes to core engine functionality[Opt]: Changes to VIDEX-Optimizer-Plugin[Stats]: Changes to VIDEX-Statistic-Server[Algo]: Implementation of new algorithms for NDV, cardinality estimation, etc.[Pipe]: Enhancements to the pipeline (e.g., data collection, environment setup)[Bug]: Corrections to existing functionality[CI]: Changes to build process or CI pipeline[Docs]: Updates or additions to documentation[Test]: Adding or updating tests[Perf]: Performance improvements[Misc]: For changes not covered above (use sparingly)Note: For changes spanning multiple categories, use the most specific prefix or multiple prefixes in order of importance (e.g., [Algorithm][Stats]).
Submission Checklist
By submitting this PR, you confirm that you've read these guidelines and your changes align with the project's contribution standards.