-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 hive style writes #697
base: antalya
Are you sure you want to change the base?
S3 hive style writes #697
Conversation
Depends on #700 |
and writing more tests are the only thing missing I guess |
…rt table function
This is an automated comment for commit abdd84a with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
throw Exception(ErrorCodes::LOGICAL_ERROR, "Table level partition expression and query level partition expression can't be specified together, this is a bug"); | ||
} | ||
|
||
static std::unordered_map<std::string, bool> partitioning_style_to_wildcard_acceptance = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a little bit overengineering. This make sense if we expect more variants in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my assumption, do you think we could keep it or you want it to be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code works, can be kept, I think.
configuration->partitioning_style); | ||
} | ||
|
||
if (configuration->withPartitionWildcard() && !partitioning_style_to_wildcard_acceptance.at(configuration->partitioning_style)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withPartitionWildcard
method searches substring in strings on every call, better to call once and keep result in local variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
/// - https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html | ||
/// - https://cloud.ibm.com/apidocs/cos/cos-compatibility#putobject | ||
|
||
if (str.empty() || str.size() > 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just comment, not for change right now, because this code already exists, just moved in namespace, but I don't like it. As I understand key is generated inside clickhouse code and customer can't fully control key length. And when he gets this error - what's next? "Ok, key is to long, how can I fix it?".
May be we need to add task in TODO list to think about autodecreasing key length in cases like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not fully generated by ClickHouse. The key here represents the path without the bucket. Part of that can be specified by user upon table creation.
src/Storages/PartitionStrategy.cpp
Outdated
*/ | ||
std::string formatToFileExtension(const std::string & format) | ||
{ | ||
std::string lower_case_format; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference with just return Poco::toLower(format)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll use Poco::toLower
More info on ClickHouse#76802
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add support for hive partition style writes
Documentation entry for user-facing changes