-
Notifications
You must be signed in to change notification settings - Fork 1k
RANGER-5310:Include Apache Tez as the process framework for ranger-hive docker #660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… docker Signed-off-by: Ramesh Mani <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates Apache Tez as the processing framework for the ranger-hive Docker setup to enable faster data processing through DAG execution and resolve issues with INSERT commands in beeline.
- Adds Tez binary distribution and configuration files for Hive integration
- Updates Hadoop YARN configuration to support Tez execution
- Creates comprehensive Tez configuration across all Hive database variants
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tez-site.xml | New Tez configuration template with memory and execution settings |
ranger-hive-setup.sh | Adds Tez setup, YARN configuration, and HDFS directory creation |
ranger-hadoop-setup.sh | Enhances YARN configuration and installs Tez JARs for NodeManager |
hive-site-*.xml | Adds Tez execution engine configuration to all database variants |
hive-site-metastore-mysql.xml | New metastore-specific configuration with Tez support |
create-users.sh | New script for creating test users (alice, abram) |
download-archives.sh | Adds Tez binary download support |
docker-compose files | Updates build arguments and environment variables for Tez |
Dockerfiles | Integrates Tez installation and user creation across containers |
.env | Updates Hadoop version compatibility and adds Tez version |
Comments suppressed due to low confidence (1)
dev-support/ranger-docker/.env:1
- The KAFKA_VERSION line appears to be missing after the HIVE_HADOOP_VERSION change. This could break Kafka-related builds that depend on this environment variable.
BUILD_HOST_SRC=true
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
@@ -0,0 +1,43 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rameeshm - following users and groups are created in ranger-base image, using https://github.com/apache/ranger-tools/blob/main/docker/Dockerfile#L50. It might be useful to add users alice
and abram
in ranger-base image itself, so that these users are available in all Ranger images.
users: ranger rangeradmin rangerusersync rangertagsync rangerkms hdfs yarn hive hbase kafka ozone knox
groups: ranger hadoop knox
With this approach, updates to many Dockerfiles in this PR can be eliminated.
Thank you @rameeshm for the patch, I believe this is tested with Ubuntu base image, please see if this can be tested with UBI base image as well, this change needs to made in |
fi | ||
|
||
# Additional verification: Check if HiveServer2 is listening on port 10000 | ||
echo "Checking if HiveServer2 is listening on port 10000..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
health checks for HiveServer2 can be added in docker-compose.ranger-hive.yml
as well, this PR adds healthchecks for all containers: #604
What changes were proposed in this pull request?
Include Apache Tez as the process framework for ranger-hive docker
How was this patch tested?
Testing in Docker running HiveServer 2 beeline and execute "INSERT" statement for DAG.
