Skip to content

Conversation

rameeshm
Copy link
Contributor

@rameeshm rameeshm commented Sep 6, 2025

What changes were proposed in this pull request?

Include Apache Tez as the process framework for ranger-hive docker

  • This fixes issue with Insert command in beeline
  • Data processing is much faster with Tez's DAG for processing.
  • Addressed review comments
  • Addressed issue with hadoop and hive container individual start up

How was this patch tested?

Testing in Docker running HiveServer 2 beeline and execute "INSERT" statement for DAG.
TezJob

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR integrates Apache Tez as the processing framework for the ranger-hive Docker setup to enable faster data processing through DAG execution and resolve issues with INSERT commands in beeline.

  • Adds Tez binary distribution and configuration files for Hive integration
  • Updates Hadoop YARN configuration to support Tez execution
  • Creates comprehensive Tez configuration across all Hive database variants

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tez-site.xml New Tez configuration template with memory and execution settings
ranger-hive-setup.sh Adds Tez setup, YARN configuration, and HDFS directory creation
ranger-hadoop-setup.sh Enhances YARN configuration and installs Tez JARs for NodeManager
hive-site-*.xml Adds Tez execution engine configuration to all database variants
hive-site-metastore-mysql.xml New metastore-specific configuration with Tez support
create-users.sh New script for creating test users (alice, abram)
download-archives.sh Adds Tez binary download support
docker-compose files Updates build arguments and environment variables for Tez
Dockerfiles Integrates Tez installation and user creation across containers
.env Updates Hadoop version compatibility and adds Tez version
Comments suppressed due to low confidence (1)

dev-support/ranger-docker/.env:1

  • The KAFKA_VERSION line appears to be missing after the HIVE_HADOOP_VERSION change. This could break Kafka-related builds that depend on this environment variable.
BUILD_HOST_SRC=true

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

 docker - Review comments adddress, hadoop and hive ssh issue while startup addressed, removed not need configs
@rameeshm rameeshm requested a review from mneethiraj September 8, 2025 00:54
@@ -0,0 +1,43 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rameeshm - following users and groups are created in ranger-base image, using https://github.com/apache/ranger-tools/blob/main/docker/Dockerfile#L50. It might be useful to add users alice and abram in ranger-base image itself, so that these users are available in all Ranger images.

users: ranger rangeradmin rangerusersync rangertagsync rangerkms hdfs yarn hive hbase kafka ozone knox
groups: ranger hadoop knox

With this approach, updates to many Dockerfiles in this PR can be eliminated.

@kumaab
Copy link
Contributor

kumaab commented Sep 8, 2025

Thank you @rameeshm for the patch, I believe this is tested with Ubuntu base image, please see if this can be tested with UBI base image as well, this change needs to made in .env file: RANGER_BASE_VERSION=20250712-1-ubi-8, thanks!

fi

# Additional verification: Check if HiveServer2 is listening on port 10000
echo "Checking if HiveServer2 is listening on port 10000..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

health checks for HiveServer2 can be added in docker-compose.ranger-hive.yml as well, this PR adds healthchecks for all containers: #604

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants