-
Notifications
You must be signed in to change notification settings - Fork 926
Add semaphore block for ducktape tests #2037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: async
Are you sure you want to change the base?
Changes from all commits
e46c9e6
af760f6
b7d7c63
6383a59
9d6eda0
6daff47
5da0d3e
18453e7
6e8a9ff
62e801d
ccc519b
f2c32a3
152c3e1
f4ee18f
05873c7
dc52ab1
0502a0b
c9796f2
0716cff
0f9c083
30bb84b
22fd64c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -275,6 +275,106 @@ blocks: | |
- python3 -m venv _venv && source _venv/bin/activate | ||
- chmod u+r+x tools/source-package-verification.sh | ||
- tools/source-package-verification.sh | ||
- name: "Ducktape Performance Tests (Linux x64)" | ||
dependencies: [] | ||
task: | ||
agent: | ||
machine: | ||
type: s1-prod-ubuntu24-04-amd64-3 | ||
env_vars: | ||
- name: OS_NAME | ||
value: linux | ||
- name: ARCH | ||
value: x64 | ||
- name: BENCHMARK_BOUNDS_CONFIG | ||
value: tests/ducktape/benchmark_bounds.json | ||
- name: BENCHMARK_ENVIRONMENT | ||
value: ci | ||
prologue: | ||
commands: | ||
- '[[ -z $DOCKERHUB_APIKEY ]] || docker login --username $DOCKERHUB_USER --password $DOCKERHUB_APIKEY' | ||
jobs: | ||
- name: Build and Tests | ||
commands: | ||
# Setup Python environment | ||
- sem-version python 3.9 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess using an older python version will catch more perf issues but I worry new versions might have behavioral differences we'd miss. e.g. some of our C-bindings we use are deprecated in 3.13. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd be ok with leaving a TODO to matrix this across a couple python versions down the line. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense! We could add couple of python versions eg. |
||
- python3 -m venv _venv && source _venv/bin/activate | ||
|
||
# Install ducktape framework and additional dependencies | ||
- pip install ducktape psutil | ||
|
||
# Install existing test requirements | ||
- pip install -r requirements/requirements-tests.txt | ||
|
||
# Build and install confluent-kafka from source | ||
- lib_dir=dest/runtimes/$OS_NAME-$ARCH/native | ||
- tools/wheels/install-librdkafka.sh "${LIBRDKAFKA_VERSION#v}" dest | ||
- export CFLAGS="$CFLAGS -I${PWD}/dest/build/native/include" | ||
- export LDFLAGS="$LDFLAGS -L${PWD}/${lib_dir}" | ||
- export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/$lib_dir" | ||
- python3 -m pip install -e . | ||
|
||
# Store project root for reliable navigation | ||
- PROJECT_ROOT="${PWD}" | ||
|
||
# Start Kafka cluster and Schema Registry using dedicated ducktape compose file (KRaft mode) | ||
- cd "${PROJECT_ROOT}/tests/docker" | ||
- docker-compose -f docker-compose.ducktape.yml up -d kafka schema-registry | ||
|
||
# Debug: Check container status and logs | ||
- echo "=== Container Status ===" | ||
- docker-compose -f docker-compose.ducktape.yml ps | ||
- echo "=== Kafka Logs ===" | ||
- docker-compose -f docker-compose.ducktape.yml logs kafka | tail -50 | ||
|
||
# Wait for Kafka to be ready (using PLAINTEXT listener for external access) | ||
- | | ||
timeout 1800 bash -c ' | ||
counter=0 | ||
until docker-compose -f docker-compose.ducktape.yml exec -T kafka kafka-topics --bootstrap-server localhost:9092 --list >/dev/null 2>&1; do | ||
echo "Waiting for Kafka... (attempt $((counter+1)))" | ||
|
||
# Show logs every 4th attempt (every 20 seconds) | ||
if [ $((counter % 4)) -eq 0 ] && [ $counter -gt 0 ]; then | ||
echo "=== Recent Kafka Logs ===" | ||
docker-compose -f docker-compose.ducktape.yml logs --tail=10 kafka | ||
echo "=== Container Status ===" | ||
docker-compose -f docker-compose.ducktape.yml ps kafka | ||
fi | ||
|
||
counter=$((counter+1)) | ||
sleep 5 | ||
done | ||
' | ||
- echo "Kafka cluster is ready!" | ||
|
||
# Wait for Schema Registry to be ready | ||
- echo "=== Waiting for Schema Registry ===" | ||
- | | ||
timeout 300 bash -c ' | ||
counter=0 | ||
until curl -f http://localhost:8081/subjects >/dev/null 2>&1; do | ||
echo "Waiting for Schema Registry... (attempt $((counter+1)))" | ||
|
||
# Show logs every 3rd attempt (every 15 seconds) | ||
if [ $((counter % 3)) -eq 0 ] && [ $counter -gt 0 ]; then | ||
echo "=== Recent Schema Registry Logs ===" | ||
docker-compose -f docker-compose.ducktape.yml logs --tail=10 schema-registry | ||
echo "=== Schema Registry Container Status ===" | ||
docker-compose -f docker-compose.ducktape.yml ps schema-registry | ||
fi | ||
|
||
counter=$((counter+1)) | ||
sleep 5 | ||
done | ||
' | ||
- echo "Schema Registry is ready!" | ||
|
||
# Run standard ducktape tests with CI bounds | ||
- cd "${PROJECT_ROOT}" && PYTHONPATH="${PROJECT_ROOT}" python tests/ducktape/run_ducktape_test.py | ||
|
||
# Cleanup | ||
- cd "${PROJECT_ROOT}/tests/docker" && docker-compose -f docker-compose.ducktape.yml down -v || true | ||
- name: "Packaging" | ||
run: | ||
when: "tag =~ '.*'" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
services: | ||
kafka: | ||
image: confluentinc/cp-kafka:latest | ||
container_name: kafka-ducktape | ||
ports: | ||
- "9092:9092" | ||
- "29092:29092" | ||
environment: | ||
KAFKA_NODE_ID: 1 | ||
KAFKA_PROCESS_ROLES: broker,controller | ||
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093 | ||
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER | ||
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093,PLAINTEXT_HOST://0.0.0.0:29092 | ||
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092,PLAINTEXT_HOST://dockerhost:29092 | ||
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT | ||
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT | ||
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 | ||
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1 | ||
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1 | ||
CLUSTER_ID: 4L6g3nShT-eMCtK--X86sw | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For my knowledge, is this the cluster dedicated for kafka testing? https://github.com/search?q=org%3Aconfluentinc%204L6g3nShT-eMCtK--X86sw&type=code There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes Each CI job spins up its own isolated Kafka container with this ID |
||
|
||
schema-registry: | ||
image: confluentinc/cp-schema-registry:latest | ||
container_name: schema-registry-ducktape | ||
depends_on: | ||
- kafka | ||
ports: | ||
- "8081:8081" | ||
extra_hosts: | ||
- "dockerhost:172.17.0.1" | ||
environment: | ||
SCHEMA_REGISTRY_HOST_NAME: schema-registry | ||
SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: dockerhost:29092 | ||
SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081 | ||
SCHEMA_REGISTRY_KAFKASTORE_TOPIC_REPLICATION_FACTOR: 1 | ||
SCHEMA_REGISTRY_DEBUG: 'true' |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,26 @@ | ||
{ | ||
"_comment": "Default performance bounds for benchmark tests", | ||
"min_throughput_msg_per_sec": 1500.0, | ||
"max_p95_latency_ms": 1500.0, | ||
"max_error_rate": 0.01, | ||
"min_success_rate": 0.99, | ||
"max_p99_latency_ms": 2500.0, | ||
"max_memory_growth_mb": 600.0, | ||
"max_buffer_full_rate": 0.03, | ||
"min_messages_per_poll": 15.0 | ||
"_comment": "Performance bounds for benchmark tests by environment", | ||
"local": { | ||
"_comment": "Default bounds for local development - more relaxed thresholds", | ||
"min_throughput_msg_per_sec": 1000.0, | ||
"max_p95_latency_ms": 2000.0, | ||
"max_error_rate": 0.02, | ||
"min_success_rate": 0.98, | ||
"max_p99_latency_ms": 3000.0, | ||
"max_memory_growth_mb": 800.0, | ||
"max_buffer_full_rate": 0.05, | ||
"min_messages_per_poll": 10.0 | ||
}, | ||
"ci": { | ||
"_comment": "Stricter bounds for CI environment - production-like requirements", | ||
"min_throughput_msg_per_sec": 1500.0, | ||
"max_p95_latency_ms": 1500.0, | ||
"max_error_rate": 0.01, | ||
"min_success_rate": 0.99, | ||
"max_p99_latency_ms": 2500.0, | ||
"max_memory_growth_mb": 600.0, | ||
"max_buffer_full_rate": 0.03, | ||
"min_messages_per_poll": 10.0 | ||
}, | ||
"_default_environment": "local" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, how is the machine type chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"s1-prod-" this is a production grade system, which is used by other semaphore jobs. IMO, its already timetested for this repository, so we should keep using it.