-
Notifications
You must be signed in to change notification settings - Fork 2
docs: Add k8s setup instructions. #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-0.293-clp-connector
Are you sure you want to change the base?
Changes from all commits
5f0eb8c
5bae0ce
c82f6eb
ba8cdf6
80ec62d
e40f329
6cafbfc
17d7d31
8aa5cc1
c8666a4
d730404
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Coordinator common | ||
PRESTO_COORDINATOR_HTTPPORT="8080" | ||
PRESTO_COORDINATOR_SERVICENAME="presto-coordinator" | ||
|
||
# Coordinator clp.properties | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE="mysql" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL="jdbc:mysql://localhost:6001" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME="clp-db" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER="clp-user" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD="123456" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_TABLEPREFIX="clp_" | ||
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_SPLITPROVIDER="mysql" | ||
|
||
# Coordinator config.properties | ||
PRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORY="1GB" | ||
PRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORYPERNODE="1GB" | ||
|
||
# Coordinator jvm.config | ||
PRESTO_COORDINATOR_CONFIG_JVMCONFIG_MAXHEAPSIZE="4G" | ||
PRESTO_COORDINATOR_CONFIG_JVMCONFIG_G1HEAPREGIONSIZE="32M" | ||
|
||
# Coordinator log.properties | ||
PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL="DEBUG" | ||
|
||
# Coordinator node.properties | ||
PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT="production" | ||
|
||
# Worker common | ||
PRESTO_WORKER_HTTPPORT="8080" | ||
|
||
# Worker node.properties | ||
PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION="worker-location" | ||
|
||
# CLP package archives | ||
CLP_PACKAGE_ARCHIVES=REPLACE_ME |
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,76 @@ | ||||||||||||||||||||||
# Setup local docker stack for presto + clp | ||||||||||||||||||||||
|
||||||||||||||||||||||
## Install docker | ||||||||||||||||||||||
|
||||||||||||||||||||||
Follow the guide here: [docker] | ||||||||||||||||||||||
|
||||||||||||||||||||||
Comment on lines
+1
to
+6
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Multiple H1 headings & missing link reference label Markdown-lint flags -# Setup local docker stack for presto + clp
+# Presto + CLP local Docker stack
…
-Follow the guide here: [docker]
+Follow the official [Docker installation guide](https://docs.docker.com/engine/install). 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents
|
||||||||||||||||||||||
# Launch clp-package | ||||||||||||||||||||||
|
||||||||||||||||||||||
1. Find the clp-package for test on our official website [clp-json-v0.4.0]. Here is a sample dataset for demo testing: [postgresql dataset]. | ||||||||||||||||||||||
|
||||||||||||||||||||||
2. Untar the clp-package and the postgresql dataset. | ||||||||||||||||||||||
|
||||||||||||||||||||||
3. Replace the content of `/path/to/clp-json-package/etc/clp-config.yml` with the output of `demo-assets/init.sh generate_sample_clp_config`. | ||||||||||||||||||||||
|
||||||||||||||||||||||
4. Launch: | ||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
# You probably want to run a python 3.9 or newer virtual environment | ||||||||||||||||||||||
sbin/start-clp.sh | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
5. Compress: | ||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
# You can also use your own dataset | ||||||||||||||||||||||
sbin/compress.sh --timestamp-key 'timestamp' /path/to/postgresql.log | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
Comment on lines
+17
to
+27
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Code fences lack blank lines & language specifiers
-```bash
+# ```bash
docker compose up
-```
+``` Repeat for every fenced block. 🧰 Tools🪛 markdownlint-cli2 (0.17.2)22-22: Ordered list item prefix (MD029, ol-prefix) 🤖 Prompt for AI Agents
|
||||||||||||||||||||||
|
||||||||||||||||||||||
6. Use the following command to update `.env`: | ||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
demo-assets/init.sh update_metadata_config /path/to/clp-json-package | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
# Create Docker Cluster | ||||||||||||||||||||||
|
||||||||||||||||||||||
Create a local docker stack: | ||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
docker compose up | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
To create a docker stack with more than 1 worker (e.g., 3 workers): | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
docker compose up --scale presto-worker=3 | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
Comment on lines
+29
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Ordered-list numbering is off The list jumps from 4 → 5 → 6 because of manual numbering. Switch to Markdown auto-numbering ( -5. Compress:
-6. Use the following command to update `.env`:
+1. Compress
+2. Update `.env`:
🧰 Tools🪛 markdownlint-cli2 (0.17.2)29-29: Ordered list item prefix (MD029, ol-prefix) 35-35: Multiple top-level headings in the same document (MD025, single-title, single-h1) 44-44: Fenced code blocks should be surrounded by blank lines (MD031, blanks-around-fences) 44-44: Fenced code blocks should have a language specified (MD040, fenced-code-language) 🤖 Prompt for AI Agents
|
||||||||||||||||||||||
|
||||||||||||||||||||||
# Use cli: | ||||||||||||||||||||||
|
||||||||||||||||||||||
After all containers are in "Started" states (check by `docker ps`): | ||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
# On your host | ||||||||||||||||||||||
docker exec -it compose-presto-coordinator-1 sh | ||||||||||||||||||||||
|
||||||||||||||||||||||
# In presto-coordinator container | ||||||||||||||||||||||
/opt/presto-cli --catalog clp --schema default --server localhost:8080 | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
Example query: | ||||||||||||||||||||||
```sql | ||||||||||||||||||||||
SELECT * FROM default LIMIT 1; | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
# Delete docker Cluster | ||||||||||||||||||||||
Comment on lines
+61
to
+65
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Trailing punctuation in heading & duplicate H1 Rename 🧰 Tools🪛 markdownlint-cli2 (0.17.2)61-61: Fenced code blocks should be surrounded by blank lines (MD031, blanks-around-fences) 65-65: Multiple top-level headings in the same document (MD025, single-title, single-h1) 🤖 Prompt for AI Agents
|
||||||||||||||||||||||
|
||||||||||||||||||||||
```bash | ||||||||||||||||||||||
docker compose down | ||||||||||||||||||||||
``` | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
[clp-json-v0.4.0]: https://github.com/y-scope/clp/releases/tag/v0.4.0 | ||||||||||||||||||||||
[docker]: https://docs.docker.com/engine/install | ||||||||||||||||||||||
[postgresql dataset]: https://zenodo.org/records/10516402 | ||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
connector.name=clp | ||
clp.metadata-provider-type=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE} | ||
clp.metadata-db-url=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL} | ||
clp.metadata-db-name=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME} | ||
clp.metadata-db-user=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER} | ||
clp.metadata-db-password=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD} | ||
clp.metadata-table-prefix=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_TABLEPREFIX} | ||
clp.split-provider-type=${PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_SPLITPROVIDER} | ||
clp.metadata-filter-config=/opt/presto-server/etc/metadata-filter.json | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Hard-coded path may break inside k8s & Docker images
🤖 Prompt for AI Agents
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
coordinator=true | ||
node-scheduler.include-coordinator=false | ||
http-server.http.port=${PRESTO_COORDINATOR_HTTPPORT} | ||
query.max-memory=${PRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORY} | ||
query.max-memory-per-node=${PRESTO_COORDINATOR_CONFIG_CONFIGPROPERTIES_QUERY_MAXMEMORYPERNODE} | ||
discovery-server.enabled=true | ||
discovery.uri=http://${PRESTO_COORDINATOR_SERVICENAME}:${PRESTO_COORDINATOR_HTTPPORT} | ||
Comment on lines
+3
to
+7
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Potential port clash when multiple coordinators run on one host
🤖 Prompt for AI Agents
|
||
optimizer.optimize-hash-generation=false | ||
regex-library=RE2J | ||
use-alternative-function-signatures=true | ||
inline-sql-functions=false | ||
nested-data-serialization-enabled=false | ||
native-execution-enabled=true | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
-server | ||
-Xmx${PRESTO_COORDINATOR_CONFIG_JVMCONFIG_MAXHEAPSIZE} | ||
-XX:+UseG1GC | ||
-XX:G1HeapRegionSize=${PRESTO_COORDINATOR_CONFIG_JVMCONFIG_G1HEAPREGIONSIZE} | ||
-XX:+UseGCOverheadLimit | ||
-XX:+ExplicitGCInvokesConcurrent | ||
-XX:+HeapDumpOnOutOfMemoryError | ||
-XX:+ExitOnOutOfMemoryError | ||
-Djdk.attach.allowAttachSelf=true | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
com.facebook.presto=${PRESTO_COORDINATOR_CONFIG_LOGPROPERTIES_LEVEL} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
} | ||
|
||
Comment on lines
+1
to
+3
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainEmpty config likely to be ignored – add a placeholder or explanatory comment Presto’s metadata filter expects an explicit JSON structure (e.g., an array of regex filters). Committing an empty object without explanatory context makes it hard for operators to know whether the file is intentionally empty or incomplete. Consider adding at least a minimal placeholder field or inline comment to clarify intent, or remove the file until a real filter is needed. 🌐 Web query:
💡 Result: A Presto/Trino Hive connector metadata filter JSON structure typically defines which databases, schemas, tables, and columns to include or exclude when ingesting metadata, most often for use with metadata tools like DataHub or OpenMetadata. The filter configuration is usually provided as part of a larger ingestion recipe or JSON configuration object. Below is an example of such a JSON structure based on common patterns observed in Presto/Trino Hive connector and metadata tool documentation[1][2][6][7]: {
"source": {
"type": "presto",
"serviceName": "my_presto",
"host_port": "presto.example.com:8080",
"database": "hive",
"filter": {
"database_pattern": {
"allow": ["^analytics$", "^sales_.*"],
"deny": ["^test_.*"]
},
"schema_pattern": {
"allow": ["^public$", "^prod_.*"],
"deny": ["^temp_.*"]
},
"table_pattern": {
"allow": ["^.*_log$", "^user_data$"],
"deny": ["^deprecated_.*"]
},
"column_pattern": {
"deny": ["^debug_.*"]
}
}
}
} Key Fields and Usage:
Alternate/Additional Fields:
Ingestion tools like DataHub and OpenMetadata may use similar but not identical JSON, sometimes within a YAML configuration, and sometimes with field names such as Caveats:
If you are using a specific tool (like DataHub, OpenMetadata, or custom ingestion scripts), consult its documentation for the exact schema required. The structure shown here is widely compatible for such use cases. Citations:
Add a minimal placeholder to metadata‐filter.json • File: yscope-compose/coordinator/config-template/metadata-filter.json Suggested diff: - {}
+ {
+ "filter": {
+ "database_pattern": {
+ "allow": ["^.*$"], // allow all databases by default
+ "deny": [] // no databases denied
+ }
+ }
+ } 🤖 Prompt for AI Agents
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | ||
node.id=${PRESTO_COORDINATOR_SERVICENAME} | ||
Comment on lines
+1
to
+2
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainFail-fast if env vars are missing Both placeholders rely on environment variables that, if unset, will propagate literally into the running coordinator and cause a startup failure that is hard to diagnose. Add default values in the template or make the 🏁 Script executed: #!/bin/bash
# Locate any generate-configs.sh scripts in the repository
fd generate-configs.sh -t f || exit 0
# For each one found, print its path and the first 50 lines to see how env vars are handled
for script in $(fd generate-configs.sh -t f); do
echo "---- Inspecting $script ----"
head -n 50 "$script"
echo
done Length of output: 2750 Add explicit checks for missing coordinator env vars The • In yscope-compose/coordinator/scripts/generate-configs.sh (before doing any substitutions), add fail-fast checks such as: : "${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT:?Error: PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT must be set}"
: "${PRESTO_COORDINATOR_SERVICENAME:?Error: PRESTO_COORDINATOR_SERVICENAME must be set}" • Or declare safe defaults in the template ( node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT:-production}
node.id=${PRESTO_COORDINATOR_SERVICENAME:-coordinator} Either approach will surface a clear error or provide a sensible default rather than let Presto start with empty values. 🤖 Prompt for AI Agents
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/sh | ||
|
||
# Exit on error | ||
set -e | ||
|
||
PRESTO_CONFIG_DIR="/opt/presto-server/etc" | ||
|
||
# Substitute environemnt variables in config template | ||
find /configs -type f | while read -r f; do | ||
( echo "cat <<EOF"; cat $f; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")" | ||
done | ||
|
||
# Setup the config directory hierarchy | ||
rm -rf ${PRESTO_CONFIG_DIR}/catalog | ||
mkdir -p ${PRESTO_CONFIG_DIR}/catalog | ||
|
||
# Copy over files | ||
mv ${PRESTO_CONFIG_DIR}/clp.properties ${PRESTO_CONFIG_DIR}/catalog | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,18 @@ | ||||||
#!/bin/sh | ||||||
|
||||||
# Exit on error | ||||||
set -e | ||||||
|
||||||
PRESTO_CONFIG_DIR="/opt/presto-server/etc" | ||||||
|
||||||
# Substitute environemnt variables in config template | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Fix typo in comment. The comment has a spelling error: "environemnt" should be "environment". -# Substitute environemnt variables in config template
+# Substitute environment variables in config template 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents
|
||||||
find /configs -type f | while read -r f; do | ||||||
( echo "cat <<EOF"; cat $f; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Add quotes around variable to prevent word splitting. The variable should be quoted to prevent potential issues with filenames containing spaces or special characters. - ( echo "cat <<EOF"; cat $f; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")"
+ ( echo "cat <<EOF"; cat "$f"; echo "EOF" ) | sh > "${PRESTO_CONFIG_DIR}/$(basename "$f")" 📝 Committable suggestion
Suggested change
🧰 Tools🪛 Shellcheck (0.10.0)[info] 10-10: Double quote to prevent globbing and word splitting. (SC2086) 🤖 Prompt for AI Agents
|
||||||
done | ||||||
|
||||||
# Setup the config directory hierarchy | ||||||
rm -f ${PRESTO_CONFIG_DIR}/catalog/* | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Consider consistency with catalog cleanup approach. This version uses 🤖 Prompt for AI Agents
|
||||||
|
||||||
# Copy over files | ||||||
mv ${PRESTO_CONFIG_DIR}/clp.properties ${PRESTO_CONFIG_DIR}/catalog | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
package: | ||
storage_engine: "clp-s" | ||
database: | ||
type: "mariadb" | ||
host: "${REPLACE_IP}" | ||
port: 6001 | ||
name: "clp-db" | ||
query_scheduler: | ||
host: "${REPLACE_IP}" | ||
port: 6002 | ||
jobs_poll_delay: 0.1 | ||
num_archives_to_search_per_sub_job: 16 | ||
logging_level: "INFO" | ||
queue: | ||
host: "${REPLACE_IP}" | ||
port: 6003 | ||
redis: | ||
host: "${REPLACE_IP}" | ||
port: 6004 | ||
query_backend_database: 0 | ||
compression_backend_database: 1 | ||
reducer: | ||
host: "${REPLACE_IP}" | ||
base_port: 6100 | ||
logging_level: "INFO" | ||
upsert_interval: 100 | ||
results_cache: | ||
host: "${REPLACE_IP}" | ||
port: 6005 | ||
db_name: "clp-query-results" | ||
stream_collection_name: "stream-files" | ||
webui: | ||
host: "localhost" | ||
port: 6000 | ||
logging_level: "INFO" | ||
log_viewer_webui: | ||
host: "localhost" | ||
port: 6006 | ||
Comment on lines
+5
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Use a single source for the service host Almost every section repeats defaults: &host
host: "${CLP_HOST}"
database:
<<: *host
port: 6001
…
queue:
<<: *host
port: 6003 🤖 Prompt for AI Agents
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Remove extra blank line at EOF
🧰 Tools🪛 YAMLlint (1.37.1)[error] 39-39: too many blank lines (1 > 0) (empty-lines) 🤖 Prompt for AI Agents
|
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,50 @@ | ||||||||||||||||||||||
#!/usr/bin/env bash | ||||||||||||||||||||||
|
||||||||||||||||||||||
SCRIPT_PATH="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" | ||||||||||||||||||||||
|
||||||||||||||||||||||
function generate_sample_clp_config { | ||||||||||||||||||||||
local ip=$(hostname -i) | ||||||||||||||||||||||
local file="${SCRIPT_PATH}/clp-config.yml" | ||||||||||||||||||||||
cp "$file" "${file}.bak" | ||||||||||||||||||||||
sed -i "s|\${REPLACE_IP}|$ip|g" "$file" | ||||||||||||||||||||||
echo "Replaced \${REPLACE_IP} with $ip in $file" | ||||||||||||||||||||||
} | ||||||||||||||||||||||
Comment on lines
+5
to
+11
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The intention (per the docs) is to copy -function generate_sample_clp_config {
- local ip=$(hostname -i)
- local file="${SCRIPT_PATH}/clp-config.yml"
- cp "$file" "${file}.bak"
- sed -i "s|\${REPLACE_IP}|$ip|g" "$file"
+function generate_sample_clp_config {
+ local ip
+ ip="$(hostname -i)"
+
+ local template="${SCRIPT_PATH}/clp-config.yml.bak"
+ local file="${SCRIPT_PATH}/clp-config.yml"
+
+ if [[ ! -f "$template" ]]; then
+ echo "Error: $template not found"; return 1
+ fi
+
+ cp "$template" "$file"
+ sed -i "s|\${REPLACE_IP}|$ip|g" "$file" 🧰 Tools🪛 Shellcheck (0.10.0)[warning] 6-6: Declare and assign separately to avoid masking return values. (SC2155) 🤖 Prompt for AI Agents
|
||||||||||||||||||||||
|
||||||||||||||||||||||
function update_metadata_config { | ||||||||||||||||||||||
if [[ $# -ne 1 ]]; then | ||||||||||||||||||||||
echo "Usage: update_metadata_config </path/to/clp-package>" | ||||||||||||||||||||||
return 1 | ||||||||||||||||||||||
fi | ||||||||||||||||||||||
|
||||||||||||||||||||||
local clp_pkg_home=$1 | ||||||||||||||||||||||
local clp_config_path="$(readlink -f ${clp_pkg_home})/etc/clp-config.yml" | ||||||||||||||||||||||
local credential_path="$(readlink -f ${clp_pkg_home})/etc/credentials.yml" | ||||||||||||||||||||||
Comment on lines
+20
to
+21
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Quote shell variables to avoid word-splitting and globbing.
- local clp_config_path="$(readlink -f ${clp_pkg_home})/etc/clp-config.yml"
- ...
- sed -i "s|^CLP_PACKAGE_ARCHIVES=.*|CLP_PACKAGE_ARCHIVES=\"$(readlink -f ${clp_pkg_home})/var/data/archives/default\"|" "$env_path"
+ local clp_config_path="$(readlink -f "${clp_pkg_home}")/etc/clp-config.yml"
+ ...
+ sed -i "s|^CLP_PACKAGE_ARCHIVES=.*|CLP_PACKAGE_ARCHIVES=\"$(readlink -f "${clp_pkg_home}")/var/data/archives/default\"|" "$env_path" Also applies to: 39-39 🧰 Tools🪛 Shellcheck (0.10.0)[warning] 20-20: Declare and assign separately to avoid masking return values. (SC2155) [info] 20-20: Double quote to prevent globbing and word splitting. (SC2086) [warning] 21-21: Declare and assign separately to avoid masking return values. (SC2155) [info] 21-21: Double quote to prevent globbing and word splitting. (SC2086) 🤖 Prompt for AI Agents
|
||||||||||||||||||||||
host=$(python3 -c 'import sys, yaml; print(yaml.load(sys.stdin)["database"]["host"])' < "$clp_config_path") | ||||||||||||||||||||||
port=$(python3 -c 'import sys, yaml; print(yaml.load(sys.stdin)["database"]["port"])' < "$clp_config_path") | ||||||||||||||||||||||
name=$(python3 -c 'import sys, yaml; print(yaml.load(sys.stdin)["database"]["name"])' < "$clp_config_path") | ||||||||||||||||||||||
user=$(python3 -c 'import sys, yaml; print(yaml.load(sys.stdin)["database"]["user"])' < "$credential_path") | ||||||||||||||||||||||
password=$(python3 -c 'import sys, yaml; print(yaml.load(sys.stdin)["database"]["password"])' < "$credential_path") | ||||||||||||||||||||||
Comment on lines
+22
to
+26
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unsafe YAML parsing – switch to
-import sys, yaml; print(yaml.load(sys.stdin)["database"]["host"])
+import sys, yaml; print(yaml.safe_load(sys.stdin.read())["database"]["host"]) Apply the same change to the other four inline Python invocations. 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents
|
||||||||||||||||||||||
echo "Metadata database host: $host" | ||||||||||||||||||||||
echo "Metadata database port: $port" | ||||||||||||||||||||||
echo "Metadata database name: $name" | ||||||||||||||||||||||
echo "Metadata database user: $user" | ||||||||||||||||||||||
echo "Metadata database password: $password" | ||||||||||||||||||||||
|
||||||||||||||||||||||
local env_path="${SCRIPT_PATH}/../.env" | ||||||||||||||||||||||
cp "$env_path" "${env_path}.bak" | ||||||||||||||||||||||
sed -i "s|^PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL=.*|PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL=\"jdbc:mysql://${host}:${port}\"|" "$env_path" | ||||||||||||||||||||||
sed -i "s/^PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME=.*/PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_NAME=\"${name}\"/" "$env_path" | ||||||||||||||||||||||
sed -i "s/^PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER=.*/PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER=\"${user}\"/" "$env_path" | ||||||||||||||||||||||
sed -i "s/^PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD=.*/PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD=\"${password}\"/" "$env_path" | ||||||||||||||||||||||
sed -i "s|^CLP_PACKAGE_ARCHIVES=.*|CLP_PACKAGE_ARCHIVES=\"$(readlink -f ${clp_pkg_home})/var/data/archives/default\"|" "$env_path" | ||||||||||||||||||||||
} | ||||||||||||||||||||||
|
||||||||||||||||||||||
if declare -f "$1" > /dev/null; then | ||||||||||||||||||||||
"$@" | ||||||||||||||||||||||
else | ||||||||||||||||||||||
echo "Error: '$1' is not a valid function name." | ||||||||||||||||||||||
echo "Available functions:" | ||||||||||||||||||||||
declare -F | awk '{print $3}' | ||||||||||||||||||||||
exit 1 | ||||||||||||||||||||||
fi | ||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,39 @@ | ||||||||||||||||||||||||||||||
version: "3.9" | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
services: | ||||||||||||||||||||||||||||||
presto-coordinator: | ||||||||||||||||||||||||||||||
image: ghcr.io/y-scope/presto/coordinator:dev | ||||||||||||||||||||||||||||||
entrypoint: ["/bin/bash", "-c", "/scripts/generate-configs.sh && /opt/entrypoint.sh"] | ||||||||||||||||||||||||||||||
env_file: | ||||||||||||||||||||||||||||||
Comment on lines
+5
to
+7
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Add explicit Without a restart policy, the services stay down after a container crash or host reboot, which hurts dev UX. 🤖 Prompt for AI Agents
|
||||||||||||||||||||||||||||||
- .env | ||||||||||||||||||||||||||||||
volumes: | ||||||||||||||||||||||||||||||
- ./coordinator/scripts:/scripts:ro | ||||||||||||||||||||||||||||||
- coordinator-config:/opt/presto-server/etc | ||||||||||||||||||||||||||||||
- ./coordinator/config-template:/configs:ro | ||||||||||||||||||||||||||||||
networks: | ||||||||||||||||||||||||||||||
- presto | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
presto-worker: | ||||||||||||||||||||||||||||||
image: ghcr.io/y-scope/presto/prestissimo-worker:dev | ||||||||||||||||||||||||||||||
depends_on: | ||||||||||||||||||||||||||||||
presto-coordinator: | ||||||||||||||||||||||||||||||
condition: service_started | ||||||||||||||||||||||||||||||
entrypoint: ["/bin/bash", "-c", "/scripts/generate-configs.sh && /opt/entrypoint.sh"] | ||||||||||||||||||||||||||||||
env_file: | ||||||||||||||||||||||||||||||
- .env | ||||||||||||||||||||||||||||||
volumes: | ||||||||||||||||||||||||||||||
- ./worker/scripts:/scripts:ro | ||||||||||||||||||||||||||||||
- worker-config:/opt/presto-server/etc | ||||||||||||||||||||||||||||||
- ./worker/config-template:/configs:ro | ||||||||||||||||||||||||||||||
- "${CLP_PACKAGE_ARCHIVES}:${CLP_PACKAGE_ARCHIVES}" | ||||||||||||||||||||||||||||||
networks: | ||||||||||||||||||||||||||||||
Comment on lines
+24
to
+29
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) Mounting host path via env var can fail on Windows The volume 🤖 Prompt for AI Agents
|
||||||||||||||||||||||||||||||
- presto | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
volumes: | ||||||||||||||||||||||||||||||
coordinator-config: | ||||||||||||||||||||||||||||||
worker-config: | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
networks: | ||||||||||||||||||||||||||||||
presto: | ||||||||||||||||||||||||||||||
driver: bridge | ||||||||||||||||||||||||||||||
Comment on lines
+32
to
+38
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Nitpick (assertive) One blank line too many
- driver: bridge
-
+ driver: bridge 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents
|
||||||||||||||||||||||||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
connector.name=clp | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
discovery.uri=http://${PRESTO_COORDINATOR_SERVICENAME}:${PRESTO_COORDINATOR_HTTPPORT} | ||
presto.version=REPLACE_ME | ||
http-server.http.port=${PRESTO_WORKER_HTTPPORT} | ||
shutdown-onset-sec=1 | ||
register-test-functions=false | ||
runtime-metrics-collection-enabled=false | ||
Comment on lines
+1
to
+6
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Verification agent 🧩 Analysis chainAutomate replacement of
Example diff for (1): -presto.version=REPLACE_ME
+presto.version=UNKNOWN # replaced by generate-configs.sh I’ll verify how 🏁 Script executed: #!/bin/bash
set -euo pipefail
# Locate the generate-configs.sh script
SCRIPT=$(fd generate-configs.sh | head -n1)
echo "Inspecting: $SCRIPT"
# Check for REPLACE_ME usage
grep -n -C 3 'REPLACE_ME' "$SCRIPT" || echo "⚠️ No occurrence of REPLACE_ME found"
# Check where presto.version is set or validated
grep -n -C 3 'presto.version' "$SCRIPT" || echo "⚠️ No direct handling of presto.version in this script" Length of output: 632 Automate presto.version substitution and validate custom keys The • In yscope-compose/worker/config-template/config.properties -presto.version=REPLACE_ME
+presto.version=UNKNOWN # replaced by generate-configs.sh • In yscope-compose/coordinator/scripts/generate-configs.sh
• Manually confirm that custom keys 🤖 Prompt for AI Agents
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
node.environment=${PRESTO_COORDINATOR_CONFIG_NODEPROPERTIES_ENVIRONMENT} | ||
node.internal-address=REPLACE_ME | ||
node.location=${PRESTO_WORKER_CONFIG_NODEPROPERTIES_LOCATION} | ||
node.id=REPLACE_ME | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
mutable-config=true | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plain-text credentials committed to VCS.
PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD="123456"
exposes a real (or realistic) password.For public repos consider:
• Replacing with an obvious placeholder
• Loading sensitive values via external
.env.local
ignored by GitThis helps prevent accidental leaks when the configuration evolves beyond local demos.
🧰 Tools
🪛 dotenv-linter (3.3.0)
[warning] 9-9: [QuoteCharacter] The value has quote characters (', ")
[warning] 9-9: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_USER key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_PROVIDERTYPE key
[warning] 10-10: [QuoteCharacter] The value has quote characters (', ")
[warning] 10-10: [UnorderedKey] The PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_PASSWORD key should go before the PRESTO_COORDINATOR_CONFIG_CLPPROPERTIES_METADATA_DATABASE_URL key
[warning] 11-11: [QuoteCharacter] The value has quote characters (', ")
🤖 Prompt for AI Agents