Fix bash options causing "silent" bugs in script execution #4612
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Most bash scripts in the repository were missing critical bash options, particularly
set -o nounset, which caused undefined variables to be treated as empty strings instead of triggering errors. This led to "silent" bugs where scripts would continue executing with incorrect behavior.Specific issue identified:
make build-api-imagewas not using the "CI ACR" because theCI_CACHE_ACR_NAMEvariable was undefinedERROR: invalid reference formatwhen the cache-from parameter became malformedset +o nounsetworkarounds to handle cascade effects from other scripts lacking proper optionsSolution
Added proper bash options (
set -o errexit,set -o pipefail,set -o nounset) to 85+ bash scripts across the repository and removed the workarounds that were masking the root cause.Scripts Updated
Core Infrastructure:
devops/scripts/mgmtacr_enable_public_access.sh- ACR public access managementdevops/scripts/mgmtstorage_enable_public_access.sh- Storage account access managementdevops/scripts/kv_add_network_exception.sh- Key Vault network exceptionsdevops/scripts/set_docker_sock_permission.sh- Docker permissions (used in Makefile)Build Pipeline:
devops/scripts/terraform_wrapper.sh- Core Terraform operationsdevops/scripts/api_healthcheck.sh- API health validationcli/scripts/build.sh- CLI build processdevops/scripts/porter_build_bundle.sh- Porter bundle buildingHelper Functions:
devops/scripts/construct_tre_url.sh- TRE URL constructiondevops/scripts/convert_azure_env_to_arm_env.sh- Environment conversiondevops/scripts/bash_trap_helper.sh- Exit trap managementCore Terraform Scripts:
core/terraform/outputs.sh,core/terraform/json-to-env.shcore/terraform/compare_plans.sh,core/terraform/scripts/letsencrypt.shRemoved Workarounds:
set +o nounsetfromdevops/scripts/check_dependencies.sh,load_and_validate_env.sh,env_to_yaml_config.sh, andload_env.shthat were added specifically to handle cascade effectsTesting
Created comprehensive tests validating:
CI_CACHE_ACR_NAMEDocker cache scenario works correctlyImpact
make build-api-imageused wrong ACR configurationnounsetin core scriptsBefore:
After:
Fixes #1672.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.