fix: route ndt7 tests to production instead of staging#164
Merged
robertodauria merged 1 commit intomainfrom Feb 26, 2026
Merged
fix: route ndt7 tests to production instead of staging#164robertodauria merged 1 commit intomainfrom
robertodauria merged 1 commit intomainfrom
Conversation
…titution
The CI pipeline runs `sed -i 's/MLAB_PROJECT_PLACEHOLDER/mlab-oti/g'`
to set the production project. The previous strict equality check
`placeholder === 'MLAB_PROJECT_PLACEHOLDER'` became
`'mlab-oti' === 'mlab-oti'` after substitution, which is always true,
causing production to use the mlab-staging fallback.
Using `placeholder.includes('PLACEHOLDER')` instead ensures the check
correctly detects an unsubstituted placeholder while evaluating to false
after sed replaces it with 'mlab-oti'.
This bug has been silently routing all ndt7 tests from the website to
staging infrastructure since the 2026-02-03 release, causing a 99% drop
in ndt7 test data in production BigQuery.
|
Visit the preview URL for this PR (updated for commit 9a485f9): https://mlab-sandbox--pr-164-cotdw1ek.web.app (expires Thu, 05 Mar 2026 01:35:52 GMT) 🔥 via Firebase Hosting GitHub Action 🌎 Sign: 8e465d9b3e7c624aa653052fc53678329c9bfc07 |
bassosimone
added a commit
that referenced
this pull request
Feb 26, 2026
The #81 pull request introduced a nasty bug where we weren't using prod. The TL;DR of the bug is that `sed` was replacing both the initial assignment and the pattern, such that this code: ```javascript const placeholder = 'MLAB_PROJECT_PLACEHOLDER'; return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder; ``` became, at runtime: ```javascript const placeholder = 'mlab-oti'; return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder; ``` This was fixed in #164 but `sed` substitution is fragile. We need more robustness. I spent time thinking about this and this is what I concluded: 1. The biggest issue of the `sed` substitution is not the substitution per se, but a *lack of debuggability* by which the substitution only happens when deploying and cannot be tested locally. 2. Any solution that only applies when merging a pull request would be morally as bad as that one. It could fail without notice. 3. We now have `scripts/build.js` so we can take advantage of that. 4. Using `scripts/build.js` to select a committed `env.prod.js` _or_ `env.staging.js` is a *bad idea* because they might drift, thus putting us *back again* into a scenario where we only know something is wrong after merging a pull request. Based on this reasoning, with this diff I am proposing what I think is the most robust solution to the problem at hand: 1. We autogenerate `dist/js/env.js` from `scripts/build.js` 2. `scripts/build.js` *requires* a command line argument identifying the target deployment environment and this value is *parsed* and we reject invalid configurations (furthermore, it is always required to specify the argument so prod and staging cannot drift) 3. `dist/js/env.js` *only* contains a *single* variable named `mlabEnvName` and that can be *either* "prod" or "staging" (and it must always be like that, so drift is *not possible*) 4. The codebase *continues* to select the correct configuration inline depending on `mlabEnvName` values (which preserves the principle of locality and ensures the code that selects the config is *close to* the code it has effects upon) 5. Now the codebase *rejects* invalid or missing `mlabEnvName` and the "not configured case" leads to *hard failure* 6. Both deployment paths use the same pattern `npm run build -- $name` where `$name` is either "prod" or "staging" (this is arguably better than before where the staging deployment relied on a default) As a bonus, deploy `console.log` aggressively. This follows the *rule of transparency*. The operator must be able to easily and clearly see what URLs we're using. Easy inspection solves many problems and empowers to perform better manual testing. (Also, a user can now more easily report bugs.)
bassosimone
added a commit
that referenced
this pull request
Feb 26, 2026
The #81 pull request introduced a nasty bug where we weren't using prod. The TL;DR of the bug is that `sed` was replacing both the initial assignment and the pattern, such that this code: ```javascript const placeholder = 'MLAB_PROJECT_PLACEHOLDER'; return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder; ``` became, at runtime: ```javascript const placeholder = 'mlab-oti'; return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder; ``` This was fixed in #164 but `sed` substitution is fragile. We need more robustness. I spent time thinking about this and this is what I concluded: 1. The biggest issue of the `sed` substitution is not the substitution per se, but a *lack of debuggability* by which the substitution only happens when deploying and cannot be tested locally. 2. Any solution that only applies when merging a pull request would be morally as bad as that one. It could fail without notice. 3. We now have `scripts/build.js` so we can take advantage of that. 4. Using `scripts/build.js` to select a committed `env.prod.js` _or_ `env.staging.js` is a *bad idea* because they might drift, thus putting us *back again* into a scenario where we only know something is wrong after merging a pull request. Based on this reasoning, with this diff I am proposing what I think is the most robust solution to the problem at hand: 1. We autogenerate `dist/js/env.js` from `scripts/build.js` 2. `scripts/build.js` *requires* a command line argument identifying the target deployment environment and this value is *parsed* and we reject invalid configurations (furthermore, it is always required to specify the argument so prod and staging cannot drift) 3. `dist/js/env.js` *only* contains a *single* variable named `mlabEnvName` and that can be *either* "prod" or "staging" (and it must always be like that, so drift is *not possible*) 4. The codebase *continues* to select the correct configuration inline depending on `mlabEnvName` values (which preserves the principle of locality and ensures the code that selects the config is *close to* the code it has effects upon) 5. Now the codebase *rejects* invalid or missing `mlabEnvName` and the "not configured case" leads to *hard failure* 6. Both deployment paths use the same pattern `npm run build -- $name` where `$name` is either "prod" or "staging" (this is arguably better than before where the staging deployment relied on a default) As a bonus, deploy `console.log` aggressively. This follows the *rule of transparency*. The operator must be able to easily and clearly see what URLs we're using. Easy inspection solves many problems and empowers to perform better manual testing. (Also, a user can now more easily report bugs.)
bassosimone
added a commit
that referenced
this pull request
Feb 26, 2026
The #81 pull request introduced a nasty bug where we weren't using prod. The TL;DR of the bug is that `sed` was replacing both the initial assignment and the pattern, such that this code: ```javascript const placeholder = 'MLAB_PROJECT_PLACEHOLDER'; return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder; ``` became, at runtime: ```javascript const placeholder = 'mlab-oti'; return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder; ``` This was fixed in #164 but `sed` substitution is fragile. We need more robustness. I spent time thinking about this and this is what I concluded: 1. The biggest issue of the `sed` substitution is not the substitution per se, but a *lack of debuggability* by which the substitution only happens when deploying and cannot be tested locally. 2. Any solution that only applies when merging a pull request would be morally as bad as that one. It could fail without notice. 3. We now have `scripts/build.js` so we can take advantage of that. 4. Using `scripts/build.js` to select a committed `env.prod.js` _or_ `env.staging.js` is a *bad idea* because they might drift, thus putting us *back again* into a scenario where we only know something is wrong after merging a pull request. Based on this reasoning, with this diff I am proposing what I think is the most robust solution to the problem at hand: 1. We autogenerate `dist/js/env.js` from `scripts/build.js` 2. `scripts/build.js` *requires* a command line argument identifying the target deployment environment and this value is *parsed* and we reject invalid configurations (furthermore, it is always required to specify the argument so prod and staging cannot drift) 3. `dist/js/env.js` *only* contains a *single* variable named `mlabEnvName` and that can be *either* "prod" or "staging" (and it must always be like that, so drift is *not possible*) 4. The codebase *continues* to select the correct configuration inline depending on `mlabEnvName` values (which preserves the principle of locality and ensures the code that selects the config is *close to* the code it has effects upon) 5. Now the codebase *rejects* invalid or missing `mlabEnvName` and the "not configured case" leads to *hard failure* 6. Both deployment paths use the same pattern `npm run build -- $name` where `$name` is either "prod" or "staging" (this is arguably better than before where the staging deployment relied on a default) As a bonus, deploy `console.log` aggressively. This follows the *rule of transparency*. The operator must be able to easily and clearly see what URLs we're using. Easy inspection solves many problems and empowers to perform better manual testing. (Also, a user can now more easily report bugs.)
bassosimone
added a commit
that referenced
this pull request
Feb 26, 2026
The #81 pull request introduced a nasty bug where we weren't using prod. The TL;DR of the bug is that `sed` was replacing both the initial assignment and the pattern, such that this code: ```javascript const placeholder = 'MLAB_PROJECT_PLACEHOLDER'; return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder; ``` became, at runtime: ```javascript const placeholder = 'mlab-oti'; return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder; ``` This was fixed in #164 but `sed` substitution is fragile. We need more robustness. I spent time thinking about this and this is what I concluded: 1. The biggest issue of the `sed` substitution is not the substitution per se, but a *lack of debuggability* by which the substitution only happens when deploying and cannot be tested locally. 2. Any solution that only applies when merging a pull request would be morally as bad as that one. It could fail without notice. 3. We now have `scripts/build.js` so we can take advantage of that. 4. Using `scripts/build.js` to select a committed `env.prod.js` _or_ `env.staging.js` is a *bad idea* because they might drift, thus putting us *back again* into a scenario where we only know something is wrong after merging a pull request. Based on this reasoning, with this diff I am proposing what I think is the most robust solution to the problem at hand: 1. We autogenerate `dist/js/env.js` from `scripts/build.js` 2. `scripts/build.js` *requires* a command line argument identifying the target deployment environment and this value is *parsed* and we reject invalid configurations (furthermore, it is always required to specify the argument so prod and staging cannot drift) 3. `dist/js/env.js` *only* contains a *single* variable named `mlabEnvName` and that can be *either* "prod" or "staging" (and it must always be like that, so drift is *not possible*) 4. The codebase *continues* to select the correct configuration inline depending on `mlabEnvName` values (which preserves the principle of locality and ensures the code that selects the config is *close to* the code it has effects upon) 5. Now the codebase *rejects* invalid or missing `mlabEnvName` and the "not configured case" leads to *hard failure* 6. Both deployment paths use the same pattern `npm run build -- $name` where `$name` is either "prod" or "staging" (this is arguably better than before where the staging deployment relied on a default) As a bonus, deploy `console.log` aggressively. This follows the *rule of transparency*. The operator must be able to easily and clearly see what URLs we're using. Easy inspection solves many problems and empowers to perform better manual testing. (Also, a user can now more easily report bugs.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mlabProject()so the sed placeholder substitution in CI actually worksplaceholder === 'MLAB_PROJECT_PLACEHOLDER'was replaced by sed into'mlab-oti' === 'mlab-oti'which is always true, causing the function to always return'mlab-staging'placeholder.includes('PLACEHOLDER')which correctly evaluates tofalseafter substitutionImpact
Since the 2026-02-03 release, all ndt7 tests from the website have been routed to staging infrastructure. This caused: