Skip to content

fix: route ndt7 tests to production instead of staging#164

Merged
robertodauria merged 1 commit intomainfrom
fix/mlab-project-placeholder
Feb 26, 2026
Merged

fix: route ndt7 tests to production instead of staging#164
robertodauria merged 1 commit intomainfrom
fix/mlab-project-placeholder

Conversation

@robertodauria
Copy link
Copy Markdown
Contributor

Summary

  • Fix mlabProject() so the sed placeholder substitution in CI actually works
  • The strict equality check placeholder === 'MLAB_PROJECT_PLACEHOLDER' was replaced by sed into 'mlab-oti' === 'mlab-oti' which is always true, causing the function to always return 'mlab-staging'
  • Changed to placeholder.includes('PLACEHOLDER') which correctly evaluates to false after substitution

Impact

Since the 2026-02-03 release, all ndt7 tests from the website have been routed to staging infrastructure. This caused:

  • ndt7 test data in production BigQuery dropped from ~2,000/day to ~15/day
  • MSAK tests were unaffected (they don't use this code path)
  • Tests appeared to work correctly for users — results displayed fine in the browser — but data landed in the staging project instead of production

…titution

The CI pipeline runs `sed -i 's/MLAB_PROJECT_PLACEHOLDER/mlab-oti/g'`
to set the production project. The previous strict equality check
`placeholder === 'MLAB_PROJECT_PLACEHOLDER'` became
`'mlab-oti' === 'mlab-oti'` after substitution, which is always true,
causing production to use the mlab-staging fallback.

Using `placeholder.includes('PLACEHOLDER')` instead ensures the check
correctly detects an unsubstituted placeholder while evaluating to false
after sed replaces it with 'mlab-oti'.

This bug has been silently routing all ndt7 tests from the website to
staging infrastructure since the 2026-02-03 release, causing a 99% drop
in ndt7 test data in production BigQuery.
@github-actions
Copy link
Copy Markdown

Visit the preview URL for this PR (updated for commit 9a485f9):

https://mlab-sandbox--pr-164-cotdw1ek.web.app

(expires Thu, 05 Mar 2026 01:35:52 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 8e465d9b3e7c624aa653052fc53678329c9bfc07

@robertodauria robertodauria merged commit c583f65 into main Feb 26, 2026
3 checks passed
@robertodauria robertodauria deleted the fix/mlab-project-placeholder branch February 26, 2026 01:45
bassosimone added a commit that referenced this pull request Feb 26, 2026
The #81 pull request
introduced a nasty bug where we weren't using prod.

The TL;DR of the bug is that `sed` was replacing both the initial
assignment and the pattern, such that this code:

```javascript
    const placeholder = 'MLAB_PROJECT_PLACEHOLDER';
    return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder;
```

became, at runtime:

```javascript
    const placeholder = 'mlab-oti';
    return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder;
```

This was fixed in #164
but `sed` substitution is fragile. We need more robustness.

I spent time thinking about this and this is what I concluded:

1. The biggest issue of the `sed` substitution is not the substitution
per se, but a *lack of debuggability* by which the substitution only
happens when deploying and cannot be tested locally.

2. Any solution that only applies when merging a pull request would
be morally as bad as that one. It could fail without notice.

3. We now have `scripts/build.js` so we can take advantage of that.

4. Using `scripts/build.js` to select a committed `env.prod.js` _or_
`env.staging.js` is a *bad idea* because they might drift, thus
putting us *back again* into a scenario where we only know something
is wrong after merging a pull request.

Based on this reasoning, with this diff I am proposing what I think
is the most robust solution to the problem at hand:

1. We autogenerate `dist/js/env.js` from `scripts/build.js`

2. `scripts/build.js` *requires* a command line argument identifying
the target deployment environment and this value is *parsed* and we
reject invalid configurations (furthermore, it is always required to
specify the argument so prod and staging cannot drift)

3. `dist/js/env.js` *only* contains a *single* variable named
`mlabEnvName` and that can be *either* "prod" or "staging" (and it
must always be like that, so drift is *not possible*)

4. The codebase *continues* to select the correct configuration
inline depending on `mlabEnvName` values (which preserves the
principle of locality and ensures the code that selects the config
is *close to* the code it has effects upon)

5. Now the codebase *rejects* invalid or missing `mlabEnvName`
and the "not configured case" leads to *hard failure*

6. Both deployment paths use the same pattern `npm run build -- $name`
where `$name` is either "prod" or "staging" (this is arguably
better than before where the staging deployment relied on a default)

As a bonus, deploy `console.log` aggressively. This follows the
*rule of transparency*. The operator must be able to easily and
clearly see what URLs we're using. Easy inspection solves many
problems and empowers to perform better manual testing. (Also, a
user can now more easily report bugs.)
bassosimone added a commit that referenced this pull request Feb 26, 2026
The #81 pull request
introduced a nasty bug where we weren't using prod.

The TL;DR of the bug is that `sed` was replacing both the initial
assignment and the pattern, such that this code:

```javascript
    const placeholder = 'MLAB_PROJECT_PLACEHOLDER';
    return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder;
```

became, at runtime:

```javascript
    const placeholder = 'mlab-oti';
    return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder;
```

This was fixed in #164
but `sed` substitution is fragile. We need more robustness.

I spent time thinking about this and this is what I concluded:

1. The biggest issue of the `sed` substitution is not the substitution
per se, but a *lack of debuggability* by which the substitution only
happens when deploying and cannot be tested locally.

2. Any solution that only applies when merging a pull request would
be morally as bad as that one. It could fail without notice.

3. We now have `scripts/build.js` so we can take advantage of that.

4. Using `scripts/build.js` to select a committed `env.prod.js` _or_
`env.staging.js` is a *bad idea* because they might drift, thus
putting us *back again* into a scenario where we only know something
is wrong after merging a pull request.

Based on this reasoning, with this diff I am proposing what I think
is the most robust solution to the problem at hand:

1. We autogenerate `dist/js/env.js` from `scripts/build.js`

2. `scripts/build.js` *requires* a command line argument identifying
the target deployment environment and this value is *parsed* and we
reject invalid configurations (furthermore, it is always required to
specify the argument so prod and staging cannot drift)

3. `dist/js/env.js` *only* contains a *single* variable named
`mlabEnvName` and that can be *either* "prod" or "staging" (and it
must always be like that, so drift is *not possible*)

4. The codebase *continues* to select the correct configuration
inline depending on `mlabEnvName` values (which preserves the
principle of locality and ensures the code that selects the config
is *close to* the code it has effects upon)

5. Now the codebase *rejects* invalid or missing `mlabEnvName`
and the "not configured case" leads to *hard failure*

6. Both deployment paths use the same pattern `npm run build -- $name`
where `$name` is either "prod" or "staging" (this is arguably
better than before where the staging deployment relied on a default)

As a bonus, deploy `console.log` aggressively. This follows the
*rule of transparency*. The operator must be able to easily and
clearly see what URLs we're using. Easy inspection solves many
problems and empowers to perform better manual testing. (Also, a
user can now more easily report bugs.)
bassosimone added a commit that referenced this pull request Feb 26, 2026
The #81 pull request
introduced a nasty bug where we weren't using prod.

The TL;DR of the bug is that `sed` was replacing both the initial
assignment and the pattern, such that this code:

```javascript
    const placeholder = 'MLAB_PROJECT_PLACEHOLDER';
    return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder;
```

became, at runtime:

```javascript
    const placeholder = 'mlab-oti';
    return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder;
```

This was fixed in #164
but `sed` substitution is fragile. We need more robustness.

I spent time thinking about this and this is what I concluded:

1. The biggest issue of the `sed` substitution is not the substitution
per se, but a *lack of debuggability* by which the substitution only
happens when deploying and cannot be tested locally.

2. Any solution that only applies when merging a pull request would
be morally as bad as that one. It could fail without notice.

3. We now have `scripts/build.js` so we can take advantage of that.

4. Using `scripts/build.js` to select a committed `env.prod.js` _or_
`env.staging.js` is a *bad idea* because they might drift, thus
putting us *back again* into a scenario where we only know something
is wrong after merging a pull request.

Based on this reasoning, with this diff I am proposing what I think
is the most robust solution to the problem at hand:

1. We autogenerate `dist/js/env.js` from `scripts/build.js`

2. `scripts/build.js` *requires* a command line argument identifying
the target deployment environment and this value is *parsed* and we
reject invalid configurations (furthermore, it is always required to
specify the argument so prod and staging cannot drift)

3. `dist/js/env.js` *only* contains a *single* variable named
`mlabEnvName` and that can be *either* "prod" or "staging" (and it
must always be like that, so drift is *not possible*)

4. The codebase *continues* to select the correct configuration
inline depending on `mlabEnvName` values (which preserves the
principle of locality and ensures the code that selects the config
is *close to* the code it has effects upon)

5. Now the codebase *rejects* invalid or missing `mlabEnvName`
and the "not configured case" leads to *hard failure*

6. Both deployment paths use the same pattern `npm run build -- $name`
where `$name` is either "prod" or "staging" (this is arguably
better than before where the staging deployment relied on a default)

As a bonus, deploy `console.log` aggressively. This follows the
*rule of transparency*. The operator must be able to easily and
clearly see what URLs we're using. Easy inspection solves many
problems and empowers to perform better manual testing. (Also, a
user can now more easily report bugs.)
bassosimone added a commit that referenced this pull request Feb 26, 2026
The #81 pull request
introduced a nasty bug where we weren't using prod.

The TL;DR of the bug is that `sed` was replacing both the initial
assignment and the pattern, such that this code:

```javascript
    const placeholder = 'MLAB_PROJECT_PLACEHOLDER';
    return placeholder === 'MLAB_PROJECT_PLACEHOLDER' ? 'mlab-staging' : placeholder;
```

became, at runtime:

```javascript
    const placeholder = 'mlab-oti';
    return placeholder === 'mlab-oti' ? 'mlab-staging' : placeholder;
```

This was fixed in #164
but `sed` substitution is fragile. We need more robustness.

I spent time thinking about this and this is what I concluded:

1. The biggest issue of the `sed` substitution is not the substitution
per se, but a *lack of debuggability* by which the substitution only
happens when deploying and cannot be tested locally.

2. Any solution that only applies when merging a pull request would
be morally as bad as that one. It could fail without notice.

3. We now have `scripts/build.js` so we can take advantage of that.

4. Using `scripts/build.js` to select a committed `env.prod.js` _or_
`env.staging.js` is a *bad idea* because they might drift, thus
putting us *back again* into a scenario where we only know something
is wrong after merging a pull request.

Based on this reasoning, with this diff I am proposing what I think
is the most robust solution to the problem at hand:

1. We autogenerate `dist/js/env.js` from `scripts/build.js`

2. `scripts/build.js` *requires* a command line argument identifying
the target deployment environment and this value is *parsed* and we
reject invalid configurations (furthermore, it is always required to
specify the argument so prod and staging cannot drift)

3. `dist/js/env.js` *only* contains a *single* variable named
`mlabEnvName` and that can be *either* "prod" or "staging" (and it
must always be like that, so drift is *not possible*)

4. The codebase *continues* to select the correct configuration
inline depending on `mlabEnvName` values (which preserves the
principle of locality and ensures the code that selects the config
is *close to* the code it has effects upon)

5. Now the codebase *rejects* invalid or missing `mlabEnvName`
and the "not configured case" leads to *hard failure*

6. Both deployment paths use the same pattern `npm run build -- $name`
where `$name` is either "prod" or "staging" (this is arguably
better than before where the staging deployment relied on a default)

As a bonus, deploy `console.log` aggressively. This follows the
*rule of transparency*. The operator must be able to easily and
clearly see what URLs we're using. Easy inspection solves many
problems and empowers to perform better manual testing. (Also, a
user can now more easily report bugs.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant