Skip to content

Conversation

jcobis
Copy link
Collaborator

@jcobis jcobis commented Sep 11, 2025

Description

Implement the mongosh script generation engine that creates an executable script based on the faker schema. The script will generate and upload mock data.

Checklist

  • New tests and/or benchmarks are included
  • Documentation is changed or added
  • If this change updates the UI, screenshots/videos are added and a design review is requested
  • I have signed the MongoDB Contributor License Agreement (https://www.mongodb.com/legal/contributor-agreement)

Types of changes

  • Backport Needed
  • Patch (non-breaking change which fixes an issue)
  • Minor (non-breaking change which adds functionality)
  • Major (fix or feature that would cause existing functionality to change)

@Copilot Copilot AI review requested due to automatic review settings September 11, 2025 14:35
@jcobis jcobis requested a review from a team as a code owner September 11, 2025 14:35
@jcobis jcobis requested a review from johnjackweir September 11, 2025 14:35
@github-actions github-actions bot added the feat label Sep 11, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a mongosh script generation engine that creates executable scripts for generating and uploading mock data based on faker schemas. The engine supports complex nested structures, arrays, probabilistic fields, and configurable array lengths.

  • Core script generation functionality with support for nested objects, arrays, and multi-dimensional arrays
  • Faker.js integration with argument handling and fallback methods for unrecognized field types
  • Probabilistic field generation for optional data modeling

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
packages/compass-collection/src/components/mock-data-generator-modal/script-generation-utils.ts Implements the complete script generation engine with type definitions, path parsing, document structure building, and code generation
packages/compass-collection/src/components/mock-data-generator-modal/script-generation-utils.spec.ts Comprehensive test suite covering all functionality including edge cases, array handling, probability features, and faker argument processing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jcobis jcobis added the no release notes Fix or feature not for release notes label Sep 11, 2025
@jcobis jcobis requested review from kpamaran and ncarbon September 11, 2025 16:02
@mongodb-js mongodb-js deleted a comment from Copilot AI Sep 11, 2025
@mongodb-js mongodb-js deleted a comment from Copilot AI Sep 11, 2025
if (probability < 1.0) {
// Use Math.random for conditional field inclusion
rootLevelFields.push(
`${fieldIndent}...(Math.random() < ${probability} ? { ${fieldName}: ${fakerCall} } : {})`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why include the empty object? vs simply leaving rootLevelFields as is

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this isn't the most readable but it's conditional rendering using the spread operator ..., so if the probability condition is not met, nothing gets rendered (i.e no empty object). I can add a comment, or potentially refactor this to make it clearer. The advantage of this notation is that it's one line, and there may be many of these

/**
* Gets default faker method for unrecognized fields based on MongoDB type
*/
export function getDefaultFakerMethod(mongoType: string): string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to cover any more of the allowed datatypes?

some of these would be easy like null, Long, timestamp

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also changed to explicitly handle null and undefined by passing through those literal values

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcobis This doesn't actually generate code to create instances of these types, though ... for example, for Decimal128, you want something like Decimal128.fromStringWithRounding(`${faker.number.float()}`), not just faker.number.float()

It's also honestly a bit unclear here what mongoType actually is – e.g. in what situation would int64 be passed vs. in what situation long? Can we re-use an existing enum here, e.g. TypeCastTypes found elsewhere in the Compass source?

// This is equivalent to: function(faker) { return <returnExpression>; }
// The 'faker' parameter will receive the real faker.js library when we pass it in on call
// eslint-disable-next-line @typescript-eslint/no-implied-eval
const generateDocument = new Function(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never knew the Function constructor could be used this way. Very cool ✨

Copy link
Collaborator

@kpamaran kpamaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see more codeql+copilot suggestions, but otherwise latest diffs lgtm

// The "{ ... }" part is the document structure

// Extract the return statement from the generateDocument function
const returnMatch = script.match(/return ([\s\S]*?);[\s]*\}/);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, completely optional, but this can be simplified a bit:

Suggested change
const returnMatch = script.match(/return ([\s\S]*?);[\s]*\}/);
const returnMatch = script.match(/return (.*?);\s*\}/s);

console.log(\`Successfully inserted \${documents.length} documents into ${
options.databaseName
}.${options.collectionName}\`);`;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ This still needs to be escaped, ' is a valid character in database and collection names

/**
* Gets default faker method for unrecognized fields based on MongoDB type
*/
export function getDefaultFakerMethod(mongoType: string): string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcobis This doesn't actually generate code to create instances of these types, though ... for example, for Decimal128, you want something like Decimal128.fromStringWithRounding(`${faker.number.float()}`), not just faker.number.float()

It's also honestly a bit unclear here what mongoType actually is – e.g. in what situation would int64 be passed vs. in what situation long? Can we re-use an existing enum here, e.g. TypeCastTypes found elsewhere in the Compass source?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat no release notes Fix or feature not for release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants