Skip to content

Conversation

josh-arnold-1
Copy link
Contributor

@josh-arnold-1 josh-arnold-1 commented Sep 14, 2025

What

Follow up from this issue here: #2238

Allow projects to define a custom preparationBatchSize in their projects. This can be incredibly useful for my large Bazel based project.

Currently, I've only supported a target strategy which allows you to define a constant batch size. We can easily extend this in the future for multiple strategies.

Usage

"preparationBatchingStrategy": {    
  "strategy": "target",
  "batchSize": 200    
}

Test plan

Tested locally in my project.

@ahoppen
Copy link
Member

ahoppen commented Sep 16, 2025

Thanks for picking this up @josh-arnold-1 🙏. One high-level comment: I would really like us to design the configuration option that allows us to customize the batching strategy as I described in #2238 (comment). do you think you could look into that?

@josh-arnold-1
Copy link
Contributor Author

Thanks for picking this up @josh-arnold-1 🙏. One high-level comment: I would really like us to design the configuration option that allows us to customize the batching strategy as I described in #2238 (comment). do you think you could look into that?

Thanks for the review!

What if we update the schema to be an enum of configuration options like you specified, but we just supply a single option for now, which we default to a target size of 1 to maintain SourceKit-LSPs default behavior?

That way, we can easily extend additional strategies in future PRs, whilst maintaining the current configuration API?

What are your thoughts? Thanks!

      {
        "type": "object",
        "description": "Prepare a fixed number of targets in a single batch",
        "properties": {
          "strategy": {
            "const": "target"
          },
          "batchSize": {
            "type": "integer",
            "description": "Defines how many targets should be prepared in a single batch"
          }
        },
        "required": [
          "strategy",
          "batchSize"
        ]
      },

@ahoppen
Copy link
Member

ahoppen commented Sep 17, 2025

Your proposal for the JSON schema sounds great to me!

@brentleyjones
Copy link

Is there a way to set batchSize to inf/"batch everything at once"?

@bnbarham
Copy link
Contributor

Is there a way to set batchSize to inf/"batch everything at once"?

Setting to a high value seems reasonable to me for this rather than handling it specifically

@josh-arnold-1
Copy link
Contributor Author

Sorry for the delayed response — I was OOO recently. I’m looking into this now and realized that config.schema.json generation might not support enums with associated values (unless I’m missing something?).

If that’s the case, what would be the best way to represent the different batching strategies?

@ahoppen, any guidance here would be super helpful. Thanks!

@ahoppen
Copy link
Member

ahoppen commented Oct 14, 2025

Yeah, the JSON schema generation would need to be expanded to support this. That’s what I meant in #2238 (comment).

As a side note, allowing us to generate the oneOf in the schema above like this will likely need quite a bit new functionality in ConfigSchemaGen. If we only stick to the target-based strategy, we should only need support for the const key in the JSON schema, which should be a lot easier to accomplish.

@josh-arnold-1 josh-arnold-1 force-pushed the preparation-batch-size branch from b8a8cad to 5cd67b8 Compare October 16, 2025 19:37
@josh-arnold-1
Copy link
Contributor Author

Thanks @ahoppen, I just updated the code with what we discussed!

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I really appreciate that you put in the effort to generate the enum options in the JSON schema and Markdown document 🙏🏽 Just a few nitpicky comments.

Two other high-level comments:

  • Could you include the test from #2238 in this PR as well?
  • I would also like to see the BSP server advertising if it can handle multi-target preparation as I mentioned in #2238 (comment) so that users can’t get regressed performance for SwiftPM projects by increasing the target batch size. #2238 already implements most of this, so you should be able to just copy it. I would be happy for this to be follow-up PR though so we can get this one in without any further discussions on the BSP protocol extension.

if let markdownEnumDescriptions {
try container.encode(markdownEnumDescriptions, forKey: .markdownEnumDescriptions)
}
if let oneOf = oneOf, !oneOf.isEmpty {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if let oneOf = oneOf, !oneOf.isEmpty {
if let oneOf, !oneOf.isEmpty {

// so we only set `markdownEnumDescriptions` here.
if enumInfo.cases.contains(where: { $0.description != nil }) {
schema.markdownEnumDescriptions = enumInfo.cases.map { $0.description ?? "" }
let hasAssociatedTypes = enumInfo.cases.contains { $0.associatedProperties != nil && !$0.associatedProperties!.isEmpty }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think it reads nicer like this because you can just infer that nil is considered empty. But I know that some people disagree.

Suggested change
let hasAssociatedTypes = enumInfo.cases.contains { $0.associatedProperties != nil && !$0.associatedProperties!.isEmpty }
let hasAssociatedTypes = enumInfo.cases.contains { !($0.associatedProperties?.isEmpty ?? true) }

doc += "\(indent) - This is a tagged union discriminated by the `\(discriminatorFieldName)` field. Each case has the following structure:\n"

for caseInfo in schema.cases {
doc += "\(indent) - `\(discriminatorFieldName): \"\(caseInfo.name)\"`"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nitpicky: I generally find that string literals that contain quotes read nicer when used in multi-line string literals because then you don’t need to escape the quotes.

Suggested change
doc += "\(indent) - `\(discriminatorFieldName): \"\(caseInfo.name)\"`"
doc += """
\(indent) - `\(discriminatorFieldName): "\(caseInfo.name)"`
"""

// run into this timeout, which causes somewhat expensive computations because we trigger the `buildTargetsChanged`
// chain.
// At the same time, we do want to provide functionality based on fallback settings after some time.
// 15s seems like it should strike a balance here but there is no data backing this value up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally removed this comment?

try container.encode(batchSize, forKey: .batchSize)
}
}
} No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run swift-format to format the source files, which should also add a trailing newline here.

/// Prepare a fixed number of targets in a single batch.
///
/// `batchSize`: The number of targets to prepare in each batch.
case target(batchSize: Int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target is pretty ambiguous here, should we make this more specific and call it fixedTargetBatchSize. That should really clarify what this strategy does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants