Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: Extract string interpolation alignment and format. #19089

Merged

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Mar 21, 2025

This PR fixes an issue with missing extraction of alignment and format clauses from string interpolation expressions as described here.

Prior to this change, when we extracted a string interpolation expression $"Hello {name,align:format}. Welcome." (denoted E in the graphs below), it looked like this in the expressions relation

graph TD;
  id1["#quot;Hello #quot;"]
  id2["name"]
  id3["#quot;. Welcome.#quot;"]
  E--0-->id1
  E--1-->id2
  E--2-->id3
Loading

With this change we introduce a new expression for `{name:align:format}, which we call a string interpolation insert expression. The expression tree for extracting a string interpolation expression now looks like,

graph TD;
  id1["#quot;Hello #quot;"]
  id2["{name,align:format}"]
  id3["#quot;. Welcome.#quot;"]
  id4["name"]
  id5["align"]
  id6["format"]
  E--0-->id1
  E--1-->id2
  E--2-->id3
  id2--0-->id4
  id2--1-->id5
  id2--2-->id6
Loading

This change in the expression tree of string interpolation expressions makes it impossible/difficult to make a proper upgrade/downgrade script, as we introduce a new kind of expression and alter the parent/child relationship. That is, to make a proper upgrade script, we would need to introduce new expressions and alter- and add new entries in the expressions and expr_parent relations. We might be able to do it a bit better in the downgrade script, but I have chosen just to change the kind of the expressions to unknown. Conversely the downgrade script needs to remove some elements from expressions and expr_parent and also do so minor edge contraction on expr_parent.
Effectively, this change means that we break dataflow for string interpolation expressions, when the upgrade and downgrade scripts are used.
The up- and downgrade scripts have been tested manually on the database that can produced from the testcase. I tested that

  • The parent/child relationship for inserts are adjusted correctly and new expressions are added when needed.
  • The difference in the size of the expressions and expr_parent are as expected, when applying the upgrade and downgrade scripts (this is a cardinality test to give an indication that we don't add or drop to many elements from the tables).

Other relevant implementation details

  • The new expression kind should just used the standard post order control flow.
  • Taint flow for string interpolation expressions are now split in two steps, as we need to propagate taint from name to {name,align:format} in the example.

DCA looks un-eventfull.

@github-actions github-actions bot added the C# label Mar 21, 2025
@michaelnebel michaelnebel force-pushed the csharp/improvestringinterpolation branch from 04621bd to acec97d Compare March 21, 2025 12:33
@michaelnebel michaelnebel force-pushed the csharp/improvestringinterpolation branch from a783354 to d9fb137 Compare March 24, 2025 11:00
@michaelnebel michaelnebel marked this pull request as ready for review March 24, 2025 12:33
@michaelnebel michaelnebel requested a review from a team as a code owner March 24, 2025 12:33
Copy link
Contributor

@hvitved hvitved left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work. Some small questions.

import csharp

query predicate inserts(InterpolatedStringExpr expr, Expr e) {
expr.getAnInsert() = e // and inSpecificSource(expr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the trailing comment?

}

query predicate texts(InterpolatedStringExpr expr, StringLiteral literal) {
expr.getAText() = literal // and inSpecificSource(expr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the trailing comment?

where
expressions(e, k, t) and
if k = 138 then kind = 106 else kind = k
select e, kind, t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this select the insert part of e instead of e?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered doing that as well, but it requires that we also make changes to the expr_parent relation as well (this is what I am vaguely hinting in the PR description).
Basically, I am worried that I break some DB invariant by changing the parent/child relationship as a part of the script.
Should we attempt to make the better downgrade script? It requires
(1) Removing the insert expressions from the expressions relation
(2) Adjust the expr_parent relation
(3) Consider if we need to do something with the new dangling align and format expressions.

@alexet
Copy link
Contributor

alexet commented Mar 28, 2025

to make a proper upgrade script, we would need to introduce new expressions and alter- and add new entries in the expressions and expr_parent relations

This is possible using the NewEntity builtin module (only available in upgrade and downgrades) and using a single file upgrade query. Obviously the alight and format can't be created out of nothing though.

@michaelnebel
Copy link
Contributor Author

to make a proper upgrade script, we would need to introduce new expressions and alter- and add new entries in the expressions and expr_parent relations

This is possible using the NewEntity builtin module (only available in upgrade and downgrades) and using a single file upgrade query. Obviously the alight and format can't be created out of nothing though.

Uh, I was not aware of that! Thank you!

@michaelnebel
Copy link
Contributor Author

@hvitved : Proper upgrade and downgrade scripts have been added, since your last review and you review comments have been addressed (the last three commits).

@michaelnebel michaelnebel requested a review from hvitved March 31, 2025 13:33
@michaelnebel michaelnebel merged commit f4105ee into github:main Apr 1, 2025
23 checks passed
@michaelnebel michaelnebel deleted the csharp/improvestringinterpolation branch April 1, 2025 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants