Skip to content

[stdlib] Make String.makeContiguousUTF8() strictly true #82851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

glessard
Copy link
Contributor

@glessard glessard commented Jul 7, 2025

Until now, StringProtocol.makeContiguousUTF8() only considered whether or not a String or Substring heap allocation contained contiguous UTF8, disregarding whether a stack-allocated instance was discontiguous. This is no longer tenable when these types vend UTF8Span and Span properties, so we now also consider whether the small string form contains contiguous code units. If the small form contains discontiguously-stored code units, then it is replaced by a new heap-allocated String in contiguous storage. This new code path is specific to ABI-stable, 32-bit platforms.

Addresses rdar://154331399

@glessard glessard force-pushed the rdar154331399-makeReallyContiguous branch from 0999641 to 268eb09 Compare July 7, 2025 22:21
@glessard
Copy link
Contributor Author

glessard commented Jul 7, 2025

@swift-ci please test linux platform

@glessard glessard force-pushed the rdar154331399-makeReallyContiguous branch from 268eb09 to d0154c2 Compare July 8, 2025 00:19
@glessard glessard changed the title [stdlib] String.makeContiguousUTF8() becomes more literal [stdlib] String.makeContiguousUTF8() is literally more true Jul 8, 2025
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please test linux platform

@glessard glessard force-pushed the rdar154331399-makeReallyContiguous branch from d0154c2 to 6c9b0b0 Compare July 8, 2025 07:56
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please test linux platform

@glessard glessard changed the title [stdlib] String.makeContiguousUTF8() is literally more true [stdlib] Make String.makeContiguousUTF8() strictly true Jul 8, 2025
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please smoke test linux platform

1 similar comment
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please smoke test linux platform

@glessard glessard force-pushed the rdar154331399-makeReallyContiguous branch from bf4ada7 to 3dfdd87 Compare July 8, 2025 20:58
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please test

@glessard glessard marked this pull request as ready for review July 8, 2025 21:03
@glessard glessard requested a review from a team as a code owner July 8, 2025 21:03
@glessard glessard requested a review from kperryua July 8, 2025 21:10
@glessard
Copy link
Contributor Author

glessard commented Jul 8, 2025

@swift-ci please test

@@ -312,9 +312,19 @@ extension String {
@inline(never) // slow-path
internal static func _copying(_ str: Substring) -> String {
if _fastPath(str._wholeGuts.isFastUTF8) {
return unsafe str._wholeGuts.withFastUTF8(range: str._offsetRange) {
var new = unsafe str._wholeGuts.withFastUTF8(range: str._offsetRange) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In non-watchOS or 64-bit environments, will the compiler not complain about this var being unmutated?

Copy link
Contributor Author

@glessard glessard Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice it complaining when building on macOS. Apparently it knows to consider statements that are conditionally skipped: https://swift.godbolt.org/z/aT1naWfce

/// Contiguous strings always operate in O(1) time for withUTF8 and always
/// give a result for String.UTF8View.withContiguousStorageIfAvailable.
/// Contiguous strings always operate in O(1) time for withUTF8, always give
/// a result for String.UTF8View.withContiguousStorageIfAvailable, and always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this comment actually inaccurate before? Did String.UTF8View.withContiguousStorageIfAvailable actually NOT provide a result for these exceptional small strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not lying: UTF8View.withContiguousStorageIfAvailable will rearrange the bytes of a small string in order to present contiguous storage to the closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. With this returning false now for these strings, some might find it surprising that the former conditions are still true. However, seems like we're within our rights to do so, as this is implication ("isContiguousUTF8 -> the above guarantees hold") and not equivalence ("isContiguousUTF8 <-> the above guarantees hold").

/// Contiguous strings always operate in O(1) time for withUTF8 and always
/// give a result for String.UTF8View.withContiguousStorageIfAvailable.
/// Contiguous strings always operate in O(1) time for withUTF8, always give
/// a result for String.UTF8View.withContiguousStorageIfAvailable, and always
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. With this returning false now for these strings, some might find it surprising that the former conditions are still true. However, seems like we're within our rights to do so, as this is implication ("isContiguousUTF8 -> the above guarantees hold") and not equivalence ("isContiguousUTF8 <-> the above guarantees hold").

@glessard
Copy link
Contributor Author

glessard commented Jul 9, 2025

@swift-ci please test linux platform

@glessard glessard merged commit 7e26972 into swiftlang:main Jul 9, 2025
5 checks passed
@glessard glessard deleted the rdar154331399-makeReallyContiguous branch July 9, 2025 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants