Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warnings from GCC 15 -Wunterminated-string-initialization #9944

Open
billatarm opened this issue Jan 30, 2025 · 8 comments
Open

Warnings from GCC 15 -Wunterminated-string-initialization #9944

billatarm opened this issue Jan 30, 2025 · 8 comments
Labels

Comments

@billatarm
Copy link
Contributor

Summary

Some of the string literals use the size of the array to specify an array of size N where N is the length of the string and so things like sizeof() will truncate the NULL byte.

Warning message Example:

/builddir/build/BUILD/mbedtls3.6-3.6.2-build/mbedtls-3.6.2/library/ssl_tls13_keys.h:20:44: error: initializer-string for array of ‘unsigned char’ is too long [-Werror=unterminated-string-initialization]
   20 |     MBEDTLS_SSL_TLS1_3_LABEL(c_hs_traffic, "c hs traffic") \

Code Examples to reproduce:

char foo[3]="foo";
printf("%s-%zu\n", foo, sizeof(foo);
foo - 3

This waring can be enabled via:
-Wunterminated-string-initialization

I have attached the build log.

build.log

System information

Mbed TLS version (number or commit id): 3.6.2
Operating system and version: F32 with gcc-15
Configuration (if not default, please attach mbedtls_config.h):
Compiler and options (if you used a pre-built binary, please indicate how you obtained it):
Additional environment information:

Expected behavior

Builds

Actual behavior

Steps to reproduce

Additional information

@billatarm
Copy link
Contributor Author

I have patches if you want them.

@gilles-peskine-arm
Copy link
Contributor

There is no undefined behavior here. MBEDTLS_SSL_TLS1_3_LABEL deliberately creates an array containing the string literal without a null byte. Arrays of bytes (unsigned char[]) can contain any byte value, including having all bytes be nonzero. Undefined behavior would only take place if such an array was passed to a string function strxxx() from <string.h> or other functions that expect a null-terminated
string. The array constructed here is not a null-terminated string and is not passed to functions that expect a null-terminated string.

In this particular case, adding a null byte would not be compliant to the protocol specification.

This waring can be enabled via:
-Wunterminated-string-initialization

Note — only with GCC 15, which is not released yet.

GCC 15's new warning -Wunterminated-string-initialization triggers many false positives on our code base. I had a quick look and none of them look like a true positive.

We do aim to compile with no warnings at a reasonable warning level with common compilers. This will include GCC 15 with -Wall -Wextra once GCC 15 is released.

Patches to avoid triggering the warning would be welcome. But they obviously need to be correct! And in most cases, for this warning, in our code base, the warning is a false positive. So we need to find a way to let GCC know that the array is intended to be a byte array and not a null-terminated string.

@gilles-peskine-arm gilles-peskine-arm changed the title String literals use undefined behavior Warnings from GCC 15 -Wunterminated-string-initialization Jan 30, 2025
@billatarm
Copy link
Contributor Author

billatarm commented Jan 31, 2025

There is no undefined behavior here. MBEDTLS_SSL_TLS1_3_LABEL deliberately creates an array containing the string literal without a null byte.

I understand the intention of the code. This wasn't discussed in C89 (undefined bahavior), but the C99 spec did cover it:

EXAMPLE 8 The declaration
char s[] = "abc", t[3] = "abc";
defines ‘‘plain’’ char array objects s and t whose elements are initialized with character string literals.
This declaration is identical to
char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };
The contents of the arrays are modifiable. On the other hand, the declaration ...

We should file a bug to GCC if it's not filed yet.

@billatarm
Copy link
Contributor Author

I would like to point out though, the original author wasn't so sure either of the behavior on this:

 struct mbedtls_ssl_tls13_labels_struct const mbedtls_ssl_tls13_labels =
 {
    /* This seems to work in C, despite the string literal being one
     * character too long due to the 0-termination. */

@gilles-peskine-arm
Copy link
Contributor

This is well-defined behavior in all versions of the C language, even back to K&R C AFAIR. Using a string literal as an initializer adds a null byte if there is room. The array size can be left empty, in which case the size includes a terminating null byte.

The following examples illustrate array initialization (hopefully I didn't make any copypasta). All are fully defined in all versions of the C language.

    /* The following initialize 4 non-null bytes: */
    char explicit_braces_array[4] = {'a', 'b', 'c', 'd'};
    char implicit_braces_array[] = {'a', 'b', 'c', 'd'};
    char explicit_quotes_array[4] = "abcd";

    /* The following initialize 4 bytes, the last one being null: */
    char explicit_braces_array0[4] = {'a', 'b', 'c', 0};
    char explicit_braces_array_partial[4] = {'a', 'b', 'c'};
    char implicit_braces_array0[] = {'a', 'b', 'c', 0};
    char explicit_quotes_string[4] = "abc";
    char explicit_quotes_string0[4] = "abc\000";
    char implicit_quotes_string[] = "abc";

    /* The following initialize 4 bytes, the last two being null: */
    char explicit_braces_array00[4] = {'a', 'b', 0, 0};
    char explicit_braces_array_partial[4] = {'a', 'b'};
    char implicit_braces_array00[] = {'a', 'b', 0, 0};
    char explicit_quotes_string00[4] = "ab\000\000";
    char explicit_quotes_string_partial[4] = "ab";
    char implicit_quotes_string0[] = "ab\000";

The warning from gcc -Wunterminated-string-initialization doesn't say “your code has undefined behavior”. It says “your code has behavior that may be surprising”. I don't see any reason to file a bug with GCC: the warning makes sense. In this case, we intended the behavior. The only thing I might want to change in GCC is to have a well-defined mechanism to say “I intended this”, but that's a very general problem with warnings in C.

One way to avoid the warning would be to use braced initializers instead of string literals. However, here, it would mean that the strings from the TLS specification wouldn't appear in the code, which is not good. If we choose to do that, we should keep the string in a comment. But that comes with a risk of typos in the comment. (Typos in the code would be caught immediately by interoperability tests.) Fortunately we don't add or modify such code often.

@billatarm
Copy link
Contributor Author

I even pulled out my KR C book, "If its size is fixed, the number of characters in the string, not counting the terminating null character, must not exceed the size of the array". I am actually surprised by all of these being defined, fun little endeavor and I learned a new part of C.

@billatarm
Copy link
Contributor Author

The warning from gcc -Wunterminated-string-initialization doesn't say “your code has undefined behavior”. It says “your code has behavior that may be surprising”.

I thought this was undefined behavior, but K+R does cover it. #9944 (comment).

I don't see any reason to file a bug with GCC: the warning makes sense. In this case, we intended the behavior.

Yes, I don't disagree that the behavior was intentional (it was commented as such), I'm just thinking if I set the size field, why warn me? as this is an intentional use.

I am trying to deduce one of the following:

  1. Did Fedora add some flag to the build flags?
  • Doesn't seem so, but I also cannot reproduce directly using rpm --eval '%{optflags}' for C flags on Rawhide.
  1. Did GCC introduce this for GCC15?
  • Not sure, need to sleuth.
  1. Did GCC add this to a set of flags?
  • Doesn't seem that -Wall added it.

If it's 2 or 3, I would file a bug with GCC to remove it from the default set as being overly pedantic. If it's 1 then I need to find out why Fedora added it and see if folks can opt out.

@gilles-peskine-arm
Copy link
Contributor

It's (2) and (3). -Wunterminated-string-initialization is new in GCC 15 and included in -Wextra. I don't see it yet in the release notes, but GCC 14.2.0 doesn't have it.

Our CI enforces a clean build with -Werror -Wall -Wextra with a couple of different versions of GCC and Clang, and we normally aim at a clean build with other versions as well. GCC 15 is not yet in scope because it's still unreleased, but it's reasonable to expect that the warning will remain there in the official release, so it would be good for us to fix it.

I don't know what the best fix is. In test code, using braced initializers or putting a null terminator in the test data would be ok. (Although I slightly dislike having test data that happens to be a null-terminated string where a byte string is expected, because that can hide bugs if the code somehow starts using a string function and thus stops working on arbitrary byte strings.) In the TLS code, it's easier to understand the code if the strings from the protocol definition appear in a text search, and the strings are byte strings and we don't particularly want to waste memory and effort to put a null byte at the end but process only the non-null part. This makes a warning-off pragma look attractive, but I'm not sure if that's possible with the way the macros are structured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

2 participants