Skip to content

CDRIVER-3228 fix memory leaks in SChannel cert loading #2009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

kevinAlbs
Copy link
Collaborator

Summary

This PR addresses several memory-related found in certificate loading for Secure Channel:

  • Fix several memory leaks.
  • NUL-terminate the PEM file for the client certificate.

Reading the PEM file is refactored with more error checks logs including the system error.

Background

Leaks were discovered when implementing PKCS#8 certificate support for (CDRIVER-4269). This may help motivate future addition of leak detection on Windows (CDRIVER-2529).

NUL-terminating the PEM file was similarly done in #1903. NUL-terminating is intended to avoid reading past beyond the string in later calls to strstr.

The Secure Channel implementation would likely benefit from a larger rewrite, and may be done as part of CDRIVER-4463. This PR intends to address the more severe issues and I expect can be backported with low risk of unexpected behavior change.

@kevinAlbs kevinAlbs force-pushed the leaks-only.C4296 branch from 1a0bce5 to 61e39c5 Compare May 9, 2025 14:15
@kevinAlbs kevinAlbs marked this pull request as ready for review May 9, 2025 16:31
@kevinAlbs kevinAlbs requested a review from a team as a code owner May 9, 2025 16:31
@kevinAlbs kevinAlbs requested review from rcsanchez97, eramongodb and mdbmes and removed request for rcsanchez97 May 9, 2025 16:31
Copy link
Contributor

@mdbmes mdbmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements in a pretty hairy area of code. It's always fun litigating subtleties of under-documented legacy APIs :)

I had a handful of comments but I don't think any of them should be treated as blockers. The type confusion possible from CERT_QUERY_CONTENT_FLAG_ALL might be my biggest concern but it's not new.

I think it's possible to simplify the changes to tls-secure-channel (freeing 'cert' early instead of storing it) but I haven't verified this beyond reading docs and code.

Comment on lines +81 to +90
MONGOC_ERROR ("Failed to get length of file: '%s'. File too large", filename);
goto fail;
}

if (0 != fseek (file, 0, SEEK_SET)) {
MONGOC_ERROR ("Failed to seek in file: '%s' with error: '%s'",
filename,
bson_strerror_r (errno, errmsg_buf, sizeof errmsg_buf));
goto fail;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, if we wanted to simplify the error reporting here, some of these are definitely worth catching at runtime but maybe not worth having a unique error string for. (I'm thinking that the LONG_MAX-1 check and the fseek to zero check are needed for security, but in practice will only fail due to unusual filesystems.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also noticing that below we have a possible overflow when the strlen of an individual key in the file is cast to DWORD (see public_blob.cbData). The security implication seems acceptably minor there, but it'd be a small improvement IMO to disallow file sizes larger than INT_MAX or so and remove that bit of attack surface.


/* https://msdn.microsoft.com/en-us/library/windows/desktop/aa376573%28v=vs.85%29.aspx
*/
// CertSetCertificateContextProperty takes ownership of `provider`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is very helpful! I'm noticing that the specific doc we could reference is on the CERT_KEY_PROV_HANDLE_PROP_ID property rather than the whole of CertSetCertificateContextProperty:

CERT_KEY_PROV_HANDLE_PROP_ID

Data type of pvData: A HCRYPTPROV value.

The HCRYPTPROV handle for the certificate's private key is set. The hCryptProv member of the CERT_KEY_CONTEXT structure is updated if it exists. If it does not exist, it is created with dwKeySpec and initialized by CERT_KEY_PROV_INFO_PROP_ID. If CERT_STORE_NO_CRYPT_RELEASE_FLAG is not set, the hCryptProv value is implicitly released either when the property is set to NULL or on the final freeing of the CERT_CONTEXT structure.

return NULL;
if (!ret) {
if (cert) {
CertFreeCertificateContext (cert);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm noticing that the type of object returned by CryptQueryObject() above can vary due to the use of CERT_QUERY_CONTENT_FLAG_ALL. I would comment on that directly, but GitHub doesn't seem to let me comment on lines that weren't in the diff. Anyway, I think we probably want CERT_QUERY_CONTENT_CERT above to constrain the return type.

(There's also CERT_QUERY_CONTENT_SERIALIZED_CERT, which is not described clearly in this API reference but I'm finding a 3rd party blog that claims "serialized certificates" are a Microsoft-specific format associated with the .sst file extension, https://medium.com/tenable-techblog/code-for-reading-windows-serialized-certificates-8634d3487ec7)

Doc reference: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptqueryobject

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed this when looking at CertFreeCertificateContext call sites in this file: mongoc_secure_channel_setup_crl is using the wrong free function, there's a separate CertFreeCRLContext API for CRLs.

(Ref: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptqueryobject)

if (pem_length < 1) {
MONGOC_ERROR ("Couldn't determine file size of '%s'", filename);
return NULL;
pem = read_file_and_null_terminate (filename, &pem_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to comment on the lines above this, but unfortunately, GitHub.

The usage of encrypted_private[_len] here is confusing, since we explicitly do not support encrypted private keys. I think this means to be "encoded" rather than "encrypted".

pem = (char *) bson_malloc0 (pem_length);
fread ((void *) pem, 1, pem_length, file);
fclose (file);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to comment on the use of strlen (pem_public) below: This is redundant since we already know the location of the NUL. The length could be calculated as (pem_length - (pem_public - pem)) instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to avoid calling CryptDecodeObjectEx twice, there's CRYPT_DECODE_ALLOC_FLAG.

(Ref: https://learn.microsoft.com/en-us/windows/win32/api/wincrypt/nf-wincrypt-cryptdecodeobjectex)

@@ -932,6 +935,10 @@ mongoc_stream_tls_secure_channel_new (mongoc_stream_t *base_stream, const char *
schannel_cred.grbitEnabledProtocols = SP_PROT_TLS1_1_CLIENT | SP_PROT_TLS1_2_CLIENT;

secure_channel->cred = (mongoc_secure_channel_cred *) bson_malloc0 (sizeof (mongoc_secure_channel_cred));
if (cert) {
// Store client cert to free later.
secure_channel->cred->cert = cert;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Microsoft docs are vague about lifetimes and reference vs. copy semantics. When I'm in doubt about lifetimes and memory semantics, I like to look for Rust bindings. If my reading of schannel-rs is correct, AcquireCredentialsHandle has copy semantics and it's safe to free the certificate immediately after obtaining the credential handle without storing it separately.

https://github.com/steffengy/schannel-rs/blob/master/src/schannel_cred.rs

goto fail;
}
CryptDestroyKey (hKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is right but I can't find a good reference that explains why this doesn't destroy the underlying key. It would be cool to add this if you have something handy. I see the CryptDestroyKey MSDN page but there seem to be two contradicting statements there:

Many CSPs overwrite the memory where the key was held before freeing it. However, the underlying public/private key pair is not destroyed by this function. Only the handle is destroyed.

and:

Keys take up both operating system's memory space and the CSP's memory space. Some CSPs are implemented in hardware with limited memory resources. Applications must destroy all keys with the CryptDestroyKey function when they are finished with them.

Comment on lines 327 to 328
DWORD encrypted_cert_len = 0;
LPBYTE encrypted_cert = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"encrypted" is wrong here too, I think they meant "encoded".

@@ -44,6 +44,7 @@ typedef enum {
typedef struct {
CredHandle cred_handle;
TimeStamp time_stamp;
PCCERT_CONTEXT cert; /* optional client cert. Kept to free later */
Copy link
Collaborator

@eramongodb eramongodb May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PCCERT_CONTEXT cert; /* optional client cert. Kept to free later */
PCCERT_CONTEXT cert; /* Owning. Optional client cert. */

Minor wording suggestion.

@@ -162,6 +162,9 @@ _mongoc_stream_tls_secure_channel_destroy (mongoc_stream_t *stream)
/* if the handle was not cached and the refcount is zero */
TRACE ("%s", "clear credential handle");
FreeCredentialsHandle (&secure_channel->cred->cred_handle);
if (secure_channel->cred->cert != NULL) {
CertFreeCertificateContext (secure_channel->cred->cert);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigating headers reveal the pCertContext is annotated with _In_opt_ indicating null pointers are permitted. Therefore this manual null check is likely unnecessary.

ok = true;
fail:
if (file) {
fclose (file); // Ignore error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fclose (file); // Ignore error.
(void) fclose (file); // Ignore error.

The error being ignored is the return value.

goto fail;
}

long file_len = ftell (file);
Copy link
Collaborator

@eramongodb eramongodb May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
long file_len = ftell (file);
const long file_len = ftell (file);

Const correctness.

}

if (file_len > LONG_MAX - 1) {
MONGOC_ERROR ("Failed to get length of file: '%s'. File too large", filename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MONGOC_ERROR ("Failed to get length of file: '%s'. File too large", filename);
MONGOC_ERROR ("Length of file '%s' is too large", filename);

Misleading: this is not failure to obtain the length of the file, but an invalid file length which may prevent reservation of the null terminating byte.

if (0 != fseek (file, 0, SEEK_SET)) {
MONGOC_ERROR ("Failed to seek in file: '%s' with error: '%s'",
filename,
bson_strerror_r (errno, errmsg_buf, sizeof errmsg_buf));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bson_strerror_r (errno, errmsg_buf, sizeof errmsg_buf));
bson_strerror_r (ferror (file), errmsg_buf, sizeof errmsg_buf));

Unlike fopen or ftell, fseek sets ferror in failure, not errno.

goto fail;
}

// Read the whole file into one nul-terminated string:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Read the whole file into one nul-terminated string:
// Read the whole file into one null-terminated string:

} else {
MONGOC_ERROR ("Failed to read file: '%s' with error: '%s'",
filename,
bson_strerror_r (errno, errmsg_buf, sizeof errmsg_buf));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bson_strerror_r (errno, errmsg_buf, sizeof errmsg_buf));
bson_strerror_r (ferror (file), errmsg_buf, sizeof errmsg_buf));

fread sets ferror on failure, not errno.

}

MONGOC_ERROR ("Can't associate private key with public key: 0x%.8X", (unsigned int) GetLastError ());
TRACE ("%s", "Successfully loaded client certificate");
ret = true;

fail:
SecureZeroMemory (pem, pem_length);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

   if (pem) {
      SecureZeroMemory (pem, pem_length);
      bson_free (pem);
   }

If read_file_and_null_terminate fails and returns NULL, this call to SecureZeroMemory is invalid (it does not accept a null pointer according to its _In_ annotation).

entering windows exception handler
exception '(access violation)' at '0x00007FF6A2010AE3', terminating
begin stack trace
test-libmongoc : RtlSecureZeroMemory
test-libmongoc : mongoc_secure_channel_setup_certificate_from_file

fclose (file); // Ignore error.
}
if (!ok) {
bson_free (contents);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bson_free (contents);
SecureZeroMemory (contents, file_len);
bson_free (contents);

Ensure any partially-read contents are securely overwritten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants