Skip to content

Conversation

@mattjala
Copy link
Contributor

@mattjala mattjala commented Oct 21, 2025

Also includes a new workflow to test FreeBSD, which has a different qsort signature and was previously untested
Resolves #5896


Important

Implement qsort_r fallback for unsupported systems and add FreeBSD CI workflow.

  • Behavior:
    • Implement fallback HDqsort_fallback() in H5system.c for systems without qsort_r/qsort_s, using thread-local storage or global variable.
    • Modify HDqsort_r macro in H5private.h to use fallback if qsort_r/qsort_s is unavailable.
  • CI:
    • Add freebsd.yml workflow to test FreeBSD versions 13.5, 14.3, 15.0.
    • Add test-qsort-fallback.yml workflow to test qsort_r fallback on Ubuntu, macOS, Windows.
  • Configuration:
    • Update ConfigureChecks.cmake to check for qsort_r and qsort_s and set H5_HAVE_QSORT_REENTRANT.

This description was created by Ellipsis for c7ffe3a. You can customize this summary. It will automatically update as commits are pushed.

hyoklee
hyoklee previously approved these changes Oct 22, 2025
Copy link
Member

@hyoklee hyoklee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if you can use matrix for fbsd versions (15.0, 14.3, and 14.2):
https://github.com/vmactions/freebsd-vm

@mattjala mattjala marked this pull request as draft October 22, 2025 16:47
@mattjala mattjala marked this pull request as ready for review October 22, 2025 20:43
strategy:
fail-fast: false
matrix:
freebsd-version: ['13.5', '14.3', '15.0']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beginning with 14.3 will be sufficient, as 13.5 has reached EOL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment on #5800 (comment) though. FreeBSD switched flipped argument ordering between 13 and 14, so it may not be a bad idea to test both.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lrknox and @hyoklee can clarify, but we have a testing policy of not supporting OS that have reached EOL. But it is up to them if they want to forgo this policy in this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's test 13.5.

Copy link
Member

@hyoklee hyoklee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both OpenBSD 7.8 and OpenBSD 7.7 are building OK with fallbacks.

obsd-7.7

obsd-7.8

#ifndef HDqsort_r
#ifdef H5_HAVE_DARWIN
#ifdef H5_HAVE_QSORT_REENTRANT
#if defined(H5_HAVE_DARWIN) || (defined(__FreeBSD__) && __FreeBSD__ < 14)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the near future we should probably move this into something we try to check at configure time. I imagine there are other platforms that use this signature and checking at configure time should get rid of platform-specific #ifdefs and also avoid needing the (probably minor) overhead of a fallback function. Maybe something like https://gitlab.gnome.org/GNOME/glib/-/blob/2.30.0/configure.ac?ref_type=tags#L589-627.

@brtnfld
Copy link
Collaborator

brtnfld commented Oct 25, 2025

What the PR does:
ConfigureChecks.cmake line 22-26: Detects both qsort_r AND qsort_s, sets H5_HAVE_QSORT_REENTRANT=1 if either exists
H5private.h line 76-85: When H5_HAVE_QSORT_REENTRANT is defined:
Darwin → calls HDqsort_context() (BSD-style wrapper)
Else → calls qsort_r() directly
The Problem:
// On Windows:
// 1. CMake detects qsort_s exists → sets H5_HAVE_QSORT_REENTRANT = 1
// 2. Code checks H5_HAVE_QSORT_REENTRANT → true, enters branch
// 3. Code checks H5_HAVE_DARWIN → false (Windows is not Darwin)
// 4. Code calls qsort_r(B, N, S, C, A) → ERROR: qsort_r doesn't exist on Windows!
Windows has qsort_s, not qsort_r - completely different function. Test gap: The test-qsort-fallback.yml includes Windows but forces the fallback by modifying the config, so it never tests the native qsort_s path. Impact: Windows builds will fail to compile with undefined reference to qsort_r.

src/H5system.c Outdated
qsort(base, nel, size, HDqsort_fallback_wrapper);

/* Clear the thread-local storage */
H5TS_key_set_value(HDqsort_fallback_key, NULL);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HDF5 code always checks H5TS_key_set_value return values, for example:

src/H5TSrec_rwlock.c:
if (H5_UNLIKELY(H5TS_key_set_value(lock->rec_read_lock_count_key, (void *)count) < 0))

src/H5TSint.c:
if (H5_UNLIKELY(H5TS_key_set_value(H5TS_thrd_info_key_g, tinfo_node))) {
    // Error handling
}

But

H5TS_key_create(&HDqsort_fallback_key, NULL);        // No check
H5TS_key_set_value(HDqsort_fallback_key, &ctx);      // No check 
H5TS_key_set_value(HDqsort_fallback_key, NULL);      // No check

Copy link
Collaborator

@brtnfld brtnfld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments

@mattjala
Copy link
Contributor Author

What the PR does: ConfigureChecks.cmake line 22-26: Detects both qsort_r AND qsort_s, sets H5_HAVE_QSORT_REENTRANT=1 if either exists H5private.h line 76-85: When H5_HAVE_QSORT_REENTRANT is defined: Darwin → calls HDqsort_context() (BSD-style wrapper) Else → calls qsort_r() directly The Problem: // On Windows: // 1. CMake detects qsort_s exists → sets H5_HAVE_QSORT_REENTRANT = 1 // 2. Code checks H5_HAVE_QSORT_REENTRANT → true, enters branch // 3. Code checks H5_HAVE_DARWIN → false (Windows is not Darwin) // 4. Code calls qsort_r(B, N, S, C, A) → ERROR: qsort_r doesn't exist on Windows! Windows has qsort_s, not qsort_r - completely different function. Test gap: The test-qsort-fallback.yml includes Windows but forces the fallback by modifying the config, so it never tests the native qsort_s path. Impact: Windows builds will fail to compile with undefined reference to qsort_r.

H5win32defs.h is included on line 575 and defines HDqsort_r() to the HDqsort_context() helper. As such, the #ifndef HDqsort_r() check at line 777 resolves false and the Darwin check and HDqsort_r definitions here are skipped entirely on Windows builds.

@mattjala mattjala dismissed stale reviews from byrnHDF and hyoklee via c56ca00 October 27, 2025 14:48
@brtnfld brtnfld self-requested a review October 27, 2025 15:26
brtnfld
brtnfld previously approved these changes Oct 27, 2025
src/H5system.c Outdated
/* Ensure the TLS key is initialized */
ret = H5TS_once(&HDqsort_fallback_key_once, HDqsort_fallback_key_init);
if (H5_UNLIKELY(ret < 0)) {
assert(false && "Failed to initialize TLS key for qsort fallback");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is in main library code, we need to do better here than asserting on a condition that has the potential to occur in debug builds and simply ignoring errors in release builds. Since the signature of this function currently doesn't allow propagating of errors, either the interface needs reworked to support that, or we should just abandon this variant of the function, using the global variable approach only, and document the thread-safety issues until we actually need a thread-safe version.

strategy:
fail-fast: false
matrix:
freebsd-version: ['14.3', '15.0']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add 13.5 back, as long as we plan to test that

hyoklee
hyoklee previously approved these changes Oct 27, 2025

- name: Force qsort fallback by modifying ConfigureChecks.cmake
run: |
sed -i.bak '/CHECK_FUNCTION_EXISTS (qsort_r _HAVE_QSORT_R_TMP)/,/^endif ()$/{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with testing the fallback function, but it probably doesn't need a completely separate workflow run everywhere just to test something that won't be used in most places. Instead, I think this should probably be driven by testing on some platform we support or want/plan to support where we know the fallback function is going to be used.

Copy link
Collaborator

@jhendersonHDF jhendersonHDF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments; dealing with the lack of error handling in the fallback function is the most important IMO

@mattjala mattjala dismissed stale reviews from hyoklee and brtnfld via 017ae72 October 27, 2025 22:08
src/H5system.c Outdated

/* Assert that initialization succeeded - cannot propagate errors from here */
if (H5_UNLIKELY(ret < 0)) {
assert(false && "Failed to create TLS key for qsort fallback");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must avoid assertions in main library code for cases that can realistically occur, even if that means ignoring errors (though ideally we don't do that either). In this case though, it looks like ret isn't assigned anyway, so this doesn't seem to be a useful check, unless ret should have been assigned from H5TS_key_create.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I'd be much more in favor of simply dropping the thread-safe variant unless there's a good use case, and requiring qsort_r to exist for thread-safe builds, rather than bending over backwards a bit to add in error handling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, ret should have been assigned from the key creation. I've removed this, since like the comment says if key creation fails, that'll be detected immediately afterwards in HDqsort_fallback().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I'd be much more in favor of simply dropping the thread-safe variant unless there's a good use case, and requiring qsort_r to exist for thread-safe builds, rather than bending over backwards a bit to add in error handling.

What exactly do you mean by requiring qsort_r to exist for threadsafe builds? I think that having the entire threadsafe build be impossible on systems that don't provide qsort_r directly seems too severe. As it stands, a threadsafe build should acquire the API lock on entry to the dataset routines that trigger qsort_r at tree create time anyway, so this should still be safe.

If the API lock isn't enough to handle the cases where the user wants a threadsafe build AND their system doesn't provide qsort_r, then I'd be more inclined to either fallback to not using the r-tree, or to acquire a special lock at tree creation time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to require more modern features as the library progresses forward, but for the time being we could simply document the issues for now. For concurrency builds (as opposed to thread-safe builds), this may eventually become an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, the current commit should be ready, unless there's somewhere else that this should be documented.

@mattjala mattjala requested review from brtnfld and byrnHDF October 28, 2025 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

Compilation error on NDK/BSDs: undefined reference to qsort_r in H5RT.c

7 participants