Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow xcalloc() to return NULL when size is 0 for portability #3452

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hasumikin
Copy link
Contributor

According to the calloc(3) man page, when nmemb or size is 0, calloc() can either return NULL or a unique pointer that can be passed to free(): https://manpages.org/calloc/3

While gcc (13.3.0) and clang (15.0.7), in my environment, return a unique pointer in this case, mruby's mrb_calloc returns NULL.
See https://github.com/mruby/mruby/blob/3.3.0/src/gc.c#L253

Since pm_constant_pool_init() is commonly called with capacity=0 during normal operation of Prism, xcalloc() needs to handle NULL returns to integrate with mruby.

Per free()'s and realloc()'s specifications, passing a NULL pointer is a valid operation, so this change should be safe.

@eregon
Copy link
Member

eregon commented Feb 2, 2025

This doesn't look correct to me, from man calloc on my Linux machine:

   calloc()
       The  calloc()  function allocates memory for an array of nmemb elements of size bytes each and returns a pointer to
       the allocated memory.  The memory is set to zero.  If nmemb or size is 0, then calloc() returns  a  unique  pointer
       value that can later be successfully passed to free().

       If  the multiplication of nmemb and size would result in integer overflow, then calloc() returns an error.  By con‐
       trast, an integer overflow would not be detected in the following call to malloc(), with the result that an  incor‐
       rectly sized block of memory would be allocated:

           malloc(nmemb * size);

...

RETURN VALUE
       The malloc(), calloc(), realloc(), and reallocarray() functions return a pointer to the allocated memory, which  is
       suitably aligned for any type that fits into the requested size or less.  On error, these functions return NULL and
       set  errno.   Attempting  to  allocate  more than PTRDIFF_MAX bytes is considered an error, as an object that large
       could cause later pointer subtraction to overflow.

...

       If your program uses a private memory allocator, it should do so by replacing malloc(), free(), calloc(), and real‐
       loc().  The replacement functions must implement the documented glibc behaviors, including  errno  handling,  size-
       zero  allocations,  and overflow checking; otherwise, other library routines may crash or operate incorrectly.  For
       example, if the replacement free() does not preserve errno, then seemingly  unrelated  library  routines  may  fail
       without  having a valid reason in errno.  Private memory allocators may also need to replace other glibc functions;
       see "Replacing malloc" in the glibc manual for details.

...

   Nonportable behavior
       The behavior of these functions when the requested size is zero is glibc specific; other implementations may return
       NULL without setting errno, and portable POSIX programs should tolerate such behavior.  See realloc(3p).

       POSIX requires memory allocators to set errno upon failure.  However, the C standard does not require this, and ap‐
       plications portable to non-POSIX platforms should not assume this.

(typical man page confusing text but)

  • on glibc calloc(sizeof(...), 0) must never return NULL, and If your program uses a private memory allocator "it should do that too"
  • But under Nonportable behavior "The behavior of these functions when the requested size is zero is glibc specific; other implementations may return NULL without setting errno, and portable POSIX programs should tolerate such behavior. See realloc(3p)."
  • calloc() returns NULL and sets errno on error (e.g. on ENOMEM), so callers MUST check for NULL.

Also, MRuby calloc can return NULL on error, if e.g. nelem <= SIZE_MAX / len: https://github.com/mruby/mruby/blob/32279e4128527bab4c961854b9cce727a060abea/src/gc.c#L245 (it doesn't set errno, that's against POSIX from the text above).

So there is at least one MRuby issue which is it should set errno if it returns NULL for an error.
And maybe but more arguable, MRuby should probably return something glibc-compatible in the zero case.

In terms of what the caller can do about this poorly-specified behavior of calloc() I think the simplest/straightforward ways are:

  • Handle capacity == 0 specially:
if (capacity) {
  list->ids = xcalloc(capacity, sizeof(pm_constant_id_t));
  if (list->ids == NULL) abort();
} else {
  list->ids = NULL;
}
  • list->ids = xcalloc(MAX(capacity, 1), sizeof(pm_constant_id_t));

@hasumikin hasumikin force-pushed the fix/pm_constant_id_list_init_capacity branch from 528bd3d to d544425 Compare February 3, 2025 02:09
@hasumikin
Copy link
Contributor Author

@eregon Thank you very much. I agree with the former option.

According to the calloc(3) man page, when nmemb or size is 0, `calloc()` can either return NULL or a unique pointer that can be passed to `free()`.
While gcc and clang typically return a unique pointer, mruby's `mrb_calloc()` returns NULL in this case.

Since `pm_constant_pool_init()` is commonly called with capacity=0 during normal operation of Prism, explicitly handle this case by setting `list->ids` to NULL when capacity is 0.
This approach is portable across different calloc implementations and avoids potential issues with mruby's allocation behavior.

This maintains compatibility with `free()` and `realloc()`, as passing NULL pointers to these functions is explicitly allowed by their specifications.
@hasumikin hasumikin force-pushed the fix/pm_constant_id_list_init_capacity branch from d544425 to 1c32252 Compare February 3, 2025 04:12
Copy link
Member

@eregon eregon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eregon
Copy link
Member

eregon commented Feb 3, 2025

I wonder if other places also need similar handling.
I guess this one was problematic for PicoRuby or MRuby, with this fix are there any other similar issues left?

I'll let @kddnewton (or another Prism maintainer) review/merge.

@hasumikin
Copy link
Contributor Author

https://github.com/ruby/prism/blob/v1.3.0/src/prism.c#L766-L767
https://github.com/ruby/prism/blob/v1.3.0/src/prism.c#L1913-L1916
As far as I understand, the places above don't call xcalloc with size=0.
So my PR is the only place where mruby fails (for now).

But I'm also wondering if it's ok that, for example, the below code allows NULL:
https://github.com/ruby/prism/blob/v1.3.0/src/util/pm_integer.c#L342-L343

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants