Skip to content

[BUG] Series.str.slice_from() fails when starts and stops have different integer dtypes #20435

@csadorf

Description

@csadorf

Describe the bug

Series.str.slice_from() fails with a cryptic C++ error when starts and stops parameters have compatible but different integer dtypes (e.g., int32 and int64). The error occurs at the libcudf C++ layer instead of being handled gracefully at the Python API level.

Steps/Code to reproduce bug

import cudf
import cupy as cp

# Enable pandas compatibility mode (where str.len() returns int64)
cudf.set_option("mode.pandas_compatible", True)

# Create a simple string series
s = cudf.Series(["hello", "world", "test"])

# Create starts as int32, stops comes from str.len() as int64
starts = cudf.Series(cp.zeros(len(s), dtype=cp.int32))
stops = s.str.len()  # Returns int64 in pandas mode

print(f"starts dtype: {starts.dtype}")  # int32
print(f"stops dtype: {stops.dtype}")    # int64

# This fails with dtype mismatch
result = s.str.slice_from(starts, stops)

Error:

TypeError: CUDF failure at: /tmp/conda-bld-output/bld/rattler-build_libcudf/work/cpp/src/strings/slice.cu:330: 
Parameters starts and stops must be of the same type.

Expected behavior

One of the following:

  1. Automatically cast both to a common dtype (like numpy/pandas type promotion)
  2. Raise a clear Python-level TypeError with actionable message: "starts (dtype int32) and stops (dtype int64) must have the same dtype. Consider casting both to the same type."
  3. Document the requirement clearly in the API docs

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: conda

Environment details

  • cuDF version: 25.12 (latest)
  • Python version: 3.13
  • CUDA version: 13.0.1

Additional context

This became a breaking change after PR #20368 (merged Oct 28, 2025), which modified str.len() to return int64 in pandas-compatible mode to match pandas behavior. Code that previously worked with int32 slice indices now fails.

Workaround:

result = s.str.slice_from(starts, stops.astype(cp.int32))

Metadata

Metadata

Assignees

Labels

PythonAffects Python cuDF API.bugSomething isn't workinglibcudfAffects libcudf (C++/CUDA) code.

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions