Skip to content

DOC: clarify Series alignment when assigning to DataFrame column (GH#39845) #62082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 19 additions & 9 deletions doc/source/user_guide/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1751,30 +1751,40 @@ Key Points:
* This behavior is consistent across df[col] = series and df.loc[:, col] = series

Examples:
.. ipython:: python
~~~~~~~~~

import pandas as pd
.. ipython:: python

# Create a DataFrame
df = pd.DataFrame({'values': [1, 2, 3]}, index=['x', 'y', 'z'])
df

# Series with matching indices (different order)
s1 = pd.Series([10, 20, 30], index=['z', 'x', 'y'])
df['aligned'] = s1 # Aligns by index, not position
print(df)
df

# Series with partial index match
s2 = pd.Series([100, 200], index=['x', 'z'])
df['partial'] = s2 # Missing 'y' gets NaN
print(df)
df

# Series with non-matching indices
s3 = pd.Series([1000, 2000], index=['a', 'b'])
df['nomatch'] = s3 # All values become NaN
print(df)
df

Avoiding Confusion:
~~~~~~~~~~~~~~~~~~~

#Avoiding Confusion:
#If you want positional assignment instead of index alignment:
# reset the Series index to match DataFrame index
df['s1_values'] = s1.reindex(df.index)
If you want positional assignment instead of index alignment:

.. ipython:: python

# Use .values or .to_numpy() to assign by position
df['position_based'] = s1.values[:len(df)]
df

# Or align the Series to the DataFrame's index first
df['reindexed'] = s1.reindex(df.index)
df