From b22fc9fa66b26dcbaff0712b6baf73e88a14059c Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Wed, 9 Jul 2025 23:46:12 +0530 Subject: [PATCH 01/10] DOC: Clarify broadcasting behavior when using lists in DataFrame arithmetic (GH18857) --- doc/source/user_guide/basics.rst | 5 +++++ doc/source/user_guide/dsintro.rst | 13 +++++++++++++ 2 files changed, 18 insertions(+) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index 8155aa0ae03fa..fbfdec6af8759 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,6 +209,11 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) +Be careful when using raw Python lists in binary operations with DataFrames. +Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. +Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. +To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. +See also: :ref:`numpy broadcasting ` Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 89981786d60b5..c635d9157f557 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -650,6 +650,19 @@ row-wise. For example: df - df.iloc[0] +When using a Python list in arithmetic operations with a DataFrame, the behavior is not element-wise broadcasting. +Instead, the list is treated as a single object and the operation is performed column-wise, resulting in unexpected output (e.g. arrays inside each cell). + +.. ipython:: python + + df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) + + df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame + + df + np.array([1, 2, 3]) # Correct broadcasting + + df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From 7bcf683e63fbfb69601236e41efd5c0a495a2843 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:01:03 +0530 Subject: [PATCH 02/10] DOC: Fix external link formatting in basics.rst --- doc/source/user_guide/basics.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index fbfdec6af8759..e25dca0da91c9 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,7 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: :ref:`numpy broadcasting ` +See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From ec00318d45485321b5fd91a4afcdf715fca30510 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Thu, 10 Jul 2025 00:03:55 +0530 Subject: [PATCH 03/10] DOC: Removed external link in basics.rst --- doc/source/user_guide/basics.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index e25dca0da91c9..d35cc06db9f06 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -213,7 +213,6 @@ Be careful when using raw Python lists in binary operations with DataFrames. Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -See also: `numpy broadcasting `_ Furthermore you can align a level of a MultiIndexed DataFrame with a Series. .. ipython:: python From 0e718bc6827ee13335fdf3dc8c4fa58ba177531a Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Tue, 15 Jul 2025 13:23:57 +0530 Subject: [PATCH 04/10] Comment changes --- doc/source/user_guide/dsintro.rst | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index c635d9157f557..59938a66bb507 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -655,13 +655,7 @@ Instead, the list is treated as a single object and the operation is performed c .. ipython:: python - df = pd.DataFrame(np.arange(6).reshape(2, 3), columns=["A", "B", "C"]) - - df + [1, 2, 3] # Returns a Series of arrays, not a DataFrame - - df + np.array([1, 2, 3]) # Correct broadcasting - - df + pd.Series([1, 2, 3], index=["A", "B", "C"]) # Also correct + df + np.array([1, 2, 3]) For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From f10c147377e1004b89b22ef5d4d4c771cafe09a8 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Mon, 21 Jul 2025 11:15:13 +0530 Subject: [PATCH 05/10] Changes as per comment --- doc/source/user_guide/dsintro.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 59938a66bb507..385238c12f423 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -650,12 +650,12 @@ row-wise. For example: df - df.iloc[0] -When using a Python list in arithmetic operations with a DataFrame, the behavior is not element-wise broadcasting. -Instead, the list is treated as a single object and the operation is performed column-wise, resulting in unexpected output (e.g. arrays inside each cell). +Use .add(array, axis=0) to apply row-wise broadcasting when the array length matches the number of rows — +this ensures element-wise operations are performed across each row, rather than mistakenly aligning with columns. .. ipython:: python - df + np.array([1, 2, 3]) + df.add(np.array([1, 2, 3]), axis=0) For explicit control over the matching and broadcasting behavior, see the section on :ref:`flexible binary operations `. From d438e8b4d8dfa9b100cfb58922d0edb6094b605e Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Mon, 21 Jul 2025 20:51:59 +0530 Subject: [PATCH 06/10] Made the changes --- doc/source/user_guide/basics.rst | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index d35cc06db9f06..6a496de3b8e53 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,11 +209,12 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) -Be careful when using raw Python lists in binary operations with DataFrames. -Unlike NumPy arrays or Series, lists are not broadcast across rows or columns. -Instead, pandas attempts to match the entire list against a single axis, which may lead to confusing results such as Series of arrays. -To ensure proper broadcasting behavior, use a NumPy array or Series with explicit index or shape. -Furthermore you can align a level of a MultiIndexed DataFrame with a Series. +Use .add(array, axis=0) to broadcast values row-wise, ensuring each element in the array is +applied to the corresponding row. This avoids accidental column alignment and preserves expected element-wise behavior. + +.. ipython:: python + + df.add(np.array([1, 2, 3]), axis=0) .. ipython:: python From 6108bcbf0f5a957f77551ece83bf923ccf6a60a2 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Wed, 23 Jul 2025 14:39:03 +0530 Subject: [PATCH 07/10] Changes removed the line --- doc/source/user_guide/basics.rst | 3 --- 1 file changed, 3 deletions(-) diff --git a/doc/source/user_guide/basics.rst b/doc/source/user_guide/basics.rst index 6a496de3b8e53..c6ff08e5f590d 100644 --- a/doc/source/user_guide/basics.rst +++ b/doc/source/user_guide/basics.rst @@ -209,9 +209,6 @@ either match on the *index* or *columns* via the **axis** keyword: df.sub(column, axis="index") df.sub(column, axis=0) -Use .add(array, axis=0) to broadcast values row-wise, ensuring each element in the array is -applied to the corresponding row. This avoids accidental column alignment and preserves expected element-wise behavior. - .. ipython:: python df.add(np.array([1, 2, 3]), axis=0) From 924e19bd79c66ddcf4cb16bb11ba362b84fb8e03 Mon Sep 17 00:00:00 2001 From: Shashwat1001 Date: Wed, 23 Jul 2025 14:43:50 +0530 Subject: [PATCH 08/10] Changes --- .gitignore | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/.gitignore b/.gitignore index d951f3fb9cbad..71f9a6545904e 100644 --- a/.gitignore +++ b/.gitignore @@ -141,3 +141,7 @@ doc/source/savefig/ # Pyodide/WASM related files # ############################## /.pyodide-xbuildenv-* + +.venv +venv/ + From 5cf6dbb1968166a6da07c72d5e22dfcbc8cd807a Mon Sep 17 00:00:00 2001 From: Shashwat Shankar <46281931+Shashwat1001@users.noreply.github.com> Date: Wed, 23 Jul 2025 14:45:26 +0530 Subject: [PATCH 09/10] Update .gitignore --- .gitignore | 2 -- 1 file changed, 2 deletions(-) diff --git a/.gitignore b/.gitignore index 71f9a6545904e..901e35658bf7f 100644 --- a/.gitignore +++ b/.gitignore @@ -142,6 +142,4 @@ doc/source/savefig/ ############################## /.pyodide-xbuildenv-* -.venv -venv/ From 2869774cfdfd20ccfc55f06e0c6cd5a852bee556 Mon Sep 17 00:00:00 2001 From: Shashwat Shankar <46281931+Shashwat1001@users.noreply.github.com> Date: Wed, 23 Jul 2025 14:46:57 +0530 Subject: [PATCH 10/10] Update .gitignore --- .gitignore | 2 -- 1 file changed, 2 deletions(-) diff --git a/.gitignore b/.gitignore index 901e35658bf7f..d951f3fb9cbad 100644 --- a/.gitignore +++ b/.gitignore @@ -141,5 +141,3 @@ doc/source/savefig/ # Pyodide/WASM related files # ############################## /.pyodide-xbuildenv-* - -