You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Creating an ibis.memtable using the same name as a previously defined ibis.memtable retains the existing data from when it was first defined. What's particularly interesting is that this only seems to take place in a REPL environment.
How I came across this
I was in an IPython session attempting to create a table from an ibis.memtable created from a pandas DataFrame on the MSSQL backend. It was taking a little while, so I cancelled it and recreated a new memtable with the same name with only the first 100 rows to speed things up.
What's happening
>>> df
a b
0 1 a
1 2 b
2 3 c
>>> t = ibis.memtable(df, name="t")
>>> t
┏━━━━━━━┳━━━━━━━━┓
┃ a ┃ b ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64 │ string │
├───────┼────────┤
│ 1 │ a │
│ 2 │ b │
│ 3 │ c │
└───────┴────────┘
Now, I only want the first row in the pandas DataFrame. If I use df.head(1) in an effort to only grab the top row but keep the same memtable name, here's what happens:
>>> t = ibis.memtable(df.head(1), name="t")
>>> t
┏━━━━━━━┳━━━━━━━━┓
┃ a ┃ b ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64 │ string │
├───────┼────────┤
│ 1 │ a │
│ 2 │ b │
│ 3 │ c │
└───────┴────────┘
But if we use a different name as an arg, here's what happens:
>>> t = ibis.memtable(df.head(1), name="t_1_row")
>>> t
┏━━━━━━━┳━━━━━━━━┓
┃ a ┃ b ┃
┡━━━━━━━╇━━━━━━━━┩
│ int64 │ string │
├───────┼────────┤
│ 1 │ a │
└───────┴────────┘
What happened?
Creating an
ibis.memtable
using the same name as a previously definedibis.memtable
retains the existing data from when it was first defined. What's particularly interesting is that this only seems to take place in a REPL environment.How I came across this
I was in an IPython session attempting to create a table from an
ibis.memtable
created from a pandas DataFrame on the MSSQL backend. It was taking a little while, so I cancelled it and recreated a new memtable with the same name with only the first 100 rows to speed things up.What's happening
Now, I only want the first row in the pandas DataFrame. If I use
df.head(1)
in an effort to only grab the top row but keep the same memtable name, here's what happens:But if we use a different name as an arg, here's what happens:
Sample code/data to repro
What version of ibis are you using?
9.5.0
More specifically, up to the most recent commit on main at 12a235c.
What backend(s) are you using, if any?
None
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: