Which script were you running?
main.py (data generation)
Error Category
Other
Bug Description
aug_msf_v2() is not idempotent. It crashes with a duplicate column error if crsp_msf_v2.parquet already contains
the mthaskhi and mthbidlo columns it tries to add.
Steps to Reproduce
- Run the full pipeline (including download_raw_data_tables()) — succeeds
- Comment out download_raw_data_tables() and re-run — aug_msf_v2() fails:
ibis.common.exceptions.IbisInputError: Duplicate column name 'mthaskhi' in result set
Expected Behavior
Run twice without having to download everything
Error Output / Stack Trace
Traceback (most recent call last):
File "/gpfs/home/ffr7/jkp-data/code/main.py", line 50, in <module>
aug_msf_v2()
File "/gpfs/home/ffr7/jkp-data/code/aux_functions.py", line 774, in aug_msf_v2
).select([msf] + [m.mthaskhi, m.mthbidlo])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/gpfs/home/ffr7/jkp-data/.venv/lib/python3.11/site-packages/ibis/expr/types/joins.py", line 385, in select
values = unwrap_aliases(values)
^^^^^^^^^^^^^^^^^^^^^^
File "/gpfs/home/ffr7/jkp-data/.venv/lib/python3.11/site-packages/ibis/expr/types/relations.py", line 509, in
un>
raise com.IbisInputError(
ibis.common.exceptions.IbisInputError: Duplicate column name 'mthaskhi' in result set
Operating System
Linux
Python Version
3.13.11
Polars Version
Default uv
uv Version
0.10.9
Available RAM
No response
WRDS Authentication Method
Stored credentials (keyring)
WRDS Two-Factor Authentication
Additional Context
No response
Pre-submission Checklist
Which script were you running?
main.py (data generation)
Error Category
Other
Bug Description
aug_msf_v2() is not idempotent. It crashes with a duplicate column error if crsp_msf_v2.parquet already contains
the mthaskhi and mthbidlo columns it tries to add.
Steps to Reproduce
ibis.common.exceptions.IbisInputError: Duplicate column name 'mthaskhi' in result set
Expected Behavior
Run twice without having to download everything
Error Output / Stack Trace
Operating System
Linux
Python Version
3.13.11
Polars Version
Default uv
uv Version
0.10.9
Available RAM
No response
WRDS Authentication Method
Stored credentials (keyring)
WRDS Two-Factor Authentication
Additional Context
No response
Pre-submission Checklist