You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Verify by unzipping the attached archive and running the "run" shell script. Out of caution, I included an obfuscated version of the crypto data in question. It has the same datatypes and the same dimensions as the original, unobfuscated csv. The script performs a vanilla postgres \COPY command on the csv and selects the row that I found to be missing when I discovered the bug (proving that COPY does, in fact, include the row). Then, the script clears all data out of the DB and repeats the copy, this time using timescaledb-parallel-copy. It then performs a selection to show the row in question is no longer there, despite supposedly being copied from an identical csv
The text was updated successfully, but these errors were encountered:
@hjfeldy Thanks for the report. It looks like the utility doesn't play well with the HEADER option you've specified -- Postgres will ignore the first line from every chunk, which is where your missing rows are going.
I suggest using -skip-header instead. If I switch your script from using
What type of bug is this?
Data corruption
What subsystems and features are affected?
Data ingestion
What happened?
Upon inserting a large CSV of cryptocurrency data, certain rows are missing.
timescaleIssue.zip
TimescaleDB version affected
2.6.1
PostgreSQL version used
12
What operating system did you use?
Ubuntu 20.04 LTS x86_64
What installation method did you use?
Deb/Apt
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
How can we reproduce the bug?
Verify by unzipping the attached archive and running the "run" shell script. Out of caution, I included an obfuscated version of the crypto data in question. It has the same datatypes and the same dimensions as the original, unobfuscated csv. The script performs a vanilla postgres \COPY command on the csv and selects the row that I found to be missing when I discovered the bug (proving that COPY does, in fact, include the row). Then, the script clears all data out of the DB and repeats the copy, this time using timescaledb-parallel-copy. It then performs a selection to show the row in question is no longer there, despite supposedly being copied from an identical csv
The text was updated successfully, but these errors were encountered: