Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas outputs warning when calling dataframe.append in flux_csv_parser._prepare_data_frame #613

Open
jonashoechst opened this issue Sep 19, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@jonashoechst
Copy link

Specifications

  • Client Version: 1.36.0
  • InfluxDB Version: 2.7

Code sample to reproduce problem

query_api.query_data_frame(query)

Expected behavior

The query should be executed without warnings.

Actual behavior

Pandas outputs a FutureWarning:

/usr/local/lib/python3.11/site-packages/influxdb_client/client/flux_csv_parser.py:256: FutureWarning:
The behavior of array concatenation with empty entries is deprecated. In a future version, this will no longer exclude empty items when determining the result dtype. To retain the old behavior, exclude the empty entries before the concat operation.

This is related to the this code snippet:

https://github.com/influxdata/influxdb-client-python/blob/aa6e5c4aad1aaf004195c175ebe424ac6040c2e3/influxdb_client/client/flux_csv_parser.py#L243C9-L256

Additional info

In the recent pandas version handling of concatenation of empty dataframes changed, see pandas-dev/pandas#39122. This should be handled else wise to reflect the changes in pandas.

@jonashoechst jonashoechst added the bug Something isn't working label Sep 19, 2023
@Sciator
Copy link
Contributor

Sciator commented Oct 26, 2023

I wasn't able to replicate this issue. I have tried to run this with 2.0.3 and 2.1.1 pandas with no warning.
Could you provide more information (query, data etc.) ?

@bednar
Copy link
Contributor

bednar commented Oct 26, 2023 via email

@ivankudibal
Copy link

pls, it would be great if we have data and the query to reproduce, before we can go on with this issue

@oskarwilliams
Copy link

I am seeing this issue as well, do you still need a query to repro this?

@bednar
Copy link
Contributor

bednar commented Oct 11, 2024

Hi @oskarwilliams, the query and possibly an example of your data in the database would be helpful… Thanks.

@dave-cz
Copy link

dave-cz commented Nov 7, 2024

Python 3.12
Pandas 2.2.3
InfluxDB 2.7.5
influxdb_client 1.47.0

Query results from InfluxDB Data Explorer
query(1).csv

Code:

query_api.query_data_frame(query, data_frame_index=['time'])

query:

from(bucket: "model")
|> range(start: 2024-10-27T23:00:00Z, stop: 2024-11-03T22:00:00.000001Z)
|> filter(fn: (r) => r["_measurement"] == "sp_el.J612")
|> aggregateWindow(every: 1h, fn: last, timeSrc: "_start", timeDst: "time", createEmpty: true)
|> pivot(rowKey: ["time"], columnKey: ["_measurement", "_field"], valueColumn: "_value")`

flux_csv_parser.py l.258
df = pd.concat([self._data_frame.astype(_temp_df.dtypes), _temp_df])

>> self._data_frame

Empty DataFrame
Columns: [result, table, _start, _stop, sp_el.J612_value]
Index: []
>> _temp_df
                            result  table                    _start                            _stop  sp_el.J612_value
time                                                                                                                  
2024-10-27 23:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               NaN
2024-10-28 00:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               NaN
...
2024-10-29 22:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               NaN
2024-10-29 23:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               2.0
2024-10-30 00:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               2.0
...
2024-10-31 22:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               2.0
2024-10-31 23:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               NaN
...
2024-11-03 22:00:00+00:00  _result      0 2024-10-27 23:00:00+00:00 2024-11-03 22:00:00.000001+00:00               NaN

_temp_df.to_csv('temp_df.csv')

temp_df.csv

WARNING    warnings                  C:\my_project\venv\Lib\site-packages\influxdb_client\client\flux_csv_parser.py:258: FutureWarning:
The behavior of array concatenation with empty entries is deprecated. In a future version, this will no longer exclude empty items when determining the result dtype. To retain the old behavior, exclude the empty entries before the concat operation.

I see

>>> df.equals(_temp_df)
True

so possible solution can look like

if self._data_frame.empty:
    df = _temp_df
else:
    df = pd.concat([self._data_frame.astype(_temp_df.dtypes), _temp_df])

but I'm not sure about consequences across all codes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants