-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting triage by a dask-sql maintainerAwaiting triage by a dask-sql maintainer
Description
What happened:
SELECT (<number>) NOT IN (CASE <column> WHEN <number> THEN <number> END) FROM <table> brings different results, when using CPU and GPU execution.
What you expected to happen:
It is the same result, when using CPU and GPU execution.
Minimal Complete Verifiable Example:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
c = Context()
df0 = pd.DataFrame({
'c0': [0.1],
})
t0 = dd.from_pandas(df0, npartitions=1)
c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)
print('CPU Result:')
result1 = c.sql("SELECT (0.1) NOT IN (CASE t0.c0 WHEN 1 THEN 2 END) FROM t0").compute()
print(result1)
print('GPU Result:')
result2 = c.sql("SELECT (0.1) NOT IN (CASE t0_gpu.c0 WHEN 1 THEN 2 END) FROM t0_gpu").compute()
print(result2)
Result:
INFO:numba.cuda.cudadrv.driver:init
CPU Result:
Float64(0.1) NOT IN (Map { iter: Iter([CASE t0.c0 WHEN Int64(1) THEN Int64(2) END]) })
0 True
GPU Result:
Float64(0.1) NOT IN (Map { iter: Iter([CASE t0_gpu.c0 WHEN Int64(1) THEN Int64(2) END]) })
0 <NA>
INFO:numba.cuda.cudadrv.driver:add pending dealloc: module_unload ? bytes
Anything else we need to know?:
Environment:
- dask-sql version: 2023.6.0
- Python version: Python 3.10.11
- Operating System: Ubuntu22.04
- Install method (conda, pip, source): Docker deploy by https://hub.docker.com/layers/rapidsai/rapidsai-dev/23.06-cuda11.8-devel-ubuntu22.04-py3.10/images/sha256-cfbb61fdf7227b090a435a2e758114f3f1c31872ed8dbd96e5e564bb5fd184a7?context=explore
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting triage by a dask-sql maintainerAwaiting triage by a dask-sql maintainer