-
Notifications
You must be signed in to change notification settings - Fork 72
Open
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting triage by a dask-sql maintainerAwaiting triage by a dask-sql maintainer
Description
What happened:
"SELECT (((<column> LIKE '\뽞^' ESCAPE 'M')) IS NULL) FROM <tables>" brings error, when using GPU.
However it is able to output result, when using CPU.
What you expected to happen:
It will not bring error, when using GPU.
Minimal Complete Verifiable Example:
import pandas as pd
import dask.dataframe as dd
from dask_sql import Context
c = Context()
df0 = pd.DataFrame({
'c0': ["TIMESTAMP '1970-08-16 10:28:23'"],
'c1': ["DATE '2002-01-29'"],
'c2': ["DATE '2021-10-05'"],
'c3': [True],
})
t0 = dd.from_pandas(df0, npartitions=1)
c.create_table('t0', t0, gpu=False)
c.create_table('t0_gpu', t0, gpu=True)
df1 = pd.DataFrame({
'c0': ['b'],
'c1': ["TIMESTAMP '1972-02-28 05:27:02'"],
'c2': [836.0000],
'c3': ['CAST((-106) AS TINYINT)'],
})
t1 = dd.from_pandas(df1, npartitions=1)
c.create_table('t1', t1, gpu=False)
c.create_table('t1_gpu', t1, gpu=True)
print('CPU Result:')
result1= c.sql("SELECT (((t1.c1 LIKE '\뽞^' ESCAPE 'M')) IS NULL) FROM t0, t1").compute()
print(result1)
print('GPU Result:')
result2= c.sql("SELECT (((t1_gpu.c1 LIKE '\뽞^' ESCAPE 'M')) IS NULL) FROM t0_gpu, t1_gpu").compute()
print(result2)
Result:
INFO:numba.cuda.cudadrv.driver:init
CPU Result:
t1.c1 LIKE Utf8("\뽞^") CHAR 'M' IS NULL
0 False
GPU Result:
Traceback (most recent call last):
File "/tmp/bug18/bug18.py", line 32, in <module>
result2= c.sql("SELECT (((t1_gpu.c1 LIKE '\뽞^' ESCAPE 'M')) IS NULL) FROM t0_gpu, t1_gpu").compute()
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/context.py", line 513, in sql
return self._compute_table_from_rel(rel, return_futures)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/context.py", line 839, in _compute_table_from_rel
dc = RelConverter.convert(rel, context=self)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rel/convert.py", line 61, in convert
df = plugin_instance.convert(rel, context=context)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rel/logical/project.py", line 57, in convert
new_columns[random_name] = RexConverter.convert(
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/convert.py", line 74, in convert
df = plugin_instance.convert(rel, rex, dc, context=context)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/core/call.py", line 1101, in convert
operands = [
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/core/call.py", line 1102, in <listcomp>
RexConverter.convert(rel, o, dc, context=context)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/convert.py", line 74, in convert
df = plugin_instance.convert(rel, rex, dc, context=context)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/core/call.py", line 1129, in convert
return operation(*operands, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/core/call.py", line 77, in __call__
return self.f(*operands, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask_sql/physical/rex/core/call.py", line 443, in regex
return test.str.match(transformed_regex, flags=flags).astype("boolean")
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/accessor.py", line 13, in func
return self._function_map(attr, *args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/accessor.py", line 106, in _function_map
meta = self._delegate_method(
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/dataframe/accessor.py", line 92, in _delegate_method
out = getattr(getattr(obj, accessor, obj), attr)(*args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.10/site-packages/cudf/core/column/string.py", line 4291, in match
libstrings.match_re(self._column, pat, flags)
File "/opt/conda/envs/rapids/lib/python3.10/contextlib.py", line 79, in inner
return func(*args, **kwds)
File "contains.pyx", line 87, in cudf._lib.strings.contains.match_re
RuntimeError: CUDF failure at:/rapids/cudf/cpp/src/strings/regex/regcomp.cpp:521: invalid regex pattern: bad escape character at position 2
Anything else we need to know?:
Environment:
- dask-sql version: 2023.6.0
- Python version: Python 3.10.11
- Operating System: Ubuntu22.04
- Install method (conda, pip, source): Docker deploy by https://hub.docker.com/layers/rapidsai/rapidsai-dev/23.06-cuda11.8-devel-ubuntu22.04-py3.10/images/sha256-cfbb61fdf7227b090a435a2e758114f3f1c31872ed8dbd96e5e564bb5fd184a7?context=explore
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingneeds triageAwaiting triage by a dask-sql maintainerAwaiting triage by a dask-sql maintainer