Data placement on different GPUs when using dask_cudf library #17966

preorat-sion · 2025-02-10T10:47:16Z

I am using the dask_cudf library with two GPUs (CUDA_VISIBLE_DEVICES="0,1"). When reading a CSV file with dask_cudf, the dataframe is automatically split into a number of partitions, which are then processed by various operations.
temp = dask_cudf.read_csv('./test.csv', dtype={"id1":"Int32","id2":"Int32","id4":"object","id5":"object","v2":"Float64"})
Could you please tell me if there is any way to track which specific GPU each partition is located on, and whether it is possible to manually direct a partition to a specific GPU? And is it possible to track which specific GPU is performing operations (such as join, groupby) on a particular partition (So far, I can only do this using the GPU memory tab)?

quasiben · 2025-02-10T15:42:48Z

You can track which partitions are where using the who_has/has_what functions:

In [13]: client.who_has()
Out[13]:
{('frompandas-cabff790f1b8b84828c3c5e69884b775',
  3): ('tcp://127.0.0.1:38207',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  2): ('tcp://127.0.0.1:43273',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  1): ('tcp://127.0.0.1:38207',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  0): ('tcp://127.0.0.1:43273',)}

In [14]: client.has_what()
Out[14]:
{'tcp://127.0.0.1:38207': (('frompandas-cabff790f1b8b84828c3c5e69884b775', 3),
  ('frompandas-cabff790f1b8b84828c3c5e69884b775', 1)),
 'tcp://127.0.0.1:43273': (('frompandas-cabff790f1b8b84828c3c5e69884b775', 2),
  ('frompandas-cabff790f1b8b84828c3c5e69884b775', 0))}

In [15]: client.who_has()
Out[15]:
{('frompandas-cabff790f1b8b84828c3c5e69884b775',
  3): ('tcp://127.0.0.1:38207',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  2): ('tcp://127.0.0.1:43273',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  1): ('tcp://127.0.0.1:38207',),
 ('frompandas-cabff790f1b8b84828c3c5e69884b775',
  0): ('tcp://127.0.0.1:43273',)}

Dask enables some amount of user control with the placement of data. I would suggest reading the data locality page, specifically:

However, we generally recommend letting the scheduler control distribution of data across the workers

preorat-sion · 2025-02-10T17:16:41Z

After reading the CSV and using has_what() (or who_has()), the functions often behave strangely, showing that key_count = 0 on both GPUs or nothing is displayed at all, even though the partitioning took place and they should be shown.
Key | Copies | Workers

quasiben · 2025-02-10T17:24:31Z

My guess is that the graph hasn't been executed and data hasn't been persisted:

ddf = read_csv(...)
ddf = ddf.persist()
client.who_has()

preorat-sion · 2025-02-10T17:33:22Z

Yes, you need to execute the task graph first. After that, the display works correctly. However, sometimes the partitions are distributed very unevenly across GPUs (for example, out of 10 partitions, 2 may go to one GPU and 8 to another). I believe I read that you can also distribute partitions across different GPUs using map_partitions.

preorat-sion · 2025-02-10T17:56:39Z

Is it possible to partition data by a key, so that values with the same key are placed in the same partition on the same GPU? This would be especially useful for operations like groupby and join.

quasiben · 2025-02-10T18:46:10Z

Yes, you can call shuffle directly but groupy and join will be calling that as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data placement on different GPUs when using dask_cudf library #17966

Data placement on different GPUs when using dask_cudf library #17966

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

preorat-sion commented Feb 10, 2025 •

edited

Loading

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

Data placement on different GPUs when using dask_cudf library #17966

Data placement on different GPUs when using dask_cudf library #17966

Comments

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

preorat-sion commented Feb 10, 2025 • edited Loading

preorat-sion commented Feb 10, 2025

quasiben commented Feb 10, 2025

preorat-sion commented Feb 10, 2025 •

edited

Loading