@@ -39,7 +39,9 @@ The accelerators natively supported by Ray Core are:
3939 * - Rebellions RBLN
4040 - RBLN
4141 - Experimental, supported by the community
42-
42+ * - METAX GPU
43+ - GPU
44+ - Experimental, supported by the community
4345Starting Ray nodes with accelerators
4446------------------------------------
4547
@@ -130,6 +132,15 @@ If you need to, you can :ref:`override <specify-node-resources>` this.
130132 For example, ``RBLN_DEVICES=1,3 ray start --head --resources='{"RBLN": 2}' ``
131133 lets Ray only see devices 1 and 3.
132134
135+ .. tab-item :: METAX GPU
136+ :sync: METAX GPU
137+
138+ .. tip ::
139+
140+ You can set the ``CUDA_VISIBLE_DEVICES `` environment variable before starting a Ray node
141+ to limit the METAX GPUs that are visible to Ray.
142+ For example, ``CUDA_VISIBLE_DEVICES=1,3 ray start --head --num-gpus=2 ``
143+ lets Ray only see devices 1 and 3.
133144.. note ::
134145
135146 There's nothing preventing you from specifying a larger number of
@@ -457,6 +468,44 @@ and assign accelerators to the task or actor by setting the corresponding enviro
457468 (rbln_task pid=51830) RBLN IDs: [1]
458469 (rbln_task pid=51830) RBLN_DEVICES: 1
459470
471+ .. tab-item :: METAX GPU
472+ :sync: METAX GPU
473+
474+ .. testcode ::
475+ :hide:
476+
477+ ray.shutdown()
478+
479+ .. testcode ::
480+
481+ import os
482+ import ray
483+
484+ ray.init(num_gpus=2)
485+
486+ @ray.remote(num_gpus=1)
487+ class GPUActor:
488+ def ping(self):
489+ print("GPU IDs: {}".format(ray.get_runtime_context().get_accelerator_ids()["GPU"]))
490+ print("CUDA_VISIBLE_DEVICES: {}".format(os.environ["CUDA_VISIBLE_DEVICES"]))
491+
492+ @ray.remote(num_gpus=1)
493+ def gpu_task():
494+ print("GPU IDs: {}".format(ray.get_runtime_context().get_accelerator_ids()["GPU"]))
495+ print("CUDA_VISIBLE_DEVICES: {}".format(os.environ["CUDA_VISIBLE_DEVICES"]))
496+
497+ gpu_actor = GPUActor.remote()
498+ ray.get(gpu_actor.ping.remote())
499+ # The actor uses the first GPU so the task uses the second one.
500+ ray.get(gpu_task.remote())
501+
502+ .. testoutput ::
503+ :options: +MOCK
504+
505+ (GPUActor pid=52420) GPU IDs: [0]
506+ (GPUActor pid=52420) CUDA_VISIBLE_DEVICES: 0
507+ (gpu_task pid=51830) GPU IDs: [1]
508+ (gpu_task pid=51830) CUDA_VISIBLE_DEVICES: 1
460509
461510Inside a task or actor, :func: `ray.get_runtime_context().get_accelerator_ids() <ray.runtime_context.RuntimeContext.get_accelerator_ids> ` returns a
462511list of accelerator IDs that are available to the task or actor.
@@ -606,6 +655,27 @@ so multiple tasks and actors can share the same accelerator.
606655
607656 Rebellions RBLN doesn't support fractional resources.
608657
658+ .. tab-item :: METAX GPU
659+ :sync: METAX GPU
660+
661+ .. testcode ::
662+ :hide:
663+
664+ ray.shutdown()
665+
666+ .. testcode ::
667+
668+ ray.init(num_cpus=4, num_gpus=1)
669+
670+ @ray.remote(num_gpus=0.25)
671+ def f():
672+ import time
673+
674+ time.sleep(1)
675+
676+ # The four tasks created here can execute concurrently
677+ # and share the same GPU.
678+ ray.get([f.remote() for _ in range(4)])
609679
610680**Note: ** It is the user's responsibility to make sure that the individual tasks
611681don't use more than their share of the accelerator memory.
0 commit comments