Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #4151

RamakrishnaChaitanya · 2025-02-10T06:28:16Z

RamakrishnaChaitanya
Feb 10, 2025

Hi, I'm trying to finetune the fastpitch model (custom model from IndicTTS) on the OpenSLR dataset. However, after precomputation, I'm getting an error stating that the "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!". I'm using following command on the console -

CUDA_VISIBLE_DEVICES="7" python TTS/bin/train_tts_rk.py --config_path /data/Ramakrishna/Projects/TTS/Github_repos/en+hi/fastpitch/config_2.json --restore_path /data/Ramakrishna/Projects/TTS/Github_repos/en+hi/fastpitch/best_model.pth

I'm facing this issue in the embedding() definition which was implemented in functional.py.
Traceback (most recent call last): File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1833, in fit self._fit() File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1785, in _fit self.train_epoch() File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1504, in train_epoch outputs, _ = self.train_step(batch, batch_num_steps, cur_step, loader_start_time) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1360, in train_step outputs, loss_dict_new, step_time = self.optimize( File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1226, in optimize outputs, loss_dict = self._compute_loss( File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1157, in _compute_loss outputs, loss_dict = self._model_train_step(batch, model, criterion) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/trainer/trainer.py", line 1116, in _model_train_step return model.train_step(*input_args) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/TTS/tts/models/forward_tts.py", line 729, in train_step outputs = self.forward( File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/TTS/tts/models/forward_tts.py", line 616, in forward o_en, x_mask, g, x_emb = self._forward_encoder(x, x_mask, g) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/TTS/tts/models/forward_tts.py", line 401, in _forward_encoder g = self.emb_g(g) # [B, C, 1] File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl return forward_call(*args, **kwargs) File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 190, in forward return F.embedding( File "/data/Condaenvs/coqui-tts/lib/python3.10/site-packages/torch/nn/functional.py", line 2559, in embedding return **torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)** RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

When I digged into each of the argument in torch.embedding(), I was getting something like
weight is GPU device - 0, input is on GPU device -1, padding_idx is on CPU, scale_grad_by_freq is on CPU and sparse is on CPU .
Even after explicitly setting CUDA_VISIBLE_DEVICES="7" (out of 8 GPU'S), Why 'm getting that one tensor is on GPU 0 and another tensor GPU -1? How to address this multiple devices issue? If anyone has faced similar issue earlier, kindly request you to help.

Just FYI, i'm using the original and unmaintained repository :(

eginhard · 2025-02-10T13:20:07Z

eginhard
Feb 10, 2025

Just FYI, i'm using the original and unmaintained repository :(

Can you check whether it still occurs in the fork?

5 replies

RamakrishnaChaitanya Feb 11, 2025
Author

Yes, I'm facing the same issue even in the forked repository!

eginhard Feb 11, 2025

Thank you for confirming, I'll try to reproduce on my side.

RamakrishnaChaitanya Feb 19, 2025
Author

I'm able to resolve this error after moving "input" data to GPU, explicitly. But, I'm not sure if this is the right fix.

input = input.to(device='cuda')
torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

eginhard Mar 3, 2025

I'm not able to reproduce this error using the Coqui fork and training on LJSpeech with the corresponding FastPitch recipe: https://github.com/idiap/coqui-ai-TTS/blob/dev/recipes/ljspeech/fast_pitch/train_fast_pitch.py

Can you share the full training recipe and config?

RamakrishnaChaitanya Mar 3, 2025
Author

I have lost the old logs. Now, I'm trying to train fastpitch model from scratch instead of finetuning. Even now, I'm facing the similar issue. Hence, I'm attaching the training script and the corresponding config file, here.
config.json
training_recipe.txt

Even though, I instructed to use device 7 (CUDA_VISIBLE_DEVICES="7"), it was still showing cuda:0 and cpu!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #4151

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #4151

RamakrishnaChaitanya Feb 10, 2025

Replies: 1 comment · 5 replies

eginhard Feb 10, 2025

RamakrishnaChaitanya Feb 11, 2025 Author

eginhard Feb 11, 2025

RamakrishnaChaitanya Feb 19, 2025 Author

eginhard Mar 3, 2025

RamakrishnaChaitanya Mar 3, 2025 Author

RamakrishnaChaitanya
Feb 10, 2025

Replies: 1 comment 5 replies

eginhard
Feb 10, 2025

RamakrishnaChaitanya Feb 11, 2025
Author

RamakrishnaChaitanya Feb 19, 2025
Author

RamakrishnaChaitanya Mar 3, 2025
Author