paddle.device.device 上下文管理器在多线程环境下不是线程安全的

### bug描述 Describe the Bug

# paddle.device.device 上下文管理器在多线程环境下不是线程安全的

## 问题现象
在多线程环境下，使用 `with paddle.device.device(device)` 上下文管理器时，不同线程的设备设置会相互干扰，导致张量被创建在错误的设备上。

## 复现代码
```python
import paddle
import threading
import time

def set_cpu():
    with paddle.device.device("cpu"):
        print(f"[CPU线程] 设置CPU")
        time.sleep(2)  # 持有CPU上下文
        t = paddle.to_tensor([1.0])
        print(f"[CPU线程] 张量设备: {t.place}")
    print("CPU线程退出")

def set_gpu():
    time.sleep(0.5)  # 稍晚一点启动
    with paddle.device.device("gpu:0"):
        print(f"[GPU线程] 设置GPU")
        t = paddle.to_tensor([2.0])
        print(f"[GPU线程] 张量设备: {t.place}")
        time.sleep(2)
    print("GPU线程退出")

t1 = threading.Thread(target=set_cpu)
t2 = threading.Thread(target=set_gpu)
t1.start()
t2.start()
t1.join()
t2.join()
```

### 实际输出
```
E:\code\paddle模型大迁移\PlaceEnv>python small_test.py
[CPU线程] 设置CPU
[GPU线程] 设置GPU
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0214 21:23:54.467437  7056 gpu_resources.cc:116] Please NOTE: device: 0, GPU Compute Capability: 6.1, Driver API Version: 12.8, Runtime API Version: 11.8
[GPU线程] 张量设备: Place(gpu:0)
[CPU线程] 张量设备: Place(gpu:0)  <-- 错误！应该在CPU上
CPU线程退出
GPU线程退出
```
<img width="1193" height="803" alt="Image" src="https://github.com/user-attachments/assets/14b5abb0-2cfe-436f-8517-2b5e7360314b" />

### 预期输出
```
[CPU线程] 设置CPU
[GPU线程] 设置GPU
[GPU线程] 张量设备: Place(gpu:0)
[CPU线程] 张量设备: Place(cpu)    <-- 应该在CPU上
CPU线程退出
GPU线程退出
```

### 根本原因分析
1. `paddle.device.device` 上下文管理器的实现依赖于全局函数 `set_device()` 和 `get_device()`
2. `framework.py` 中的 `_global_expected_place_` 是一个全局变量（第267行），没有任何线程同步机制
3. `_set_expected_place()` 函数（第877-880行）直接修改这个全局变量`_global_expected_place_ = place`，没有使用锁或线程本地存储


### 影响范围
- 所有依赖于 `paddle.device.device` 上下文管理器的多线程程序
- 在多线程环境中使用 `with paddle.cuda.device` 的场景
- 任何通过 `paddle.set_device()` 设置全局设备的多线程应用



### 其他补充信息 Additional Supplementary Information

### 环境信息
- PaddlePaddle版本：3.3.0
- Python版本：3.13.1
- 操作系统：Windows 10
- 是否必须依赖GPU：是（问题在CPU/GPU场景都会出现）

### 建议的解决方案
1. **短期修复**：在 `_set_expected_place` 和 `get_device` 中添加线程锁，确保原子性
2. **长期优化**：将设备状态改为线程本地存储（Thread Local Storage），参考 `GlobalThreadLocal` 类的设计模式
3. **根本解决**：考虑将设备管理重构为每个线程独立的上下文，而不是全局共享状态


### 备注
这个问题会影响使用多线程进行模型推理、数据处理等场景，尤其是在需要同时使用CPU和GPU进行混合计算的场景下更为严重。建议优先修复以提高框架在多线程环境下的可靠性。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paddle.device.device 上下文管理器在多线程环境下不是线程安全的 #77927

bug描述 Describe the Bug

paddle.device.device 上下文管理器在多线程环境下不是线程安全的

问题现象

复现代码

实际输出

预期输出

根本原因分析

影响范围

其他补充信息 Additional Supplementary Information

环境信息

建议的解决方案

备注

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

paddle.device.device 上下文管理器在多线程环境下不是线程安全的 #77927

Description

bug描述 Describe the Bug

paddle.device.device 上下文管理器在多线程环境下不是线程安全的

问题现象

复现代码

实际输出

预期输出

根本原因分析

影响范围

其他补充信息 Additional Supplementary Information

环境信息

建议的解决方案

备注

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions