请教：数组索引连续化的性能问题

我又来打扰了。我记得小彭老师说过，连续的内存访问速度比较快。可是我的高维数组进行抽提时，低维也有很长片段的连续，为啥还是很慢。需要用什么 stream 或者 prefetch 来优化吗？

```python
# array0 是一个体像素(voxel)，对其进行3D上的crop。crop 的尺寸为 64x64x64
array2 = np.ascontiguousarray(array0[:9, x:x+64, y:y+64, z:z+64])  #280fps, 4.9 GByte/s 
array3 = C++版本的memcopy(array, x, y, z, ....)   #同样 280fps
```

这个 4.9 GByte/s 的内存访问速度就很离谱，我还没有做任何计算。

https://github.com/chenxinfeng4/hpc_issue_array_continguous

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

请教：数组索引连续化的性能问题 #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

请教：数组索引连续化的性能问题 #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions