Skip to content

Issue with access particle data structure on host #133

@zhangchonglin

Description

@zhangchonglin

To output the particle coordinate information in XGCm, particle data structure is accessed on the host by the following code. However, the code crashed with the latest PUMIPic commit b6678b0:

  • Below code was working fine for early PUMIPic version (when it was still using kokkos 3.7.02).
  • After PUMIPic upgrade to kokkos > 4.0 or around that time, it stopped working.
    auto ptcls_host =                                                           
#ifdef XGCM_PS_CAB                                                              
      static_cast<ps::CabM<PtclType, DeviceType>*>(ptcls)->template copy<Kokkos::HostSpace>();
#else                                                                           
      static_cast<ps::SellCSigma<PtclType, DeviceType>*>(ptcls)->template copy<Kokkos::HostSpace>();
#endif                                                                          
                                                                                
    auto coords = ptcls_host->template get<PTCL_COORDS>();                      
    auto pids = ptcls_host->template get<PTCL_IDS>();                           
                                                                                
    auto writeCoords = [&](const int elm, const int ptcl, const bool mask) {    
      if (mask) {                                                               
        out << pids(ptcl) << ' ' << coords(ptcl,0) << ' ' << coords(ptcl, 1)    
            << ' ' << coords(ptcl, 2) << '\n';                                  
      }                                                                         
    };                                                                          
    ps::parallel_for(ptcls_host, writeCoords, "writeCoords");                 
  • The error message when using SCS particle structure:
Kokkos::abort: Requested Team Size is too large!
  • Using CabM particle structure, the code simply crashed.

Below is the stack trace from XGCm crash using SCS:

(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f4071c8bc43 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2  0x00007f4071c3e686 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f4071c28833 in __GI_abort () at abort.c:79
#4  0x00007f4075cc32a9 in Kokkos::Impl::host_abort (message=message@entry=0xd2a4d8 "Kokkos::abort: Requested Team Size is too large!")
    at /hdd1/xgcm/xgcm_kokkos4.2.00/kokkos/core/src/impl/Kokkos_Abort.cpp:40
#5  0x00000000004b13eb in Kokkos::abort (message=0xd2a4d8 "Kokkos::abort: Requested Team Size is too large!")
    at /hdd1/xgcm/xgcm_kokkos4.2.00/install/kokkos/install/include/Kokkos_Abort.hpp:97
#6  Kokkos::Impl::TeamPolicyInternal<Kokkos::Serial, Kokkos::Serial>::TeamPolicyInternal (team_size_request=32, league_size_request=72, this=0x7ffcab74bcf0)
    at /hdd1/xgcm/xgcm_kokkos4.2.00/install/kokkos/install/include/Serial/Kokkos_Serial_Parallel_Team.hpp:126
#7  Kokkos::Impl::TeamPolicyInternal<Kokkos::Serial, Kokkos::Serial>::TeamPolicyInternal (this=0x7ffcab74bcf0, league_size_request=72, 
    team_size_request=team_size_request@entry=32, vector_length_request=<optimized out>)
    at /hdd1/xgcm/xgcm_kokkos4.2.00/install/kokkos/install/include/Serial/Kokkos_Serial_Parallel_Team.hpp:168
#8  0x00000000004db5ab in Kokkos::TeamPolicy<Kokkos::Serial>::TeamPolicy (vector_length_request=1, team_size_request=32, league_size_request=<optimized out>, 
    this=<optimized out>) at /hdd1/xgcm/xgcm_kokkos4.2.00/install/kokkos/install/include/Kokkos_ExecPolicy.hpp:566
#9  pumipic::SellCSigma<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::HostSpace>::parallel_for<xgcm::saveCoords<pumipic::MemberTypes<double [3], double [3], double [3], int> >(pumipic::ParticleStructure<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >*, char const*, int)::{lambda(int, int, bool)#1}>(xgcm::saveCoords<pumipic::MemberTypes<double [3], double [3], double [3], int> >(pumipic::ParticleStructure<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >*, char const*, int)::{lambda(int, int, bool)#1}&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (this=this@entry=0x5cdc1190, fn=..., name="writeCoords")
    at /hdd1/xgcm/xgcm_kokkos4.2.00/install/pumi-pic/install/include/SellCSigma.h:534
#10 0x00000000004e520c in pumipic::parallel_for<xgcm::saveCoords<pumipic::MemberTypes<double [3], double [3], double [3], int> >(pumipic::ParticleStructure<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >*, char const*, int)::{lambda(int, int, bool)#1}, pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::HostSpace>(pumipic::ParticleStructure<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::HostSpace>*, xgcm::saveCoords<pumipic::MemberTypes<double [3], double [3], double [3], int> >(pumipic::ParticleStructure<pumipic::MemberTypes<double [3], double [3], double [3], int>, Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace> >*, char const*, int)::{lambda(int, int, bool)#1}&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (
    ps=ps@entry=0x5cdc1190, fn=..., s="writeCoords") at /hdd1/xgcm/xgcm_kokkos4.2.00/install/pumi-pic/install/include/ps_for.hpp:10
#11 0x0000000000515f2e in xgcm::saveCoords<pumipic::MemberTypes<double [3], double [3], double [3], int> > (ptcls=<optimized out>, prefix=prefix@entry=0xd70964 "ions", 
    iter=iter@entry=0) at /hdd1/xgcm/xgcm_kokkos4.2.00/xgcm/src/viz/xgcm_viz_ptcls.tpp:40
#12 0x00000000004b9b0a in xgcm_main_loop (ions=@0x7ffcab74cb78: 0x142b0c30, electrons=@0x7ffcab74cc30: 0x14cc7d20, mesh=..., ptcl_dist=..., sml=..., magnetic_field=..., 
    species=0x7ffcab74cd70, electric_field=..., grid=..., poisson=..., profile=..., mesh_render=true, particle_render=true)
    at /hdd1/xgcm/xgcm_kokkos4.2.00/xgcm/test/xgcm_main_loop.cpp:77
#13 0x0000000000445c0f in main (argc=<optimized out>, argv=<optimized out>) at /hdd1/xgcm/xgcm_kokkos4.2.00/xgcm/test/xgcm.cpp:308

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions