Skip to content

Passing strings requires passing their lengths as well #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MakisH opened this issue May 11, 2020 · 7 comments · May be fixed by #26
Open

Passing strings requires passing their lengths as well #1

MakisH opened this issue May 11, 2020 · 7 comments · May be fixed by #26
Labels
enhancement New feature or request

Comments

@MakisH
Copy link
Member

MakisH commented May 11, 2020

Only in the module variant of our Fortran bindings, we always need to pass the (exact, or sometimes large enough) size of the strings, e.g. for participant name, data name, names of actions, etc. See, for example, how we create the interface:

CALL precicef_create(participantName, config, rank, commsize, 50, 50)

The last two arguments (50, 50) refer to the maximum sizes of participantName and config.

This is not only ugly, but can also be dangerous in some cases.

Compare with the solverdummy using the preCICE-internal fortran bindings:

CALL precicef_create(participantName, config, rank, commsize)
@MakisH MakisH added the enhancement New feature or request label May 11, 2020
@MakisH
Copy link
Member Author

MakisH commented May 11, 2020

We also follow the same style in the ateles adapter (in-house):

  subroutine tem_precice_create( rank, comm_size )
    ! -------------------------------------------------------------------- !
    integer, intent(in) :: rank
    integer, intent(in) :: comm_size
    ! -------------------------------------------------------------------- !
    character(len=labelLen, kind=c_char) :: c_accessor
    character(len=labelLen, kind=c_char) :: c_precice_configFile
    integer(kind=c_int) :: c_my_rank
    integer(kind=c_int) :: c_my_size
    integer(kind=c_int) :: c_accessorNameLength
    integer(kind=c_int) :: c_configFileNameLength
    ! -------------------------------------------------------------------- !

    c_accessor = precice%accessor
    c_precice_configFile = precice%precice_configFile
    c_my_rank = int(rank, kind=c_int)
    c_my_size = int(comm_size, kind=c_int)
    c_accessorNameLength = len_trim(precice%accessor)
    c_configFileNameLength = len_trim(precice%precice_configFile)

    call precicef_create( participantName      = c_accessor,            &
      &                   configFileName       = c_precice_configFile,  &
      &                   solverProcessIndex   = c_my_rank,             &
      &                   solverProcessSize    = c_my_size,             &
      &                   lengthAccessorName   = c_accessorNameLength,  &
      &                   lengthConfigFileName = c_configFileNameLength )

  end subroutine tem_precice_create

@MakisH
Copy link
Member Author

MakisH commented May 11, 2020

In the preCICE-internal Fortran bindings, we automatically strip the length using our own method for this:

int precice::impl::strippedLength(
    const char *string,
    int         length)
{
  int i = length - 1;
  while (((string[i] == ' ') || (string[i] == 0)) && (i >= 0)) {
    i--;
  }
  return i + 1;
}

@MakisH
Copy link
Member Author

MakisH commented Nov 15, 2022

We could make these as default arguments.

@ivan-pi
Copy link
Collaborator

ivan-pi commented Feb 8, 2024

There are two modes of interoperability,

  • the standardized route using the bind(c) attribute (recommended)
  • the non-standard route which depends on compiler-specific conventions (i.e. name mangling, hidden arguments)

Non-standard route

In practice, a procedure may have an implicit interface, e.g.

       external precicef_create
       call precicef_create("myName","myConfig",idx,sz)

Here the compiler doesn't perform any interface checking for type-correctness or even the number of arguments. Perhaps the linker may complain.

There could be an explicit interface, declared either on the spot, included from a file, or imported from a module:

interface
   ! This procedure could be implemented in Fortran as an external subroutine, 
   ! or some any other language as long as it obeys the expected calling convention
   subroutine precicef_create(participantName,configFileName,solverProcessIndex,solverProcessSize)
      character(kind=c_char,len=*), intent(in) :: participantName
      character(kind=c_char,len=*), intent(in) :: configFileName
      integer(c_int), intent(in) :: solverProcessIndex
      integer(c_int), intent(in) :: solverProcessSize
   end subroutine
end interface

The argument passing conventions of Fortran are described here: https://gcc.gnu.org/onlinedocs/gcc-10.3.0/gfortran/Argument-passing-conventions.html

For Intel Fortran they are found here: https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2024-0/mixed-language-issues.html. On Intel Fortran the rules differ between Linux and Windows, and may even be influenced by compiler flags. In other words trying to support all of them is a mess.

Standard Route

If you use the standard route, you can rely on the C API

PRECICE_API void precicec_createParticipant(
    const char *participantName,
    const char *configFileName,
    int         solverProcessIndex,
    int         solverProcessSize);

This procedure will have the Fortran interface:

interface
   subroutine precicec_createParticipant( &
        participantName,configFileName,solverProcessIndex,solverProcessSize) &
        bind(c,name="precicec_createParticipant")
      use, intrinsic :: iso_c_binding, only: c_char, c_int
      character(kind=c_char), intent(in) :: participantName(*)   ! null-terminated
      character(kind=c_char), intent(in) :: configFileName(*)  ! null-terminated
      integer(c_int), value :: solverProcessIndex
      integer(c_int), value :: solverProcessSize
   end subroutine
end interface

For convenience, you could wrap this procedure in Fortran, so that users don't need to null-terminate the strings themselves. You would place this procedure in a module:

module precice
! ...
contains

subroutine precicef_createParticipant(participantName,configFileName,solverProcessIndex,solverProcessSize)
      character(len=*), intent(in) :: participantName
      character(len=*), intent(in) :: configFileName
      integer(c_int), intent(in) :: solverProcessIndex
      integer(c_int), intent(in) :: solverProcessSize

     call precicec_createParticipant( &
        trim(participantName)//c_null_char, &
        trim(configFileName)//c_null_char, &
        solverProcessIndex, solverProcessSize)
        
end subroutine

end module precice

Alternatively, since Fortran 2018, you could also create the binding in C or C++:

/* Since F2018, there is a C header that provides a "descriptor",
   which is an opaque C struct, that can be used to interoperate 
   with Fortran arrays and scalars */
   
#include <ISO_Fortran_binding.h>

extern "C"
void precicef_createParticipant(
    CFI_cdesc_t *pName, CFI_cdesc_t *pIndex,
    int* solverProcessIndex, int* solverProcessSize) {
    
    std::string_view name{pName->base, pName->elem_len};
    std::string_view index{pIndex->base, pIndex->elem_len};
    
    // ... 
}

This procedure would have the same interface as the wrapper above, with the bind(c,name="precicef_createParticipant") attribute on top. The assumed-length character scalars get passed to the C procedure as the CFI_cdesc_t * type.

@MakisH
Copy link
Member Author

MakisH commented Mar 19, 2024

So, if I understand correctly, an easy and standard way to solve this would be to almost duplicate all the precicef_foo(a, b, lengthA, lengthB) with wrapper functions precicec_foo(a, b) that automatically trim the zeros.

As most of the API now uses mesh names, data names, or similar, that would be quite some duplication, but it might be worth it.

Restricting our users to Fortran 2018 could scare them away, while I would avoid compiler-specific solutions.

(thank you for your very detailed suggestions! I am just assessing them out loud for myself)

@ivan-pi
Copy link
Collaborator

ivan-pi commented Mar 19, 2024

Yes, however it's not about trimming zeros, it's about appending the null character. C has the convention that strings are null terminated, whereas in Fortran, the implementations typically store a buffer and the length separately.

In the existing implementation you are using the de facto standard name mangling and linkage conventions (but not standard from the language point of view), that makes the procedures work as external subroutines (a (dis)service to the die-hard Fortran users which stick to old obsolescent conventions (fixed-form, implicit typing, external procedures), but that's just my opinion at the end of the day).

@ivan-pi
Copy link
Collaborator

ivan-pi commented Mar 19, 2024

I'm afraid I just made things for confusing. In summary, the options are:

  1. use the non-standard route (but at least offer interfaces for argument checking)
  2. implement the procedures in Fortran (as a thin wrapper layer of a C-compatible binding), or
  3. use Fortran 2018 enhanced interoperability (ISO_Fortran_binding.h)

In all three cases the ABI in Fortran will be compiler-specific (just in slightly different ways).

At the moment, preciceFortran.cpp is written as if you were aiming for option 1, however the Fortran module is a 50 % take on option 2.

Option 3 would impose the requirement of at least gfortran 9 (released in 2019), Intel Fortran Compiler 19.0 (released in 2019 and already deprecated as a product, replaced by the Intel LLVM compiler), and nvfortran doesn't support it at all (but is looking to replace it soon with flang-new).

@MakisH MakisH linked a pull request Mar 19, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants