Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Lack of support for noreturn functions results in incorrect CFG generation. #115154

Open
bgergely0 opened this issue Nov 6, 2024 · 4 comments · May be fixed by #115616
Open

[BOLT] Lack of support for noreturn functions results in incorrect CFG generation. #115154

bgergely0 opened this issue Nov 6, 2024 · 4 comments · May be fixed by #115616
Assignees
Labels

Comments

@bgergely0
Copy link

BOLT does not have support for noreturn functions. Because of this, BOLT assumes that the call to the noreturn function will return, and execution will continue, and adds the following BasicBlock to the successors list.

Take this example:

#include <iostream>

__attribute__((noinline, noreturn)) void my_error() {
    std::abort();
}

int main() {

    int a = 10;

    if (a < 20) {
        my_error();
    } else {
        std::cout << a << std::endl;
    }
    return 0;
}

Compile it:

clang++ --target=aarch64-linux-gnu -Wl,--emit-relocs main.cpp -o noreturn_main

(I am compiling for aarch64, but the issue is the same regardless of the architecture).

Looking at the BasicBlocks:
BB containing the call to my_error

   00000000:   bl  _Z8my_errorv

The following BB, which is in the successors list:

    00000000:   ldr     w1, [sp, #0x8]
    00000000:   adrp    x0, __BOLT_got_zero+65536
    00000000:   ldr     x0, [x0, :lo12:__BOLT_got_zero+4032]
    00000000:   bl      _ZNSolsEi@PLT
    00000000:   adrp    x1, __BOLT_got_zero+65536
    00000000:   ldr     x1, [x1, :lo12:__BOLT_got_zero+4016]
    00000000:   bl      _ZNSolsEPFRSoS_E@PLT
    00000000:   b       .Ltmp7

This is after the call to my_error, but it should not be in the successors list.

We can also look at the disassembly of the whole main function:

0000000000000a60 <main>:
 a60:   d10083ff        sub     sp, sp, #0x20
 a64:   a9017bfd        stp     x29, x30, [sp, #16]
 a68:   910043fd        add     x29, sp, #0x10
 a6c:   b81fc3bf        stur    wzr, [x29, #-4]
 a70:   52800148        mov     w8, #0xa                        // #10
 a74:   b9000be8        str     w8, [sp, #8]
 a78:   b9400be8        ldr     w8, [sp, #8]
 a7c:   71005108        subs    w8, w8, #0x14
 a80:   5400006a        b.ge    a8c <main+0x2c>  // b.tcont.    // <---- jump to BB after the BL my_error
 a84:   14000001        b       a88 <main+0x28>
 a88:   97fffff3        bl      a54 <_Z8my_errorv>
 a8c:   b9400be1        ldr     w1, [sp, #8]
 a90:   90000080        adrp    x0, 10000 <__FRAME_END__+0xf3cc>
 a94:   f947e000        ldr     x0, [x0, #4032]
 a98:   97ffff7a        bl      880 <_ZNSolsEi@plt>
 a9c:   90000081        adrp    x1, 10000 <__FRAME_END__+0xf3cc>
 aa0:   f947d821        ldr     x1, [x1, #4016]
 aa4:   97ffff67        bl      840 <_ZNSolsEPFRSoS_E@plt>
 aa8:   14000001        b       aac <main+0x4c>
 aac:   2a1f03e0        mov     w0, wzr
 ab0:   a9417bfd        ldp     x29, x30, [sp, #16]
 ab4:   910083ff        add     sp, sp, #0x20
 ab8:   d65f03c0        ret

We can see that the BB starting at a8c is the target of a conditional jump, and it is placed under the BL by the compiler on the assumption, that my_error does not return.

@github-actions github-actions bot added the BOLT label Nov 6, 2024
@bgergely0
Copy link
Author

@kbeyls FYI

@kbeyls
Copy link
Collaborator

kbeyls commented Nov 6, 2024

@elhewaty, I believe you're working on a fix for this?

@elhewaty
Copy link
Member

elhewaty commented Nov 6, 2024

Yes, I am working on this, Sorry for my late reply. I was thinking of using CallGraph class and CallGraphWalker class to build a pass that mark all no-return functions and cash the results in some std::unordered_map.

Here are some thoughts:
1 - Do not build the graph if the no-return functions were not called [__cxa_throw@PLT _Unwind_Resume@PLT __cxa_rethrow@PLT exit@PLT abort@PLT] for ELF at least. I didn't find an something like getBinaryFunction(string &) ] in the Interface, so I think I should iterate over all functions, so this is not useful.

2 - using the call graph and the call graph walker we can start traversing the call graph in any order(I guess), although the the traversal in the walker is topological, but it's suffecient enough for this purpose as it has a worklist that keeps track of the caller function if the callee is later marked as a no-return.

3 - As mentioned, the walker visits every node to ensure some property using the callback function. we can make the callback function a DFS traversal and go from it to see if all it's successors end with a no-return call and mark it as a no-return [I think using this will make updating the worklist unnecessary, as we will visit the leaves of the graph at the first node and cash the results of each node, so when we traverse a visited node from non-visited one the result was cached from before so the traversing stops]. if we go with this we will not need the walker -I guess-

the code will be something like this:

unordered_map<Node, uint8_t> color; // 0 --> not visited, 1 ---> still there (cycle), 2 ---> traversed
unordered_map<Node, bool> memo;
bool dfs(Node cur) {
  if (isNoReturn(cur)) return true;
  if (memo.count(cur)) return memo[cur];
  if (isLeaf(cur)) return false;
  memo[cur] = 1; // assume it's no-return
  color[cur] = 1;
  for (every successor S of cur) {
    if (!color.count(S))
       memo[cur] &= dfs(S);
  }
  return memo[cur];
}

// pass
for every node N in the CG
  dfs(N)

So what do you think?

@elhewaty elhewaty self-assigned this Nov 6, 2024
@bgergely0
Copy link
Author

bgergely0 commented Nov 7, 2024

Some thoughts:

  • fixing this is made of two different steps: working out which functions are noreturn, then modifying CFG to account for these calls.
  • because of this, I think it would be nice to store isNoReturn on the BinaryFunctions themselves (instead of a cache), so the information can be used in any other way as well later on

I didn't find an something like getBinaryFunction(string &) ] in the Interface

I have also looked at this, and found we can get the BinaryFunction that is used in a call this way:

const MCSymbol* TargetSymbol = BC.MIB->getTargetSymbol(Inst); // Inst is a call
BinaryFunction* TargetFunction = BC.SymbolToFunctionMap[TargetSymbol];

we can make the callback function a DFS traversal and go from it to see if all it's successors end with a no-return call and mark it as a no-return

So are you saying, if a function only calls noreturn functions then itself is noreturn as well?

If so, I agree. But for leaf functions we need to check that they have no returns (obviously), and I think they should end in an instruction that is a terminator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants