Skip to content

Conversation

jjppp
Copy link
Member

@jjppp jjppp commented Aug 26, 2025

According to lecture 7, in Class Hierarchy Analysis (CHA for short), the resolution of virtual invocation such as base.foo() should traverse down the class hierarchy starting from the type of the receiver variable base.
However, Tai-e currently uses the declaring class of the referenced method, as in the following snippet:

JClass cls = methodRef.getDeclaringClass();

This might introduce imprecision when the base variable has a different type of the declaring class of the referenced method. Below is a simplified example from the antlr benchmark program from java-benchmarks, the corresponding source code can be found here:

// antlr.PreservingFileWriter: void close()
void close() {
  Reader source = null;
  source = new BufferedReader(...);
  source.close(); // ?
}

Tir dumped by Tai-e:

void close() {
    java.io.BufferedReader $r36, ...;
    [13@L71] $r36 = new java.io.BufferedReader;
    [53@L97] invokevirtual $r36.<java.io.Reader: void close()>();
}

Note the how the types of variable source and $r36 differ in the source code and the IR.
The variable $r36 in the generated Tir (source in the source code) has type BufferedReader (Reader in the source code). Since we know that $r36 can only point to objects of types that are subtypes of BufferedReader, we can eliminate other callees from other subclasses of Reader.

The difference of variable types comes from the fact that bytecode frontends (Soot, at the moment) usually perform a precise local type inference algorithm to recover tight type information from the bytecode. Although variable source has type Reader in the original source code, it is only assigned with objects of type BufferedReader throughout its method. This guarantees that for invoke statements recv.foo(), the type of receiver is always a subtype of the declaring class of foo, and using the type of the receiver variable is always as precise as (sometimes more precise than) using the declaring class of the method reference.

Copy link

codecov bot commented Aug 26, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.71%. Comparing base (523aec2) to head (64ce5fd).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...scal/taie/analysis/graph/callgraph/CHABuilder.java 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #197      +/-   ##
============================================
+ Coverage     75.67%   75.71%   +0.03%     
- Complexity     4626     4631       +5     
============================================
  Files           480      480              
  Lines         15928    15933       +5     
  Branches       2183     2185       +2     
============================================
+ Hits          12053    12063      +10     
+ Misses         3008     3002       -6     
- Partials        867      868       +1     
Files with missing lines Coverage Δ
...scal/taie/analysis/graph/callgraph/CHABuilder.java 79.45% <66.66%> (+7.39%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jjppp jjppp marked this pull request as ready for review August 26, 2025 16:28
@jjppp jjppp changed the title Use the type of the receiver variable when resolving callees in CHABuilder Improve the precision of CHABuilder via resolving callees using the type of the receiver variable Sep 3, 2025
@zhangt2333 zhangt2333 changed the title Improve the precision of CHABuilder via resolving callees using the type of the receiver variable Improve CHABuilder precision via resolving callees using the type of the receiver variable Sep 3, 2025
@zhangt2333 zhangt2333 changed the title Improve CHABuilder precision via resolving callees using the type of the receiver variable Improve CHABuilder precision via resolving callees using the type of the receiver variable Sep 3, 2025
@zhangt2333 zhangt2333 merged commit 6e58580 into pascal-lab:master Sep 3, 2025
4 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Sep 3, 2025
@jjppp jjppp deleted the chabuilder branch September 3, 2025 09:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants