Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are many syntax nodes not used in java.g4? #4418

Open
AntJiuFo opened this issue Feb 18, 2025 · 3 comments
Open

Why are many syntax nodes not used in java.g4? #4418

AntJiuFo opened this issue Feb 18, 2025 · 3 comments

Comments

@AntJiuFo
Copy link

AntJiuFo commented Feb 18, 2025

I used g4 in the java directory in the project, which I think should be the official full version. I tested and ran all the java files in the example, including AllInOne7.java, AllInOne8.java, AllInOne11.java, AllInOne17.java, etc. However, it is found that nearly half of the nodes in g4, such as ModuleBody and LambdaParameters, have not been visited. What is the reason?

@kaby76
Copy link
Contributor

kaby76 commented Feb 18, 2025

First, let's test that statement ([I]t is found that nearly half of the nodes in g4, such as ModuleBody and LambdaParameters, have not been visited.) using the Trash Toolkit and the java/java grammar. NB: There is no Java.g4 anywhere in grammar-v4! So, I don't really know which grammar and test suite you are referring to.

$ for i in `ls ../examples/*.java | grep -v ManyStringsConcat.java`; do trparse $i | trxgrep ' //(methodBody | lambdaParameters)' | t
rtext -c; done
CSharp 0 ../examples/AllInOne11.java success 0.2243314
32
CSharp 0 ../examples/AllInOne17.java success 0.4238909
53
CSharp 0 ../examples/AllInOne7.java success 0.2904277
39
CSharp 0 ../examples/AllInOne8.java success 0.3027744
39
CSharp 0 ../examples/ConsecutiveSemicolons.java success 0.0323889
0
CSharp 0 ../examples/Escapes.java success 0.0483005
1
CSharp 0 ../examples/ExpressionOrder.java success 0.0796696
4
CSharp 0 ../examples/GenericConstructor.java success 0.0652121
1
CSharp 0 ../examples/LocalVariableDeclaration.java success 0.0727948
1
CSharp 0 ../examples/module-info.java success 0.0282214
0
CSharp 0 ../examples/RecordsTesting.java success 0.0680149
0
CSharp 0 ../examples/SwitchExpression.java success 0.1214815
1
CSharp 0 ../examples/TryStatements.java success 0.0942388
1
02/18-07:56:16 ~/issues/g4-current/java/java/Generated-CSharp

(NB: trxgrep has stack overflow for ManyStringsConcat.java, so I removed that from the test. kaby76/Trash#540)

So, 10 out of 13 test files contain either methodBody or lambdaParameters parse tree nodes.

Next, I added an Antlr Listener to my driver program, and test where it tracks the count of (methodBody | lambdaParameters) nodes.

$ cat Class1.cs
using System;
using System.Reflection;
using Antlr4.Runtime.Misc;

public class Class1 : JavaParserBaseListener
{
    public int count;

        public Class1()
        {
        }

    public override void EnterMethodBody([NotNull] JavaParser.MethodBodyContext context)
    {
        this.count++;
    }

    public override void EnterLambdaParameters([NotNull] JavaParser.LambdaParametersContext context)
    {
        this.count++;
    }
}
02/18-08:31:36 ~/issues/g4-current/java/java/Generated-CSharp
$ for i in `ls ../examples/*.java | grep -v ManyStringsConcat.java`; do ./bin/Debug/net8.0/Test.exe $i; done
CSharp 0 ../examples/AllInOne11.java success 0.1817295
32
Total Time: 0.2665532
CSharp 0 ../examples/AllInOne17.java success 0.3518479
53
Total Time: 0.4398401
CSharp 0 ../examples/AllInOne7.java success 0.2332528
39
Total Time: 0.3209089
CSharp 0 ../examples/AllInOne8.java success 0.305218
39
Total Time: 0.3885185
CSharp 0 ../examples/ConsecutiveSemicolons.java success 0.0298602
0
Total Time: 0.1000413
CSharp 0 ../examples/Escapes.java success 0.0499986
1
Total Time: 0.1236674
CSharp 0 ../examples/ExpressionOrder.java success 0.0902961
4
Total Time: 0.1721068
CSharp 0 ../examples/GenericConstructor.java success 0.0683203
1
Total Time: 0.1419784
CSharp 0 ../examples/LocalVariableDeclaration.java success 0.0772038
1
Total Time: 0.1516898
CSharp 0 ../examples/module-info.java success 0.0284661
0
Total Time: 0.0980746
CSharp 0 ../examples/RecordsTesting.java success 0.070418
0
Total Time: 0.1450204
CSharp 0 ../examples/SwitchExpression.java success 0.0674109
1
Total Time: 0.1419133
CSharp 0 ../examples/TryStatements.java success 0.0653567
1
Total Time: 0.1396325
02/18-08:31:48 ~/issues/g4-current/java/java/Generated-CSharp

The results are the same. And I suspect the version that uses Antlr Visitors--if properly implemented--would be the same.

I also tested this using trcover and found that 99% of the nodes are used across the entire test suite. Take this file cover.html.txt, rename to cover.html, then open it in a browser, and scroll to the end of the page.

I generally do not use Antlr4 Visitors or Listeners. They are essentially the equivalent of assembly language programming for parsing.

@AntJiuFo
Copy link
Author

thank you. I'll try

@HanzoDev1375
Copy link

@kaby76 It may seem strange, but I'm asking, can you create a Python code formatter for me with a parser and a code generator, with 2 ports, one for C# and one for Java? Because I haven't mastered Java yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants