Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RATIS-2251. Migrate ratis-test tests to Junit 5 - Part 3. #1227

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Feb 22, 2025

What changes were proposed in this pull request?

Trying to upgrade ratis-server's Junit4 unit tests to Junit5.

What is the link to the Apache JIRA

JIRA: RATIS-2251. Migrate ratis-test tests to Junit 5 - Part 3.

How was this patch tested?

Junit Test & mvn clean test.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slfan1989 , thanks a lot for working on this! The change looks good in general. Let's also clean up TestReConfigProperty; see https://issues.apache.org/jira/secure/attachment/13074931/1227_review.patch

BTW, there are test failures. Please take a look.

Comment on lines 205 to 206
Assertions.assertFalse(
exceptionCaught, "received unexpected exception");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use fail(..) inside the catch-block instead of assertFalse(..).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo Thank you very much for reviewing this PR and providing suggestions! I will continue to improve this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some time to validate the unit tests for the issue and will provide a response as soon as possible.

@szetszwo
Copy link
Contributor

Not sure why it has IllegalStateException in https://github.com/apache/ratis/actions/runs/13490213066/job/37726235717?pr=1227#step:11:886

Warning:  Could not process start event for test [JUnit Jupiter executed by VM 1]
java.lang.IllegalStateException: Overlapping ID assignment for 
'HashableTestDescriptor{testDescriptor=DefaultTestDescriptor{id=5866381821535424172, 
surefireForkId=1, name='JUnit Jupiter', className='null', parentId=null, displayName='null'}, 
testMojoExecutionId=-6165802485913919921}' (prev: '3968948658746052783').

@slfan1989
Copy link
Contributor Author

Not sure why it has IllegalStateException in https://github.com/apache/ratis/actions/runs/13490213066/job/37726235717?pr=1227#step:11:886

@szetszwo Thank you for your message. I am currently conducting local tests to identify the cause of the issue.

@szetszwo
Copy link
Contributor

@slfan1989 , Thanks for the update!

The subclass tests related to RaftAsyncTests and InstallSnapshotNotificationTests keep failing. How about reverting these two files for now and work on them later?

@slfan1989
Copy link
Contributor Author

@szetszwo Thank you very much for your message! I will attempt to roll back these two classes and continue testing locally. I have upgraded the JUnit dependency and the version of the surefire-plugin. The previous java.lang.IllegalStateException error is no longer occurring, and I will continue testing today.

@slfan1989
Copy link
Contributor Author

slfan1989 commented Mar 2, 2025

I have carefully reviewed the CI report, and these errors are likely unrelated to the JUnit 5 upgrade. The same errors existed in JUnit 4 as well and can still be reproduced locally.

Let me summarize the current situation regarding the unit test errors.

CI / unit (grpc) / unit (grpc)

Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :ratis-test
org.apache.ratis.grpc.TestInstallSnapshotNotificationWithGrpc
org.apache.ratis.grpc.TestWatchRequestWithGrpc
org.apache.ratis.grpc.TestRaftAsyncWithGrpc
Error: Process completed with exit code 1.
    1. TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot

CI:

[INFO] Running org.apache.ratis.grpc.TestInstallSnapshotNotificationWithGrpc
Error:  Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 50.061 s <<< FAILURE! - in org.apache.ratis.grpc.TestInstallSnapshotNotificationWithGrpc
Error:  org.apache.ratis.grpc.TestInstallSnapshotNotificationWithGrpc.testAddNewFollowersNoSnapshot  Time elapsed: 5.588 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <-1> but was: <16>
	at org.apache.ratis.InstallSnapshotNotificationTests.testAddNewFollowers(InstallSnapshotNotificationTests.java:256)
	at 
....

Local:

....
MiniRaftClusterWithGrpc.java:lambda$null$2(103)) - Checking s1-group-96D8828789EC
2025-03-02 21:50:06,706 [Time-limited test] INFO  impl.MiniRaftCluster (MiniRaftClusterWithGrpc.java:lambda$null$2(103)) - Checking s2-group-96D8828789EC

java.lang.AssertionError: 
Expected :-1
Actual   :16
<Click to see difference>
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.failNotEquals(Assert.java:835)
	at org.junit.Assert.assertEquals(Assert.java:647)
	at org.junit.Assert.assertEquals(Assert.java:633)
	at org.apache.ratis.InstallSnapshotNotificationTests.testAddNewFollowers(InstallSnapshotNotificationTests.java:256)
	at org.apache.ratis.InstallSnapshotNotificationTests.lambda$testAddNewFollowersNoSnapshot$1(InstallSnapshotNotificationTests.java:182)
	at org.apache.ratis.server.impl.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:144)
	at org.apache.ratis.server.impl.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:121)
	at org.apache.ratis.InstallSnapshotNotificationTests.testAddNewFollowersNoSnapshot(InstallSnapshotNotificationTests.java:182)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.lang.Thread.run(Thread.java:840)
    1. TestWatchRequestWithGrpc#testWatchRequestAsyncChangeLeader

CI

[INFO] Running org.apache.ratis.grpc.TestWatchRequestWithGrpc
Error:  Tests run: 5, Failures: 0, Errors: 1, Skipped: 4, Time elapsed: 163.401 s <<< FAILURE! - in org.apache.ratis.grpc.TestWatchRequestWithGrpc
Error:  org.apache.ratis.grpc.TestWatchRequestWithGrpc.testWatchRequestAsyncChangeLeader  Time elapsed: 163.266 s  <<< ERROR!
java.util.concurrent.TimeoutException: testWatchRequestAsyncChangeLeader() timed out after 100 seconds
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	Suppressed: java.lang.InterruptedException: sleep interrupted

LOCAL

Junit4 Success.
Junit5 Success.

    1. TestRaftAsyncWithGrpc#

CI

INFO] Running org.apache.ratis.grpc.TestRaftAsyncWithGrpc
Error:  Tests run: 14, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 176.887 s <<< FAILURE! - in org.apache.ratis.grpc.TestRaftAsyncWithGrpc
Error:  org.apache.ratis.grpc.TestRaftAsyncWithGrpc.testWithLoadAsync  Time elapsed: 106.223 s  <<< ERROR!
java.util.concurrent.TimeoutException: testWithLoadAsync() timed out after 100 seconds
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	Suppressed: java.lang.InterruptedException: sleep interrupted

LOCAL

2025-03-02 22:17:48,858 [grpc-default-executor-780] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onError(557)) - s2@group-065D0445974D->s1-GrpcLogAppender is already stopped
2025-03-02 22:17:48,858 [grpc-default-executor-626] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onError(557)) - s0@group-065D0445974D->s1-GrpcLogAppender is already stopped
2025-03-02 22:17:48,858 [grpc-default-executor-786] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onError(557)) - s0@group-065D0445974D->s1-GrpcLogAppender is already stopped

org.junit.runners.model.TestTimedOutException: test timed out after 100 seconds

	.....
	at app//org.apache.ratis.util.TimeDuration.sleep(TimeDuration.java:353)
	at app//org.apache.ratis.util.TimeDuration.sleep(TimeDuration.java:338)
	at app//org.apache.ratis.util.JavaUtils.attempt(JavaUtils.java:236)
	at app//org.apache.ratis.RaftTestUtil.waitForLeader(RaftTestUtil.java:118)

CI / unit (misc) / unit (misc)

Error:  
Error:  After correcting the problems, you can resume the build with the command
Error:    mvn <args> -rf :ratis-test
org.apache.ratis.netty.TestRaftAsyncWithNetty
Error: Process completed with exit code 1.
    1. TestRaftAsyncWithNetty

CI:

[INFO] Running org.apache.ratis.netty.TestRaftAsyncWithNetty
Error:  Tests run: 14, Failures: 1, Errors: 7, Skipped: 0, Time elapsed: 348.303 s <<< FAILURE! - in Error:  

# 1 TestRaftAsyncWithNetty#testStaleReadAsync
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testStaleReadAsync  Time elapsed: 6.909 s  <<< ERROR!
java.util.concurrent.CompletionException: org.apache.ratis.protocol.exceptions.NotLeaderException: Server s0@group-CC3D2906FD90 is not the leader, suggested leader is: s1|localhost:15004
	at org.apache.ratis.client.impl.RaftClientImpl.handleRaftException(RaftClientImpl.java:373)
	at org.apache.ratis.client.impl.OrderedAsync.lambda$send$3(OrderedAsync.java:175)
	at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)

# 2 TestRaftAsyncWithNetty.#estStateMachineMetrics
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testStateMachineMetrics  Time elapsed: 1.417 s  <<< ERROR!
java.util.concurrent.ExecutionException: org.apache.ratis.protocol.exceptions.NotLeaderException: Server s0@group-DA17DD87FDE1 is not the leader, suggested leader is: s1|localhost:15016
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)

# 3 TestRaftAsyncWithNetty.testAppendEntriesTimeout
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testAppendEntriesTimeout  Time elapsed: 6.356 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <false> but was: <true>
	at org.apache.ratis.RaftAsyncTests.runTestAppendEntriesTimeout(RaftAsyncTests.java:433)
	at org.apache.ratis.server.impl.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:144)
	at org.apache.ratis.server.impl.MiniRaftCluster$Factory$Get.runWithNewCluster(MiniRaftCluster.java:121)
	at org.apache.ratis.RaftAsyncTests.testAppendEntriesTimeout(RaftAsyncTests.java:413)
	at java.lang.reflect.Method.invoke(Method.java:498)

# 4 TestRaftAsyncWithNetty.testWithLoadAsync
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testWithLoadAsync  Time elapsed: 100.973 s  <<< ERROR!
java.util.concurrent.TimeoutException: testWithLoadAsync() timed out after 100 seconds
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	Suppressed: java.lang.InterruptedException: sleep interrupted

# 5 TestRaftAsyncWithNetty.testRequestTimeout
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testRequestTimeout  Time elapsed: 1.395 s  <<< ERROR!
java.util.concurrent.ExecutionException: org.apache.ratis.protocol.exceptions.NotLeaderException: Server s0@group-2B35970A0B90 is not the leader, suggested leader is: s1|localhost:15088
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
	at org.apache.ratis.RaftBasicTests.testRequestTimeout(RaftBasicTests.java:438)

# 6 TestRaftAsyncWithNetty#testAsyncRequestSemaphore
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testAsyncRequestSemaphore  Time elapsed: 100.379 s  <<< ERROR!
java.util.concurrent.TimeoutException: testAsyncRequestSemaphore() timed out after 100 seconds
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at java.util.ArrayList.forEach(ArrayList.java:1259)

# 7 TestRaftAsyncWithNetty#testNoRetryWaitOnNotLeaderException
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testNoRetryWaitOnNotLeaderException  Time elapsed: 1.636 s  <<< ERROR!
java.util.concurrent.ExecutionException: org.apache.ratis.protocol.exceptions.NotLeaderException: Server s1@group-8CF8D02C8046 is not the leader, suggested leader is: s0|localhost:15120
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
	at org.apache.ratis.util.TimeDuration.apply(TimeDuration.java:313)

# 8 TestRaftAsyncWithNetty#testRequestAsyncWithRetryFailureAfterInitialMessages
Error:  org.apache.ratis.netty.TestRaftAsyncWithNetty.testRequestAsyncWithRetryFailureAfterInitialMessages  Time elapsed: 100.97 s  <<< ERROR!
java.util.concurrent.TimeoutException: testRequestAsyncWithRetryFailureAfterInitialMessages() timed out after 100 seconds
	at java.util.ArrayList.forEach(ArrayList.java:1259)

LOCAL:

image

@szetszwo
Copy link
Contributor

szetszwo commented Mar 4, 2025

@slfan1989 , after RATIS-2124 is merged, we need to resolve the conflicts here. Thanks.

@slfan1989
Copy link
Contributor Author

@slfan1989 , after RATIS-2124 is merged, we need to resolve the conflicts here. Thanks.

@szetszwo Thank you for your message! I will continue to follow up on this PR.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slfan1989 , thanks a lot for you hard work! Not sure why the unit tests keep failing here but not the other PRs such as #1228 . Let's try updating surefire to see if it make a difference.

@@ -174,7 +174,7 @@
<maven-pdf-plugin.version>1.6.1</maven-pdf-plugin.version>
<maven-remote-resources-plugin.version>3.3.0</maven-remote-resources-plugin.version>
<maven-shade-plugin.version>3.6.0</maven-shade-plugin.version>
<maven-surefire-plugin.version>3.0.0-M4</maven-surefire-plugin.version>
<maven-surefire-plugin.version>3.0.0-M9</maven-surefire-plugin.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about try updating it to the latest 3.5.2?
https://maven.apache.org/surefire/maven-surefire-plugin/usage.html

Copy link
Contributor

@adoroszlai adoroszlai Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried updating surefire to 3.5.2 (RATIS-2257), but had some problems:

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.5.2:test (default-test) on project ratis-docs: groups/excludedGroups require TestNG, JUnit48+ or JUnit 5 (a specific engine required on classpath) on project test classpath
Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.5.2:test (default-test) on project ratis-proto: groups/excludedGroups require TestNG, JUnit48+ or JUnit 5 (a specific engine required on classpath) on project test classpath

These modules have skipTests=true, so I don't understand why surefire wants to run tests. Also tried with maven.test.skip=true, which should even skip test compilation, but the problem still happened.

It would be great if you can make it work.

adoroszlai@d53a586
https://github.com/adoroszlai/ratis/actions/runs/13670191390

Copy link
Contributor

@szetszwo szetszwo Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adoroszlai , thanks for checking! It succeeded when running it locally but it failed in GitHub https://github.com/szetszwo/ratis/actions/runs/13725256050

It may be a problem in our githut action configuration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error says it's related to using excludedGroups:

groups/excludedGroups require TestNG, JUnit48+ or JUnit 5 (a specific engine required on classpath) on project test classpath

which is set in CI:

ratis/pom.xml

Line 1122 in d740e51

<excludedGroups>${flaky-test-groups}</excludedGroups>

This should not happen with skipTests, which we set for ratis-docs here:

ratis/ratis-docs/pom.xml

Lines 31 to 32 in d740e51

<!-- no testable code in this module -->
<skipTests>true</skipTests>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants