SRE-3704 ci: adapt NLT pipeline to cover Fault Injection testing procedure.#516
Open
SRE-3704 ci: adapt NLT pipeline to cover Fault Injection testing procedure.#516
Conversation
Fix NullPointerException when allowEmptyArchive receives null
from missing ignore_failure key
When unitTest() throws inside afterTest(), the results_map stash written
by runTest() (which lacks the ignore_failure key) is the only stash
available to unitTestPost(). Reading results['ignore_failure'] on a Map
with no such key returns null in Groovy. Passing null to
archiveArtifacts(allowEmptyArchive: null) causes a NullPointerException
in Java reflection (unboxBoolean) because ArtifactArchiver.allowEmptyArchive
is a primitive boolean.
Root cause fix (unitTest.groovy):
- Wrap afterTest() in a try/finally block so that the updated stash
containing ignore_failure is always written, even if afterTest() throws.
Defensive fix (unitTestPost.groovy):
- Replace all results['ignore_failure'] accesses with
results.get('ignore_failure', false) to guard against any future
code path where the stash is incomplete.
Both 'NLT' and 'NLT Fault injection testing' stages have their stage_info['NLT'] set to true by parseStageInfo() because both stage names contain the string 'NLT'. As a result unitTestPost() calls recordIssues with the hardcoded id: 'VM_test' for both stages in the same build, causing: IllegalStateException: ID VM_test is already used by another action Fix by replacing the hardcoded 'VM_test' with sanitizedStageName() + '_VM_test', which produces a unique ID per stage (e.g. 'NLT_VM_test' and 'NLT_Fault_injection_testing_VM_test').
When a stage runs with memcheck disabled (e.g. NLT Fault injection testing uses '--memcheck no'), no *memcheck.xml files are created. The fileOperations copy step copies 0 files so the target directory is never created, and the unconditional tar command fails with: tar: <dir>: Cannot stat: No such file or directory Guard the tar with fileExists() so it is only executed when the memcheck directory was actually populated. Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com> Priority: 2 Cancel-prev-build: false Skip-python-bandit: true Skip-unit-test-memcheck: true Skip-func-vm-all: true Skip-test-el-9-rpms: true Skip-test-leap-15-rpms: true Skip-func-hw-test: true Skip-build-el8-gcc: true Skip-build-leap15-gcc: true
Add a new config['nlt_name'] parameter to unitTestPost() to allow callers to override the display name used for the NLT recordIssues section in the Jenkins UI. Defaults to 'Node local testing' to keep existing behaviour for the NLT stage. The NLT Fault injection testing stage passes nlt_name: 'Fault injection issues' so its warnings section is clearly distinguished from the plain NLT stage in the Jenkins UI. Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com> Priority: 2 Cancel-prev-build: false Skip-python-bandit: true Skip-unit-test-memcheck: true Skip-func-vm-all: true Skip-test-el-9-rpms: true Skip-test-leap-15-rpms: true Skip-func-hw-test: true Skip-build-el8-gcc: true Skip-build-leap15-gcc: true
The memcheck tarball (${stage}_memcheck_results.tar.bz2) is only created
by unitTest.groovy when memcheck files actually exist (guarded by
fileExists(memcheck_dir)). However, unitTestPost.groovy added the tarball
to artifact_list unconditionally, and artifact_list is archived with
allowEmptyArchive tied to ignore_failure. For the NLT Fault injection
testing stage ignore_failure=false, so archiveArtifacts throws:
No artifacts found that match the file pattern
"NLT_Fault_injection_testing_memcheck_results.tar.bz2".
Configuration error?
Archive the memcheck tarball directly with allowEmptyArchive: true instead
of adding it to artifact_list, so it is silently skipped when no memcheck
files were produced.
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Priority: 2
Cancel-prev-build: false
Skip-python-bandit: true
Skip-unit-test-memcheck: true
Skip-func-vm-all: true
Skip-test-el-9-rpms: true
Skip-test-leap-15-rpms: true
Skip-func-hw-test: true
Skip-build-el8-gcc: true
Skip-build-leap15-gcc: true
parseStageInfo: detect 'NLT fault injection' stage separately and set FI=true in addition to NLT=true, leaving the regular NLT path unchanged (with valgrind enabled). unitTest/unitTestPost: remove NLT from the valgrind check condition since fault injection runs with --memcheck no and produces no memcheck files. unitTestPost: when FI=true, add nlt-client-leaks.json as a second tool to the recordIssues call alongside the existing vm_test/nlt-errors.json. skipStage: add 'NLT Fault injection testing' case to match the stage name used in the Jenkinsfile. Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
This reverts commit df0bd59.
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
12 tasks
Doc-only: true Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Signed-off-by: Tomasz Gromadzki <tomasz.gromadzki@hpe.com>
Contributor
daltonbohning
left a comment
There was a problem hiding this comment.
It appears this goes along with daos-stack/daos#17953.
So we are effectively going to have different handling for this stage based on the release branch? Since the stage name is different, pipeline-lib will treat it differently
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces logic that simplifies Fault Injection testing stage setup in the Jenkinsfile.
Requires:
NLT Fault Injection testingis fully implemented using theunitTest/unitTestPostGroovyprocedures from
pipeline-lib:The stage is given a new name (
NLT Fault Injection testing) so it is not confused withthe existing
Fault injection testingstage.parseStageInfo.groovy:
NLT Fault Injectionstage by name (case-insensitive) and set theFIflag.FIis set, skip valgrind configuration — NLT FI runs with--memcheck noanddoes not produce memcheck XML files.
FIis not set (plain NLT), keep the existingvalgrind_pattern/with_valgrindsetup unchanged.
unitTest.groovy:
environment: "VM_CPUS=20"toprovisionNodesfor all NLT stages; the NLT testsuite requires at least 20 CPU cores to run reliably.
finallyblock soignore_failureis always written tothe stash even if
afterTest()throws. Without this,unitTestPost()reads the earlierstash from
runTest()which lacks theignore_failurekey, causingallowEmptyArchive: nulland a NullPointerException inArtifactArchiver.config['NLT']from the Valgrind check condition — NLT FI does not run withmemcheck, so the Valgrind path should only be taken when
with_valgrindis set.fileExists()check to avoid a shell error whenthe copy step produced no output directory.
unitTestPost.groovy:
results['ignore_failure']map accesses with.get('ignore_failure', false)to prevent NPE when the key is absent in the stash.tool:with atools:list (nltTools) so the Fault Injectionstage can report both
nlt-errors.jsonandnlt-client-leaks.jsonas separate issuesources under
recordIssues.recordIssues name:dynamically to'Fault injection'or'NLT'based on theFIflag, replacing the hard-coded'Node local testing'label.archiveArtifactsrather than appending itto the deferred artifact list, consistent with how other binary artifacts are handled.
with_valgrindblock — same rationale as unitTest.skipStage.groovy:
'NLT Fault injection testing'as a recognized case so the new stage name ishandled identically to
'Fault injection testing'when evaluating skip conditions.