Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGVs and gtest failures on ARM Mac #11717

Closed
alanpaxton opened this issue Aug 18, 2023 · 4 comments
Closed

SEGVs and gtest failures on ARM Mac #11717

alanpaxton opened this issue Aug 18, 2023 · 4 comments
Assignees

Comments

@alanpaxton
Copy link
Contributor

alanpaxton commented Aug 18, 2023

We are bringing the cmake build for Mac up to date on ARM.

See PRs #11633 and #11543 which are the WIP cmake CI work.

We discovered that prefetch_test fails (SEGVs and test failures) consistently when run under cmake on ARM Macs, in CI and on an M1 Macbook Pro - this would appear to be a RocksDB/ARM issue.

I notice there have been some fairly recent changes to these tests by @hx235 @ajkr - that may or may not be related
in #11631

Expected behavior

Tests pass, no SEGVs.

Actual behavior

SEGVs and failed tests. ctest.log

Steps to reproduce the behavior

mkdir build
(cd build && cmake -DWITH_GFLAGS=1 ..)
(cd build && make clean)
(cd build && make V=1 -j16)

Now cut down the number of tests to run by editing CTestTestfile.cmake to just include the prefetch_test

# CMake generated Testfile for 
# Source directory: /Users/alan/swProjects/evolvedBinary/rocksdb-evolved
# Build directory: /Users/alan/swProjects/evolvedBinary/rocksdb-evolved/build
# 
# This file includes the relevant testing commands required for 
# testing this directory and lists subdirectories to be tested as well.
include("/Users/alan/swProjects/evolvedBinary/rocksdb-evolved/build/prefetch_test[1]_include.cmake")
add_test(c_test "/Users/alan/swProjects/evolvedBinary/rocksdb-evolved/build/c_test")
set_tests_properties(c_test PROPERTIES  _BACKTRACE_TRIPLES "/Users/alan/swProjects/evolvedBinary/rocksdb-evolved/CMakeLists.txt;1515;add_test;/Users/alan/swProjects/evolvedBinary/rocksdb-evolved/CMakeLists.txt;0;")
subdirs("third-party/gtest-1.8.1/fused-src/gtest")
subdirs("tools")
subdirs("db_stress_tool")

run it

ctest -j16 -I 0,,1

and re-run the failed tests to see the details:

ctest -j16 -I 0,,1 --rerun-failed --output-on-failure

This will generate the same output as ctest.log


Expected equality of these values:
  buff_prefetch_count
    Which is: 2
  3
Received signal 11 (Segmentation fault: 11)
#0   rocksdb::DBImpl::CloseHelper() (in librocksdb.8.6.0.dylib) (db_impl.cc:561)	
#1   rocksdb::DBImpl::CloseImpl() (in librocksdb.8.6.0.dylib) (db_impl.cc:718)	
#2   rocksdb::DBImpl::~DBImpl() (in librocksdb.8.6.0.dylib) (db_impl.cc:736)	
#3   rocksdb::DBImpl::~DBImpl() (in librocksdb.8.6.0.dylib) (db_impl.cc:720)	
#4   rocksdb::DBImpl::~DBImpl() (in librocksdb.8.6.0.dylib) (db_impl.cc:720)	
#5   rocksdb::DBTestBase::Close() (in prefetch_test) (db_test_util.cc:679)	
#6   rocksdb::DBTestBase::~DBTestBase() (in prefetch_test) (db_test_util.cc:108)	
#7   rocksdb::PrefetchTest1::~PrefetchTest1() (in prefetch_test) (prefetch_test.cc:94)	
#8   rocksdb::PrefetchTest1_DecreaseReadAheadIfInCache_Test::~PrefetchTest1_DecreaseReadAheadIfInCache_Test() (in prefetch_test) (prefetch_test.cc:139)	
#9   rocksdb::PrefetchTailTest_UpgradeToTailSizeInManifest_Test::~PrefetchTailTest_UpgradeToTailSizeInManifest_Test() (in prefetch_test) (prefetch_test.cc:139)	
#10  rocksdb::PrefetchTest1_SeekParallelizationTest_Test::~PrefetchTest1_SeekParallelizationTest_Test() (in prefetch_test) (prefetch_test.cc:139)	
#11  testing::Test::DeleteSelf_() (in prefetch_test) (gtest.h:20260)	
#12  void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in prefetch_test) (gtest-all.cc:3899)	
#13  void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (in prefetch_test) (gtest-all.cc:3935)	
#14  testing::TestInfo::Run() (in prefetch_test) (gtest-all.cc:4154)	
#15  testing::TestCase::Run() (in prefetch_test) (gtest-all.cc:4268)	
#16  testing::internal::UnitTestImpl::RunAllTests() (in prefetch_test) (gtest-all.cc:6634)	
#17  bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in prefetch_test) (gtest-all.cc:3899)	
#18  bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (in prefetch_test) (gtest-all.cc:3935)	
#19  testing::UnitTest::Run() (in prefetch_test) (gtest-all.cc:6242)	
#20  RUN_ALL_TESTS() (in prefetch_test) (gtest.h:22110)	
#21  main (in prefetch_test) (prefetch_test.cc:2672)
@hx235 hx235 self-assigned this Aug 18, 2023
@alanpaxton
Copy link
Contributor Author

alanpaxton commented Aug 18, 2023

On further investigation (using an ASAN build) I discovered that the SEGV is caused by the test failing; when the GTEST ASSERT fails, the test breaks and the Close() method at the end of the test function is not called. The std::unique_ptr<Env> is deallocated at the end of the function however (RAII) and when the ~DBTestBase() (superclass of test) is invoked, it calls DBTestBase::Close() which ultimately calls DBImpl::CloseHelper() which thinks it has work to do (not closed already) and accesses a now-deallocated env_ .

I've checked, and it seems there are many test susceptible to this "SEGV on failing test" issue. But maybe it is simplest to make the tests work ?

@hx235
Copy link
Contributor

hx235 commented Aug 18, 2023

While I'm yet to repro on my machines, I suspect it's due to the file number is not 3 as intended. I made a harmless-fix-anyway #11720 to that. Would you mind trying this and see if the failures are gone?

@alanpaxton
Copy link
Contributor Author

That fixes the problem on my ARM64/M1. All previously broken tests passing now, thanks.

facebook-github-bot pushed a commit that referenced this issue Aug 19, 2023
…11720)

Summary:
**Context/Summary:** as title, should be harmless. And it's a guessed fix to #11717 while no repro has obtained on my end yet.

Pull Request resolved: #11720

Test Plan: existing tests

Reviewed By: cbi42

Differential Revision: D48475661

Pulled By: hx235

fbshipit-source-id: 7c7390319f094c540e703fe2e78a8d601b7a894b
@hx235
Copy link
Contributor

hx235 commented Aug 19, 2023

f53018c is now merged. Thanks!

@hx235 hx235 closed this as completed Aug 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants