Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long Read Giraffe #4323

Open
wants to merge 1,242 commits into
base: master
Choose a base branch
from
Open

Long Read Giraffe #4323

wants to merge 1,242 commits into from

Conversation

adamnovak
Copy link
Member

@adamnovak adamnovak commented Jul 1, 2024

Changelog Entry

To be copied to the draft changelog by merger:

  • vg giraffe now has --parameter-preset hifi and --parameter_preset r10 for using a new chaining-based algorithm to map long reads. --parameter-preset sr uses the new algorithm for single-ended short reads; the old --parameter-preset default and --parameter-preset fast remain available with the old non-chaining algorithm.
  • giraffe-facts.py script now knows how to read GAM files internally and no longer needs JSON preprocessing.

Description

This adds the much-anticipated Long Read Giraffe to mainline vg.

@adamnovak adamnovak changed the title Logn Read Giraffe Long Read Giraffe Jul 1, 2024
@adamnovak
Copy link
Member Author

@xchang1 On my Mac, I get a failure in the randomized tests for the zip code tree:

1: ([6+1/1 1 ( 11 [3+11/3] 18  0  1)])
0: [10-4/18rev 4 ( 25 [24+3/16] 18446744073709551615  14 [24+3/16rev] 18446744073709551615  18446744073709551615  2 [22+2/2] 67  23  44  37 [22+2/2rev] 12  47  3  24  17  4) 9 ( 8 [19-0/19] 18446744073709551615  11 [19-0/19rev] 6  18446744073709551615  17 [13-2/13rev] 18446744073709551615  19  18446744073709551615  2 [13-2/13] 5  18  24  27  7 [17+0/0rev 0 17+0/7rev] 18446744073709551615  18446744073709551615  18446744073709551615  29  18446744073709551615  12 [17+0/7 0 17+0/0] 22  27  4  17  23  26  6  6) 2 14-12/15rev]
1: [( 1 [11+1/6] 21  0  1) 5 12-15/17rev 1 12+6/5]
2: [9-21/1rev]
3: [9+21/10]
4: ([1+7/12] 23 [2+0/3 0 2+0/11 25 6-1/14rev 3 4-2/4rev 2 ( 1 [7-0/8rev 0 7-0/9rev] 0  0  1)])
Assertion failed: (distance == distances[i]), function validate_snarl, file zip_code_tree.cpp, line 1427.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vg test is a Catch v2.13.8 host application.
Run with -? for options

-------------------------------------------------------------------------------
Random graphs zip tree
-------------------------------------------------------------------------------
src/unittest/zip_code_tree.cpp:2772
...............................................................................

src/unittest/zip_code_tree.cpp:2772: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:     666 |     665 passed | 1 failed
assertions: 9903209 | 9903208 passed | 1 failed

━━━━━━━━━━━━━━━━━━━━
Crash report for vg v1.57.0-1079-g3a1cd38a3 "Franchini"
Caught signal 6 raised at address 0x183fdaa60; tracing with backward-cpp
Stack trace (most recent call last):
#16   Object "vg", at 0x102614043, in main + 691
#15   Object "vg", at 0x102d94f23, in vg::subcommand::Subcommand::operator()(int, char**) const + 47
#14   Object "vg", at 0x102de733f, in main_test(int, char**) + 527
#13   Object "vg", at 0x102dce44f, in Catch::Session::run() + 155
#12   Object "vg", at 0x102dcf1e3, in Catch::Session::runInternal() + 3291
#11   Object "vg", at 0x102dc8c37, in Catch::RunContext::runTest(Catch::TestCase const&) + 359
#10   Object "vg", at 0x102dc9523, in Catch::RunContext::runCurrentTest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) + 439
#9    Object "vg", at 0x102c1bc17, in vg::unittest::C_A_T_C_H_T_E_S_T_112() + 1891
#8    Object "vg", at 0x1033290d7, in vg::ZipCodeForest::validate_zip_forest(bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const + 327
#7    Object "vg", at 0x1033269c7, in vg::ZipCodeTree::validate_zip_tree(bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const + 231
#6    Object "vg", at 0x103328dbf, in vg::ZipCodeTree::validate_snarl(std::__1::__wrap_iter<vg::ZipCodeTree::tree_item_t const*>, bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const + 2123
#5    Object "libsystem_c.dylib", at 0x183f1ed1f, in __assert_rtn + 283
#4    Object "libsystem_c.dylib", at 0x183f1fa2f, in abort + 179
#3    Object "libsystem_pthread.dylib", at 0x184012c1f, in pthread_kill + 287
#2    Object "libsystem_platform.dylib", at 0x184043583, in _sigtramp + 55
#1    Object "vg", at 0x102ece8c3, in vg::emit_stacktrace(int, __siginfo*, void*) + 1167
#0    Object "vg", at 0x102eceecb, in backward::StackTraceImpl<backward::system_tag::darwin_tag>::load_from(void*, unsigned long, void*, void*) + 43

Library locations:
#16	main (in vg) (main.cpp:86) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#15	vg::subcommand::Subcommand::operator()(int, char**) const (in vg) (subcommand.cpp:75) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#14	main_test(int, char**) (in vg) (test_main.cpp:69) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#13	Catch::Session::run() (in vg) (catch.hpp:13516) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#12	Catch::Session::runInternal() (in vg) (catch.hpp:13560) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#11	Catch::RunContext::runTest(Catch::TestCase const&) (in vg) (catch.hpp:12761) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#10	Catch::RunContext::runCurrentTest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&) (in vg) (catch.hpp:13000) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#9	vg::unittest::C_A_T_C_H_T_E_S_T_112() (in vg) (zip_code_tree.cpp:2837) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#8	vg::ZipCodeForest::validate_zip_forest(bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const (in vg) (zip_code_tree.cpp:1352) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#7	vg::ZipCodeTree::validate_zip_tree(bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const (in vg) (zip_code_tree.cpp:1069) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#6	vg::ZipCodeTree::validate_snarl(std::__1::__wrap_iter<vg::ZipCodeTree::tree_item_t const*>, bdsg::SnarlDistanceIndex const&, std::__1::vector<vg::SnarlDistanceIndexClusterer::Seed, std::__1::allocator<vg::SnarlDistanceIndexClusterer::Seed>> const*, unsigned long) const (in vg) (zip_code_tree.cpp:0) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
#5	__assert_rtn (in libsystem_c.dylib) + 283 in /usr/lib/system/libsystem_c.dylib loaded at 0x183ea9000
#4	abort (in libsystem_c.dylib) + 179 in /usr/lib/system/libsystem_c.dylib loaded at 0x183ea9000
#3	_pthread_atfork_prepare_handlers (in libsystem_pthread.dylib) + 95 in /usr/lib/system/libsystem_pthread.dylib loaded at 0x18400c000
#2	_simple_esappend (in libsystem_platform.dylib) + 147 in /usr/lib/system/libsystem_platform.dylib loaded at 0x18403f000
#1	vg::emit_stacktrace(int, __siginfo*, void*) (in vg) (crash.cpp:379) in /Users/anovak/workspace/vg/bin/vg loaded at 0x102610000
ERROR: Signal 6 occurred. VG has crashed. Visit https://github.com/vgteam/vg/issues/new/choose to report a bug.
━━━━━━━━━━━━━━━━━━━━
Context dump:
Found 0 threads with context.
━━━━━━━━━━━━━━━━━━━━
Please include this entire error log in your bug report!
━━━━━━━━━━━━━━━━━━━━

I don't think we can get this to make today's release.

@adamnovak
Copy link
Member Author

I think we also need to fix #4324 before merging this, since this includes changes to vg surject that detect the bad alignments and break the vg inject tests.

@xchang1
Copy link
Contributor

xchang1 commented Jul 9, 2024

I fixed my random unit tests - the problem was that it wasn't checking distances in snarls properly. It works now but I turned off the random tests too

adamnovak and others added 30 commits October 21, 2024 17:05
Shuffle tied minimizers when sorting and keep extra minimizers to fill gaps in read coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants