Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build mongo 3.2.x on ARM (armv7l/arm64/aarch64) - segmentation fault #57

Open
TomFreudenberg opened this issue Jul 6, 2016 · 20 comments

Comments

@TomFreudenberg
Copy link
Member

While the mongod (DB) is running well after build without seen issues yet, it is not possible to get a running mongo shell [with or without javascript (mozjs) support enabled] after build.

Simple build:
mkdir -p /tmp/mongo-build

cd /tmp/mongo-build

git clone --branch "r3.2.6" --depth 1 https://github.com/mongodb/mongo.git

cd mongo

scons --disable-warnings-as-errors --prefix=/tmp/mongo-build/mongo --js-engine=mozjs mongo mongod

Even that this build will be run on all tested platforms and architectures (Linux on armv7l, aarch64, amd64) without issues during compiling, the results in case of running programs differs.

Running mongo command after build on amd64:

Just run mongo shell and try to connect to a non existing instance

./mongo mongodb://localhost:5002/sample

this will give this output:

MongoDB shell version: 3.2.6
connecting to: mongodb://localhost:5002/sample
2016-07-05T14:10:23.772+0200 W NETWORK  [thread1] Failed to connect to 127.0.0.1:5002, reason: errno:111 Connection refused
2016-07-05T14:10:23.772+0200 E QUERY    [thread1] Error: couldn't connect to server localhost:5002, connection attempt failed :
connect@src/mongo/shell/mongo.js:223:14
@(connect):1:6

exception: connect failed
Running mongo command after build on aarch64/armv7l:

Instead of this, the output on ARMs is just:

MongoDB shell version: 3.2.6
Segmentation fault (core dumped)

It seems to me that the TCP connection part might bring up that "SEGMENTATION FAULT" but I can't debug this.

I would be happy about any help to get this run.

@cyphernix
Copy link

cyphernix commented Jul 13, 2016

Hello Tom

Please can you post the output of the segmentation fault by typing dmesgor sudo dmesg after you have tried to run ./mongo mongodb://localhost:5002/sample command.

Regards
Keith

@TomFreudenberg
Copy link
Member Author

TomFreudenberg commented Jul 13, 2016

Attached you find also the core dump from running mongo

core.zip

This was done on linaro cluster / arm64 (aarch64)

@cyphernix
Copy link

cyphernix commented Jul 13, 2016

Thanks Tom much appreciated. Just another thing if you could. Would you mind running the same command with trace ./mongo mongodb://localhost:5002/sample
Then post output for me.

If it's too much to copy paste then just run it like this...
trace ./mongo mongodb://localhost:5002/sample > tracer

@TomFreudenberg
Copy link
Member Author

Hopefully output of strace is what you mean?

strace.txt

@TomFreudenberg
Copy link
Member Author

Yes it does but mongo still seg fault. New trace attached.

strace2.txt

@cyphernix
Copy link

cyphernix commented Jul 13, 2016

Yeah I see it's still dying out at the same place unable to get any of those threads going. Perhaps repeat that command but pass --64bit on to it. In case it is not detecting it correctly since it's an Arm64 processor.

scons --64bit --disable-warnings-as-errors --prefix=/tmp/mongo-build/mongo --js-engine=mozjs core

@TomFreudenberg
Copy link
Member Author

Hi Keith @cyphernix

there is no option --64 or --64bit. I checked the SConstruct options and get some settings for 3rdparty libraries like google-perf (tcmalloc) etc.

I do now a fresh 3.3.9 build while replacing most of the included 3rdparty libraries by system ones. I will post an update when compile is done.

@cyphernix
Copy link

cyphernix commented Jul 13, 2016

Sounds like a plan. Hard to diagnose this as I don't have any device with that Architecture near me at the moment so working blind. Sorry for the back and fourth mate.

@TomFreudenberg
Copy link
Member Author

TomFreudenberg commented Jul 14, 2016

Well after running a number of builds and checks, I switched into mongo release branch r3.3.9.

Doing the same build as always, this release seems to work on (at least) arm64.

I will run a check if this is ok for armv7 as well.

Attention: Target needs to be core for r3.3.x, mongo or mongod don't exist (anymore)

@TomFreudenberg
Copy link
Member Author

Hi @benjamn

will it be possible to change the mongo release to at least r3.3.9 instead of r3.2.6

https://github.com/meteor/meteor/blob/release-1.4/scripts/build-dev-bundle-common.sh#L8

r3.2.6 seems not to work (mongo shell) for ARM but >r3.3.9

Thanks for a short note
Tom

@TomFreudenberg
Copy link
Member Author

@benjamn - sorry for silly question - r3.3.9 is still development - so I am trying to check r3.2.8

@TomFreudenberg
Copy link
Member Author

TomFreudenberg commented Jul 14, 2016

OMG 👎 r3.2.8 has still Seg Fault

I will check from r3.3.0 up to r3.3.9 if possible to locate the fix


Update:

r3.3.0 up to r3.3.4 still Seg Fault
r3.3.5 unsupported architecture arm64
r3.3.6 up to r3.3.8 fail with backtrace
r3.3.9 is (the only) working release

Hopefully the issue is addressed by a small changset

@cyphernix
Copy link

Hey Tom have you read this MongoDB ticket...
https://jira.mongodb.org/browse/SERVER-23126

@cyphernix
Copy link

cyphernix commented Jul 14, 2016

If you issue ./mongo --help the binary works but the connection has issues dealing with a non existant database. Seg Faulting instead of failing gracefully at the non existant database.

The offending code is in
/home/keith-testing/mongodb-src-r3.2.8/src/mongo/client/mongo_uri_connect.cpp

    if (!_user.empty()) {
        ret->auth(_makeAuthObjFromOptions(ret->getMaxWireVersion()));
    }
    return ret;
}

@TomFreudenberg
Copy link
Member Author

Hey @cyphernix I have checked the hash but all is already applied to r3.2.8.

Still Seg Fault - it seems that just r3.3.9 is running - see comment before, up to r3.3.8 none works.

@cyphernix
Copy link

Ah yes I see what you mean so it's affected from r3.2.8 until 3.3.8. Just check that and see they have added this to it.

ret = _connectString.connect(errmsg, socketTimeout);

@TomFreudenberg
Copy link
Member Author

Hey @cyphernix Keith

As you can read above, r3.3.9 is the one is running - all releases before have seg fault issue.

I will do a compare but maybe this is a huge changeset :-(

@cyphernix
Copy link

cyphernix commented Jul 14, 2016

Perhaps just including the fix from 3.3.9 in mongo_uri_connect.cpp and apply it to one of the older versions will be easier? That I will leave up to you to decide. Although I'm glad we both finally managed to get to the bottom of this issue... :-)

@TomFreudenberg
Copy link
Member Author

TomFreudenberg commented Jul 14, 2016

Can you post the lines please - which fix do mean?

This is the complete changeset - mongodb/mongo@r3.3.8...r3.3.9

@cyphernix
Copy link

cyphernix commented Jul 14, 2016

On the server run this command.

diff /home/mongo-3.3.9/src/mongo/client/mongo_uri_connect.cpp /home/keith-testing/mongodb-src-r3.2.8/src/mongo/client/mongo_uri_connect.cpp
You will see the problem code there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants