-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Savedata is not endianness portable #2693
Comments
Why 0.3.0? Can this not be fixed earlier? |
I see, your comment says big endian systems will break. True, but does anyone actually use toxcore on that? Is it worth delaying the fix by months/years? |
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
@iphydf well, that was before we had the discussion that it could be done without possibly breaking things and with Tox_Options flag(s). There you go, updated the milestone. Not sure if v0.2.x or v0.2.20 is more appropriate, feel free to change it if I got it wrong. |
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
save_compatibility_test was failing on big-endian systems, as it was written and tested on a little-endian system and savedata is not endianness portable[1]. [1] TokTok#2693
Savedata created on a little-endian systems will not load on big-endian systems and the other way around.
I have looked into
save_compatibility_test
failing on s390x and found a couple of issues that I would like to document:Toxcore uses little-endian to host and host to little-endian conversion functions. For example,
lendian_bytes_to_host32()
intox_load()
:c-toxcore/toxcore/tox.c
Lines 704 to 709 in 710eb67
They function such that if
WORDS_BIGENDIAN
is defined, they assume the host is big-endian, otherwise it's little endian:c-toxcore/toxcore/state.c
Lines 127 to 136 in 710eb67
However,
WORDS_BIGENDIAN
is never defined by anything on a big-endian system, so those functions always assume the host is little-endian and justmemcpy()
the data as is, without any conversion. This results in little-endian systems storing and reading integers in the little-endian order, and big-endian systems storing and reading integers in the big-endian order.Trying to load a savedata created on amd64 on a s390x system will result in the
tox_load()
code linked above returning -1 on line 708 as the savedata file's magic number won't match due to the wrong endianness, withtox_new()
returningTOX_ERR_NEW_LOAD_BAD_FORMAT
to the user, as evident bysave_compatibility_test
failing on s390s with:c-toxcore/auto_tests/save_compatibility_test.c
Lines 84 to 87 in 710eb67
If we fix the previous issue by defining
WORDS_BIGENDIAN
on a big-endian system, e.g. by adding the following toCMakeLists.txt
:then the magic number matches and the code proceeds further in parsing the savedata, however the s390x
save_compatibility_test
then fails with:c-toxcore/toxcore/state.c
Lines 27 to 45 in 710eb67
The issue here is that the code does the same endianness conversion twice. On line 28 it converts little-endian to host 32-bit. This will convert the little-endian to the big-endian representation. But then on the line 39 it performs a little-endian to host conversion again, mistakenly converting those 16 bits from the host endianness (which is big-endian! this is why calling
lendian_to_host16()
on it produces non-portable result, as it's notlendian
to begin with) to the little endian. Then the comparison fails because left side is little-endian now and the right side is big-endian. (Then there are the lower 16-bits being converted on like 45, which also is unnecessary and will produce an incorrect result).The saving code does the conversion twice too:
c-toxcore/toxcore/state.c
Line 77 in 710eb67
which are no-ops on little endian but produce unexpected result on big-endian.
Fixing this would break the savedata format on big-endian systems, changing it in non backwards-compatible way. There have been talk of switching savedata to using msgpack, which would also be a breaking change, so it might make sense to keep the current broken behavior for now and do all the savedata breaking changes together.
Here is a small snippet to reproduce the issue
The text was updated successfully, but these errors were encountered: