#16337: Segment violation on x86_64 in bbitmap
------------------------+-------------------------
Reporter: bbjimmy | Owner: nobody
Type: bug | Status: closed
Priority: normal | Milestone: Unscheduled
Component: - General | Version: R1/beta2
Resolution: invalid | Keywords:
Blocked By: | Blocking:
Platform: All |
------------------------+-------------------------
Comment (by mmlr):
I've spent a fair amount of time on this and even pointed out where
exactly the problem in this specific instance is. I understand that this
is getting somewhat low-level, but I tried to explain that and how yab is
buggy in this case and pointed out an easy way to find and fix this and
some similar issues. Just dismissing that and continuing to blame the
operating system isn't going to make this go away, only fixing the code
will.
I don't have enough personal investment in yab to fix these issues myself,
but I know that it is a used by others and consider it a nice thing to
have. I'll therefore try to explain the problem some more in the hope that
it leads to fixes in yab that benefit the broader community.
The fundamental problem is this: Having control flow reach the end of a
non-void function without a return statement results in undefined
behaviour. In such cases pretty much anything is allowed to happen,
including for the compiler to not produce a function epilogue at all. The
epilogue is the part that restores registers and the stack to where it
needs to be and actually returns the control flow to the calling function.
When it's missing, the function will quite literally not return, i.e. it
will continue to run code adjacent to its code. The layout of the source
code is not relevant, only the layout of the produced binary is and that
one can vary depending on the toolchain, compiler flags, used libraries,
etc and should be considered unpredictable.
The yab package currently available for x86_64 as well as one I built
locally (using {{{gcc-8.3.0_2019_05_24-7-x86_64.hpkg}}} for reference)
shows exactly this missing epilogue. This is what a disassembly of the
relevant part looks like (produced by {{{objdump --demangle --disassemble-
all /boot/system/lib/libyab_1.7.8.so}}}):
{{{
0000000000061ac4 <yi_BitmapLoad>:
61ac4: 55 push %rbp
61ac5: 48 89 e5 mov %rsp,%rbp
61ac8: 48 89 f8 mov %rdi,%rax
61acb: 48 89 d7 mov %rdx,%rdi
61ace: 48 89 f2 mov %rsi,%rdx
61ad1: 48 89 c6 mov %rax,%rsi
61ad4: e8 87 03 fe ff callq 41e60
<YabInterface::BitmapLoad(char const*, char const*)@plt>
61ad9: 90 nop
0000000000061ada <YabInterface::BitmapSave(char const*, char const*, char
const*)>:
61ada: 55 push %rbp
61adb: 48 89 e5 mov %rsp,%rbp
61ade: 41 57 push %r15
61ae0: 41 56 push %r14
...
}}}
It shows that no epilogue was produced, i.e. no final stack pop and no ret
instruction. Therefore the program will simply continue to run and in this
case it happens to have {{{YabInterface::BitmapSave}}} directly following
which it then executes. That function has different arguments and
therefore expects a different stack layout and register values and then
crashes.
Compare this with a similar function that isn't missing the return
statement:
{{{
0000000000061cec <yi_BitmapSave>:
61cec: 55 push %rbp
61ced: 48 89 e5 mov %rsp,%rbp
61cf0: 48 89 f8 mov %rdi,%rax
61cf3: 48 89 cf mov %rcx,%rdi
61cf6: 48 89 d1 mov %rdx,%rcx
61cf9: 48 89 f2 mov %rsi,%rdx
61cfc: 48 89 c6 mov %rax,%rsi
61cff: e8 1c d8 fd ff callq 3f520
<YabInterface::BitmapSave(char const*, char const*, char const*)@plt>
61d04: 5d pop %rbp
61d05: c3 retq
}}}
As I already said above, the compiler warns about such errors. It only
doesn't fail compilation because it cannot know for sure that undefined
behaviour will actually occur (the inner function could for example always
exit by throwing an exception or actually be meant to execute an infinite
loop). But in almost all cases they are indeed just errors and they will
lead to hard to understand bugs. So ignoring these compilation warnings
because "they are just warnings" is not an option.
Recompiling in an even slightly different environment may alter what
happens. It is possible that the layout of the binary changes so that a
function happens to follow that wouldn't crash with the given stack.
Obviously it's always problematic to run functions that weren't intended
to run. Even if it doesn't crash, it will possibly have dangerous side
effects (what if it was a function to delete some files for example?) and
it most certainly will mangle the return value so the rest of the program
will run with bogus data.
So my advice to build yab with {{{-Wall -Wextra -Werror}}} and fixing
anything it points out stands. It will fix some bugs that may have been
hidden by pure chance so far (or lead to so far unexplained strange
behaviour) and it will increase code quality. You can always disable some
of these warnings if you explicitly don't want to change the code
({{{-Wno-unused-variable}}} or {{{-Wno-unused-parameter}}} for example if
you don't want to remove them). But even the "misleading indent" ones are
sometimes right and even if they aren't, adjusting the code style doesn't
really hurt. A quick glance showed at least one instance of a misleading
indent warning that indeed looked wrong.
--
Ticket URL: <https://dev.haiku-os.org/ticket/16337#comment:9>
Haiku <https://dev.haiku-os.org>
The Haiku operating system.