Become a leader in the IoT community!
New DevHeads get a 320-point leaderboard boost when joining the DevHeads IoT Integration Community. In addition to learning and advising, active community leaders are rewarded with community recognition and free tech stuff. Start your Legendary Collaboration now!
The crash Details:
Signal: Segmentation fault (SIGSEGV)
Faulting Instruction Address: 0x17bd9fc (mid-instruction in function Foo)
My code around the crash location: “`(gdb) x/6i $pc-12
0x17bd9f1: mov (%rbx),%eax ;
0x17bd9f3: mov %rbx,%rdi ;
0x17bd9f6: callq *0x70(%rax) ;
0x17bd9f9 <_Z3Foov+345>: cmp %eax,%r12d ;
0x17bd9fc <_Z3Foov+348>: mov %eax,-0x80(%rbp) ;
0x17bd9ff <_Z3Foov+351>: jge 0x17bd97e ;
“`
The crash happens in the middle of the instruction at `0x17bd9fc` , which is after a call to a virtual function through a pointer at offset `0x70` from memory pointed to by `%eax` .
Examining the virtual table shows it’s not corrupted, and it points to the expected function `Foo::Get()` .
`Foo::Get()` itself seems to be simple and well-behaved (will be shown in disassembly below).
The return address on the stack ($rsp-8) points to the correct instruction after the call to Foo::Get().
Disassembly of Foo::Get():
“`
(gdb) disas 0x2d3d7b0
0x0000000002d3d7b0 <+0>: push %rbp
0x0000000002d3d7b1 <+1>: mov 0x70(%rdi),%eax ; Move value from memory pointed to by offset 0x70 from %rdi to %eax
0x0000000002d3d7b4 <+4>: mov %rsp,%rbp
0x0000000002d3d7b7 <+7>: leaveq
0x0000000002d3d7b8 <+8>: retq
End of assembler dump.
“`
It’s as if during the return from Foo::Get(), something increments the program counter (%rip) by 4 bytes, leading to the crash mid-instruction in Foo.
Has anyone encountered anything similar? Any suggestions on how to approach debugging this further?
Tools like GDB can help you track memory access patterns in different threads , and you know that mutexes are synchronization mechanisms you should add around critical sections of `foo` to allow you have thread safe access to shared data . How’s it going ? @marveeamasi
Yh @ucgee so I was able to identify the issue as a data race condition within the Foo function. Multiple threads were like accessing or modifying shared data concurrently, it coused the corruption and the crash
I synchronized thread with a(n) semaphore around the critical sections of Foo that involved shared data access. I wanted to ensure that only one thread can access that data at a time, preventing race conditions
CONTRIBUTE TO THIS THREAD