Bug 800

Summary: sudo hangs and leaves the executed program as “zombie”
Product: Sudo Reporter: sander.vrijders
Component: SudoAssignee: Todd C. Miller <Todd.Miller>
Status: RESOLVED FIXED    
Severity: normal CC: evangelos, tharrin
Priority: low    
Version: 1.8.21   
Hardware: PC   
OS: Linux   

Description sander.vrijders 2017-09-06 03:02:08 MDT
I have a Python script that executes 'ip' commands in a lot of different threads that needs to use sudo since it needs elevated privileges. The 'ip' commands are very short in execution time since they just open a netlink socket and send a message. sudo then never reaps the child process and leaves my script hanging:

28066 28077 28066  5446 pts/2    28066 S+    1000   0:00  |   |       |       \_ python ./example4.py
28077 28156 28066  5446 pts/2    28066 S+       0   0:00  |   |       |       |   \_ sudo ip link set dev b.02 up
28156 28160 28066  5446 pts/2    28066 Z+       0   0:00  |   |       |       |       \_ [ip] <defunct>


I attached gdb to sudo and it appears to be stuck in poll():

(gdb) bt
#0  0x00007f9d26525c84 in poll () from /usr/lib/libc.so.6
#1  0x00007f9d267fd14d in ?? () from /usr/lib/sudo/libsudo_util.so.0
#2  0x00007f9d267f62af in sudo_ev_loop_v1 () from /usr/lib/sudo/libsudo_util.so.0
#3  0x000055989cc321ef in ?? ()
#4  0x000055989cc2f00b in ?? ()
#5  0x000055989cc3c40d in ?? ()
#6  0x000055989cc2d78d in ?? ()
#7  0x00007f9d2645af6a in __libc_start_main () from /usr/lib/libc.so.6
#8  0x000055989cc2dcda in ?? ()
Comment 1 sander.vrijders 2017-09-06 03:04:36 MDT
Forgot to mention that I saw the bug on 1.8.21 and 1.8.21p1. (I am on ArchLinux). These versions were not selectable in the bug report, so I just chose the latest version there.
Comment 2 Todd C. Miller 2017-09-06 06:10:31 MDT
Sounds like sudo is not seeing the SIGCHLD for some reason.  Do you have log_output or use_pty set in sudoers?  If not, does setting use_pty change the behavior?
Comment 3 sander.vrijders 2017-09-06 06:25:48 MDT
I don't have use_pty set, nor log_output. Setting use_pty seems to worsen the problem.
Comment 4 Evangelos Foutras 2017-09-06 08:52:14 MDT
Bisecting points to: https://www.sudo.ws/repos/sudo/rev/d94e202b8e57

Used the following as a test and the hang is easily reproducible:

$ while true; do sudo true && echo -n .; done
Comment 5 Todd C. Miller 2017-09-06 09:46:40 MDT
Thanks for tracking that down.  It takes a while to reproduce here but does eventually hang.
Comment 6 Todd C. Miller 2017-09-06 09:54:36 MDT
Even with d94e202b8e57 backed out I get an eventual hang but this points to where the race is hiding.
Comment 7 Evangelos Foutras 2017-09-06 10:43:32 MDT
My bad for not testing more thoroughly. d94e202b8e57 makes the hang easier to reproduce for me but the real issue must lie within its parent commit. [1]

[1] https://www.sudo.ws/repos/sudo/rev/841e2ca6a4a6
Comment 8 Todd C. Miller 2017-09-06 12:36:28 MDT
Found it.  The underlying problem is that the wrong ends of the signal pipe were being used.  In other words, the read side was being written to and the write side was being read from.  On systems with directional pipes this was not an issue but Linux appears to have traditional Unix pipe semantics.

It took quite a while for me to figure out why I was getting EBADF when I knew the descriptor was valid.

This is fixed by:  https://www.sudo.ws/repos/sudo/rev/7668f93e6544
Comment 9 Evangelos Foutras 2017-09-06 12:50:25 MDT
Great job tracking down the real cause! 
Comment 10 Todd C. Miller 2017-09-06 13:16:24 MDT
*** Bug 802 has been marked as a duplicate of this bug. ***
Comment 11 Todd C. Miller 2017-09-07 05:49:26 MDT
Sudo 1.8.21p2 is out now which includes the fix.