Bug 826 - I/O hangs with Sample I/O plugin.
I/O hangs with Sample I/O plugin.
Status: RESOLVED FIXED
Product: Sudo
Classification: Unclassified
Component: Sudo
1.8.22
PC Linux
: low normal
Assigned To: Todd C. Miller
: 830 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-03-23 05:10 MDT by Alberto González Palomo
Modified: 2018-06-28 20:56 MDT (History)
2 users (show)

See Also:


Attachments
Patch to close pty slave after command completes (1.32 KB, patch)
2018-03-25 06:15 MDT, Todd C. Miller
Details | Diff
My patch that worked for 1.8.22 (1.01 KB, patch)
2018-03-31 15:43 MDT, Stepan Broz
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alberto González Palomo 2018-03-23 05:10:26 MDT
Using "sudo dmesg" or "sudo cat /var/log/syslog" for instance, some times sudo hangs after printing a bunch of lines, and the only way I've found to continue is to kill -9 it.
It also happens with any other moderately big text file, like ".bash_history".

To reproduce it, I repeat the command "sudo dmesg" a few times quickly until it hangs, which some times happens at first try, but more often requires a few tries to happen.

My "/etc/sudo.conf" has only one plugin configured:

Plugin sample_io sample_plugin.so

If I add the sudoers plugin, the behaviour seems to be the same:
Plugin sudoers_policy sudoers.so
Plugin sudoers_io sudoers.so
Plugin sample_io sample_plugin.so

The version of both Sudo and the plugin is 1.8.22, both installed from the same source downloaded from sudo.ws:

sudo-1.8.22$ sudo -V
Sudo version 1.8.22
Sudoers policy plugin version 1.8.22
Sudoers file grammar version 46
Sample I/O plugin version 1.8.22

I'm developing an I/O plugin for a customer. In that one I added debug print lines with "sudo_log(SUDO_CONF_INFO_MSG, ...);" at the entry and exit of io_open(), io_log_input(), io_log_output(), and io_close() and all of them exit even io_close(), then sudo hangs as described above.
Comment 1 Todd C. Miller 2018-03-23 09:30:55 MDT
Please include details of the Linux distribution you are using.
Comment 2 Alberto González Palomo 2018-03-23 12:13:59 MDT
Oh yes, of course:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 17.10
Release:	17.10
Codename:	artful

It is the desktop version, without any custom kernel or drivers.

The console application is Gnome Terminal, with bash shell.

Thank you for taking the time to look into this.
If there is something you'd like me to try, including modifying the source code, please let me know.
Comment 3 Todd C. Miller 2018-03-23 16:30:20 MDT
Can you reproduce this with smaller files or just large ones?  Sudo uses 64KB buffers so it is possible that the bug only manifests for files bigger than that.
Comment 4 Todd C. Miller 2018-03-23 17:29:40 MDT
Does this also happen with only the sudoers plugin and the line:

Defaults use_pty

in /etc/sudoers?  That will force sudo to run the command in a pty without using an I/O logging plugin.

You can get debugging info from sudo with a line like the following in sudo.conf

Debug sudo /var/log/sudo_debug all@debug

To just log info about the sudo event loop, something like the following should work: 

Debug sudo /var/log/sudo_debug exec@debug, event@debug
Comment 5 Todd C. Miller 2018-03-24 02:55:32 MDT
Another thing to check is whether the underlying command has actually exited when sudo is hanging.  It may be that sudo has missed the exit of the command somehow.
Comment 6 Alberto González Palomo 2018-03-24 03:02:31 MDT
Yes, I'm building a script now to automate testing different file sizes.

The underlying command was often just "cat" from a regular file, although given the files I was testing which were logs (the first ones that came to mind when looking for big text files) there might be appended to when the cat program was reading from them. Don't know whether this might have any effect, but just realized it would explain why it only happens at seemingly random intervals.

What I'm doing now is first build the script to run it many times with different file sizes, and if I manage to reproduce it this way I'll then try the different configurations you mentioned to see what effect they have.
Comment 7 Todd C. Miller 2018-03-24 03:14:05 MDT
I was just able to reproduce this with "sudo dmesg" with log_output enabled in sudoers.  It never happens with remote logins, only with a local terminal.  It appears that both the command and the sudo monitor process exited but the main sudo process didn't notice.  It is likely a race condition.  I have enough data to be able to track this down now.
Comment 8 Todd C. Miller 2018-03-25 06:15:37 MDT
Created attachment 505 [details]
Patch to close pty slave after command completes

Closing the pty slave after the command finishes appears to fix the race condition that caused the hang.  This commit will be in the upcoming sudo 1.8.23 release.
Comment 9 Alberto González Palomo 2018-03-25 14:38:11 MDT
Thanks!

BTW, just a comment to tell you that SUDO's documentation is great, concise and to the point, as is the example plugin.
Comment 10 Todd C. Miller 2018-03-27 16:01:22 MDT
Thanks for the kind words.  Please let me know if you find any errors or omissions in the plugin documentation.
Comment 11 Stepan Broz 2018-03-31 15:42:26 MDT
I've been solving identical issue myself - iolog enabled, sudo hanged on poll() for /dev/ptmx (pty master), sometimes leaving Zombie children behind. I was able to reproduce the issue being ssh'd in. All what was needed was using a cat within sudo. Once there was something in the writing buffer to be flushed after pty_close() was called, the write_callback() scheduled another reader event for /dev/ptmx and in the blocking cycle for flushing buffers, the poll() hanged.

The above patch solves the issue for me (so far), however, the following error appears in my debug logs now, it's new: "error reading fd 11: Input/output error"

It appears a while after del_io_events() was called as nonblocking, one successfull read_callback() happened before this one:

Mar 31 09:18:20 sudo[29536] -> read_callback @ ./exec_pty.c:546
Mar 31 09:18:20 sudo[29536] error reading fd 11: Input/output error
Mar 31 09:18:20 sudo[29536] -> safe_close @ ./exec_pty.c:1714
Mar 31 09:18:20 sudo[29536] safe_close: closing fd 11
Mar 31 09:18:20 sudo[29536] <- safe_close @ ./exec_pty.c:1724 := 0

I was just about to discuss this on the mailing list when this fix appeared in the 1.8.23b2 release. For me the issue was solved by different patch (on top of 1.8.22), I just wanted to share it. (I'll try an attachment)
Comment 12 Stepan Broz 2018-03-31 15:43:56 MDT
Created attachment 506 [details]
My patch that worked for 1.8.22
Comment 13 Todd C. Miller 2018-04-01 05:37:34 MDT
I started out with a similar fix to Stepan's but abandoned it when I noticed that all the data was not flushed at times.

The EIO is not a problem, it is just what happens if you try to read from the pty master when the slave is closed and there is nothing buffered.  It can happen without the patch too, depending on the timing of things.
Comment 14 Todd C. Miller 2018-06-28 20:56:49 MDT
*** Bug 830 has been marked as a duplicate of this bug. ***