Bug 247

Summary: AIX 5.3 + LDAP Groups + Segmentation fault
Product: Sudo Reporter: José Luis <jotones>
Component: SudoAssignee: Todd C. Miller <Todd.Miller>
Status: RESOLVED FIXED    
Severity: high CC: jotones
Priority: high    
Version: 1.6.7   
Hardware: IBM   
OS: AIX   
Attachments: Patch to add debugging info to aix authentication

Description José Luis 2007-05-25 06:09:05 MDT
Hi, 

We are using sudo in ours AIX servers long time ago without problems.

Now we are going (if we can) to use the AIX compound module KRB5LDAP. With this authentication module we get a "Segmentation fault" error in some scenarios in which are involved groups of users defined in the LDAP server.

If we comment the LDAP authentication module definition in the file /usr/lib/security/methods.cfg file it seems to work so we suspect we have some problems with the LDAP interaction.

We launched:  truss sudo -l ,  in order to investigate what it's happening and we get some strange things (I am not an expertise debugging):

1.- We get 65549 close systems calls, decreasing the file descriptor. This is normal? Anyway we have the same output in servers in which sudo works so it seems not to be important, only a question.

close(65533)                     Err#9  EBADF
.
.
.
close(3)                        Err#9  EBADF
sbrk(0x00000000)                = 0x20008FB8
vmgetinfo(0x2FF22210, 7, 16)    = 0
sbrk(0x00000000)                = 0x20008FB8

2.- We get some errors from kiotcl call. What do they mean?

kioctl(6, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY

3.- The main problem as the two above appear as well in other servers working fine is the segmentation fault:

........
987346: 61.7154:        close(6)                        = 0
987346: 61.7164:        open("/etc/passwd", O_RDONLY)   = 6
987346: 61.7173:        kioctl(6, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY
987346: 61.7183:        kfcntl(6, F_GETFD, 0x00000000)  = 0
987346: 61.7195:        kfcntl(6, F_SETFD, 0x00000001)  = 0
987346: 61.7206:        accessx("/etc/security/passwd", 04, 0) = 0
987346: 61.7218:        open("/etc/security/passwd", O_RDONLY) = 8
987346: 61.7229:        kioctl(8, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY
987346: 61.7239:        kfcntl(8, F_GETFD, 0x00000000)  = 0
987346: 61.7250:        kfcntl(8, F_SETFD, 0x00000001)  = 0
987346: 61.7262:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7274:        statx("/etc/passwd", 0x2FF21EC0, 76, 0) = 0
987346: 61.7286:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7297:        statx("/etc/passwd", 0x2FF21990, 76, 0) = 0
987346: 61.7308:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7321:        statx("/etc/passwd", 0x2FF21A00, 76, 0) = 0
987346: 61.7333:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7345:        statx("/etc/passwd", 0x2FF21980, 76, 0) = 0
987346: 61.7356:        accessx("/etc/security/passwd", 04, 0) = 0
987346: 61.7368:        statx("/etc/security/passwd", 0x2FF219E0, 76, 0) = 0
987346: 61.7379:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7391:        statx("/etc/passwd", 0x2FF21700, 76, 0) = 0
987346: 61.7404:        accessx("/etc/passwd", 04, 0)   = 0
987346: 61.7416:        statx("/etc/passwd", 0x2FF21770, 76, 0) = 0
987346: 61.7428:        close(6)                        = 0
987346: 61.7438:        close(8)                        = 0
987346:     Received signal #11, SIGSEGV [default]
987346: *** process killed ***


Someone could help me to know where it could be the problem?

I can provide you with any additional information you request me.

Thanks in advance, 

Otto
Comment 1 José Luis 2007-05-25 06:10:59 MDT
The exact version is the IBM compiled versión: 1.6.7p5

I compiled the 1.6.8p12 and I got the same problem.

Thanks, 
Comment 2 Todd C. Miller 2007-06-19 19:44:43 MDT
Can you try with the 1.6.9 beta version?  You can download it from http://www.sudo.ws/sudo/beta.html

It should avoid the large number of close system calls at the very least.
Comment 3 José Luis 2007-06-20 03:55:28 MDT
I will test it and I will come back to you.

Any idea about the segmentation fault in AIX? How I can debug the problem in order to help you or give you more information?

Thanks
Comment 4 José Luis 2007-06-20 04:11:32 MDT
Hi Todd, 

I found an compilation error (remember I am not an expertise)

I am using:

libgcc-4.0.0-1
gcc-4.0.0-1

I ran ./configure and in the log config.log we found some errors related to confdefs.h and some functions not defined:

configure: failed program was:
| /* confdefs.h.  */
...
...

Do you want that I send you this file?

Anyway I found it with other tests I am doing. I have always used the IBM rpm package so now I am starting to compile myself the source.

After that I ran make with the following error:

        /usr/bin/gcc -c -I. -I.  -O2 -D_XOPEN_EXTENDED_SOURCE -D_ALL_SOURCE -D_PATH_SUDOERS=\"/etc/sudoers\" -D_PATH_SUDOERS_TMP=\"/etc/sudoers.tmp\" -DSUDOERS_UID=0 -DSUDOERS_GID=0 -DSUDOERS_MODE=0440  closefrom.c
closefrom.c: In function 'closefrom':
closefrom.c:96: error: 'F_CLOSEM' undeclared (first use in this function)
closefrom.c:96: error: (Each undeclared identifier is reported only once
closefrom.c:96: error: for each function it appears in.)
make: 1254-004 El código de error del último mandato es 1.


But during the configure step the config.log shows:

ac_cv_have_decl_F_CLOSEM=yes


Any idea?

I will continue testing.

Thanks for your help
Comment 5 José Luis 2007-06-20 06:38:33 MDT
Hi all, 

About the segmentation fault, I will attach other output get with the 1.6.8.p12 version:

truss -deaf sudo :

.. 
..
901182: kwrite(7, " 1 7", 2)                            = 2
901182: kread(7, "\0", 1)                               = 1
901182: kwrite(7, " 3\0 0\0 0\0 0\0 0\0 g p".., 17)     = 17
901182: kread(7, " 4 0 0\0 0\0 0\0 0\0 g p".., 17)      = 2
901182: kwrite(7, "\0", 1)                              = 1
901182: kread(7, " p g r p = 9 0 8 0 1\0 u".., 40)      = 40
901182: 2.6567:        close(6)                         = 0
901182: 2.6581:        open("/etc/group", O_RDONLY)     = 6
901182: 2.6590:        kioctl(6, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY
901182: 2.6600:        kfcntl(6, F_GETFD, 0x00000000)   = 0
901182: 2.6611:        kfcntl(6, F_SETFD, 0x00000001)   = 0
901182: 2.6623:        accessx("/etc/group", 04, 0)     = 0
901182: 2.9614:        statx("/etc/group", 0x2FF21F00, 76, 0) = 0
901182: 2.9630:        statx("/etc/group", 0x2FF21A08, 76, 0) = 0
901182: 2.9643:        getdomainname(0x2FF22260, 256)   = 0
901182: 2.9656:        accessx("/etc/group", 04, 0)     = 0
901182: 2.9669:        statx("/etc/group", 0x2FF21F00, 76, 0) = 0
901182: 2.9683:        statx("/etc/group", 0x2FF21A08, 76, 0) = 0
901182: 2.9695:        getdomainname(0x2FF22260, 256)   = 0
901182: 2.9707:        accessx("/etc/group", 04, 0)     = 0
901182: 2.9720:        statx("/etc/group", 0x2FF21F00, 76, 0) = 0
901182: 2.9733:        statx("/etc/group", 0x2FF21A08, 76, 0) = 0
901182: 2.9746:        getdomainname(0x2FF22260, 256)   = 0
901182: 2.9757:        _getpid()                        = 901182
901182: 2.9767:        getuidx(1)                       = 0
901182: kwrite(7, " 1 9", 2)                            = 2
901182: kread(7, "\0", 1)                               = 1
901182: kwrite(7, " 3\0 0\0 0\0 0\0 0\0 g p".., 19)     = 19
901182: kread(7, " 3 0 0\0 0\0 0\0 0\0 g p".., 19)      = 2
901182: kwrite(7, "\0", 1)                              = 1
901182: kread(7, " p g r p = 9 0 8 0 2\0 g".., 30)      = 30
901182: 2.9885:        close(6)                         = 0
901182:     Received signal #11, SIGSEGV [default]
901182: *** process killed ***


The groups 90801 and 90802 are defined in LDAP database and it seems that sudo is loading correctly the LDAP AIX authentication module:

From the audit subsystems we can view that sudo is accesing the LDAP security method:

event           login    status      time                     command
--------------- -------- ----------- ------------------------ -------------------------------
PROC_Load       root     OK          Wed Jun 20 12:32:57 2007 sudo
        file: /usr/lib/security/LDAP
***** WATCH *****


And now open the socket that AIX uses to communicate with the LDAP in which the users and groups of the operating system are stored:

event           login    status      time                     command
--------------- -------- ----------- ------------------------ -------------------------------
TCP_ksocket     root     OK          Wed Jun 20 12:32:57 2007 sudo
        fd8 AF_UNIX, SOCK_STREAM, Protocol 0
***** WATCH *****
event           login    status      time                     command
--------------- -------- ----------- ------------------------ -------------------------------
TCP_kconnect    root     OK          Wed Jun 20 12:32:57 2007 sudo
        fd8 /usr/lib/security/.s_secldapclnt




Any ideas about how to investigate?
Comment 6 Todd C. Miller 2007-06-20 07:06:00 MDT
Try adding the line:
#include <fcntl.h>

to closefrom.c right before:
#include "sudo.h"

I believe that will fix the compilation error.
Comment 7 José Luis 2007-06-20 08:12:27 MDT
Hi Todd, 

It compiles now. The close calls have been disappeared but the Segmentation fault error continues.

How I can debug the sudo in orther to know what is happening?

Thanks, 
Comment 8 Todd C. Miller 2007-06-20 18:14:28 MDT
Created attachment 72 [details]
Patch to add debugging info to aix authentication

Please try the patch in the attachment, or if you'd rather, grab ftp://ftp.sudo.ws/pub/millert/sudo/aix_auth.c and place it in the "auth" directory of the sudo source distribution.  It may give us a little more information to help track this down.
Comment 9 José Luis 2007-06-21 05:46:06 MDT
(In reply to comment #8)
> Created an attachment (id=72) [details]
> Patch to add debugging info to aix authentication
> Please try the patch in the attachment, or if you'd rather, grab
> ftp://ftp.sudo.ws/pub/millert/sudo/aix_auth.c and place it in the "auth"
> directory of the sudo source distribution.  It may give us a little more
> information to help track this down.

Hi Todd, 

I don't know it this will help us. We find the problem even launching "sudo", and the error is depending of the sudoers file content. 

Internally it seems that when we launch sudo the file sudoers is being readed and through some system calls AIX is using his authentication modules (/usr/lib/security/LDAP, /usr/lib/security/KRB5 , and so on) to get information about the users and groups authorized in the sudoers file) Is in this case used the aix_auth.c sudo module used? or it is only when a password is asked to the user?

Anyway I replaced the aix_auth.c and compile again (I didn't know how or where to apply the patch) with the -g option to use dbx debugger.

With the change done I assume that using dbx debugger we will get more information?

Thanks
Comment 10 José Luis 2007-06-21 06:14:57 MDT
Hi Todd, 

I have run the dbx and I get the following:

I have seen that this first part it could be common so I am focused in the last message:

dbx ./sudo

Escriba 'help' para obtener ayuda.
leyendo información simbólica ...error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado 's__LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado '__LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
error interno: 1283-228 se esperaba el carácter ';', se ha encontrado '_LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
error interno: unexpected value 44 at line 5040 in file stabstring.c
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado '768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado 's_LC_locale_objhdl:,64,32;;'
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado '_LC_locale_objhdl:,64,32;;'
error interno: 1283-228 se esperaba el carácter ';', se ha encontrado 'LC_locale_objhdl:,64,32;;'
error interno: unexpected value 44 at line 5040 in file stabstring.c
error interno: 1283-228 se esperaba el carácter ',', se ha encontrado '64,32;;'
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c
error interno: unexpected value 120 at line 5040 in file stabstring.c

(dbx) next

Segmentation fault in strcmp.strcmp [sudo] at 0x1000776c ($t1)
0x1000776c (strcmp+0xcc) 88a30001        lbz   r5,0x1(r3)


I will continue working to know how to interpretate it.

Thanks

Comment 11 Todd C. Miller 2007-06-21 06:50:11 MDT
That's great, can you get a backtrace using the dbx "where" command?  That should give me an idea where in the code the strcmp() is being called from.
Comment 12 José Luis 2007-06-21 10:25:45 MDT
Hi Todd , I am trying to get the exact point (I need time as I am a dummy ...). 

I am using also ddd (Gui interface) with the gdb debugger to try to get the piece of code is causing this problem.

Meanwhile here is the "whereis" output:

dbx ./sudo

Type 'help' for help.
reading symbolic information ...internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: 1283-228 expected char ',', found 's__LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
internal error: 1283-228 expected char ',', found '__LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
internal error: 1283-228 expected char ';', found '_LC_locale:,768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
internal error: unexpected value 44 at line 5040 in file stabstring.c
internal error: 1283-228 expected char ',', found '768,32;__meth_ptr:87,800,32;__data_ptr:87,832,32;;'
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: 1283-228 expected char ',', found 's_LC_locale_objhdl:,64,32;;'
internal error: 1283-228 expected char ',', found '_LC_locale_objhdl:,64,32;;'
internal error: 1283-228 expected char ';', found 'LC_locale_objhdl:,64,32;;'
internal error: unexpected value 44 at line 5040 in file stabstring.c
internal error: 1283-228 expected char ',', found '64,32;;'
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c
internal error: unexpected value 120 at line 5040 in file stabstring.c

(dbx)

(dbx) n

Segmentation fault in strcmp.strcmp [sudo] at 0x1000776c ($t1)
0x1000776c (strcmp+0xcc) 88a30001        lbz   r5,0x1(r3)
(dbx) file
sudo.c
(dbx) func
strcmp.strcmp
(dbx) whereis strcmp
strcmp.strcmp [/usr/lib/libpthreads.a]
.strcmp
strcmp.strcmp [/usr/lib/libksvc.a]
.strcmp
strcmp.strcmp [/usr/lib/libkrb5.a]
.strcmp
strcmp.strcmp [/usr/lib/libk5profile.a]
.strcmp
strcmp.strcmp [/usr/lib/libgssapi_krb5.a]
.strcmp
strcmp.strcmp [/usr/lib/libi18n.a]
.strcmp
strcmp.strcmp [/usr/lib/libiconv.a]
.strcmp
strcmp.strcmp [/usr/lib/libs.a]
.strcmp
strcmp.strcmp [sudo]
.strcmp
Comment 13 José Luis 2007-06-21 12:15:15 MDT
Hi Todd, 

I have found where the code crashes. 

I attch you a backtrace:

#4 0x10001340 in main () at sudo.c:287
#3 0x1000e6b0 in sudoers_lookup () at parse.c:132
#2 0x1000d0b4 in yyparse () at parse.yacc:844
#1 0x1000f458 in usergr_matches () at parse.c:512
#0 0x1000776c in strcmp ()


Looking into the code the error seems produces the segmentation fault in;

511     for (cur = grp->gr_mem; *cur; cur++) {
512         if (strcmp(*cur, user) == 0)
513             return(TRUE);
514     }

I am trying to know the logic of the program to know where the problem that produces the segmentation fault occurs. If I evaluate the grp pointer in some of the calls I see:

grp.name=0x76707473 <Address 0x76707743 out of bounds>
grp.passwd=0x74000000 <Address 0x74000000 out of bounds>
grp.gid=0
grp.mem=0x0

Anyway I need more time to detect the logic and to know what is happening.

I hope come back to you soon.

Thanks, 
Comment 14 Todd C. Miller 2007-06-21 12:25:47 MDT
It looks like the results of getgrnam() are invalid at the point where you are displaying them.  However, the problem may simply be that the gr_mem member is allowed to be NULL for groups that have no members.  I've added a check for this in ftp://ftp.sudo.ws/pub/millert/sudo/parse.c

Please try building with that and let me know if the behavior changes.
Comment 15 José Luis 2007-06-22 07:51:52 MDT
Hi Todd

I am having some problems with the debuggers, but I think that the problem it could be in the result of the getgrnam () when a group in the LDAP has no members even the problem is produced in the strcmp call.

In general I am looking at the usergr_maches () function:

usergr_matches(group, user, pw)
...
...

******************************************************************************
Here the getgrnam causes that the group variable get an understable group name (Is this normal?) after the execution of the call. Also when the group has not members it gets an Address out of bounds.
******************************************************************************

    if ((grp = getgrnam(group)) == NULL)
        return(FALSE);

    /* check against user's primary (passwd file) gid */
    if (grp->gr_gid == pw_gid)
        return(TRUE);

    /*
     * If the user has a supplementary group vector, check it first.
     */

******************************************************************************
After this for , I don't know why the group variable is reset after this call to 0x0. It happens the same with the grp struct. Could you explain it to me?
I also found that n variable using dbx "print n" no changes the value in each interaction.
******************************************************************************

    for (n = user_ngroups; n != 0; n--) {
        if (grp->gr_gid == user_groups[n])
            return(TRUE);
    }

******************************************************************************
Here start my problems with the debuggers. Using ddd (GUI) with the gdb says that grp->gr_mem is out of bounds. With dbx says that is nil. In this case I trust on gdb)
******************************************************************************

    if (grp->gr_mem != NULL) {
        for (cur = grp->gr_mem; *cur; cur++) {
            if (strcmp(*cur, user) == 0)
                return(TRUE);
        }
    }

    return(FALSE);
}

Sample debug output:

#> dbx ./sudo

(dbx) file parse.c
(dbx) stop in usergr_matches
[1] stop in usergr_matches
(dbx) stop in 511
[2] stop at 511
(dbx) status
[1] stop in usergr_matches
[2] stop at 511
(dbx) run
[1] stopped in usergr_matches at line 482 in file "parse.c"
  482   {
(dbx) dump
usergr_matches(group = "%gptst", user = "root", pw = 0x20008e48), line 482 in "parse.c"
n = 0
cur = 0x20002dd4
grp = 0x20009178
pw_gid = 536882644
(dbx) c
[2] stopped in usergr_matches at line 511 in file "parse.c" ($t1)
  511       if (grp->gr_mem != NULL) {
(dbx) dump
usergr_matches(group = "\360J\317h\360J\317n", user = "root", pw = (nil)), line 511 in "parse.c"
n = 9
cur = (nil)
grp = 0xf04acf50
pw_gid = 0
(dbx) print *grp
(gr_name = "gptst", gr_passwd = "!", gr_gid = 90801, gr_mem = 0xf04acf60)
(dbx) print grp->gr_mem
(nil)
(dbx) c
[2] stopped in usergr_matches at line 511 in file "parse.c" ($t1)
  511       if (grp->gr_mem != NULL) {
(dbx)

(dbx) c
[2] stopped in usergr_matches at line 511 in file "parse.c" ($t1)
  511       if (grp->gr_mem != NULL) {
(dbx) dump
usergr_matches(group = (nil), user = "root", pw = 0x00000076), line 511 in "parse.c"
n = 9
cur = 0xf04acf64
grp = (nil)
pw_gid = 4031434596
(dbx) c
[1] stopped in usergr_matches at line 482 in file "parse.c" ($t1)
  482   {
(dbx) dump
usergr_matches(group = "%gptstsu", user = "root", pw = 0x20008e48), line 482 in "parse.c"
n = 0
cur = 0x20002dd4
grp = 0x20009258
pw_gid = 536882644
(dbx) print *grp
(gr_name = (invalid char ptr (0x25677074)), gr_passwd = (invalid char ptr (0x73747375)), gr_gid = 0, gr_mem = 0x00000040)
(dbx) c
[2] stopped in usergr_matches at line 511 in file "parse.c" ($t1)
  511       if (grp->gr_mem != NULL) {
(dbx) print *grp
(gr_name = "gptstsu", gr_passwd = "!", gr_gid = 90802, gr_mem = 0xf04acf60)
(dbx) print grp->gr_mem
(nil)
(dbx) dump
usergr_matches(group = "\360J\317`\360J\317h", user = "root", pw = 0x20009170), line 511 in "parse.c"
n = 9
cur = (nil)   <<<<<<<<<<<<<<<------------ Here gdb shows me out of bonds. Has it sense for you?
grp = 0xf04acf50
pw_gid = 0
(dbx) c

Segmentation fault in strcmp.strcmp [sudo] at 0x1000776c ($t1)
0x1000776c (strcmp+0xcc) 88a30001        lbz   r5,0x1(r3)
(dbx) print *grp
(gr_name = "gptstsu", gr_passwd = "!", gr_gid = 90802, gr_mem = 0xf04acf60)
(dbx) print grp->gr_mem
(nil)

Comment 16 Todd C. Miller 2007-06-22 10:05:23 MDT
This appears to be a bug in AIX's LDAP support, see:
http://www-1.ibm.com/support/docview.wss?uid=isg1IY81303

Can you try the fix in IY78339 and see if that fixes the problem?
Comment 17 José Luis 2007-06-22 10:21:45 MDT
Hi Todd, 

Lookint at IBM I have found some APAR's refering to getgrnam and LDAP groups:

IY81303: GETGRNAM LDAP GR_MEM PTR INVALID

Error description 
When no user is a member of a group, getgrnam returns
garbage
in the gr_mem field because it fails to insert a null
byte in the field.


I will need to upgrade the OS. I will come back to you as soon as I have tested.

Thanks for you great help.
Comment 18 José Luis 2007-06-26 04:38:39 MDT
Hi Todd, 

It seems that both found the APAR at same time.

I have tested and it works. Problem solved.

Thanks for your help and your quick responses.