This is a discussion on Any idea why chroot temporarily "cannot find name for group ID 0"? within the Debian Linux Users forum forums, part of the Debian Linux category; --> Can you suggest why I'm seeing the following error when entering a debootstrap chroot on a Fedora host, and ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Can you suggest why I'm seeing the following error when entering a debootstrap chroot on a Fedora host, and why it spontaneously disappears a few minutes later? [root@XXXX svn]# chroot staging/www id: cannot find name for group ID 0 id: cannot find name for group ID 1 id: cannot find name for group ID 2 id: cannot find name for group ID 3 id: cannot find name for group ID 4 id: cannot find name for group ID 6 id: cannot find name for group ID 10 I have no name!@XXXX:/# More detail on the mystery problem is on linuxquestions.org [1], but in short: I created two chroots on a Fedora Core 4 box using debootstrap (hardy). The chroots work great, and one contains a webserver and the other contains a database. Everything is awesome except after some period of time (or some unknown activity), the www chroot temporarily "breaks" and spits out the error above when I try to enter it. But the real mystery is it just spontaneously corrects itself a few minutes later, even though I make no changes to permissions or files. For example, here's a log of me exiting a broken chroot, and then immediately re-entering without having made any changes: I have no name!@XXXX:/etc# exit [root@XXXX svn]# chroot staging/www root@XXXX:/# Even stranger, I've experienced the 'www' chroot having this behavior (repeatably, for several minutes) even though I can chroot into a 'db' chroot just fine -- despite both being created with the exact same debootstrap command. I'm at a total loss and posting this here because I'm not sure if it's a Debian or Fedora issue: it's happening inside a Debian chroot, but on a Fedora host. Any suggestions? Thanks! -david [1] http://www.linuxquestions.org/questi...p-id-0-651928/ -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| can you give the output of cat > /etc/group On 6/30/08, David Barrett <dbarrett@quinthar.com> wrote: > Can you suggest why I'm seeing the following error when entering a > debootstrap chroot on a Fedora host, and why it spontaneously disappears > a few minutes later? > > [root@XXXX svn]# chroot staging/www > id: cannot find name for group ID 0 > id: cannot find name for group ID 1 > id: cannot find name for group ID 2 > id: cannot find name for group ID 3 > id: cannot find name for group ID 4 > id: cannot find name for group ID 6 > id: cannot find name for group ID 10 > I have no name!@XXXX:/# > > More detail on the mystery problem is on linuxquestions.org [1], but in > short: I created two chroots on a Fedora Core 4 box using debootstrap > (hardy). The chroots work great, and one contains a webserver and the > other contains a database. Everything is awesome except after some > period of time (or some unknown activity), the www chroot temporarily > "breaks" and spits out the error above when I try to enter it. > > But the real mystery is it just spontaneously corrects itself a few > minutes later, even though I make no changes to permissions or files. > For example, here's a log of me exiting a broken chroot, and then > immediately re-entering without having made any changes: > > I have no name!@XXXX:/etc# exit > [root@XXXX svn]# chroot staging/www > root@XXXX:/# > > Even stranger, I've experienced the 'www' chroot having this behavior > (repeatably, for several minutes) even though I can chroot into a 'db' > chroot just fine -- despite both being created with the exact same > debootstrap command. > > I'm at a total loss and posting this here because I'm not sure if it's a > Debian or Fedora issue: it's happening inside a Debian chroot, but on a > Fedora host. > > Any suggestions? Thanks! > > -david > > [1] > http://www.linuxquestions.org/questi...p-id-0-651928/ > > > > -- > To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org > with a subject of "unsubscribe". Trouble? Contact > listmaster@lists.debian.org > > -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| On Mon, Jun 30, 2008 at 10:42:53AM +0800, paragasu <paragasu@gmail.com> was heard to say: > can you give the output of cat > /etc/group Actually, you don't want to do that since it will erase your group file! I think that the contents of /etc/group and /etc/nsswitch.conf, both when the system is working and when it's "broken", would be interesting, though. Daniel -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| Ok, here it is. I apologize for its length, I want to include everything so you don't think I'm just doing some magic change without reporting it. First, I ssh into the server and try to chroot in, and lucky us, we're experiencing it right away. I cat /etc/group as requested: Last login: Sun Jun 29 02:31:35 2008 from 99-204-40-118.area1.spcsdns.net [root@XXXX ~]# chroot /svn/staging/ [root@XXXX ~]# cd /svn [root@XXXX svn]# chroot staging/www id: cannot find name for group ID 0 id: cannot find name for group ID 1 id: cannot find name for group ID 2 id: cannot find name for group ID 3 id: cannot find name for group ID 4 id: cannot find name for group ID 6 id: cannot find name for group ID 10 I have no name!@XXXX:/# cat /etc/group root:x:0: daemon:x:1: bin:x:2: sys:x:3: adm:x:4: tty:x:5: disk:x:6: lp:x:7: mail:x:8: news:x:9: uucp:x:10: man:x:12: proxy:x:13: kmem:x:15: dialout:x:20: fax:x:21: voice:x:22: cdrom:x:24: floppy:x:25: tape:x:26: sudo:x:27: audio:x:29: dip:x:30: www-data:x:33: backup:x:34: operator:x:37: list:x:38: irc:x:39: src:x:40: gnats:x:41: shadow:x:42: utmp:x:43: video:x:44: sasl:x:45: plugdev:x:46: staff:x:50: games:x:60: users:x:100: nogroup:x:65534: libuuid:x:101: I have no name!@XXXX:/# date Mon Jun 30 05:55:48 UTC 2008 I have no name!@XXXX:/# exit After that I exit the chroot and go right back in: [root@XXXX svn]# chroot staging/www I have no name!@XXXX:/# date Mon Jun 30 05:58:33 UTC 2008 I have no name!@XXXX:/# exit Note that even though only 3 minutes have passed, now for some reason it doesn't complain about the missing group names. This is a direct session copy -- I haven't done anything on the server whatsoever. I keep logging out and back in several times to see if it fixes itself: [root@XXXX svn]# chroot staging/www I have no name!@XXXX:/# date Mon Jun 30 06:02:37 UTC 2008 I have no name!@XXXX:/# exit [root@XXXX svn]# chroot staging/www I have no name!@XXXX:/# date Mon Jun 30 06:06:07 UTC 2008 I have no name!@XXXX:/# exit No luck. 10 minutes have passed and it's still busted. Just for grins, here's a look at the group permissions: [root@XXXX svn]# chroot staging/www I have no name!@XXXX:/# ls -latr /etc/group -rw-r--r-- 1 0 root 461 Jun 13 00:58 /etc/group I have no name!@XXXX:/# date Mon Jun 30 06:11:04 UTC 2008 I have no name!@XXXX:/# Note that the user is still 0, but it correctly recognizes the user group to be '0'. So, the mapping between group IDs and group names seems to be working, but the mapping for user IDs to names is still broken. Here's the passwd file: (In retrospect I should have checked the permissions of the passwd file, oops.) I have no name!@XXXX:/etc# cat passwd root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/bin/sh bin:x:2:2:bin:/bin:/bin/sh sys:x:3:3:sys:/dev:/bin/sh sync:x:4:65534:sync:/bin:/bin/sync games:x:5:60:games:/usr/games:/bin/sh man:x:6:12:man:/var/cache/man:/bin/sh lp:x:7:7:lp:/var/spool/lpd:/bin/sh mail:x:8:8:mail:/var/mail:/bin/sh news:x:9:9:news:/var/spool/news:/bin/sh uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh proxy:x:13:13 www-data:x:33:33:www-data:/var/www:/bin/sh backup:x:34:34:backup:/var/backups:/bin/sh list:x:38:38:Mailing List Manager:/var/list:/bin/sh irc:x:39:39:ircd:/var/run/ircd:/bin/sh gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh nobody:x:65534:65534:nobody:/nonexistent:/bin/sh libuuid:x:100:101::/var/lib/libuuid:/bin/sh I have no name!@XXXX:/etc# exit During this time the chroot'd lighttpd webserver is hosting pages fine, but can't send emails. I try to restart it -- no luck. The init.d script contains "chown www-data:www-data /var/run/lighttpd" and fails on that line: [root@XXXX svn]# chroot staging/www I have no name!@XXXX:/# /etc/init.d/lighttpd stop chown: invalid user: `www-data:www-data' I have no name!@XXXX:/# /etc/init.d/lighttpd start chown: invalid user: `www-data:www-data' I have no name!@XXXX:/# exit I have the great idea of switching from usernames to user IDs: this solves the chown problem, but lighttpd.conf has a username. I switch to a user ID in lighttpd.conf but that doesn't work either, so I back out those changes. Note, I've made no changes to /etc/group or /etc/passwd I have no name!@XXXX:/# /etc/init.d/lighttpd start * Starting web server lighttpd 2008-06-30 06:17:39: (server.c.727) can't find username www-data [fail] I have no name!@XXXX:/# /etc/init.d/lighttpd start * Starting web server lighttpd 2008-06-30 06:18:22: (server.c.727) can't find username 33 [fail] I have no name!@XXXX:/# exit Also note, despite these changes, the chroot is still broken. More time passes and I try one more time: [root@XXXX svn]# chroot staging/www root@XXXX:/# date Mon Jun 30 06:25:45 UTC 2008 root@XXXX:/# exit Aha! it's fixed! 30 minutes after the first attempt it's suddenly working. So one theory is something is changing the permissions of /etc/group and /etc/passwd. But I can't figure out what that might be. Next time I'm going to check the permissions first thing. Even more fun is now it works fine, and will continue to work fine for some undetermined period of time. So, I'll post again when it happens again, and next time I'll do whoami, check permissions better, etc. Any hints so far? Thanks for following along! -david Daniel Burrows wrote: > On Mon, Jun 30, 2008 at 10:42:53AM +0800, paragasu <paragasu@gmail.com> was heard to say: >> can you give the output of cat > /etc/group > > Actually, you don't want to do that since it will erase your group > file! > > I think that the contents of /etc/group and /etc/nsswitch.conf, both > when the system is working and when it's "broken", would be interesting, > though. > > Daniel > > -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| On Mon, Jun 30, 2008 at 12:06:02AM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: > [root@XXXX svn]# chroot staging/www > id: cannot find name for group ID 0 > id: cannot find name for group ID 1 > id: cannot find name for group ID 2 > id: cannot find name for group ID 3 > id: cannot find name for group ID 4 > id: cannot find name for group ID 6 > id: cannot find name for group ID 10 I wonder what you would get while this is happening if you run "strace id"; of course you might have to install strace in the chroot first. Also, did you check whether there's anything odd in nsswitch.conf? (I suppose probably not since you didn't mention setting anything up there, but it's worth a check) > Aha! it's fixed! 30 minutes after the first attempt it's suddenly working. What cron jobs are scheduled? (system jobs as well as user jobs) Maybe one of them is causing this problem? Do you have nscd installed in the chroots or on the main system? > So one theory is something is changing the permissions of /etc/group and > /etc/passwd. But I can't figure out what that might be. Next time I'm > going to check the permissions first thing. It seems unlikely that this is related to your problem. Your shell was unable to determine its user name, but it was running as root and root could read /etc/passwd. Daniel -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| Ok, even more mystery, check out this session: Basically, I mentioned I have two nearly-identical chroots, and that sometimes one works while the other one doesn't. Here's an example of that, combined with the broken chroot fixing itself almost instantly: Last login: Mon Jun 30 07:15:52 2008 from c-98-207-97-133.hsd1.ca.comcast.net [root@XXXX ~]# cd /svn [root@XXXX svn]# chroot staging/www root@XXXX:/# exit [root@XXXX svn]# chroot staging/db id: cannot find name for group ID 0 id: cannot find name for group ID 1 id: cannot find name for group ID 2 id: cannot find name for group ID 3 id: cannot find name for group ID 4 id: cannot find name for group ID 6 id: cannot find name for group ID 10 I have no name!@XXXX:/# ls -latr /etc/group -rw-r--r-- 1 0 root 461 Jun 28 23:34 /etc/group I have no name!@XXXX:/# ls -latr /etc/passwd -rw-r--r-- 1 root root 761 Jun 28 23:34 /etc/passwd I have no name!@XXXX:/# ls -latr /etc/group -rw-r--r-- 1 root root 461 Jun 28 23:34 /etc/group I have no name!@XXXX:/# exit [root@XXXX svn]# chroot staging/db root@XXXX:/# Basically, I go into staging/www, and it works fine. Then I go into staging/db, and it has the problem. I immediately check the group permissions, and note that now group IDs are being resolved to group names, but user IDs aren't getting resolved. I then check the passwd permissions, and note that both user and group names are now working. I go right back to the group file, and now group and usernames are working fine. I exit the broken DB chroot, and re-enter just fine. All this happened in probably under a minute; that's the entire transcript, unaltered. There are no other SSH sessions on that box. As for nsswitch.conf, here it is: I haven't changed it, but I'm not familiar with the file so I don't know if it's right or not: ------------------------------------------------------------------- # /etc/nsswitch.conf # # Example configuration of GNU Name Service Switch functionality. # If you have the `glibc-doc-reference' and `info' packages installed, try: # `info libc "Name Service Switch"' for information about this file. passwd: compat group: compat shadow: compat hosts: files dns networks: files protocols: db files services: db files ethers: db files rpc: db files netgroup: nis ------------------------------------------------------------------- As for cron, I've got none inside the chroots, and none that I think would touch them. I'll need to check on that. But the timing would need to be consistently coincidental (or it could be going at very high frequency). Furthermore, I'm not sure what a cron job could do that would trigger this in the first place. As for nscd... Aha! This is a good candidate: it turns out I *do* have this installed on the host system. I don't know anything about this; I'll need to read up on it. But looking over the config file, it looks like a very likely explanation: ------------------------------------------------------------------- # logfile /var/log/nscd.log # threads 6 # max-threads 128 server-user nscd # stat-user nocpulse debug-level 0 # reload-count 5 paranoia no # restart-interval 3600 enable-cache passwd yes positive-time-to-live passwd 600 negative-time-to-live passwd 20 suggested-size passwd 211 check-files passwd yes persistent passwd yes shared passwd yes max-db-size passwd 33554432 enable-cache group yes positive-time-to-live group 3600 negative-time-to-live group 60 suggested-size group 211 check-files group yes persistent group yes shared group yes max-db-size group 33554432 enable-cache hosts yes positive-time-to-live hosts 3600 negative-time-to-live hosts 20 suggested-size hosts 211 check-files hosts yes persistent hosts yes shared hosts yes max-db-size hosts 33554432 ------------------------------------------------------------------- I'm putting my money on nscd for now, though why this would screw up email sending from PHP within the chroot, I don't know. (But then again, maybe that's an entirely different erratic problem.) Thanks for all your help! -david Daniel Burrows wrote: > On Mon, Jun 30, 2008 at 12:06:02AM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: >> [root@XXXX svn]# chroot staging/www >> id: cannot find name for group ID 0 >> id: cannot find name for group ID 1 >> id: cannot find name for group ID 2 >> id: cannot find name for group ID 3 >> id: cannot find name for group ID 4 >> id: cannot find name for group ID 6 >> id: cannot find name for group ID 10 > > I wonder what you would get while this is happening if you run "strace id"; > of course you might have to install strace in the chroot first. Also, > did you check whether there's anything odd in nsswitch.conf? (I suppose > probably not since you didn't mention setting anything up there, but > it's worth a check) > >> Aha! it's fixed! 30 minutes after the first attempt it's suddenly working. > > What cron jobs are scheduled? (system jobs as well as user jobs) > Maybe one of them is causing this problem? > > Do you have nscd installed in the chroots or on the main system? > >> So one theory is something is changing the permissions of /etc/group and >> /etc/passwd. But I can't figure out what that might be. Next time I'm >> going to check the permissions first thing. > > It seems unlikely that this is related to your problem. Your shell > was unable to determine its user name, but it was running as root and > root could read /etc/passwd. > > Daniel > > -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| On Mon, Jun 30, 2008 at 01:12:34PM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: > Basically, I go into staging/www, and it works fine. Then I go into > staging/db, and it has the problem. I immediately check the group > permissions, and note that now group IDs are being resolved to group > names, but user IDs aren't getting resolved. I then check the passwd > permissions, and note that both user and group names are now working. I > go right back to the group file, and now group and usernames are working > fine. I exit the broken DB chroot, and re-enter just fine. Just out of curiosity, does the chroot that *was* working continue to work after this? If you run, e.g., "id" a few times when it's broken, does it continue to be broken? And although I can't imagine why this would be the case, does the problem consistently go away for, e.g., passwords after you "ls" the password file? I notice that this happened in both of your last two examples: your problems with each file went away as soon as you listed it. Is there anything unusual about how your filesystems are configured? > As for nsswitch.conf, here it is: I haven't changed it, but I'm not > familiar with the file so I don't know if it's right or not: That looks right. The main concern would be if you had done something like fetching user information from LDAP, which would be another place for bugs to hide. > As for nscd... Aha! This is a good candidate: it turns out I *do* have > this installed on the host system. I don't know anything about this; > I'll need to read up on it. nscd caches lookups of things like uid <-> name mappings. I've had various problems in the past, which I won't detail because I can't remember them in detail, and I wouldn't recommend installing it unless you need to (mostly if you're using something like NIS or LDAP). This looks like the sort of gremlin that nscd could cause. However, I don't think I needed to ask whether you had it installed on the host system: it looks like it communicates via a Unix-domain socket in /var, so it wouldn't be able to interfere with what's happening in the chroots. I think an strace of a failing command (e.g., "id") would be very interesting. Daniel -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| Wow, great observation: doing a "ls" of /etc/group and /etc/passwd fixes it. How incredibly strange: [root@XXXX svn]# chroot staging/db id: cannot find name for group ID 0 id: cannot find name for group ID 1 id: cannot find name for group ID 2 id: cannot find name for group ID 3 id: cannot find name for group ID 4 id: cannot find name for group ID 6 id: cannot find name for group ID 10 I have no name!@XXXX:/# ls /etc/group /etc/passwd /etc/group /etc/passwd I have no name!@XXXX:/# exit [root@XXXX svn]# chroot staging/db root@XXXX:/# Anyway, it's fantastic to have a workaround. Thanks a million! Also, here's the output of "id" from a broken chroot; I don't actually know if this fixes it -- I tried the "ls" trick and that did it, so I don't know if "id" will fix it, too. I have no name!@XXXX:/# id uid=0 gid=0 groups=0,1,2,3,4,6,10 Finally, fixing one chroot neither fixes nor breaks the other -- they seem entirely independent. (Great question, though.) So, the next time I see it I'm going to see if "id" fixes it by itself. I'm also going to read up more on the nscd and see if tweaking that helps. It's a little slow going as it seems to take a long time for this problem to re-appear; perhaps I can tweak nscd to force it to happen more frequently, and thus better figure out how to make it not happen it all. Thanks for all your help! -david Daniel Burrows wrote: > On Mon, Jun 30, 2008 at 01:12:34PM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: >> Basically, I go into staging/www, and it works fine. Then I go into >> staging/db, and it has the problem. I immediately check the group >> permissions, and note that now group IDs are being resolved to group >> names, but user IDs aren't getting resolved. I then check the passwd >> permissions, and note that both user and group names are now working. I >> go right back to the group file, and now group and usernames are working >> fine. I exit the broken DB chroot, and re-enter just fine. > > Just out of curiosity, does the chroot that *was* working continue to > work after this? > > If you run, e.g., "id" a few times when it's broken, does it continue > to be broken? And although I can't imagine why this would be the case, > does the problem consistently go away for, e.g., passwords after you > "ls" the password file? I notice that this happened in both of your > last two examples: your problems with each file went away as soon as you > listed it. Is there anything unusual about how your filesystems are > configured? > >> As for nsswitch.conf, here it is: I haven't changed it, but I'm not >> familiar with the file so I don't know if it's right or not: > > That looks right. The main concern would be if you had done something > like fetching user information from LDAP, which would be another place > for bugs to hide. > >> As for nscd... Aha! This is a good candidate: it turns out I *do* have >> this installed on the host system. I don't know anything about this; >> I'll need to read up on it. > > nscd caches lookups of things like uid <-> name mappings. I've had > various problems in the past, which I won't detail because I can't > remember them in detail, and I wouldn't recommend installing it unless > you need to (mostly if you're using something like NIS or LDAP). This > looks like the sort of gremlin that nscd could cause. > > However, I don't think I needed to ask whether you had it installed on > the host system: it looks like it communicates via a Unix-domain socket > in /var, so it wouldn't be able to interfere with what's happening in > the chroots. > > > I think an strace of a failing command (e.g., "id") would be very > interesting. > > Daniel > > -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| |||
| On Wed, Jul 02, 2008 at 05:38:50PM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: > Wow, great observation: doing a "ls" of /etc/group and /etc/passwd fixes > it. How incredibly strange: I'd go for "jawdroppingly bizarre" myself. The only other thing I can think of is that maybe there's something odd at the filesystem level. Is it anything but a straight ext3 filesystem? (e.g., are you using NFS, unionfs, fuse filesystems, etc) Daniel -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |
| ||||
| Ok, so I have definitely confirmed that: 1) Both the db and www chroot periodically break 2) When broken, the www chroot can't send email from PHP 3) Each can be fixed by chrooting in with "ls /etc/group /etc/passwd" 4) Fixing one doesn't fix the other 5) Once the www chroot is fixed, PHP can send email just fine 6) I needn't restart lighttpd/php to fix it The filesystem is ext3. One theory was nscd, but I don't think I have that running. At the very least, I don't see a nscd process: [root@XXXX svn]# ps aux | grep nscd root 12260 0.0 0.1 3916 684 pts/0 S+ 00:01 0:00 grep nscd [root@XXXX svn]# I'm thinking the real solution is to ditch this old FC4 host system (which was installed by the dedicated server provider, and thus I don't know what weird changes they made) and switch to Debian. -david Daniel Burrows wrote: > On Wed, Jul 02, 2008 at 05:38:50PM -0700, David Barrett <dbarrett@quinthar.com> was heard to say: >> Wow, great observation: doing a "ls" of /etc/group and /etc/passwd fixes >> it. How incredibly strange: > > I'd go for "jawdroppingly bizarre" myself. > > The only other thing I can think of is that maybe there's something > odd at the filesystem level. Is it anything but a straight ext3 > filesystem? (e.g., are you using NFS, unionfs, fuse filesystems, etc) > > Daniel > > -- To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org |