The Linux/NFS 16 group limit

I just spent a couple of hours wondering why I suddenly couldn't use files belonging to a particular group on my Linux system in work. For example, a directory is group-owned by a group called rivetgun, it's group-writeable and calling groups tells me that I'm a member of that group. But I can't write to it... what's up? The permissions of the group look identical to those of other groups that are behaving as normal. Eventually, I try logging in as another member of the same group... it works fine! Okay, so it's my problem, not the group.

buckley@h6:~$ groups
users rivet hepforge hepdata professor cedar lhapdf jimmy ktjet hztool jetweb hepml
hzsteer heptex feynml pyfeyn rivetgun hepjet statpattrec rr
buckley@h6:~$ touch hepjet/foo
touch: cannot touch 'hepjet/foo': Permission denied
buckley@h6:~$ touch pyfeyn/foo

Let's count from the left: 1 is users, 2 is rivet, 3 is hepforge, ... 16 is pyfeyn, 17 is rivetgun. Aha, suspicious. Google delivers a handy link to this blog entry:

Synopsis: UNIX users can belong to UNIX groups and for many years the maximum
number of groups in Solaris has been limited to 16. Increasing it sounds easy
and of obvious benefit. It turns out to be neither...

The same logic applies to Linux - the 2.6 kernel can handle up to 64k groups per user, but still truncates them at 16 when making use of the AUTH_SYS authentication mechanism. Bah!

Okay, so the weird error is understood and fortunately I can remove myself from some of the groups in the first 16, so the immediate problem can be worked around for now. But this is a real problem - looks like more of my non-existent admin time will have to be devoted to finding a sensible ACL system next time I upgrade the server. sigh

Update: I've also found this blog entry which if anything is even more useful. It's quite definite and up-front with the opinion that NFS users should be using ACLs and ditching AUTH_SYS for RPCSEC_GSS... so when am I going to find the time and resources to convert HepForge to such a system? Hum. At least this issue came up before I re-wrote all the management scripts!


Comments powered by Disqus