Understanding and Debugging Kerberised NFSv4

This page was the result of much headbanging - and not the fun sort. Hopefully you may benefit.
Kerberos is hard to understand in actual detail, and typical implementation's error handling is poor.
One tool I can highly recommend is Wireshark which decodes all the KRB5 and NFS packets even going down to the horrible ASN.1 level.
Solaris snoop is also pretty useful.

Routine Debugging (newnfs not Apple nfs)

1) Errors: “SERVER_NOT_FOUND on KDC” or “Required KADM5 principal missing while initializing kadmin interface”
It may mean that your kdc server name (blackhole.dfusion.com.au) appears differently in your client and servers’ ‘hosts’ database (typically DNS or /etc/hosts). Kerberos uses the canonical FQDN name for a host (eg blackhole.dfusion.com.au) regardless of other hostname aliases.
Check the entries eg “getent hosts blackhole” (or for /etc/hosts resolution just “grep blackhole /etc/hosts”) and compare the results on server and client. I believe they should be identical.
run klookup and also check any domain name in /etc/resolve.conf

2) ...

3) If you get really stuck use Wireshark and compare your network packets with my reference set.

Deeper Problems and Implementation Specific Challenges

Tiger and Leopard work differently under the bonnet with a default configuration which can confuse things.

Session ticket encryption

Other encryption types are generated by the TGS for the TGS/Server so the client doesn't care about enc. type.
I've researched what Leopard does. Interestingly its kerberos side can handle the full range of enc types.
One doesn't need to limit any enc types in the kerberos config (ie don't need to set default_tkt_enctypes or default_tgs_enctypes).
If for example one does force Leopard to get an aes256* encoded session key for example, it will ask for an receive one happily
(examination of the krbtgt in the Kerberos.app shows this is all fine).

However Leopard does something "sneaky" under the bonnet if it receives an nfs session ticket (result of a TGT_REQ/REP) that is encoded with anything other than des-cbc-crc(1) des-cbc-md5(3) des-cbc-md4(2).
In this case Leopard (which daemon/kext?) makes a new TGT_REQ asking for only these three types and naturally gets its first request type (lowest common denominator des-cbc-crc) in the TGT_REP.
Thus regardless of what krb5.conf is set to - you always end up with an nfs enc type of 1,3 or 2. If you care about 3 being slghly more secure than two (I believe) then the only way to do this is to force ALL session tickets (AND all server tickets I think) to be des-cbc-md5 but setting both the krb5.conf defaults mentioned above. So one has the choice of either all tickets at a moderately low grade or just nfs ticket at a very low grade.
Bear in mind any security is enough for most people, I'm just being pedagogical.

Tiger on the other hand (with newnfs' gsscl) doesn't appear to do this (although I didn't check this fully).
Tiger does support/offer many session ticket enc types, however gsscl/newnfs is NOT happy with pretty much anything except types 1,3 ( and 2?) - the basic des ones. When the tiger combination gets an enc type it can't handle it just barfs with Authentication error.
So in the case of tiger one MUST set default_tgs_enctypes to one of the basic set.

Note: Solaris kerb server (based on MIT) apparently has an implementation constraint requiring default_tgt_enctypes to be a subset of the default_tgs_enctypes. Therefore one unfortunately must also set default_tgt_enctypes on a Tiger box, effectively to the same set as default_tgs_enctypes.

Some encryption types provoke indigestible response from Solaris RPCGSS_INIT

Why can't other keys like AES* be handled by Tiger/Leopard as nfs service tickets?
There is a "feature" apparently related to aes* keys (maybe others too haven't tested fully) in that when passed to the NFS NULL proc (via RPCGSS_INIT) returns a GSS token (at least on Solaris) that is not GSSAPI compliant - maybe it isn't wrapped. (I hope I have the terms correct). I've seen various notes that might relate to this but I'm still unsure if it is a bug or a new feature.
Anyway the newnfs code checks for an initial Verification Token value of 0x60, wheres the Server sends back an (unencapsulated) MIC object starting with 040405. Maybe the standards are "evolving".
(There was a Solaris bug mentioned in this general area from 2008 but I can't believe it would still be there)

NFS NULL REPLY (that newnfs/gsscl are happy with) The OID (in bold) starts with 0x60.
        Flavor: RPCSEC_GSS (6)
        GSS Token: 00000025602306092A864886F71201020201010000FFFFFF...
            GSS Token Length: 37
            GSS-API Generic Security Service Application Program Interface
                OID: 1.2.840.113554.1.2.2 (KRB5 - Kerberos 5)
                krb5_blob: 01010000FFFFFFFF4A604A70DDC86EF623C36495D01F8111
                    krb5_tok_id: KRB5_GSS_GetMIC (0x0101)
                    krb5_sgn_alg: DES MAC MD5 (0x0000)
                    krb5_snd_seq: 4A604A70DDC86EF6
                    krb5_sgn_cksum: 23C36495D01F8111

NFS NULL REPLY (note krb5_blob has no preceding OID) and also the token id/type is a different variant of GetMIC.
gss gets the raw 0x040405... rather than the 0x60... and barfs.

        Flavor: RPCSEC_GSS (6)
        GSS Token: 0000001C040405FFFFFFFFFF0000000028E28928474CDA1B...
            GSS Token Length: 28
            GSS-API Generic Security Service Application Program Interface
                krb5_blob: 040405FFFFFFFFFF0000000028E28928474CDA1B692A9FEA...
                    krb5_tok_id: KRB_TOKEN_CFX_GetMic (0x0404)
                    krb5_cfx_flags: 0x05
                        .... .1.. = AcceptorSubkey: Set
                        .... ..0. = Sealed: Not set
                        .... ...1 = SendByAcceptor: Set
                    krb5_filler: FFFFFFFFFF
                    krb5_cfx_seq: 685934888
                    krb5_sgn_cksum: 474CDA1B692A9FEAAB616B1D

The original document is available at http://dfusion.com.au/wiki/tiki-index.php?page=Understanding+and+Debugging+Kerberised+NFSv4