Understanding and Debugging Kerberised NFSv4

This page was the result of much headbanging - and not the fun sort. Hopefully you may benefit.
Kerberos is hard to understand in actual detail, and typical implementation's error handling is poor.
One tool I can highly recommend is Wireshark which decodes all the KRB5 and NFS packets even going down to the horrible ASN.1 level.
Solaris snoop is also pretty useful.

Routine Debugging (newnfs not Apple nfs)

1) Errors: “SERVER_NOT_FOUND on KDC” or “Required KADM5 principal missing while initializing kadmin interface”
It may mean that your kdc server name ( appears differently in your client and servers’ ‘hosts’ database (typically DNS or /etc/hosts). Kerberos uses the canonical FQDN name for a host (eg regardless of other hostname aliases.
Check the entries eg “getent hosts blackhole” (or for /etc/hosts resolution just “grep blackhole /etc/hosts”) and compare the results on server and client. I believe they should be identical.
run klookup and also check any domain name in /etc/resolve.conf

2) ...

3) If you get really stuck use Wireshark and compare your network packets with my reference set.

Deeper Problems and Implementation Specific Challenges

Tiger and Leopard work differently under the bonnet with a default configuration which can confuse things.

Session ticket encryption

Other encryption types are generated by the TGS for the TGS/Server so the client doesn't care about enc. type.
I've researched what Leopard does. Interestingly its kerberos side can handle the full range of enc types.
One doesn't need to limit any enc types in the kerberos config (ie don't need to set default_tkt_enctypes or default_tgs_enctypes).
If for example one does force Leopard to get an aes256* encoded session key for example, it will ask for an receive one happily
(examination of the krbtgt in the shows this is all fine).

However Leopard does something "sneaky" under the bonnet if it receives an nfs session ticket (result of a TGT_REQ/REP) that is encoded with anything other than des-cbc-crc(1) des-cbc-md5(3) des-cbc-md4(2).
In this case Leopard (which daemon/kext?) makes a new TGT_REQ asking for only these three types and naturally gets its first request type (lowest common denominator des-cbc-crc) in the TGT_REP.
Thus regardless of what krb5.conf is set to - you always end up with an nfs enc type of 1,3 or 2. If you care about 3 being slghly more secure than two (I believe) then the only way to do this is to force ALL session tickets (AND all server tickets I think) to be des-cbc-md5 but setting both the krb5.conf defaults mentioned above. So one has the choice of either all tickets at a moderately low grade or just nfs ticket at a very low grade.
Bear in mind any security is enough for most people, I'm just being pedagogical.

Tiger on the other hand (with newnfs' gsscl) doesn't appear to do this (although I didn't check this fully).
Tiger does support/offer many session ticket enc types, however gsscl/newnfs is NOT happy with pretty much anything except types 1,3 ( and 2?) - the basic des ones. When the tiger combination gets an enc type it can't handle it just barfs with Authentication error.
So in the case of tiger one MUST set default_tgs_enctypes to one of the basic set.

Note: Solaris kerb server (based on MIT) apparently has an implementation constraint requiring default_tgt_enctypes to be a subset of the default_tgs_enctypes. Therefore one unfortunately must also set default_tgt_enctypes on a Tiger box, effectively to the same set as default_tgs_enctypes.

Some encryption types provoke indigestible response from Solaris RPCGSS_INIT

Why can't other keys like AES* be handled by Tiger/Leopard as nfs service tickets?
There is a "feature" apparently related to aes* keys (maybe others too haven't tested fully) in that when passed to the NFS NULL proc (via RPCGSS_INIT) returns a GSS token (at least on Solaris) that is not GSSAPI compliant - maybe it isn't wrapped. (I hope I have the terms correct). I've seen various notes that might relate to this but I'm still unsure if it is a bug or a new feature.
Anyway the newnfs code checks for an initial Verification Token value of 0x60, wheres the Server sends back an (unencapsulated) MIC object starting with 040405. Maybe the standards are "evolving".
(There was a Solaris bug mentioned in this general area from 2008 but I can't believe it would still be there)

NFS NULL REPLY (that newnfs/gsscl are happy with) The OID (in bold) starts with 0x60.
        Flavor: RPCSEC_GSS (6)
        GSS Token: 00000025602306092A864886F71201020201010000FFFFFF...
            GSS Token Length: 37
            GSS-API Generic Security Service Application Program Interface
                OID: 1.2.840.113554.1.2.2 (KRB5 - Kerberos 5)
                krb5_blob: 01010000FFFFFFFF4A604A70DDC86EF623C36495D01F8111
                    krb5_tok_id: KRB5_GSS_GetMIC (0x0101)
                    krb5_sgn_alg: DES MAC MD5 (0x0000)
                    krb5_snd_seq: 4A604A70DDC86EF6
                    krb5_sgn_cksum: 23C36495D01F8111

NFS NULL REPLY (note krb5_blob has no preceding OID) and also the token id/type is a different variant of GetMIC.
gss gets the raw 0x040405... rather than the 0x60... and barfs.

        Flavor: RPCSEC_GSS (6)
        GSS Token: 0000001C040405FFFFFFFFFF0000000028E28928474CDA1B...
            GSS Token Length: 28
            GSS-API Generic Security Service Application Program Interface
                krb5_blob: 040405FFFFFFFFFF0000000028E28928474CDA1B692A9FEA...
                    krb5_tok_id: KRB_TOKEN_CFX_GetMic (0x0404)
                    krb5_cfx_flags: 0x05
                        .... .1.. = AcceptorSubkey: Set
                        .... ..0. = Sealed: Not set
                        .... ...1 = SendByAcceptor: Set
                    krb5_filler: FFFFFFFFFF
                    krb5_cfx_seq: 685934888
                    krb5_sgn_cksum: 474CDA1B692A9FEAAB616B1D

  • + : A leading plus sign indicates that this word must be present in every object returned.
  • - : A leading minus sign indicates that this word must not be present in any row returned.
  • By default (when neither plus nor minus is specified) the word is optional, but the object that contain it will be rated higher.
  • < > : These two operators are used to change a word's contribution to the relevance value that is assigned to a row.
  • ( ) : Parentheses are used to group words into subexpressions.
  • ~ : A leading tilde acts as a negation operator, causing the word's contribution to the object relevance to be negative. It's useful for marking noise words. An object that contains such a word will be rated lower than others, but will not be excluded altogether, as it would be with the - operator.
  • * : An asterisk is the truncation operator. Unlike the other operators, it should be appended to the word, not prepended.
  • " : The phrase, that is enclosed in double quotes ", matches only objects that contain this phrase literally, as it was typed.


Related Sites