Saturday, June 9, 2007

Microsoft's implementation of Kerberos

I'm going to go on a technical rant for a moment, because none of this information seems to be accumulated in one spot anywhere.

You can read about how kerberos was developed by MIT, and how it's a fine idea and so forth at wikipedia. What I'm ranting about is how Microsoft implemented it in their active directory, and why it has a bunch of undocumented problems that large (or anally secure) organizations will run into eventually.

The first problem an organization will hit upon, the tip of the God damn iceberg, is that for some reason, some users in the environment will all the sudden no longer get their home drives. Strange things will happen with DFS mappings that work for everyone else. Group policies will cease to be enforced. If you have DCs in slow connection site (especially over vpn or other small packet traffic connections), the DCs will get out of sync.

You'll probably start to see issues when users are a member of around 130 groups. Yes, this includes default groups and nested group relationships. And distribution lists. Not really that hard to get there in any complex environment.

When this happens, fire whoever designed your active directory structure. Start over. You're headed into pain-in-the-ass land. If you can't (say you have SOX audits and other bullshit to deal with), then read on.

The reason you're having issues is that Microsoft by default sends the entire fucking kerberos authentication in one UDP packet. Jesus Christ, talk about stupid. As any network engineer (or even some non-network engineers) can tell you, UDP can't be fragmented. So when your kerberos packet gets too big, it stops working. The machine you're authenticating with will only get the info that fits into the first packet. This could mean you don't get access to an application you should, etc. I found it consistent that users with large kerberos tickets wouldn't get group policy applied at all. Home drives also stopped working. DFS permissions behaved oddly. Sometimes you'd only see 1/3 of the directories you should.

All this can be fixed by doing this: http://support.microsoft.com/kb/244474/en-us.

For the lazy, all machines (servers and workstations) will need this:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\Kerberos\Parameters

MaxPacketSize=dword:00000001
MaxTokenSize=dword:0000ffff

This should be put into the group policy that applies to every machine in your environment. All domain controllers, servers, workstations, etc. It's imperitive that all these machines can pass kerberos tickets using TCP instead of UDP.

Not so bad right? Hit a brick wall with authentication, so you patch or push a reg key change to every fucking machine and reboot your whole environment. Jesus, try to do it during Microsofts monthly patch cycle or something so your management team doesn't think you're a bunch of retards or something. I can't convey to you the look of disgust I got from management when I explained this problem (Microsoft won't call it a bug). A seriously lack of forward planning from the Microsoft development team here.

But I digress.

Next, you're start to run into more issues with any application that runs in IIS. This will crop up when users get into the 350 group membership range. It's variable, since the test of the group name and SID get dumped into your kerberos ticket.

This is a buffer over-run problem in IIS. Read up on http://support.microsoft.com/kb/820129 to get the gist of what is going on.

In a nutshell, you need to add a couple dword entries to the HTTP parameters key.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\HTTP\Parameters
MaxFieldLength=dword:0000fffe
MaxRequestBytes=dword:00800000

These numbers are probably higher than you need, but you're already going against IIS's security model by mucking with this, so who gives a shit at this point right? It's all about making things work.

So, once you deal with that. You'll probably also want to take a look at how SQL works with Kerberos authentication. You'll want to get familiar with the trusted for delegation check box in the AD properties of servers, and learn what the hell SPNs are and how to set them (SETSPN is a crappy tool, using ADSIEdit seems to work better).

Take a visit to http://support.microsoft.com/kb/319723/en-us to check out Kerberos authentication for SQL.

Check out this for more info on IIS http://support.microsoft.com/kb/324274/en-us.

Good luck in your overly complicated, pain in the ass to administer environment. If you're dealing with these issues, your organization is run by mutants. Best of luck.

Oh yeah, you probably want to take a visit to Microsoft's KB and check out the known issues with large Kerberos ticket sizes and Sharepoint. We've also had to apply the HTTP fixes to our Symantec Enterprise Vault servers since they do their previews for archived mail using HTML and IIS.

Fun stuff.

No comments: