For the last few weeks I've been chasing a performance problem with web publishing on Forefront TMG. Specifically we have the following environment (which I've dramatically simplified for this blog post):
Now when publishing SharePoint with TMG there are a number of decisions to be made about the configuration - will you authenticate users with Forms Login or HTTP (NTLM, Basic) Authentication? Will you bother to pre-authenticate at all? And if you do, how will you get those credentials to the SharePoint servers without making the user log in twice (delegation) - we have lots of options.
For this environment it seemed to be a straightforward decision. The customer needed Kerberos authentication at the SharePoint servers - when you configure SharePoint for Kerberos, you choose the radio button marked "Negotiate (Kerberos)". So for delegation at the TMG that's exactly what we used too - Negotiate, with the Server Principal Name or SPN configured in the publishing rule. Ran a quick test - looked great; until it came time for load testing.
This is a relatively large environment, with up to 150,000 registered users and potentially 40% of those users active. So we figured the TMG farm handling this particular part of the environment might need to handle 5,000 concurrent users, and we set out to prove it could.
And it couldn't. Not by a long shot (indeed the first test maxed out at just 500 users). Gosh - 500 users caused an array of FOUR TMG servers (each with 4 CPUs!) to reach maximum throughput, and they were CPU limited!
We spent weeks (and so did Microsoft) troubleshooting. We fixed some issues with the load scripts, and we did some optimisation of the TMG ruleset, though it was still 30 rules for the environment. But the biggest change was one we didn't expect - authentication delegation.
Here's a graph showing the throughput achieved with each method (the SharePoint environment maxes out at around 2200 hits per second, for this test).
Check out the throughput of the Negotiate delegation - it's terrible! Kerberos and NTLM are pretty line ball, given it's an uncontrolled environment. We're still trying to understand why Negotiate is so bad - could it be environmental? Is there a limit we're hitting somehow? Or is this "just the way it is"?
The moral of the story? Fall back to NTLM if you want it to be simple. Upgrade all the way to Kerberos Constrained Delegation if you need to support multi-hop authentication and delegation. And don't use Negotiate.