(From Microsoft’s Azure Documentation) – Windows Server AppFabric is a set of integrated technologies that make it easier to build, scale and manage Web and composite applications that run on IIS.
Prior to cloud computing, there were only 3 options to storing your session data. InProc (in the runtime process’ memory), Out-of-Proc (SqlServer or StateServer). The out of proc option was usually used for high volume websites to accommodate server farms (sessions that need to persist across servers – need to be stored in an external medium such as a database).
With the advent of cloud computing, a 4th option was made available for storing session data. The AppFabric Cache. The AppFabric utilizes a distributed caching architecture which makes it highly scalable – and a worthwhile option to try for storing session data.
So – what could go wrong?
Well – one of the more frequent exceptions that our event log showed was:
ErrorCode<ERRCA0014>:Cache::PutAndUnlock: Object being referred to is not locked by any client.
What was going on? We kept seeing this error over and over again. Turns out, that the AppFabric Session State Provider uses a lock on the session data at the start of each request. It only releases the lock when the request is completed. If the request takes too long, it keeps the lock and never releases it. The request times out (based on your normal HttpRuntime setting) – causing the AppFabric to retry the request (and also throw an exception before retrying). So – essentially – a long running request manifests itself as an AppFabric cache exception. This, in itself, is not terrible. Except that all the ‘retries’ eventually end up exhausting IIS – and bringing the website to a crawl. This is what we were experiencing.
All the ‘retries’ on the part of the AppFabric were bringing IIS to a crawl.
Here is Microsoft’s official documentation on this session locking mechanism:
A lock is set on session-store data at the beginning of the request in the call to the GetItemExclusive method. When the request completes, the lock is released during the call to the SetAndReleaseItemExclusive method.
Workaround 1: Increase the request execution timeout : Now that we had an idea of why our sessions weren’t scaling (and why we were seeing all the exceptions in our event log), we asked Microsoft for some workarounds. Their only somewhat helpful suggestion was to increase the default Http request timeout (from 120 seconds to 300 seconds). This, in our opinion, was only a partial fix. It would not eliminate the exceptions entirely.
Workaround 1: HttpRuntime – execution timeout increase
Workaround 2: Store the session in sqlserver. Since the root of the exceptions was the session locking mechanism of the AppFabric cache, we decided to try not using that at all. Instead of storing sessions in the AppFabric cache, let us revert to the old, tried and tested SqlServer session storage (we had previously scaled up to 3000 simultaneous user sessions). In addition, we would isolate ourselves from any network latency issues (SqlServer would reside inside the intranet).
The actual steps in making this happen are surprisingly straightforward.
Step 1 – Create the supporting tables (to store session data) inside SqlServer.
aspnet_regsql.exe -S SampleSqlServer -E -ssadd -sstype p
Notes – The ‘-E’ lets you use integrated windows security (without it, you would need to specify a SQL Server username –U and password -P)
Step 2: Modify the web.config file to specify SqlServer session storage.
Am happy to report that, while they are still running more load tests, the application seems to be holding up well under a 10,000 user load. This is a 5 fold increase from the original 2000 that was bringing the website to a crawl.
The Microsoft Cloud offering (aka Azure aka AppFabric) is something to reckon with for 2 reasons:
a) Ease of configurability (I doubt if any cloud offering can make it any easier than tweaking app.config files)
b) Scalability and Performance – While there are a few things Microsoft needs to address (such as better handling of the AppFabric cache’s ‘session locking’), overall, if your application doesn’t have any slow, unresponsive pieces, then the AppFabric itself will enable you to scale in hitherto, impossible ways.