Brief History of Web Uploads
The upload support in the Microsoft web platform has come quite a long way since the early days of IIS. Unfortunately, a number of remaining limitations have made it difficult to offer a quality upload experience, especially for high-traffic sites.
As more and more websites rely on user generated content, and in general storing data in the cloud, the need to support reliable and scalable web uploads has become more important than ever.
Today, after almost a year of beta testing with some of the world’s largest web sites, LeanServer is announcing a breakthrough upload solution that finally enables fast, reliable, and scalable uploads to any ASP.NET, ASP, or PHP application on the Microsoft web platform.
To help explain our motivation for building it, I wanted to revisit the history of upload support in the platform - and explain the challenges we sought to resolve.
Brief History of Uploads
In the days of Active Server Pages (ASP), web applications did not have any built-in way to extract file uploads from the POST data sent by browsers. To implement this, one had to write a custom multipart/form-data parser or use one of the several third party COM plug-ins that did the parsing.
ASP.NET brought built-in support for accessing and saving uploaded files through the FileUpload control. ASP.NET itself synchronously preloaded the entire POST data in memory, and then parsed it to extract the files and other POST form values needed to populate the Request.Form collection/postback processing.
Unfortunately, customers complained that upload processing quickly lead to OutOfMemory exceptions due to the need to store the entire file contents in memory. This severely limited both the upload size and the scalability of uploads, as the CLR heap quickly fragmented and ran out of memory – sometimes after just a few 200Mb uploads.
ASP.NET 2.0 solved this problem by adding the Upload Disk Buffering feature (Incidentally I was the PM for when I started at Microsoft in 2003). This feature allowed the POST data to be saved to disk after it exceeded a configured threshold (80Kb by default), alleviating the need to store the entire upload in memory. After the entire POST body was preloaded to disk, ASP.NET’s form parser extracted the uploaded files and allowed them to be saved.
This helped a lot. Now, it was possible to both upload bigger files without running out of memory (up to the internal ASP.NET limit of 2Gb), and handle more concurrent uploads. (Interestingly, not everyone was aware of this feature – and much of the third party content on the web still leads people to believe that modern ASP.NET applications cannot handle larger uploads without memory problems).
Unfortunately, these improvements left a number of critical limitations:
Every upload consumed an ASP.NET worker thread. As the number of active uploads grows (uploads can take a long time), it leads to thread starvation and eventually complete deadlock for both upload processing and the rest of the ASP.NET application. Unfortunately, any third party ASP.NET module had the same problem.
Disk buffering had a very strong performance penalty. The Disk subsystem is the slowest part of a server, and concurrent uploads would often bring a server to a crawl by overloading the disk. Of course, without disk buffering, you’d be back to running out of memory.
There was no way to “stream” the upload, or report upload progress to the application, because ASP.NET fully preloaded the POST body to disk before processing it. This preload happened before the data was exposed to the application through any API, including HttpRequest.Filter and HttpRequest.InputStream.
Upload size was internally limited to 2Gb. Uploads were limited to 2Gb in ASP.NET, and 4Gb in IIS (for Content-Length uploads).
Finally, it’s important to note that all these problems did not just apply to file upload scenarios – but also to any web service or custom application handler that accepted POST data.
The ASP.NET community produced a number of third party upload plugins, which allowed “hacking” of the ASP.NET worker request to enabling a limited version of streaming, and added support for progress reporting.
Unfortunately, since these plugins had to be based on the same synchronous WorkerRequest.ReadEntityBody API that was responsible for thread exhaustion in the first place, they were not able to address the core problems around poor upload scalability, as well as the core IIS size limit.
Some third party ISAPI filters did have the capability to bypass the 4Gb limit, but not the thread exhaustion and scalability limits.
Then came Windows Server 2008 / ASP.NET 3.0 sp1, and the ASP.NET Integrated Pipeline. Although we had wanted to address some of the fundamental upload problems, they did not happen due to time constraints in what was already a super ambitious release. To add insult to injury, the architecture of the Integrated Pipeline also broke the streaming workaround that enabled third party features like progress reporting in previous versions of ASP.NET (although most still work in Classic mode).
Making it work
While IIS 7.0 did not solve the upload problem directly, it did for the first time provide enough flexibility to make it possible, although difficult, to address the aforementioned limitations. This flexibility combined with better support for application frameworks, such as PHP, this also created an opportunity to solve this problem correctly and consistently for all applications.
We at LeanServer took advantage of IIS 7.0 powerful architecture to create ScaleUP – a complete web upload solution that enables truly fast, reliable, and super-scalable uploads for any application on the Microsoft web platform. With ScaleUP, we aim to finally solve all of the problems that have plagued upload support until now, and turn any ASP.NET, ASP, or PHP application on the IIS 7.0 platform into an upload powerhouse.
You can read about ScaleUP on Enabling Fast, Reliable, and Super-Scalable Uploads to ASP.NET, ASP, and PHP Applications blog post and at www.leanserver.com/scaleup.