There are several things you need to take into consideration when planning for how much storage you will need to allocate to SharePoint content.
There are a few basic ways to manage storage growth in SharePoint. First things first, set site quotas and alerts, we always recommend a 10 GB limit and 8 GB alert. This is going to let you stay on top of which sites are growing more quickly so you can plan future structure accordingly. Next is to monitor growth trends. Pay attention to how quickly your sites are growing, and then don’t forget to monitor the overall Content DB size. Finally, depending on growth, there’s a very good chance you’ll need to split content DB’s if they get too big. Now, what is “too big”? We’ll get to that in a minute, as there are several recommendations based on your concerns… First though, we’ll look at a couple of ways to control growth of content DBs…
Here’s a look at a very basic SharePoint architecture. Here we have the front end with the object model, and the SharePoint content (blobs and metadata) are stored in SQL.
This diagram maps out data growth over a period of time in a collaboration environment:Electronic data will continue to grow year over yearInactive data growth outpaces growth of active, operational dataHowever, most of the data is actually inactive or stale data, which is the area in between the two linesIn SharePointConsider your work on a design documentThrough drafts and edits, multiple versions are being created. Ex. Up to 32 versions of a document, if each document is 2.5 MB, full version history takes up 80MB.When document is approved, initial 30 versions no longer needed and can be archived awayConsider project sitesProject sites bring groups of people together to work on related documents, task lists, discussions, etc.Within each org, there could be hundreds of projects that get completed each year. But entire project sites still reside on SharePointAs inactive data continue to grow, resources required for current active data is saturated by inactive data. Users experience diminishing service levels (i.e. performance degradation)Additional hardware, servers, processing power may be needed, As databases continue to grow, this would also impact the current SLA’s for backup and recovery windows that are currently in place.
To optimize storage, we can essentially look at two major concepts. We already discussed earlier how BLOB’s don’t contribute to SQL queries, so essentially there’s no need to keep them in the database. So the first option is to move the BLOBs out of the database. The way to do this is to leverage Blob Services APIs. The second option is to Archive content, which currently there are no native tools for, so you’d have to look at a 3rd party.
So if we leverage EBS… this is how the SharePoint architecture would change. The provider sits with the SP object model, and gives SharePoint tokens or stubs so it knows how to retrieve the content and maintains the context of the content. The metadata is stored in SQL, BLOBs go to a storage location of your choosing. This is completely transparent to the end user.
However, there are some things to note about EBS. As it is implemented by SharePoint, there’s only 1 provider allowed per SharePoint farm. There’s a chance that you could run into orphaned BLOBs, and then there are also compliance concerns.
So now let’s take a look at RBS… As RBS is SQL specific, it can be used across applications that leverage SQL, not just SharePoint, so this gives you more of an enterprise-wide storage architecture versus EBS, and here’s how enabling an RBS provider would affect your SharePoint storage architecture. You can have an RBS provider per database. No context, no ability to manage the object
Natively with SharePoint 2010, MSFT offers a RBS provider, FILESTREAM. However, it does not recommend using this with very large databases in production. To leverage this feature, you’d have to 1, 2, 3, and then 4, so you would need admin privileges on SQL and Windows server. STORAGE LOCATION IS FILE SYSTEM ONLY!!
As with EBS, there are some things to note with RBS… one of the main benefits is the ability to mange RBS viaPowershell, which MSFT is highly encouraging the use of over STSADM, as I believe they’re eventually doing away with STSADM.
So again, looking that the two blob services available, which is better, with EBS we have tighter application integration, allowing for more rules and settings to determine which BLOBs are offloaded, and then you have RBS…
Which is simpler, and allows for a more unified storage architecture across applications, it’s not SharePoint specifid….
So once we leverage Blob Services APIs to offload BLOB out of SQL, these are the impacts that we’ll see relative to our previous concerns regarding Backup and Recovery, Performance, and Storage.