13 02 2012
What The Hell Is A SharePoint Site Collection?
Here’s another introductory post about something that I always see confusion over when new users to SharePoint hear these terms – Site Collection and Web site. What the hell is the difference between the two?
Here’s my TL;DR answer (followed by a lengthy discussion):
SITE COLLECTIONS DO NOT STORE CONTENT
A site collection is just a container for a collection of web sites.
So why do we have a differentiation between that and a website and what’s so special about them?
A site collection groups together websites, and (perhaps most importantly) the security data for these websites. Every piece of ‘content’ in a website, and even the website itself stores information about who can and cannot access it (any developers reading this – SPWeb, SPList and SPListItem inherit a class called SPSecurableObject which provides the security functionality). Incidentally, each of these items also contains a reference to it’s ‘parent’.
A List Item must exist in a List, therefore it has a parent, and a List must exist in a site, there it has a parent. A Web Site also has a parent, as you can have nested websites. And here’s the interesting part.
To the Site Collection – all the websites within it are in a flat hierarchy (Programmers: SPSite.AllWebs). The ‘Parents’ of each website are irrelevant to the Site Collection itself – what it does care about is a single website that it denotes as its ‘Root Website’. There can only be one RootWeb within a Site Collection. This is also known as a Top-Level site in SharePoint (typically, developers will refer to it as the Root Web).
Now, it’s possible to ‘re-parent’ a web site in SharePoint by simply changing it’s URL. Let’s say we have a site collection at URL: ‘http://server/sites/’. If a website ‘subsite2’ exists at URL ‘http://server/sites/subsite1/subsite2’ then its parent is ‘subsite1’. If we simply changed the URL to ‘http://server/sites/subsite2’ then its parent is the Root Website of the site collection, which is at ‘http://server/sites/’.
So if every website has a parent, what about the Root Website of a site collection? It’s a website like any other therefore exposes a Parent, but what does it come back with? Well, URLs are everything when it comes to the hierarchy of a web site. When you ask for the parent, it simply looks for the ‘next level up’ in the URL. If it reaches the actual URL of the site collection, it will return itself.
So a Site Collection contains a flat hierarchy of websites, whose hierarchy is defined in themselves, and these websites store their own security data. What else?
Well, a Site Collection stores references to a bunch of information critical to any web site. This information includes, but is not limited to:
- Web Parts
- Users which have access to the website
- Sandboxed Solutions
- List Templates
One last point, is that a site collection and all its relevant data, and all the websites that it contains are all stored within a single content database. You can pile as many as you want (up to 250,000!) site collections in a content database, but you cannot split websites across content databases.