Omicron Llama

Coding all day, every day.

What The Hell Is A SharePoint Site Collection?

Here’s another introductory post about something that I always see confusion over when new users to SharePoint hear these terms – Site Collection and Web site. What the hell is the difference between the two?

Here’s my TL;DR answer (followed by a lengthy discussion):

SITE COLLECTIONS DO NOT STORE CONTENT

A site collection is just a container for a collection of web sites.

So why do we have a differentiation between that and a website and what’s so special about them?

A site collection groups together websites, and (perhaps most importantly) the security data for these websites. Every piece of ‘content’ in a website, and even the website itself stores information about who can and cannot access it (any developers reading this – SPWeb, SPList and SPListItem inherit a class called SPSecurableObject which provides the security functionality). Incidentally, each of these items also contains a reference to it’s ‘parent’.

A List Item must exist in a List, therefore it has a parent, and a List must exist in a site, there it has a parent. A Web Site also has a parent, as you can have nested websites. And here’s the interesting part.

To the Site Collection – all the websites within it are in a flat hierarchy (Programmers: SPSite.AllWebs). The ‘Parents’ of each website are irrelevant to the Site Collection itself – what it does care about is a single website that it denotes as its ‘Root Website’. There can only be one RootWeb within a Site Collection. This is also known as a Top-Level site in SharePoint (typically, developers will refer to it as the Root Web).

Now, it’s possible to ‘re-parent’ a web site in SharePoint by simply changing it’s URL. Let’s say we have a site collection at URL: ‘http://server/sites/’. If a website ‘subsite2’ exists at URL ‘http://server/sites/subsite1/subsite2’ then its parent is ‘subsite1’. If we simply changed the URL to ‘http://server/sites/subsite2’ then its parent is the Root Website of the site collection, which is at ‘http://server/sites/’.

So if every website has a parent, what about the Root Website of a site collection? It’s a website like any other therefore exposes a Parent, but what does it come back with? Well, URLs are everything when it comes to the hierarchy of a web site. When you ask for the parent, it simply looks for the ‘next level up’ in the URL. If it reaches the actual URL of the site collection, it will return itself.

So a Site Collection contains a flat hierarchy of websites, whose hierarchy is defined in themselves, and these websites store their own security data. What else?

Well, a Site Collection stores references to a bunch of information critical to any web site. This information includes, but is not limited to:

  • Web Parts
  • Users which have access to the website
  • Themes
  • Sandboxed Solutions
  • MasterPages
  • List Templates
Now each of these sets of information is made accessible via special lists that are only accessible from a site collections root website. These Lists do not exists in the same way that they do in any other website in a site collection, instead the ‘list’ is dynamically generated under what’s known as a virtual folder (_catalogs) on the server. SharePoint provides a familiar List interface for interacting and managing this critical metadata. 
Exceptions to the list of information above includes Users, but also Content Types and Site Columns. These objects are handled slightly differently as these objects can be associated to have a web site as a parent.  As these objects are tied to the site collection, they are restricted to that site collection and cannot typically be shared between. You can add a user to multiple site collections, but if you add her to one site collection, you need to add her manually to another site collection if she must have access to it. A notable exception to this is the Content Type Hub in SharePoint 2010.

One last point, is that a site collection and all its relevant data, and all the websites that it contains are all stored within a single content database. You can pile as many as you want (up to 250,000!) site collections in a content database, but you cannot split websites across content databases.

So there we have it. A Site Collection is a container for a collection of web sites, with important information that all of the websites use in some way, and a reference to one special website denoted as its Root Web.

One thought on “What The Hell Is A SharePoint Site Collection?

Leave a Reply

Your email address will not be published. Required fields are marked *