Omicron Llama

Coding all day, every day.

A Closer Look at a SharePoint SQL Query using the Developer Dashboard

Out of sheer curiosity and boredom, I’ve decided to have a poke at the huge query that is provided in the developer dashboard and shows up for nearly every page you load in SharePoint that has something to do with List Items.

The query in particular is the one that starts ‘DECLARE @DocParentRefForRF’, and when you open it up it appears horrendously huge and complex. The reason it’s larger than the others you’ll see is because the majority of SharePoint functions performed on the database are done using Stored Procedures (which then call custom functions) in SQL Server, which are pre-compiled for performance reasons.

This reason this query is not in a compiled, optimised stored procedure is (I can only assume) because SharePoint Lists have fields. And these fields can be added by users. These fields are not added as new fields in the database (in the database term of a field or column). Rather these fields are added as references in a table in the database that already have prepared columns set up for a number of data types for every list item. Infact, all list items in all lists are stored in a single database table in a content database. This means that this table is incredibly wide (by normal standards, but not so by SQL Server support standards. This RDBMS can support very wide tables).

Let’s first have a look at the full text, formatted by myself in a structure I hope is somewhat readable. This is pulled out when I opened a TeamSite homepage. All that’s on the page is a web part list view of the Shared Documents library. The query is getting the items from that library.

That’s a lotta code!

I’m not going to go through every single line, but to try to explain various groups, and infer what I think is going on with this query, and why it needs to be so complex.

The first line, which is what you see as the clickable link in the developer dashboard creates a SQL variable called DocParentRefForRF with a data type of uniqueidentifier, which is obviously a GUID.

There is then a query which sets this value to the GUID which identifies the particular list item we have open (in this case, default.aspx within the SitePages library). (As a bonus hint, compare this little query to when we look at this when we have a list full of unique item permissions! Also, see my other blog post on that topic).

The next query is pretty huge, and has a row limiter, shown by the TOP(@NUMROWS)part. This at first glance might make you think that this is the List View Threshold in action. It’s not. If you look at the bottom of the query, you see the parameters which have been passed into this query, in this case is 16. We’ll figure out what this means later.

Let’s have a try at figuring out what data is attempted to be brought back in this query.

A timestamp for the file (c0), the site-relative path to the current file in question (c11), the content type ID, some custom field (nvarchar4), the user that the file is checked out to, a couple more user added fields (c31c6, c32c3), the approval state of the item (tp_ModerationStatus), the publishing status of the item (Level), a reference to the permission levels for this file (ScopeId as c17), a Type (this is a flag which just says if the item is a file or a folder, c8), the filename (c1), a couple more user added fields (c31c3, c31c5), a flag indicating whether the file is checked out to the local drafts folder (c12), the person who edited the file last, the Created field (c23c7), SortBehaviour(?) (c14), another Created field (c31c7), an Item ID (c32c4), the Modified date, yet another custom added field (nvarchar3), and a couple of other Item IDs and and custom fields.

You’ll see that the table selectors for these fields within this Select statement come from 4 different tables – UserData, t1(Docs), t2 (AllUserData) and t3 (AllUserData). You’ll notice there are two references to AllUserData, and one to UserData. Basically, UserData is a View provided for backwards compatibility, and just points to AllUserData. I haven’t a clue if this view is cached for performance reasons (it probably isn’t), and I can’t confirm because my current CloudShare environmetn doesn’t have SQL Server Management Studio installed.

What’s important to note is that (All)UserData contains all the custom fields which relate to a list item. These are the fields that store anything which you can select in a view in the web interface (generally). (All)Docs stores all the metadata associated with a particular item. Stuff like Checked out status, version details, the actual URL to the item.

So now we know what kind of data is coming back from what tables. Let’s try next to decipher the joins. Note that the joins are all merged into a single inner join.

The first join binds the Docs view (AllDocs table) to the UserData view, simply matching up the  current item selected from both tables to produce a full picture of the item in question.

The next two outer loop joins act as secondary UserData (custom fields) bindings. They are almost identical except for the join clause which binds the person who last edited the file and who the file is currently checked out to (remember this is an outer join so it includes these rows in the result set if the result is null, which would be the case if the file is not checked out, making the CheckOutUserId field blank). The purpose of these joins appears to ensure that the most recent version of the item is retrieve.

The next part of the query is the Where section. This basically ensures that if you are the editor of the file, then you can see the draft version, otherwise you will only see the most recent published version. Here we can also see a couple of references to the DocParentRefForRF variable declared earlier. This makes sure you have permission to view the current publishing state of the item.

Finally we’ve got a couple of sorting operations to help things once the result set comes back to the COM object that made the call.

And that’s it. The Item Limit, as it turns out, as actually what is defined for the default view for the web part. Go into the Edit Web Part pane, edit the current view and you’ll see that the time limit is set to ’15’. I guess it throws another one in for luck, or to account for null return values or something.

Hope you found this geek fest into what a SharePoint query looks like. Might do another one in future. Maybe for a query against a list that has many fields added, just to see what it looks like.

For reference, have a look at these early schema details of the AllUserData and AllDocs tables:

AllUserData – http://msdn.microsoft.com/en-us/library/dd358229(v=prot.13).aspx
AllDocs – http://msdn.microsoft.com/en-us/library/dd304112(v=prot.13).aspx

Here’s also a glossary of terms used within the tables, for further reference:
http://msdn.microsoft.com/en-us/library/c6244ace-4199-450d-a215-d37fa591159d(v=PROT.13)

(Btw, I do like that prot.13 version number added to these files. Obviously early documentation and prototype documentation for SharePoint 2010 (at the time, version 13 before it was changed to 14 at Beta).

2 thoughts on “A Closer Look at a SharePoint SQL Query using the Developer Dashboard

  • Stu says:

    I can’t say I’ve ever had to use an OUTER LOOP JOIN, and yet SharePoint seems to use them all the time.

    In lists and sites of widely varying sizes, you’d think they’d perhaps let the specialised and dedicated SQL Server compiler figure out the best method for interpreting the query. 😐 Weird and wrong.

  • James Love says:

    Very true. Also the use of that View for alleged ‘backward compatibility’. Does a the use of a View really improve performance? Isn’t it just yet another step, as the View is really just a SELECT * FROM…

Leave a Reply

Your email address will not be published. Required fields are marked *