Quantcast
Channel: CSS SQL Server Engineers
Viewing all 339 articles
Browse latest View live

What SPN do I use and how does it get there?

$
0
0

This month has turned into another Kerberos Month for me.  I had an email discussion regarding SPN’s for SQL Server and what we can do to get them created and in a usable state.  I thought I would share my response to the questions as it will probably be helpful for someone.  Here was the comment that started the conversation.  And, by the way, this was actually a good question.  I actually see this kind of comment a lot in regards to SPN placement.  Not necessarily the setup aspect of it, but for SPN’s in general.

“In prior versions of setup we used to be able to specify the port number for the default and Named Instance.  Now, (SQL 2008 & R2) it takes the defaults.  1433 and Dynamic for Named Instances.

If you want to use Kerberos with TCP, you need to know the port number to create the SPN.  For Default instances, if you’re using 1433 then you’re ok. But, Named Instances listen on a dynamic port by default, and since you can’t set the port number, any SPN you create will probably be wrong and Kerberos won’t work.  It would be great if we could ask the user if they want to change the port number during setup, like we did with SQL 2000.”

Let’s have a look at Books Online first.

Registering a Service Principal Name
http://msdn.microsoft.com/en-us/library/ms191153.aspx

This article goes through the different formats that are applicable to SQL 2008 (they are the same for R2 as well).  It also touches on two items that are important to understand.  1.  Automatic SPN Registration and 2. Client Connections. Here is the excerpt from the above article in regards to Automatic SPN Registration.

Automatic SPN Registration

When an instance of the SQL Server Database Engine starts, SQL Server tries to register the SPN for the SQL Server service. When the instance is stopped, SQL Server tries to unregister the SPN. For a TCP/IP connection the SPN is registered in the format MSSQLSvc/<FQDN>:<tcpport>.Both named instances and the default instance are registered as MSSQLSvc, relying on the <tcpport> value to differentiate the instances.

For other connections that support Kerberos the SPN is registered in the format MSSQLSvc/<FQDN>:<instancename> for a named instance. The format for registering the default instance is MSSQLSvc/<FQDN>.

Manual intervention might be required to register or unregister the SPN if the service account lacks the permissions that are required for these actions.

What does this mean?  It means that if the SQL Service account is using Local System or Network Service as the logon account, we will have the permission necessary to register the SPN against the Domain Machine Account.  By default, the machine accounts have permission to modify themselves.  If we change this over to a Domain User Account for the SQL Service account, things change a little.  By default a Domain User does not have the permission required to create the SPN.  So, when you start SQL Server with a Domain User Account, you will see an entry in your ERRORLOG similar to the following:

2010-03-05 09:39:53.20 Server      The SQL Server Network Interface library could not register the Service Principal Name (SPN) for the SQL Server service. Error: 0x2098, state: 15. Failure to register an SPN may cause integrated authentication to fall back to NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies.

This permission is called “Write servicePrincipalName” and can be altered through an MMC snap in called ADSI Edit.  For instructions on how to modify this setting, refer to Step 3 in the following KB Article.  WARNING:  I do NOT recommend you do this on a Cluster.  We have seen issues with this causing connectivity issues due to Active Directory Replication issues if more than one Domain Controller is used in your environment.

How to use Kerberos authentication in SQL Server
http://support.microsoft.com/kb/319723

clip_image002

So, if I enable that permission, lets see what the SQL Service does.  I have two machines I’m going to use for this.  ASKJCTP3 (running the RC build of 2008 R2) and MySQLCluster (SQL 2008 running a Named Instance called SQL2K8).

SetSPN Details:

SPN's with TCP and NP enabled on Default Instance:

C:\>setspn -l sqlservice
Registered ServicePrincipalNames for CN=SQL Service,OU=Services,DC=dsdnet,DC=local:
        MSSQLSvc/ASKJCTP3.dsdnet.local:1433
        MSSQLSvc/ASKJCTP3.dsdnet.local

SPN's with only NP enabled on Default Instance:

C:\>setspn -l sqlservice
Registered ServicePrincipalNames for CN=SQL Service,OU=Services,DC=dsdnet,DC=local:
        MSSQLSvc/ASKJCTP3.dsdnet.local

SPN's with TCP and NP enabled on Clustered Named Instance:

C:\>setspn -l sqlservice
Registered ServicePrincipalNames for CN=SQL Service,OU=Services,DC=dsdnet,DC=local:
        MSSQLSvc/MYSQLCLUSTER.dsdnet.local:54675
        MSSQLSvc/MYSQLCLUSTER.dsdnet.local:SQL2K8

SPN's with only NP enabled on a Clustered Named Instance:

C:\>setspn -l sqlservice
Registered ServicePrincipalNames for CN=SQL Service,OU=Services,DC=dsdnet,DC=local:
        MSSQLSvc/MYSQLCLUSTER.dsdnet.local:SQL2K8

Lets look at what the client will do.  When I say client, this could mean a lot of different things.  Really it means an Application trying to connect to SQL Server by way of a Provider/Driver.  NOTE:  Specifying the SPN as part of the connection is specific to SQL Native Client 10 and later.  It does not apply to SqlClient or the Provider/Driver that ships with Windows.

Service Principal Name (SPN) Support in Client Connections
http://msdn.microsoft.com/en-us/library/cc280459.aspx

MSSQLSvc/fqdn

The provider-generated, default SPN for a default instance when a protocol other than TCP is used.

fqdn is a fully-qualified domain name.

MSSQLSvc/fqdn:port

The provider-generated, default SPN when TCP is used.

port is a TCP port number.

MSSQLSvc/fqdn:InstanceName

The provider-generated, default SPN for a named instance when a protocol other than TCP is used.

InstanceName is a SQL Server instance name

Based on this, if I have a straight TCP connection, the Provider/Driver will use the Port for the SPN designation.  Let’s see what happens when I try to make connections using a UDL file.  For the UDL I’m going to use the SQL Native Client 10 OleDb Provider.  Starting with SNAC10, we can specify which SPN to use for the connection.  This provides us some flexibility when we control how the application is going to connect.  Note:  This is not available with the Provider/Driver that actually ship with Windows.  I also will show what the Kerberos request looks like in the network trace.  This will show us, what SPN is actually being used.  All of these connection attempts were made using ASKJCTP3 which is a Default Instance.

Being this is a Default Instance, I added the Instance Name SPN manually.

C:\>setspn -l sqlservice
Registered ServicePrincipalNames for CN=SQL Service,OU=Services,DC=dsdnet,DC=local:
        MSSQLSvc/ASKJCTP3.dsdnet.local:MSSQLSERVER
        MSSQLSvc/ASKJCTP3.dsdnet.local:1433
        MSSQLSvc/ASKJCTP3.dsdnet.local
        MSSQLSvc/MYSQLCLUSTER.dsdnet.local:54675
        MSSQLSvc/MYSQLCLUSTER.dsdnet.local:SQL2K8

Straight TCP with no SPN Specified:

clip_image002[5]

58     1.796875   {TCP:7, IPv4:5}      10.0.0.3      10.0.0.1      KerberosV5    KerberosV5:TGS Request Realm: DSDNET.LOCAL Sname: MSSQLSvc/askjctp3.dsdnet.local:1433

TCP with specifying an SPN for the connection:

clip_image004

32     1.062500   {TCP:11, IPv4:5}     10.0.0.3      10.0.0.1      KerberosV5    KerberosV5:TGS Request Realm: DSDNET.LOCAL Sname: MSSQLSvc/ASKJCTP3.dsdnet.local:MSSQLSERVER

Forcing Named Pipes with no SPN specified:

clip_image006

68     1.828125   {TCP:21, IPv4:5}     10.0.0.3      10.0.0.1      KerberosV5    KerberosV5:TGS Request Realm: DSDNET.LOCAL Sname: MSSQLSvc/askjctp3.dsdnet.local

 

The way the provider/driver determines which SPN to use is based on the Protocol being used.  Of note, starting in SQL 2008 we allowed for Kerberos to be used with Named Pipes.  If you have a Named Instance and you are using the Named Pipes protocol, we will look for an SPN with the Named Instance specified.  For a Default Instance and Named Pipes, we will just look for the SPN with no port or Named Instance Name specified as shown above.

With the ability to specify the SPN from the client side, you can see how you can easily manipulate, or even see how we will determine what SPN will be used. 

Now that we know all of the above, lets go back to the original question.  Your company may or may not want to enable the Write permission for the Domain User Account.  If your company is not willing to open up the permission on the service account, then their only recourse will be to set a static port for the Named Instance instead of letting the Named Instance use a dynamic port.  This would also be my recommendation for Clusters.  In this case, you will need to know exactly what SPN’s are needed and create them manually using SetSPN or tool of your choice.

Even though we don’t provide the ability to set your port during setup, you can still modify the port settings for the Instance through the SQL Server Configuration Manager.  This will allow you to set your static SPN’s as well as assist you with Firewall rules.

image

image

Adam W. Saxton | Microsoft SQL Server Escalation Services

http://twitter.com/awsaxton


It helps to read the “What’s New…” once in a while

$
0
0

In the course of my job, I use ADPlus (a command-line tool that ships with Debugging Tools for Windows) to capture hang dumps on a regular basis.  For both low and high CPU scenarios, I generally need 2-3 hang dumps spaced out over about 10 minutes.  Historically, I have always given customers the command-line below and asked them to run it 2-3 times over 10 minutes:

adplus –hang –pn someprocess.exe –o somefolder

Most of the time, customers follow the instructions.  However, sometimes they forget my instructions or they hand my instructions off to someone else and they get garbled in the transition, etc.  The end result is that I don’t always get the series of dumps I need to properly analyze the problem.

ADPlus v7.01.002 to the rescue!!!

I was working on taking a hang dump of a process on my local machine the other day and happened to mistype the command-line.  As ADPlus is wont to do, it spit out all of the possible switches to the command-line.  Lo and behold, there was an “ADPlus Flash” section with a new switch that caught my eye:

-r <quantity> <interval in seconds> Runs –hang multiple times

image

Multiple hang dumps triggered from a single command-line?  Awesome!!  Even better, some testing demonstrated that all of the hang dumps go into one folder.  This is a better experience than having parallel folders with one for each execution of –hang.

I will be using this going forward to capture my serial hang dumps and I would recommend you do, too.

P.S.  Notice the release date for the new version of ADPlus?  2/27/2009.  That means it took me more than a year to notice this.  Sigh…

 

Evan Basalik | Senior Support Escalation Engineer | Microsoft SQL Server Escalation Services

Unable to load CLR assembly intermittently

$
0
0

Recently, I worked with a customer on an CLR assembly loading issue.

Intermittently, they would receive the following error.

Msg 10314, Level 16, State 11, Line 1
An error occurred in the Microsoft .NET Framework while trying to load assembly id 65537. The server may be running out of resources, or the assembly may not be trusted with PERMISSION_SET = EXTERNAL_ACCESS or UNSAFE. Run the query again, or check documentation to see how to solve the assembly trust issues. For more information about this error:
System.IO.FileLoadException: Could not load file or assembly 'helloworld, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies. Exception from HRESULT: 0x80FC80F1
System.IO.FileLoadException:
   at System.Reflection.Assembly._nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, Assembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection)
   at System.Reflection.Assembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, Assembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection)
   at System.Reflection.Assembly.InternalLoad(AssemblyName assemblyRef, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)
   at System.Reflection.Assembly.InternalLoad(String assemblyString, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)
   at System.Reflection.Assembly.Load(String assemblyString)

 

Our initial focus was on the database that has the CLR assembly.  Per KB http://support.microsoft.com/kb/918040, the database that has the CLR assembly and if the CLR assembly has external_access or unsafe permission set, SQL Server checks to ensure the dbo's sid is a valid sid in sys.server_principals and matches sys.databases.

We changed the database's owner to sa which is gauranteed to match.

But after that, customer continued to receive the error intermittently.    Thru debugging, we discovered that SQL Server also checks on the dbo's sid for database under which the query was compiled if the assembly is not loaded already.  As it turned out, customer had many databases.  they would write queries in those databases but also reference a centralized CLR resource database.   Some of the databases had mismatching or orphaned sid (resulting from restore).  If the CLR assembly isn't loaded, any queries referencing the CLR assembly from those databases will  raise above error as well.   Once assembly is loaded, the check won't be done.  Most of the other databases that use the CLR objects have valid sids for dbo.   So if it happens that a query references the CLR assembly runs first from those 'good' databases, it will trigger the CLR assembly to be loaded and everything will work.

Soltuion:
Ensure the dbo's sid on from every database matches sys.server_principals and sys.databases. We will update the KB mentioned above as well

A demo repro

 

1.       This repro uses standard sql login for ease of demonstration

2.       Configure your sql server to use both windows and sql authentication.

3.       Create a standard login as CLRLogin and grant this login permission so that it can create database

4.       Login as CLRLogin and Create a database named NonClrDB

5.       Backup the database NonClrDB

6.       Drop database NonClrDb and login ClrLogin

7.       Re-create  ClrLogin.  This will generate a different seed

8.       Restore database NonClrDb

9.       Logon as yourself with enough permission to create another database called ClrDb and set trustworhty on

10.   Use ClrDb and Run this script

CREATE ASSEMBLY [HelloWorld]

FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C0103001DA06C4B0000000000000000E00002210B010800000800000006000000000000CE2600000020000000400000000040000020000000020000040000000000000004000000000000000080000000020000000000000300408500001000001000000000100000100000000000001000000000000000000000007826000053000000004000009003000000000000000000000000000000000000006000000C000000002600001C0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E74657874000000D4060000002000000008000000020000000000000000000000000000200000602E72737263000000900300000040000000040000000A0000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000000E00000000000000000000000000004000004200000000000000000000000000000000B0260000000000004800000002000500742000008C0500000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000133002001000000001000011007201000070730F00000A0A2B00062A1E02281000000A2A42534A4201000100000000000C00000076322E302E35303732370000000005006C000000D0010000237E00003C0200007802000023537472696E677300000000B40400001C00000023555300D0040000100000002347554944000000E0040000AC00000023426C6F620000000000000002000001471402000900000000FA01330016000001000000110000000200000002000000100000000C00000001000000010000000200000000000A0001000000000006003E0037000A00660051000600930081000600AA0081000600C70081000600E60081000600FF00810006001801810006003301810006004E01810006008601670106009A0181000600C601B3013700DA01000006000902E90106002902E9010A006202470200000000010000000000010001000100100019000000050001000100502000000000960070000A0001006C200000000086187B000F00010019007B00130021007B00130029007B00130031007B00130039007B00130041007B00130049007B00130051007B00130059007B00180061007B00130069007B001D0079007B00230081007B000F0089007B000F0011007B00130009007B000F002000730028002E002B0032002E00130042002E001B0042002E00230048002E000B0032002E00330057002E003B0042002E004B0042002E005B0078002E00630081002E006B008A002D000480000001000000680E1E760000000000007000000002000000000000000000000001002E0000000000020000000000000000000000010045000000000000000000003C4D6F64756C653E0048656C6C6F576F726C642E646C6C0055736572446566696E656446756E6374696F6E73006D73636F726C69620053797374656D004F626A6563740053797374656D2E446174610053797374656D2E446174612E53716C54797065730053716C537472696E670048656C6C6F576F726C64002E63746F720053797374656D2E5265666C656374696F6E00417373656D626C795469746C6541747472696275746500417373656D626C794465736372697074696F6E41747472696275746500417373656D626C79436F6E66696775726174696F6E41747472696275746500417373656D626C79436F6D70616E7941747472696275746500417373656D626C7950726F6475637441747472696275746500417373656D626C79436F7079726967687441747472696275746500417373656D626C7954726164656D61726B41747472696275746500417373656D626C7943756C747572654174747269627574650053797374656D2E52756E74696D652E496E7465726F70536572766963657300436F6D56697369626C6541747472696275746500417373656D626C7956657273696F6E4174747269627574650053797374656D2E446961676E6F73746963730044656275676761626C6541747472696275746500446562756767696E674D6F6465730053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C697479417474726962757465004D6963726F736F66742E53716C5365727665722E5365727665720053716C46756E6374696F6E41747472696275746500000017480065006C006C006F00200057006F0072006C006400000000008895DD36598FB94681CF27B4C443A44E0008B77A5C561934E089040000110903200001042001010E04200101020520010111390420010108040100000004070111090F01000A48656C6C6F576F726C6400000501000000000E0100094D6963726F736F667400002001001B436F7079726967687420C2A9204D6963726F736F6674203230313000000801000701000000000801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F777301000000000000001DA06C4B0000000002000000590000001C2600001C08000052534453907F1962D03D9843AD482AF957AE4FF801000000493A5C63617365735C636C722E6C6F61645C48656C6C6F576F726C645C48656C6C6F576F726C645C6F626A5C44656275675C48656C6C6F576F726C642E70646200000000A02600000000000000000000BE260000002000000000000000000000000000000000000000000000B026000000000000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF250020400000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100100000001800008000000000000000000000000000000100010000003000008000000000000000000000000000000100000000004800000058400000380300000000000000000000380334000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE00000100000001001E76680E000001001E76680E3F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B00498020000010053007400720069006E006700460069006C00650049006E0066006F00000074020000010030003000300030003000340062003000000034000A00010043006F006D00700061006E0079004E0061006D006500000000004D006900630072006F0073006F0066007400000040000B000100460069006C0065004400650073006300720069007000740069006F006E0000000000480065006C006C006F0057006F0072006C0064000000000040000F000100460069006C006500560065007200730069006F006E000000000031002E0030002E0033003600380038002E00330030003200330038000000000040000F00010049006E007400650072006E0061006C004E0061006D0065000000480065006C006C006F0057006F0072006C0064002E0064006C006C00000000005C001B0001004C006500670061006C0043006F007000790072006900670068007400000043006F0070007900720069006700680074002000A90020004D006900630072006F0073006F0066007400200032003000310030000000000048000F0001004F0072006900670069006E0061006C00460069006C0065006E0061006D0065000000480065006C006C006F0057006F0072006C0064002E0064006C006C000000000038000B000100500072006F0064007500630074004E0061006D00650000000000480065006C006C006F0057006F0072006C0064000000000044000F000100500072006F006400750063007400560065007200730069006F006E00000031002E0030002E0033003600380038002E00330030003200330038000000000048000F00010041007300730065006D0062006C0079002000560065007200730069006F006E00000031002E0030002E0033003600380038002E00330030003200330038000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002000000C000000D03600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

WITH PERMISSION_SET = UNSAFE

 

 

go

 

CREATE FUNCTION [HelloWorld]

 

(

)

RETURNS nvarchar(4000)

AS

EXTERNAL NAME [HelloWorld].[UserDefinedFunctions].[HelloWorld]

 

go

 

11.   Use ClrDb and run select dbo.HelloWorld() and ensure it works.

12.   Now restart your server. Restarting is necessary because we want it to reload from database. Due to the step above, the CLR assembly is already loaded.

13.   Then use nonclrdb and run select clrdb.dbo.HelloWorld(). This will result in error 10314.  this is because the nonclrdb has mismatching sid

 

 

Msg 10314, Level 16, State 11, Line 1

An error occurred in the Microsoft .NET Framework while trying to load assembly id 65537. The server may be running out of resources, or the assembly may not be trusted with PERMISSION_SET = EXTERNAL_ACCESS or UNSAFE. Run the query again, or check documentation to see how to solve the assembly trust issues. For more information about this error:

System.IO.FileLoadException: Could not load file or assembly 'helloworld, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies. Exception from HRESULT: 0x80FC80F1

System.IO.FileLoadException:

   at System.Reflection.Assembly._nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, Assembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection)

   at System.Reflection.Assembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, Assembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection)

   at System.Reflection.Assembly.InternalLoad(AssemblyName assemblyRef, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)

   at System.Reflection.Assembly.InternalLoad(String assemblyString, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)

   at System.Reflection.Assembly.Load(String assemblyString)

 

 

 

 

 

 

 

Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support

How It Works: Bob Dorr's SQL Server I/O Presentation

$
0
0

I put a presentation together quite some time ago going over various SQL Server I/O behaviors and relating them to the SQL Server I/O whitepapers I authored.    I keep getting requests to post the presentation and the information is relevant from SQL 7.0 to SQL Server 2008 and beyond. 

Here are the RAW slides and my speaker notes.  You can read the reference materials outlined in details on the final sides for completeness.

imageThe original presentation was given to a group of support engineers and later to several customers (DBAs).
The goal of the presentation was to expose the attendee to the wide aspects of SQL Server I/O so they had a better understanding of what the system requirements are and how to troubleshoot common problems.

As you can see from the wide list of topics this presentation discusses a broad range of SQL Server I/O aspects. image

The WAL protocol for SQL Server is the ability to secure/harden the log records to stable media. SQL Server uses the File Flag Write Through option when opening the file (CreateFile) to accomplish this.
Hard/Stable media is deemed any media that can survive a power failure.   This could be the physical storage device or sophisticated, battery backed caching mechanisms.   As long as the I/O path returns successful write to the SQL Server it can uphold that guarantee.

SQL Server uses WAL protocol designs to accomplish durability of transaction.

When I whiteboard this slide I talk about commit and rollback and the impact of locks and latches. For example a latch is only used to protect the physical integrity of the data page while in memory. A lock is more of a virtual concept and can protect data that is no longer or not yet in memory.

Take the following as an example “update tblxxx set col1 = GetDate()” where the table is 1 billion rows.

Simplified process to discuss.

Begin transaction

Fetch Page

Acquire Lock

Acquire Latch

Generate log record

Update page LSN to match that of the log record

Change Data (dirty the page)

Release Latch

Fetch Next Page

Acquire Lock

Commit Transaction Starting

FlushToLSN of Commit

Release locks

Commit Transaction Complete

The entire table won’t fit into memory so lazy writer will wrote dirty pages to disk. In doing so lazy writer will first call the routine FlushToLSN to flush all log records up to and including the LSN stored in the page header of the dirty page. Then lazy writer, and only then, will it issue the write of the data page. If a power outage occurs at this point the log records can be used to reconstruct the page (rollback the changes).

Notice that the latch only protects the physical access to the in-memory data page only for the small about of time the physical change is being made. The lock protects the associated data until the commit or rollback takes place.

This allows lazy writer to remove a dirty page while the locking structures continue to maintain the transactional integrity.

The question I always ask here is if I issue a rollback instead of a commit would SQL Server fetch back in pages to undo the update? The answer is yes. The page could have been removed from buffer pool/page cache by lazy writer after the FlushToLSN took place. When the rollback occurs the pages changed by the transaction may need to be fetched into buffer pool to undo the change.

SQL Server 2000 used to use the log records to populate the INSERTED and DELETED tables in a trigger. Snapshot isolation is now used, internally, for any trigger instead of having to scan the log records. This means the version store (located) in TEMPDB is involved in INSERTED and DELETED table population and row version tracking. image

For whatever reason this is a confusion subject for many people. Make sure to spend time on this slide to make sure everyone understands sync vs async well.

I like to use a an example outside of the direct API calls before I dive into API behaviors. I often use the example of a package vs a phone conversation for a send and response paradigm to help explain the behavior a bit.

If you go to the post office and send a package contents of the package move to the designation in a single way. Once it is sent you can’t really cancel the action and you have to wait for the sender to receive the package and reply to you. It is a synchronization point or a sync type of activity.

Instead if you are in a phone conversation the traffic is two-way. In fact, many of us interrupt each other with answers and information before the full message even arrives at the other end of the conversation. You can hang up the phone to cancel the transmission or ask a customer service representative a question, get put on hold, do other things and then get an answer. This is closer to an asynchronous operation.

The more I do this presentation I think it might be better to compare sending a package vs sending an e-mail. You can send the e-mail and it goes out while you do other activities (reading other e-mails and being efficient with your time). You can later come back and check response, send location, etc…

Others have suggested I use the idea of hand writing a note vs sending the note to the printer. When had writing you are tied up (sync) and can’t do other things. When you send the note to the printer you can do other things (async) while the physical generation of the note takes place.

In Windows the I/O APIs allow sync and async requests. Sync requests are calls to the API such as WriteFile that will not return control to the calling code until the operation is complete. Async hands the request off to the operating system and associated drivers and returns control to the calling code. The calling code is free to execute other logic and later come back to see if/when the I/O completes.

SQL Server uses mostly Async I/O patterns. This allows SQL Server to write or read a page and then continue to use the CPU and other resources effectively. Take the example of a large sort operation. SQL Server can use its read ahead logic to post (async request) many pages and then start processing the first page returned by the request. This allows SQL Server to use the CPU resources to sort the rows on the page while the I/O subsystem is fetching (reading) in other pages at the same time. Maximizing the I/O bandwidth and using other resources such as CPU more effectively.

If you want to know more about Async processing study the Overlapped structure associated with I/O requests and HasOverlappedIOCompleted.

SQL Server also exposes the pending (async) I/O requests in the sys.dm_io_pending_io_requests DMV. I specifically point out that the column ‘io_pending’ is a key to understanding the location of the I/O request and who is responsible for it. If the value is TRUE it indicates that HasOverlappedIOCompleted returned FALSE and the operating system or driver stack has yet to complete the I/O. Looking at the io_pending_ms_ticks indicates how long the I/O has been pending. If the column reports FALSE for io_pending it indicates that the I/O has completed at the operating system level and SQL Server is now responsible for handling the remainder of the request.

Using Async I/O means that SQL Server will often exceed the recommended disk queue length depth of (2). This is by design as I just described with the read ahead logic as one example. SQL Server is more concerned with the number of I/O requests and average disk seconds per transfer than the actual queue depth.

There are a few places in SQL Server where async I/O does not make sense. For example, if you are writing to a tape drive the backup has to lay down blocks in order so the I/O request is sync and the thread(s)/worker(s) doing this activity are located on hidden schedulers to they don’t cause any scheduler stalls.

An advantage of async is the avoidance of forcing a SQL Server worker to stay in kernel mode and allows it to do other user mode processing like the sort activity I describe here. Thus, it reduces the number of kernel threads in wait states and allows SQL Server to better work with the operating system and the SQLOS scheduler design. image

In service pack 4 for SQL Server 6.5 scatter/gather I/O APIs started to be used. Prior to this a checkpoint would first sweep the buffer pool and locate all dirty pages for the current checkpoint generation and place them on a list in page id sorted order. The older design was attempting to write the pages in an order that was often close to on disk order.

One problem that SQL 6.x and previous builds had was elevator seeking. The drive(s) would often service the I/O requests closest to the drive head. So in some cases a single I/O could be stalled longer than expected. If this was a critical page in an index it could lead to unwanted concurrency stalls as checkpoint or lazy writer executed. Another problem was the need to have a separate list to maintain. Yet another problem was the number of I/O requests.

Scatter/Gather reduces addresses all of these issues nicely.

First we are able to remove the sweep and sort onto a list of dirty pages. Instead a new routine named WriteMultiple was added to SQL Server. Whenever write multiple is called it can look in the hash table for the next ## of pages and attempt to bundles a larger I/O request. In the diagram it shows the pages disbursed in physical memory but located next to each other on disk. Without scatter gather each of these data pages would require a separate I/O request to write the physically disbursed pages to disk. With the WriteFileGather the pages can all be part of a single I/O request.

This design removes the need to sort. All SQL Server has to do is sweep the buffer pool from position 0 to …. N and locate a dirty page for the database. Call WriteMultiple that will gather pages that would be in physical order next to it on disk that are also dirty and issue the write. SQL Server 2005 will gather up 16 pages with page numbers greater than the initial page requested and SQL Server 2008 can gather up 32 pages before or after the requested page that will make a contiguous write request.

By doing the sweep the writes are now out of order and are no longer as prone to elevator seek issues and are more efficient because the size of the transfers are larger with fewer transfer requests.

ReadFileScatter is used for reading pages into the buffer pool and performs the opposite operation. There is no longer a need to have a contiguous 8, 16, 32, 64, …K chunk of memory. SQL Server merely needs to locate 8K chunks of memory and setup the read request. During the read the data is placed in disparate locations in memory. Prior to the scatter request each of these would result in a separate I/O request.

Some SQL Server versions (Enterprise for example) will do additional ramp-up reads. When SQL Server is first started each page read is expanded to 8 pages so the buffer pool is populated quickly with additional pages near the pages that are currently being asked for. This can reduce the time required to return the buffer pool to a warm/hot state. Once the commit target it reached the ramp-up behavior is disabled. image

Sector size is used to write to the log. Versions of SQL Server before SQL Server 7.0 used data page sizes for the log records and the page could be re-written. This actually violates the intention DURABILITY. For example, you have a committed transaction that has FlushToLSN and released locks but the data pages have not been written to the data file. Another transaction issues a FlushToLSN and the same log page write occurs with the additional log records. If this write fails (bad sector or hardware failure) you may have lost information about the transaction that was previously considered committed.

The SQL 6.x design can be faster than the SQL 7.0 and later design because the same location on disk may be written several times but it is unsafe. The SQL 6.x design can also pack more log records, for smaller transactions, on the same set of sectors where SQL 7.0 and later builds will use more log (.ldf) disk space.

SQL 7.0 changed the logging algorithms to sector based. Any time the physical flush takes place it occurs on a sector boundary. The FlushToLSN will attempt to pack as many active log records into sector aligned boundaries and write on the sector size. The sectors each contain a parity bit that the log can use to detect the valid sectors flushed before a crash recovery.

Hardware manufactures typically maintain that the stable media has enough capacity to ensure a complete sector write when a power outage is encountered so the sector size is the most table choice.

Take the following example:

While(1 < 1000)

begin

insert into tblTest values (…)

end

SQL Server 6.x would keep writing the same log page over and over. SQL Server 7.0 and later builds will flush each insert so 1000 sectors are used. Many customers have encountered this and needed to understand the new behavior.

To correct the issue you should put groups of inserts into a single transaction. For this example if you wrap the entire loop in a begin / commit a single FlushToLSN is issued and all 1000 inserts are compacted into a handful of log sectors.

CAUTION: Wrapping transaction broadly can reduce concurrency so control break processing and transaction grouping is usually a better design than global wrapping.

Some newer drives can have sectors larger than 512 bytes. This is a not a problem for SQL Server but restoring a database between drives with different sector sizes can be prevented by SQL Server. The reason for preventing the move is to avoid the possibility of sector rewrites.

For example the database is created in a drive with a sector size of 512 bytes and (if allowed by SQL) restored to a 4096 byte sector size drive. SQL Server’s .ldf metadata and log file initialization is tracking on 512 byte boundaries. So it would continue to flush log records on 512 byte sectors. This could result in sector rewrites during subsequent flushes to the 4096 sectors and leave the database susceptible to log record loss.

Note that some drives with large sector sizes will report 512 bytes for backward compatibility and do the re-writes without the system knowing about it. You should validate the physical sector size vs reported sector sizes when using these new drives.

Block size and alignment comes up in support often before the NTFS changes in Windows 2008 to adjust the alignment to a better boundary.

The problem is often that the partition alignment ends at 63 – 512K sectors so every fetch and write of a 64K SQL Server extent results in 2 disk block touches. You want to avoid rewrites of a block just to handle the 64th sector and prevent stable media damage to the other 63 sectors. You also want to avoid the performance impact of the 2-for-1 operations.

Review any number of KB articles related to Diskpart/DiskPar and work with the hardware manufacture to make sure the proper block alignment is achieved. You can also look at the SQLIO.exe utility to help tune your I/O path for SQL Server.

Defragmentation is sometimes a good idea for SQL Server but generally not needed. I usually see benefit if the database shrinks and grows a lot so it would be releaseing and acquring physical sectors frequently. If the database size is fixed the sectors are acquired on time and usually in blocks.

Whenever you defragment a volume with SQL Server files be sure to take a SQL Server backup first and make sure the defragmentation utility is transactional. The utility must acquire new space, make the copy of the data and release the space in a transactional safe way so a power outage during defragmentation does not damage the SQL Server files. image

The LATCH protects the physical access to the in-memory database page. They are used in other areas of the server for synchronization but for I/O they protect the physical integrity of the page.

For example, when the page is being read into data cache there is no way to tell how much of the page is valid for reading until the I/O is fully complete. SQL Server associates a latch with every database page held in-memory. When a read of the page from the data file takes place an exclusive (EX) latch is acquired and held until the read completes. This prevents any other access to the physical page. (PAGE_IO*_LATCH) wait types are used when reading and writing pages and are expected to be long page latches (I/O speed).

This is different from the lock because multiple row locks can apply to the same page but only a single latch is used to protect the physical integrity. A (PAGE*_LATCH) indicates a latch is held on a page that is already in memory (not in I/O) and it should be held for only the time needed to modify some physical data on the page. This is considered a short latch and allows SQL Server to maintain row level locking in conjunction with the need for the specific physical change to be synchronized (one at a time) on the page itself.

The latch allows multiple readers but a single writer. So once the page is in memory it can be read (select for example) by 100s of sessions at the same time. SH (Shared) acquires don’t block each other. The behavior is the latch is basically FIFO and prevents live lock scenarios from occurring.

The latch is implemented in user mode and does not involve kernel synchronization objects. Instead it is built in conjunction with SQLOS to properly yield to other workers and maximize the overall resource usage by the SQL Server.

You can wait on yourself? Yes it is possible to wait on yourself and that behavior was always part of the latch design but only exposed starting with SQL Server 2000 SP4. In the case of a read the worker acquires a EX latch and posts (async request) the I/O. The worker goes off and does other work and later comes back to read the data on the page that it put in motion. It will attempt to acquire an SH latch on the page and if the I/O is still pending the original EX latch mode will block it. (Blocked on an I/O request you posted yourself.) When the I/O completes the EX latch is released and processing continues. The aspect of this to be aware of is that you don’t want large wait times for PAGE_IO_*_LATCH) types or it indicates SQL Server is using an I/O pattern that the hardware is not responding to in a timely fashion.

Many jump to the conclusion that if you see average disk seconds per transfer > 4ms or > 10ms you have an I/O bottleneck at the hardware. This may not be true. As you recall I earlier discussed that read ahead can post a deep number of I/Os. While you wan the average disk seconds per transfer to be small the PAGE_IO*_LATCH type is a good indicator of how well the sub-system is responding to the needs of SQL Server. The virtual file statistics DMV is another good place to determine how well the I/O sub-subsystem is responding to SQL Server requests.

Sub-latch is also referred to as super latch. These are only used for data cache page latches. They are designed to reduce the latching contention on hot pages. For example if you have a lookup table that is only a few pages in size but used by 100s of queries per second that SH latch activity is aggressive to protect the page. When SQL Server detects high rates of SH latch activity for a sustained period a buffer latch is PROMOTED to a sub-latch. A sub-latch partitions a single latch into an array of latch structures per logical CPU. In doing so the worker (always assigned to a specific CPU) only needs to acquire a SH on the sub-latch assigned to the local scheduler. This avoids interlocked activity and cache line invalidations across all physical CPUs. The acquiring of an SH sub-latch resource uses less resources and scales access to hot pages better.

The downside of a sub-latch is that when an EX latch is requested the EX latch must be acquired on EVERY sub-latch. When a sub-latch detects a series of EX requests the sub-latch can be DEMOTED back to the single latch mechanisms.

I have touched on reading a page on previous slides already and described the locks vs latching mechanisms. Now walk-through a page read in detail with audit checks and such.

When a worker needs to access a page is calls routines such as BufferPool::GetPage. The GetPage routine does a hash search looking for a BUF structure that already has the page in memory. If found the page is latched and returned to the caller. If not found the page must be read from disk.

Here is the simplest form of reading a page. SQL Server can read pages with read ahead, ramp-up and other logic but the following is the clearest for discussion.

Step 1: A request to the memory manager for an aligned (OS Page alignment 4K or 8K) 8K page is made.

Step 2: The page is associated with a BUF structure for tracking the page

Step 3: The EX latch is acquired to protect the page

Step 4: The BUF is inserted into the hash table. In doing so all other requests for the page use the same BUF and Page and access is currently prevented by the EX latch

If the entry is already in the hash table release the memory and use what is already in the hash table at this time

Step 5: Setup the I/O request and post (async I/O request) the I/O request.

Step 6: Attempt to acquire the requested latch type asked for. (This will block until the page read completes)

Step 7: Check for any error conditions that may be present for the page and raise an error if present.

Some errors result in additional activity. For example a checksum failure will result in read-retry behavior. Exchange and SQL Server have found that in some instances issuing the same read again (up to 4 times) can return the correct image of the page. The SQL Server error log will reflect that retries were attempted and successful or failed. In either case the retry messages should be taken as a sign of possible sub-system problems and corrected.

SQLIOSim.exe ships with SQL Server 2008 or can be downloaded. It mimics SQL Server I/O behavior(s) and patterns as well as adds random I/O patterns to the test passes. We have done extensive testing with the utility and it often will reproduce the same I/O problem(s) logged in the SQL Server error log independent from the SQL Server process or database files. Use it to help narrow a reproduction on a troubled system. CAUTION: SQLIOSIM can’t be used for performance testing as it can post I/O requests at depths of 10,000 or more to make sure the sub-system and drivers don’t cause blue screens when the I/O depth is stressed. Some drivers have caused blue screens and others don’t recover well. It is expected that the I/O response time will be poor but the system should recovery gracefully.

When the read completes it does not release the EX latch until audit activity takes place. (823, 824, 605 and such error condition checks).

The process of completing an I/O is a callback routine and can’t log an error so an error code (berrcode) is set in the BUF structure and the next acquire (Step 7 above) will check for the error and handle it accordingly.

•Check the number of bytes transferred

•Then the operating system error code

•Does the page in the page header match that expected from the offset (offset / 8K) - Some sub-system bugs will return the wrong offset (605 error)

•If PAGE_AUDIT is enabled check the checksum or torn bit information

•If the trace flag is enabled to perform dbcc audit a page audit is executed

Set the berrcode accordingly and release the latch. Compliant waiters of the latch are woken to continue the processing.

Revisit the PAGE_IO* vs PAGE_* latch meanings.

Writing a page is just pretty much like reading a page. The page is already in memory and the BUF status is dirty (changed). To write a page SQL Server always used the WriteMultiple that I discussed during an earlier slide.

Lazy Write – Clock sweeping the buffer pool to maintain the free lists. A buffer is found dirty and the time of last access shows the buffer can be aged so WriteMultiple is called on the buffer.

Checkpoint – A request to checkpoint a database is enqueued or requested. This can happen for various reasons (number of changes in the database would exceed recovery interval), backup is issues, manual checkpoint, an alteration requiring checkpoint. A sweep from ordinal 0 to max committed is done, locating the dirty pages associated with the specified database and WriteMultiple is called.

Eager Writes – During some operations (BCP, non-logged blob activity, …) pages are flushed during the transactional activity as they must be on disk to complete the transaction. These are deemed eager writes and WriteMultiple is used for these writes as well.

If you will recall WriteMultiple does not just write the requested page but attempts to build up a request for those pages that are dirty and ajacent to the page to reduce the number of I/O requests and increase the I/O request size for better performance.

To write the page a latch must first be acquired. In most cases this is an EX latch to prevent further changes on the page. For example the EX latch is acquired and the checksum or torn bits are calculated and the page is then written. The page can never change during the write or it will be come corrupted. In some cases you can think of an SH latch would prevent an EX latch from changing the page so why would an EX latch be required during the write and block readers. Take the torn PAGE_AUDIT protection as the example. The torn bit protection changes a bit on every sector. If read in this state it would appears as the page was corrupted. So to handle torn bit protection the EX latch is acquired, the write completes and the in-memory copy of the page removed the torn bit protection so readers see the right data. In most instances the EX latch is used but SQL Server will use an SH latch when possible to allow readers during the write.

Stalled/Stuck I/O: SQL Server 2000 SP4 added a warning that the I/O was taking too long and appears to be stuck or stalled. When an I/O request is posted (async) the time is kept (sys.dm_io_pending_io_requests) with the tracking information. Lazy writer checks these lists periodically and if any I/O is still pending at the operating system level (FALSE == HasOverlappedIoCompleted) and 15 seconds has elapsed the warning is recorded. Each file will report the number of stalls at most every 5 minutes to avoid flooding the log.

Since a normal I/O request should respond in ~15ms or less 15 seconds is way too long. For example if the I/O request is stalled for 30 seconds and the query timeout is 30 seconds it can cause query timeouts. If the stalled I/O request if for the log it can cause unwanted blocking situations.

If you are seeing these warnings you need to double check the I/O sub-system and use SQLIOSIM.exe to help narrow the problem. It can be anything from the configured HBA queue depth, multi-path failover detection mechanism, virus scanners or other filter drivers.

The Microsoft Platforms team can use ETW tracing facilities to help track down the source of the stalled/stuck I/O request.

In some situations the stall can result in scheduler hang situations (17883) for example. The slide shows a stack from a stuck I/O request. Remember that SQL Server I/O is mostly async so the call to WriteFile should be fast, just a hand-off. However, if a filter driver gets stuck before the I/O is considered posted at the (Interrupt Request Packet (IRP)) level the kernel call will appear as if the I/O request was sync. This is bad because the worker that is posting the async I/O is stuck in kernel mode and the logical scheduler is not progressing. SQL Server will detect this and issue the 17883 warning and capture a mini-dump.

image

I have touched on reading a page on previous slides already and described the locks vs latching mechanisms. Now walk-through a page read in detail with audit checks and such.

When a worker needs to access a page is calls routines such as BufferPool::GetPage. The GetPage routine does a hash search looking for a BUF structure that already has the page in memory. If found the page is latched and returned to the caller. If not found the page must be read from disk.

Here is the simplest form of reading a page. SQL Server can read pages with read ahead, ramp-up and other logic but the following is the clearest for discussion.

Step 1: A request to the memory manager for an aligned (OS Page alignment 4K or 8K) 8K page is made.

Step 2: The page is associated with a BUF structure for tracking the page

Step 3: The EX latch is acquired to protect the page

Step 4: The BUF is inserted into the hash table. In doing so all other requests for the page use the same BUF and Page and access is currently prevented by the EX latch

If the entry is already in the hash table release the memory and use what is already in the hash table at this time

Step 5: Setup the I/O request and post (async I/O request) the I/O request.

Step 6: Attempt to acquire the requested latch type asked for. (This will block until the page read completes)

Step 7: Check for any error conditions that may be present for the page and raise an error if present.

Some errors result in additional activity. For example a checksum failure will result in read-retry behavior. Exchange and SQL Server have found that in some instances issuing the same read again (up to 4 times) can return the correct image of the page. The SQL Server error log will reflect that retries were attempted and successful or failed. In either case the retry messages should be taken as a sign of possible sub-system problems and corrected.

SQLIOSim.exe ships with SQL Server 2008 or can be downloaded. It mimics SQL Server I/O behavior(s) and patterns as well as adds random I/O patterns to the test passes. We have done extensive testing with the utility and it often will reproduce the same I/O problem(s) logged in the SQL Server error log independent from the SQL Server process or database files. Use it to help narrow a reproduction on a troubled system. CAUTION: SQLIOSIM can’t be used for performance testing as it can post I/O requests at depths of 10,000 or more to make sure the sub-system and drivers don’t cause blue screens when the I/O depth is stressed. Some drivers have caused blue screens and others don’t recover well. It is expected that the I/O response time will be poor but the system should recovery gracefully.

When the read completes it does not release the EX latch until audit activity takes place. (823, 824, 605 and such error condition checks).

The process of completing an I/O is a callback routine and can’t log an error so an error code (berrcode) is set in the BUF structure and the next acquire (Step 7 above) will check for the error and handle it accordingly.

•Check the number of bytes transferred

•Then the operating system error code

•Does the page in the page header match that expected from the offset (offset / 8K) - Some sub-system bugs will return the wrong offset (605 error)

•If PAGE_AUDIT is enabled check the checksum or torn bit information

•If the trace flag is enabled to perform dbcc audit a page audit is executed

Set the berrcode accordingly and release the latch. Compliant waiters of the latch are woken to continue the processing.

Revisit the PAGE_IO* vs PAGE_* latch meanings.

Writing a page is just pretty much like reading a page. The page is already in memory and the BUF status is dirty (changed). To write a page SQL Server always used the WriteMultiple that I discussed during an earlier slide.

Lazy Write – Clock sweeping the buffer pool to maintain the free lists. A buffer is found dirty and the time of last access shows the buffer can be aged so WriteMultiple is called on the buffer.

Checkpoint – A request to checkpoint a database is enqueued or requested. This can happen for various reasons (number of changes in the database would exceed recovery interval), backup is issues, manual checkpoint, an alteration requiring checkpoint. A sweep from ordinal 0 to max committed is done, locating the dirty pages associated with the specified database and WriteMultiple is called.

Eager Writes – During some operations (BCP, non-logged blob activity, …) pages are flushed during the transactional activity as they must be on disk to complete the transaction. These are deemed eager writes and WriteMultiple is used for these writes as well.

If you will recall WriteMultiple does not just write the requested page but attempts to build up a request for those pages that are dirty and ajacent to the page to reduce the number of I/O requests and increase the I/O request size for better performance.

To write the page a latch must first be acquired. In most cases this is an EX latch to prevent further changes on the page. For example the EX latch is acquired and the checksum or torn bits are calculated and the page is then written. The page can never change during the write or it will be come corrupted. In some cases you can think of an SH latch would prevent an EX latch from changing the page so why would an EX latch be required during the write and block readers. Take the torn PAGE_AUDIT protection as the example. The torn bit protection changes a bit on every sector. If read in this state it would appears as the page was corrupted. So to handle torn bit protection the EX latch is acquired, the write completes and the in-memory copy of the page removed the torn bit protection so readers see the right data. In most instances the EX latch is used but SQL Server will use an SH latch when possible to allow readers during the write.

Stalled/Stuck I/O: SQL Server 2000 SP4 added a warning that the I/O was taking too long and appears to be stuck or stalled. When an I/O request is posted (async) the time is kept (sys.dm_io_pending_io_requests) with the tracking information. Lazy writer checks these lists periodically and if any I/O is still pending at the operating system level (FALSE == HasOverlappedIoCompleted) and 15 seconds has elapsed the warning is recorded. Each file will report the number of stalls at most every 5 minutes to avoid flooding the log.

Since a normal I/O request should respond in ~15ms or less 15 seconds is way too long. For example if the I/O request is stalled for 30 seconds and the query timeout is 30 seconds it can cause query timeouts. If the stalled I/O request if for the log it can cause unwanted blocking situations.

If you are seeing these warnings you need to double check the I/O sub-system and use SQLIOSIM.exe to help narrow the problem. It can be anything from the configured HBA queue depth, multi-path failover detection mechanism, virus scanners or other filter drivers.

The Microsoft Platforms team can use ETW tracing facilities to help track down the source of the stalled/stuck I/O request.

In some situations the stall can result in scheduler hang situations (17883) for example. The slide shows a stack from a stuck I/O request. Remember that SQL Server I/O is mostly async so the call to WriteFile should be fast, just a hand-off. However, if a filter driver gets stuck before the I/O is considered posted at the (Interrupt Request Packet (IRP)) level the kernel call will appear as if the I/O request was sync. This is bad because the worker that is posting the async I/O is stuck in kernel mode and the logical scheduler is not progressing. SQL Server will detect this and issue the 17883 warning and capture a mini-dump.

image

This is a myth that I have worked hard in dispelling. There was some wording in books online that was inacurate that lead people to believe that there are special threads for file on SQL Server. This is NOT the case. When doing I/O each worker is free to post and process I/O so there is not thread based throttle for file on I/O. SQL Server will do as much I/O as necessary across any worker thread.

There are some things that change this behavior.

The first is create database. Before instant file initialization was added to SQL Server the data files were zero’ed (zeros written to every byte in the file) when the file was created. In order to do this faster a set of workers is used. SQL Server 2005 and 2008 still use the concept of workers aligned per volume. When you create a database the workers are used to create the files. The zeroing is no longer needed on data files as long as instant file initialization is enabled but if not they will zero the contents of the data files. Log files are always zeroed. So for creation of the database each volume (by drive letter usually) does get a worker to create the files assigned to the volume. Once created any worker can do I/O on the file.

Also, the little used feature of I/O affinity creates special workers. The I/O affinity workers are workers assigned to specific CPUs for processing I/O requests. Whenever a standard worker would normally post an I/O request the request is intercepted and put on a list for the I/O affinity worker. The only job of the I/O affinity worker it to process the request queue(s) (read and a write queue) by posting (async I/O still applies) the request and processing the completion routine for the request on a dedicated CPU. I/O affinity requires an extreme amount of I/O to be flowing on the system (enough to keep a dedicated CPU busy with just posting and completion activities.) This is very rare and I have only seen a couple of servers even approach the need for this.

When I see someone evaluating I/O affinity I ask them to first look at the queries producing the I/O. What I find 99% of the time is that the queries need to be tuned or indexes added. That SQL Server has decided to do a large sort or parallel operation and the I/O activity is heavy but it does not need to be. Not only would I/O affinity be a poor choice but the amount of I/O is turning over the pages in data cache and impacting overall performance negatively.

One other issue with I/O affinity is that the log writer thread is placed on a separate scheduler. The log writer is generally located on scheduler 0 or 1 (based on start-up). When a log write is triggered the log writer is signaled to handle the activity. This means the log writer shares the scheduler with other workers. Since all it is doing is posting the log I/O and handling the completion of the I/O this is lightweight and on 99% of systems I have never seen this be an issue. Using I/O affinity can move the log writer to a dedicated scheduler but I have not seen a system this changed log write behavior on. The log write critical path is the I/O and this is 100x times slower than the CPU needed by the log writer.

image

The lazy writer is responsible for keeping the free lists prepared for the system, monitoring and handling the commit targets and such.

The latest builds of SQL Server have moved to Time of Last Access (TLA) algorithms to determine which buffers to eject from cache. The older algorithms were based on a reference counting scheme and TLA has been found to be more efficient.

The lazy writer works like a clock. It starts it sweep hand at buffer position zero and ticks every time is runs. The tick us usually 16 buffers. Looking over each buffer in the tick it finds those below the TLA threshold and handles removal from data cache. If the page is clean it can just put the buffer on the free list. If the page is dirty, WriteMultiple is used to FlushToLSN and write the page to the data base file. Once the write is complete the buffer is placed on the free list for use.

There are routines to help lazy writer (Routine is HelpLazyWriter). As each worker is allocating memory there are conditions that show the free list too low were any worker can perform a lazy writer tick. This makes SQL Server very adaptable to environment changes of the buffer pool as lazy write activity can be performed by dozens of workers when needed.

Once the clock hand has reached the committed level it is reset to the zero position and the tick behavior continues.

On hardware based NUMA system (soft NUMA sees memory as a single pool and SQL Server used a signal lazy writer) each node has a lazy writer that maintains the buffer pool portion assigned to the node. SQL Server divides the max server memory but the number of nodes and each node is treated equally. Since the goal of NUMA memory is to keep memory local and not remote a lazy writer is used on each node to maintain that goal. Keeping the free lists populated on the local node with local buffers increases performance. This means that queries running on the node have the advantage of keeping their activity within the node as much of the time as possible and they won’t flood other nodes with data cache populations. For example, if you start a dbcc checkdb you only want the local node to service the data cache requests. You don’t want every node get populated with the pages being read by the dbcc for the database as many of the other nodes could have more fundamental pages for servicing queries. By having a lazy writer per node the query is commonly contained within the node and the entire data cache does not become polluted (in a sense).

The I/O stall warnings are always checked and handled by the lazy writer on node #0. With some of the RTM builds of SQL 2005 this can cause false warnings on other nodes if the RDTSC timer(s) are not in sync. You can read about all the RTDSC issues on my block posts and move to later builds to use the multi-media timer that avoid the false reports.

Since the best performance is to use local memory on the local node checkpoint has been enhanced. Checkpoint will continue to sweep over the BUF structures but it assigns any writes of dirty buffers to the lazy writers on the respective nodes. This gives checkpoint the advantage of allowing local write activity but scaling up by using the lazy writer worker(s) on each node to assist it.

I briefly touched on checkpoint and recovery interval on previous slides. Checkpoint is commonly triggered by the changes in a database exceeding the target recovery interval. TO simplify the algorithm each log record is estimated to take ## of ms to recover assuming crash recovery and a cold data cache. When a log record is created the count of records since last checkpoint request is incremented and when the number of log records * estimated recovery time > recovery interval a checkpoint request for the database is enqueued.

Checkpoint can also be triggered by a backup or during shutdown of the service for example. SQL Server will attempt to do its best to have a clean database on startup (all transaction before shutdown of the service were flushed to log and data files.) This avoids the need for crash recovery processing and allows the databases to come online faster.

Checkpoint has been adjusted many times over various builds to accommodate various systems and requests from customers. The biggest change I have seen in checkpoint is the target mechanism. I was working with several customers where checkpoint would kick in and the I/O load would cause negative impacts on the overall system and concurrency. In SQL Server 2000 a change was proto-typed and eventually made in the later service packs (I think it was SP3 when first introduced but I would have to check my history on this fact). Checkpoint times the I/O requests and when latency grows larger than the latency target (~20ms) the number of checkpoint I/Os is reduced. The outstanding number of I/O (WriteMultiple) requests for checkpoint generally starts at 100. So checkpoint will attempt to keep the I/O depth for the checkpoint I/Os at 100 and less than 20ms response time.

It is pretty neat to watch the checkpoint rate and start a heavy copy to the same drive and watch checkpoint back down the number of outstanding requests to keep the I/O timing below the threshold. Allowing the timing to get above the threshold can cause a significant page (say the top most page in an index) to hold the EX latch and cause PAGE_IO*_LATCH waits for longer than expected. This shows up as potential concurrency issues when checkpoint is running as an example.

SQL Server 2005 and later versions has exposed the target timing as an option for manual checkpoint. You can issue a checkpoint on a database and tell it that it should take 5 minutes for example. The pace of the checkpoint is further altered to try to maintain the target for the full checkpoint of the database. In the RTM release of SQL 2005 it did not throttle if ahead of the pace as long as the I/O response was acceptable. Later service packs have altered the behavior and it may cause checkpoint to throttle (sleep) to meet the target.

Why would I ever use the throttle mechanism? Continuous checkpoint is the answer. If you disable the recovery interval or set it to a very high value the automatic checkpoints are few and far between. If you establish a startup procedure it can loop and issue checkpoint with the target and never stop. This would allow you to control the database check pointing on a continuous basis at a pace that is conducive to your needs. NOTE: When you disable the automatic check pointing this is for ALL databases so you must have a way to accommodate this with a startup stored procedure set or some job design. This is not recommended for general use or practice but it is an option.

Older builds need a fix and trace flag to make lazy writer and checkpoint work better when both are running so the I/O sub-system is not flooded with requests.

Notice that the slide points out the checkpoint queue. The automatic checkpoint process works from a queue so only one database is executing a checkpoint automatically at any given point in time. There is also a high level per database latch that serializes checkpoints for the same database. If an automatic is executing it holds the EX latch for the database checkpoint process and a manual checkpoint would wait to acquire the latch.

image

The PAE and Awe issue seems to confuse folks and on top of that why would it apply to an I/O conversation?

First of all you need to distinguish the PAE behavior of the operating system from the AWE API set. The boot.ini /PAE is what tells the 32 bit operating system to boot the extended kernel that handles 36 bit addressing of memory and allows access to more than 4GB of RAM. This is independent of AWE usage.

Notice that I mention independent of AWE because the AWE APIs can be used on a system that does not have /PAE enabled or on a 64 bit operating system. AWE is the ability to allocate physical RAM using the AWE API set that is not managed by Windows working set, page file and other memory operations. For us old timers you can think if it like the extended memory we used to have in DOS. The application is fully responsible for the AWE allocations. The operating system can not trim them, page them or otherwise touch them. As you will find from various sources it does not even show these allocations in common task manager output.

The key for AWE for SQL Server is that the only allocations that can be mapped and unmapped into the extended AWE address space are data pages. So using AWE on 32 bit extends only the data cache. It does not extend the procedure cache or other allocations for the SQL Server. Thus it can reduce the physical I/O requirement once data cache is populated.

To access a page that has been mapped out of common virtual address range (2GB or 3GB) the AWE APIs are used to map/unmap the area and access is again to the data is granted.

Why would SQL Server allow AWE on 64 bit. Well it really doesn’t. The sp_configure value is a no-op and should not be present. Instead if all the appropriate options are enabled for lock pages the AWE API is used by the buffer pool to do the allocations. This is because AWE allocates physical pages for SQL Server that can’t be touched by the windows memory manager for paging, working set trim and such operations. So you can think of it like under-the-covers use of the API to allow locked page behavior.

Why not use VirtualLock? If you read virtual lock carefully it is only a hint to the operating system and if a working set trim is needed it is an all or nothing activity and we don’t want SQL Server to get a 100% working set trim operation.

Note that on systems that are enabled for Hot Add Memory the /PAE behavior may be automatically enabled by the operating system.

NOTE: Windows 2000 and early Windows 2003 build had several issues with /PAE behavior and the QFEs are needed to prevent corruption of data pages and service termination of the SQL Server.

image

I have touched on read ahead behavior and how it drives the disk queue length > 2 and how ReadFileScatter is used to reduce the number of I/O requests and allow larger I/O transfers. I want to reiterate these facts, how the plan drives the decision and the power of async I/O.

Specifically point out that the read ahead depth for Enterprise SKUs is 1024 pages instead of 128 pages.

Mention that ramp-up is an 8 for 1 read during the initial growth of the buffer pool (until first time the commit target is reached) so help populate the buffer pool from a cold to a warm state faster.

Read-Over-Write: This is a behavior that many hardware manufactures did not expect but is used by SQL Server and the operating system page file routines. All the manufactures that I am aware of now test for the behavior to prevent stale reads.

While doing a read-ahead SQL Server wants to minimize the number of I/O requests. Let’s say we need to read in page 1 thru 8 but we find page 5 is already hash and in the data cache. We don’t want to issue a read for page 1, 2, 3 and 4 and another for page 6, 7, and 8. Instead we issue a read for pages 1 – 8 and we ignore page 5. The in-memory buffer for page 5 is only used during the physical read but it is not hashed (already a different buffer supporting page 5). When the read completes the buffer is put directly onto the free list and ignored.

The read-over-write can occur if a read-ahead is taking place at the same time the hashed buffer (5 in our example) is being written to the data file. At this point the data returned for page 5 from the read could be part old and part new. SQL Server knows this and is going ignore the page 5 read anyway so it does not matter.

The problem we saw several years ago is that the hardware cache was not expecting it and it would not properly invalidate the hardware cache with the new data. In some case none of the old sectors for the page was removed from hardware cache and in others only some sectors.

With some sectors we can detect checksum and torn bit failures. When the sectors all remain in-tact the checksum or torn bit protection is valid for the previous version of the page. So extend the example for the page 5 and assume that the write behavior was a lazy write so the page is removed from data cache. The next read will pull in the previous version of the page. More specifically I just deposited $500 in my bank account that got lost (lost write/stale read) as the next read never shows the transaction that was successfully committed.

When you extend this to replication or log shipping and restore you can get strange errors that the LSN on the page is not expected because the last change the log knows about is not seen on the page. (lost write/stale read).

The first test added to SQLIOStress – later SQLIOSim.exe was for stale read/lost write scenarios.

image

The introduction of database snapshots is nice for reporting as well as DBCC activity.

When a snapshot database is created (internal for DBCC or external with CREATE DATABASE for SNAPSHOT) copy-on-write behavior is established in the buffer pool. As I discussed earlier to dirty a page the EX latch has to be acquired. At this central location the copy on write decision can also be made. If the page is going to be dirtied a check is made to see if the page has been copied to the snapshot(s). If not a write to the snapshot takes place before the change to the page can be made.

This changes the behavior of a page modification. Instead of just the EX latch and the modification an I/O must first complete, the first time the page is dirtied after snapshot completion, for any page that is being changed. For this simple reason the snapshot I/O behavior needs to match that of a high speed SQL Server database file. Don’t place the snapshot on a less performing drive as it has direct impact on the production throughput.

In order to handle snapshot the internal file control block (FBCs) of SQL Server are chained to the parent database. This allows a copy-on-write to know which file(s) are associated with 1 or more snapshots setup on the database. It also allows queries on the snapshot to know the parent file.

When a select is executed on the snapshot the pages are retrieved from the snapshot files. If the page has not been copied to the snapshot (never dirtied on the parent database) the request is sent to the parent file control block to retrieve the page from the parent database. This allows the snapshot to remain sparse as it only has to maintain pages that have been dirtied on the parent database files.

Since the snapshot is a point-in-time operation the file sizes are fixed but marked with the sparse attribute so the physical space required is only for those pages dirtied and copied.

We have found a couple of NTFS bugs related to sparse files so refer to the PSSSQL blog for more spare file details and the latest patching information. There are also NTFS limitations on how large a sparse file can be which may require you to size the database files to accommodate the sparse file activity.

New page allocations in the parent are first copied to the snapshot. In a scenario where the page was allocated when the snapshot was created and later the table truncated the page is not moved to the snapshot. The page itself was not changed just the allocation information. So when the page is going to be reused for a new allocation in the parent database it must first be copied to the snapshot. So while you might not expect new allocations to cause physical usage in the snapshot they will.

DBCC used to reverse engineer the log records to fix-up the fact table information that changed during the dbcc scan. It now uses a internal snapshot (secondary stream on the data file). This allows DBCC to have a stable, point-in-time view of the database to build the fact tables from. This means you need as much space on the volume containing the database file as the number of pages that can be dirtied for the duration of the snapshot. If there is not enough space the dbcc must be run WITH TABLE LOCK to block activity while the fact tables are being built. The internal snapshot is removed when the DBCC completes. Crash recovery also removes any internal snapshots in the event that DBCC was active when a crash was encountered.

image

When you work with Microsoft CSS on corruption issues the term scribbler is often used. The idea is like a child coloring outside the lines of the picture. In terms of data cache and SQL Server memory it indicates that a component is making changes to memory that is does not own. For example

BYTE * pData = new BYTE[10];

delete [] pData; <----------------------- Memory can be reused for something else we no longer own it

memcpy(pData, pBuffer, 10); ß---------- Scribbler, just wrote to memory that it did not own. Some times an exception (AV usually) other times just damage that is not seen until the real owner tries to use the memory.

DANGER: If this memory is a database page in cache it could be flushed to disk with the damage and perminant corruption encountered.

SQL Server uses the checksum behavior to help prevent such a problem. When checksum PAGE_AUDIT is enabled the constant page checks are also enabled. Lazy writer will check pages in memory (as it handles clock ticks) and re-caclulate the checksum for pages that have not been dirtied. If the checksum is not correct an error is logged and the page removed from data cache, signaling a scribble has taken place.

If you suspect scribbling first check all 3rd party modules loaded in the address space. Many times a DLL or COM object is not thread pool safe and will be the source of the issue.

Tracking scribblers down can be difficult. This is where the Latch enforcement trace flag comes into play. With the trace flag enabled SQL Server will keep the page with a VirtualProtect of READ_ONLY and only toggle to READ_WRITE state when a modification latch is required. When the modification latch is released the protection is set back to READ_ONLY. Then if a scribbling code line attempt to write to the page they encounter an exception that is captured by the SQL Server stack dump logic and a mini-dump is generated showing the source of the issue. This works well for database pages but not stolen memory as stolen memory is always in a READ_WRITE state.

More frequent dbcc checks and page audit trace flag can help track down data page scribblers. At times working from a backup and the log backups can reply the issue and help track down the problem as well. SQLIOSIM.exe should always be used when corruption is at play to help rule out the I/O sub-system basic behaviors.

Stale Read: This has been a common problem with firmware and hardware caches that did not expect a read-over-write behavior. As I discuses in the read ahead section this can lead to all kinds of bad behavior. Staring with SQL Server 2000 SP4 we added a hash table that will check the LSN on the page against that in the write hash table when a page is read in and hashed into the buffer pool. This hash table is limited to the last ## of writes but is designed to catch when a write took place and the next read of that same page returns the wrong LSN.

SQL Server 2008 has also extended protections to the sort buffers in tempdb to better catch scribbles and stale read behaviors using similar design concepts.

Bit Flip: You may also hear the term bit-flip used to describe the type of corruption found. This is when a value is expected to be ## but when you look at the binary representation it is only off by one bit. A bit has been flipped from 1 to 0 or 0 to 1. This is often a scribbler scenario but we have also seen hardware issues. For example in one dump the ESP register was set to an ODD offset. This is not correct so we knew there was a CPU problem on the machine. We set affinity mask for the SQL Server scheduler to see where the error kept occurring and we could help identify the faulty CPU.

Bit flips are also common with bad reference counting. When you do an AddRef or Release it usually leads to an InterlockedIncrment or InterlockedDecrement activity (chaning the counter by 1) and these can look like bit flips for a stale object pointer.

New additions to extend checksum to the log as well as backup media help protect your SQL Server against corruption.

image

image

image

image

image

image

image

image

Other Blog Content

 ¨SQL Server Urban Legends Discussed
http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx

¨How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem
http://support.microsoft.com/kb/231619
¨How It Works: SQLIOSim - Running Average, Target Duration, Discarded Buffers ...
http://blogs.msdn.com/psssql/archive/2008/11/12/how-it-works-sqliosim-running-average-target-duration-discarded-buffers.aspx
¨How It Works: SQLIOSim [Audit Users] and .INI Control File Sections with User Count Options
http://blogs.msdn.com/psssql/archive/2008/08/19/how-it-works-sqliosim-audit-users-and-ini-control-file-sections-with-user-count-options.aspx

Additional Learning Resources

¨Inside SQL Server 7.0 and Inside SQL Server 2000, … 

  Written by Kalen Delaney

¨The Guru’s Guide to SQL Server Architecture and Internals – ISBN 0-201-70047-6

  Written by Ken after he joined Microsoft SQL Server Support

  Many chapters reviewed by developers and folks like myself

¨SQL Server 2005 Practical Troubleshooting ISBN 0-321-44774-3 – Ken Henderson

  Authors of this book were key developers or support team members

  Cesar – QP developer and leader of the QP RedZone with Keithelm and Jackli

  Sameert – Developer of UMS and SQLOS Scheduler

  Santeriv – Developer of the lock manager

  Slavao – Developer of the SOS memory managers and engine architect

  Wei Xiao – Engine developer

  Bart Duncan – long time SQL EE and now developer of the Microsoft Data Warehouse – performance focused

  Bob Ward – SQL Server Support Senior EE

¨Advanced Windows Debugging – ISBN 0-321-37446

  Written by Microsoft developers – excellent resource

¨Applications for Windows – Jeffrey Richter

  Great details about Windows basics

 

 

 

Bob Dorr – Principle SQL Server Escalation Engineer

How to troubleshoot database corruption errors and System Center…

$
0
0

What do these two subjects have in common? The SQL Server product team is currently developing an update to the SQL Server Management Pack specifically designed for System Center Operations Manager 2007 SP1. There are several enhancements and fixes in this update to the SQL Management Pack and it is due to release to the web in Q3 of this calendar year 2010.

This team approached myself and others in CSS and asked us to review what “database corruption” errors were being alerted as part of the management pack and whether the advice given matched that of CSS. What we found was let’s say “room for improvement”. A colleague of mine at CSS and a contributor to this blog Suresh Kandoth and I reviewed what errors were being evaluated and the recommendations. We cleaned these up to match what the code looks like for SQL Server 2008 and SQL Server 2008 R2. We also decided that some of the recommendations and internals about some of these corruption errors needed a refresh.

Therefore, in anticipation of this release, we published today a set of KB articles that provides a concise view of how CSS believes customers should troubleshoot these errors with some interesting facts and internals about them. I’ve listed here a link to all of the articles. These links actually will come up as recommendations for more information if you encounter an Operations Manager alert for the error when the new SQL Management Pack is available. However, we wrote these articles so they could be used by anyone who may encounter these errors whether as seen in the Event Log, ERRORLOG, or by an application. We believe we can expand even further on these and provide more information on each of them, but this is our first pass at providing our recommendations:

How to troubleshoot a Msg 823 error in SQL Server

How to troubleshoot Msg 824 in SQL Server

How to troubleshoot Msg 825 (read retry) in SQL Server

How to troubleshoot Msg 832 (constant page has changed) in SQL Server

How to troubleshoot database consistency errors reported by DBCC CHECKB

How to troubleshoot Error 3414 and a failed database recovery with SQL Server

How to troubleshoot Error 17204 and 17207 in SQL Server

How to troubleshoot Error 9004 in SQL Server

How to troubleshoot Msg 7105 in SQL Server

How to troubleshoot Msg 5180 in SQL Server

How to troubleshoot Msg 605 with SQL Server

We feel these errors represent the most common scenarios seen by customers for database corruption and database recovery. If you have a particular error or scenario you think is common let us know and we can investigate covering it. Look for the “How to troubleshoot…” title as a common way of finding these. I’m also investigating in the future how perhaps we can add the right wording to the articles so they can be seen as a group when searching on the web.

We would love to hear your feedback if you believe there is anything incorrect or something that needs clarification. Please send them to psssql@microsoft.com

 

Bob Ward
Microsoft

How It Works: Soft NUMA, I/O Completion Thread, Lazy Writer Workers and Memory Nodes

$
0
0

There seems to be some semantic(s) confusion on the books online description of SOFT NUMA.    The area of confusion is from the SQL Server 2008 Books Online section, shown below.

Soft-NUMA

SQL Server allows you to group CPUs into nodes referred to as soft-NUMA. You usually configure soft-NUMA when you have many CPUs and do not have hardware NUMA, but you can also use soft-NUMA to subdivide hardware NUMA nodes into smaller groups. Only the SQL Server scheduler and SQL Server Network Interface (SNI) are soft-NUMA aware. Memory nodes are created based on hardware NUMA and therefore not impacted by soft-NUMA. So, for example, if you have an SMP computer with eight CPUs and you create four soft-NUMA nodes with two CPUs each, you will only have one memory node serving all four NUMA nodes. Soft-NUMA does not provide memory to CPU affinity.

The benefits of soft-NUMA include reducing I/O and lazy writer bottlenecks on computers with many CPUs and no hardware NUMA. There is a single I/O thread and a single lazy writer thread for each NUMA node. Depending on the usage of the database, these single threads may be a significant performance bottleneck. Configuring four soft-NUMA nodes provides four I/O threads and four lazy writer threads, which could increase performance.

You cannot create a soft-NUMA that includes CPUs from different hardware NUMA nodes. For example, if your hardware has eight CPUs (0..7) and you have two hardware NUMA nodes (0-3 and 4-7), you can create soft-NUMA by combining CPU(0,1) and CPU(2,3). You cannot create soft-NUMA using CPU (1, 5), but you can use CPU affinity to affinitize an instance of SQL Server to CPUs from different NUMA nodes. So in the previous example, if SQL Server uses CPUs 0-3, you will have one I/O thread and one lazy writer thread. If, in the previous example SQL Server uses CPUs 1, 2, 5, and 6, you will access two NUMA nodes and have two I/O threads and two lazy writer threads.  

The confusion is centered on the use of memory node it a bit of a generic way and combining a general I/O comment and some beta/CTP behaviors.   Instead this section should make a clear statement about what buffer pool does with the memory and what the SQL OS considers a memory node for tracking.  

The SQL OS memory node is aligned for each physical NUMA node presented on the system.   This is regardless of the SOFT NUMA usage.   As stated in the books online documentation the Soft NUMA configuration is not allowed to cross physical NUMA (SQL OS Memory Node) boundaries.  Soft NUMA allows you to divide a physical node into logical nodes but you are not allowed to combine physical nodes into a logical node.  

There were some beta/CTP builds were Soft NUMA is enabled the buffer pool to treat all memory as a flat memory model (single node).  Instead of tracking and handling per, physical node information the buffer pool will treat all memory as if only a single, physical node exists.  Now the only way to tell buffer pool to treat all memory as a single node (all flat access) is to enabled trace flag 839 or 8015.

The I/O comment directly refers to the I/O completion port and thread that is created on a per logical node.   So you can configure soft NUMA to allow advanced TCP/IP bindings and each logical node receives a specific I/O completion port and managing thread.     Logical nodes do NOT receive additional lazy writer thread but the physical nodes do.

NOTE:  The I/O Completion threads in SQL Server 2005 and 2008 are designed to handle connection requests and TDS traffic.  They are NOT handling database, data and log file I/O operations.

The lazy writer thread creations are tied to the SQL OS view of the physical NUMA memory nodes.   So whatever the hardware presents as physical NUMA nodes will equate to the number of lazy writer threads that are created.  The trace flag 8015 tells SQL OS to ignore physical NUMA detection.   As with any trace flag this should be used with care as this trace flag reverts behavior to a pre-SQL 2005 logical state and is not recommended for production use.  

The following table outlines the expected behavior of SQL Server 2008.   The behavior can be slightly different on SQL Server 2005 installations. 

Physical Nodes

Logical Nodes (SOFT NUMA)

Buffer Pool Memory Nodes

SQL OS Memory Nodes

Lazy Writer Workers

I/O Completion Threads

1

Not Enabled

1

1

1

1

2

Not Enabled

2

2

2

2

1

2

1

1

1

2

2

4

2

2

2

4

4

2 (Invalid - Ignored)

4

4

4

4

Trace Flag 8015 Enabled

 

 

 

 

 

1

Not Enabled

1

1

1

1

2

Not Enabled

1

1

1

1

1

2

1

1

1

2

2

4

1

1

1

4

1

1

2

Bob Dorr - Principal SQL Server Escalation Engineer
George Reynya; Fabricio Voznika - SQL Server SQL OS Developers

The case of the additional indexes

$
0
0

I was assisting with a SQL Server performance issue the other day.  The issue was transactional replication was unable to keep up while trying to replicate data from a transactional database to a reporting database.  This was causing the customer to miss their data latency SLAs.  The oddest part of the problem was that replication to a test reporting database was perfectly able to keep up.  Since the CPU, I/O, and memory capabilities of the two servers were similar, we began to suspect that there were differences in the schemas of the two databases (test and production) even though they were ostensibly supposed to be the same.

Unfortunately, detecting schema differences between two supposedly identical databases can be fairly difficult.  You can go through the databases by hand looking for differences or you can script out the entire schema and then compare them.  However, both of these approaches are subject to error since a human being has to actually identify the differences.

The good news is that there is a version of Visual Studio that can help solve the problem – Visual Studio Database Edition.  This edition of Visual Studio has a really neat feature called Schema Compare.  Let me walk you through the steps involved in making a comparison:

1)  Open up Visual Studio and then go to File->New->Project

2)  From there, browse to Database Projects and then select the appropriate variant of SQL Server and then Database Projectimage

3)  Give your project a name and a location

4)  Once the project has been created, go to Data->Schema Compare->New Schema Comparison

image

5)  At this point, you are able to select both a source database and a target database.  In this case, I am going to select the development database (BlackAdept) and the production database (DSDB)

image

6)  Click OK and the two databases will be compared

Here’s a snippet of the differences:

image

As you can see above, I apparently have a stored procedure in my production database that differs in definition from my development database (see the red highlight above).  I guess I need to go back and see why they are different before I have problems.  :)

Looping back around to the original problem, we used the Schema Compare capability and found 8 (!!!) additional indexes in the reporting database.  The overhead of keeping these indexes updated was enough to keep the transactional replication process far enough behind that the data latency SLAs were being missed.  Removing these indexes allowed transactional replication to keep up, thus allowing the customer to meet their SLAs.

Evan Basalik | Senior Support Escalation Engineer | Microsoft SQL Server Escalation Services

An important change for the Microsoft Lifecycle Support Policy….

$
0
0

Over the past few months I’ve filed some posts on our blog regarding our support lifecycle policies because I know sometimes this topic can get very confusing:

http://blogs.msdn.com/psssql/archive/2010/02/17/mainstream-vs-extended-support-and-sql-server-2005-sp4-can-someone-explain-all-of-this.aspx

http://blogs.msdn.com/psssql/archive/2010/01/08/important-sql-server-and-windows-end-of-support-dates-you-should-know-about.aspx

One of the key points in these blogs is that when a service pack for a product hits its “end of life date” you must upgrade to a newer service pack to get technical support from Microsoft unless you purchase a customer support agreement. Today, the Microsoft lifecycle team is announcing a change to this policy and have blogged about it at this post: http://blogs.technet.com/lifecycle/archive/2010/04/13/end-of-support-for-windows-vista-rtm-and-recent-service-pack-support-policy-updates.aspx. You can read the official page at: http://support.microsoft.com/gp/newsplifecycle

Let me summarize these changes and explain what it means by using an example:

Today April 13, 2010 marks the end of support for SQL Server 2008 RTM (remember this just for the RTM version. Full support exists for SQL Server 2008 SP1). This includes any cumulative updates to the RTM version. Prior to this lifecycle change, this meant that if you called Microsoft Technical Support we would not be able to help you unless you had purchased a customer support agreement.

With this change, we will “now take your call” to help you but provide limited troubleshooting. What is limited? Here is how the lifecycle team explains it:

  • Break/fix support incidents will be provided through Microsoft Customer Service and Support; and through Microsoft’s managed support offerings (such as Premier Support).
  • There will be no option to engage Microsoft’s product development resources, and technical workarounds may be limited or not available.
  • If the support incident requires escalation to development for further guidance, requires a hotfix, or requires a security update, customers will be asked to upgrade to a supported service pack.

So you can expect us to help you find a solution or answer but that answer won’t involve engaging the product development team, a deep dive into root cause of your issue, new hotfix requests, or extensive troubleshooting techniques that could last for weeks. We haven’t established fixed time limits but as you can see our ability to find exactly the answer you want may be limited. There will be situations when we have to tell you that the only solution is to upgrade to a supported service pack level.

This does not affect customers who purchase a custom support agreement. These customers do not have the same limitations as described here so they are getting the extra benefit of full support.

I will continue to monitor any future updates to any of our support polices on our blog. Look for the tag called Support Policy.

 

Bob Ward
Microsoft


How It Works: Orphan DTC Transaction (Session/SPID = -2)

$
0
0

It looks like it would be a good post to help clarify that -2 does NOT mean ORPHAN.

_____________________________________________________________________________________

CURRENT EXCHANGE
_____________________________________________________________________________________

From: Robert Dorr
Sent: Tuesday, April 20, 2010 8:47 AM
Subject: RE: ONSITE:Orphaned Distributed Transactions

 

Let me clarify the term Orphaned.  A -2 is not Orphaned it means there are NO ENLISTED SESSIONS on the SQL Server but the transaction is active yet.   Let me give you can example.

 

Begin DTC Transaction with DTC Transaction Manager

Connect To SQL and enlist SPID 50     -     Transaction imported to SQL Server and communications established with the DTC Manager and session enlisted

T-SQL work done on SPID

Disconnect Session – Transaction still tracked by SQL to hold locks and such but no session enlisted so reporting now shows (-2)

               

What you have is an application that has done work in a DTC transaction against SQL and not committed or aborted it until a later point in time.

 

Sent: Tuesday, April 20, 2010 8:38 AM
Subject: ONSITE:Orphaned Distributed Transactions

 

The cx is seeing request_session_id of -2 in sys.dm_tran_locks and req_spid in sys.syslockinfo.

After the application of CU9 of SP2 they saw reduced occurrences of the orphaned tx. Now they are seeing transient -2’s that seem to clear themselves away.

Is this expected behavior? All documentation I could find seemed to indicate a -2 was an orphaned distributed transaction. Has the behavior changed to automatically ‘sweep’ the -2 away?

_____________________________________________________________________________________

PREVIOUS EXCHANGE
_____________________________________________________________________________________

From: Robert Dorr
Sent: Monday, June 09, 2008 2:35 PM
Subject: RE: SPID = -2

 

No, just bad wording.    SQL allows the DTC transaction to remain active as long as the DTC manager has the transaction active but it does not require a session.

 

Sent: Monday, June 09, 2008 2:28 PM
Subject: RE: SPID = -2

 

 

Thanks for the correction. In the BOL topic for the KILL command (ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/071cf260-c794-4b45-adc0-0e64097938c0.htm) it says:

 

Use KILL UOW to terminate orphaned distributed transactions. These transactions are not associated with any real session ID, but instead are associated artificially with session ID = '-2'. This session ID makes it easier to identify orphaned transactions by querying the session ID column in sys.dm_tran_locks, sys.dm_exec_sessions, or sys.dm_exec_requests dynamic management views.

 

Would this be just an instance of unclear/incomplete documentation? Do you have a better reference on this topic?

 

From: Robert Dorr
Sent: Monday, June 09, 2008 3:20 PM
Subject: RE: SPID = -2

 

Be careful (-2) does not mean orphaned.  It means you have an open DTC transaction managed by an external ITransaction interface but no Sessions currently using it.

 

Sent: Monday, June 09, 2008 2:13 PM
Subject: RE: SPID = -2

 

Yes, -2 is the SPID for orphaned MSDTC sessions. You may be able to get additional information about the process from the following DMVs:

sys.dm_tran_locks

sys.dm_exec_requests

sys.dm_os_waiting_tasks

sys.dm_tran_active_transactions

 

Another troubleshooting step is to enable MSDTC tracing as described in http://support.microsoft.com/kb/899115/en-us.

 

Sent: Monday, June 09, 2008 2:01 PM
Subject: SPID = -2

 

 

We have a query in a BizTalk process that it’s being blocked by a process with SPID = -2.

 

There is no process listed with a negative SPID.

 

What does SPID -2 mean and how can we find more information about the blocking process?

 

I have found comments about SPID -2 being related to DTC transactions. It’s that the only case where spid -2 is used?  What’s the best practices recommended to deal with this type of issue?


Bob Dorr - Principal SQL Server Escalation Engineer

AWE Allocated Values Reported Incorrectly (Large or Negative Value)

$
0
0

I ran into an issue today that is documented but you have to know where to find it so I wanted to point it out.

In the middle of a lengthy KB article # 907877 (http://support.microsoft.com/kb/907877) the the following comment.

"In a NUMA-enabled system, this value can be incorrect or negative. However, the overall AWE Allocated value in the Memory Manager section is a correct value."

I received this output from an customer today.   When I investigated it I found a known issue that was not corrected in SQL Server 2005 that allows a calculation to rollover a LONG value but prints out a LONGLONG value.  So depending on the calculation it may a negative or an unexpected large value in some instances.

SQL 2008: I did find a source check-in for SQL Server 2008 that corrects the behavior.

WORKAROUND: You can look at the 'Memory Manager' AWE Allocated to get the correct AWE Allocated value instead.

WS 2003 Enterprise x86

Total Physical Memory  36,350.82 MB

SQL Server 2005 x86 Enterprise

MEMORYCLERK_SQLBUFFERPOOL (Total)                                 KB                

---------------------------------------------------------------- --------------------

VM Reserved                                                                  1506552

VM Committed                                                                  155612

AWE Allocated                                                             4309540864

SM Reserved                                                                        0

SM Committed                                                                        0

SinglePage Allocator                                                               0

MultiPage Allocator                                                             7752

 

My math might be wrong here but AWE Allocated 4309540864 = ~4 Terabytes!??

 

Here is the BPOOL breakdown for each node.  Node 0 looks *more correct* but not really sure if I can trust it.

 

(7 rows affected)

MEMORYCLERK_SQLBUFFERPOOL (node 0)                                KB                

---------------------------------------------------------------- --------------------

VM Reserved                                                                  1506552

VM Committed                                                                   86704

AWE Allocated                                                               17399808                = 16 GB

SM Reserved                                                                        0

SM Commited                                                                        0

SinglePage Allocator                                                               0

MultiPage Allocator                                                             7752

 

(7 rows affected)

MEMORYCLERK_SQLBUFFERPOOL (node 1)                                KB                

---------------------------------------------------------------- --------------------

VM Reserved                                                                        0

VM Committed                                                                   68908

AWE Allocated                                                             4292141056 = ~4 TB

SM Reserved                                                                        0

SM Commited                                                                        0

SinglePage Allocator                                                               0

MultiPage Allocator                                                                0

Bob Dorr - Principal SQL Server Escalation Engineer

Europe PASS, a volcano, Live Meeting, and SQL 2008 R2 BPA….

$
0
0

What do these have in common? Sounds like a question on Jeopardy. There is some bad and good news as part of this.

Bad

Adam Saxton and I were not able to travel to Europe this past week to speak at Europe PASS.  A little volcano in Iceland got in the way. We tried our best even securing a flight to Barcelona to hopefully get there by train but those flights eventually got canceled as well.

Good

  • We were able to use the wonders of Live meeting and a webcam to successfully present our pre-conference seminars yesterday (or was it the day before? Right now I’m kind of on European time). So far the feedback is that it was still successful despite the fact that we could not be there. Adam gave another talk this morning US time and I’m due to give my main con talk tomorrow.
  • Both of us as part of our preconference talks announced the upcoming availability of a new Best Practices Analyzer (BPA) for SQL Server 2008 R2

The new SQL Server 2008 R2 BPA is based on the new Microsoft Baseline Configuration Analyzer (MBCA) v2.0. We hope to release the free downloadable package to the web in early Summer of 2010. Here are a few facts about the new BPA and some sneak peek at screenshots (the details are subject to change):

  • Supports both SQL Server 2008 and SQL Server 2008 R2
  • Free download from the web.
  • Runs on various operating systems such as Vista, XP, Windows Server 2003, Windows Server 2008, and Windows Server 2008 R2
  • Supports GUI and command line execution
  • Supports remote execution through remote Powershell 2.0
  • SQL Server 2005 rules converted where appropriate
  • ~150 rules in the first release spanning engine, security, replication, SSAS, SSRS, Servicing, and SSIS

Here are a few screenshots (the overall look and feel will be like this but the details as shown here are subject to change before we release):

image

image

image

image

Look for more posts on this blog with the tag SQLBPA in the coming months as we move toward ship.

 

Bob Ward
Microsoft

Error 18056 can be unwanted noise in certain scenarios

$
0
0

I saw a lot of hits on the web when I searched for the Error message 18056 with State 29. I even saw two Microsoft Connect items for this issue filed for SQL Server 2008 instances:

http://connect.microsoft.com/SQL/feedback/ViewFeedback.aspx?FeedbackID=468478

http://connect.microsoft.com/SQLServer/feedback/details/540092/sql-server-2008-sp1-cu6-periodically-does-not-accept-connections

So, I thought it was high time that we pen a blog post on when this message can be safely ignored and when it is supposed to raise alarm bells. Before I get into the nitty-gritty details, let me explain under what condition is 18056 raised with the state = 29.

Most applications today make use of connection pooling to reduce the number of times a new connection need to be opened to the backend database server. When the client application reuses the connection pool to send a new request to the server, SQL Server performs certain operations to facilitate the connection reuse. During this process (we shall call it Redo Login for this discussion) if any exception occurs, we report an 18056 error. The state numbers like the famous 18456: Login Failed error message give us more insight into why the Redo Login task fails. State 29 occurs when there is an Attention received from the client while the Redo Login code is being executed. This is when you would see the message below which has plagued many a mind till date on SQL Server 2008 instances:

2009-02-19 04:40:03.41 spid58 Error: 18056, Severity: 20, State: 29.

2009-02-19 04:40:03.41 spid58 The client was unable to reuse a session with SPID 58, which had been reset for connection pooling. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

Is this a harmful message?

The answer that always brings a smile to my face: It depends! The dependency of this error message being just plain noise to something that should send all the admins in the environment running helter-skelter can be summarized in one line.

If the above error message (note that the state number should reflect 29) is the only message in the SQL Server Errorlog along with no other errors noticed in the environment (connectivity failures to the SQL instance in question, degraded performance, high CPU usage, Out of Memory errors), then this message can be treated as benign and safely ignored.

Why is this message there?

Well our intentions here were noble and we didn’t put the error message out there to create confusion. This error message is just reporting that a client is reusing a pooled connection and when the connection was reset, the server received an attention (in this case, a client disconnect) during the connection reset processing on the server side. This could be due to either a performance bottleneck on the server/environment or a plain application disconnect. The error message is aimed at helping in troubleshooting the first category of problems. If you do see some other issues at the same time though, these errors may be an indicator of what is going on at the engine side.

What should you do when you see your Errorlog bloating with these error messages?

a.       The foremost task would be to scan the SQL Errorlog and determine if this error message is accompanied before/after by some other error message or warning like Non-yielding messages, Out of Memory (OOM) error message (Error 701, Failed Allocate Pages etc.).

b.      The next action item would be to determine if there is high CPU usage on the server or any other resource bottleneck on the Windows Server. Windows Performance Monitor (Perfmon) would be your best friend here.

c.       Lastly, check if the Network between the Client and Server is facing any latency issues or if network packets drops are occurring frequently. A Netmon trace should help you here.

 

Tejas Shah

Escalation Engineer - Microsoft

Follow-up to Questions from Europe PASS 2010….

$
0
0

I thought i would post some answers to questions I received during my pre-conference seminar at Europe PASS 2010:

Q: You said that trace flag 2528 disables parallelism for DBCC CHECKDB. Is there any way to force CHECKDB to use a parallel threads?

A: No. If the trace flag 2528 is not enabled and ‘max degree of parallelism’ is 0 or > 1, then the engine can decide to use parallel threads to scan the information required for DBCC CHECKDB. How do you know? Well, one way is to see if you see CXPACKET waits for the request. Another is to see if more than one task in sys.dm_os_tasks shows up for your session running CHECKDB. But there is no trace flag to force a CHECKDB to run in parallel (like you force a MAXDOP query hint for a query).

Q: You mentioned that for a deferred transaction a session might get blocked by a session_id = –3. What other “negative” session ids exist?

A: There are only two others:

-2 = Active DTC transaction with no enlisted sessions. Often called an “orphaned” DTC transaction. This term may not be the best for this situation. See Robert Dorr’s recent blog on this at http://blogs.msdn.com/psssql/archive/2010/04/20/how-it-works-orphan-dtc-transaction-session-spid-2.aspx

-4 =  In the code this means “latch in transition”.  You should only see this in the “blocking session” such as the blocking_session_id column of sys.dm_exec_requests. So what the heck does this  mean? Well, when we are trying to show who owns a latch you are waiting on, we use an internal mechanism called a spinlock to find the owner of the latch. If you wait on a spinlock for a long time you can chew up CPU so we use the spinlock here in a “minimal” mode as to not use up much CPU. This means that we may not be able to acquire the spinlock to tell you the true session_id that owns the latch. When this happens we simply mark it as –4. These should be very rare situations and I don’t expect actually you will ever see –4, but if you do the wait_type should be some type of latch.

After reading this you may be asking why isn’t there a –1 SPID value? Well there used to be. Prior to SQL 7.0, SPID=-1 was reserved for a special case designating an orphaned lock. It only showed up in defects found with the product such as this article: http://support.microsoft.com/kb/216370

Q: You said in the talk that -m<app name> would limit the SQL Server to single user mode and only allow a single connection from the Application  name as listed in the parameter. Does this syntax allow for Application Names with embedded spaces?

A: Yes it does. Let’s say you want the server to start in single user mode and only allow the Query Window of SSMS to connect. You would start the server like this: (In my example I have a named instanced called sql2008)

net start mssql$sql2008 /m”Microsoft SQL Server Management Studio – Query”

Any attempt by an application other than the Database Engine Query feature of SSMS will be denied a connection to the server

Q: How do I drop the statistics or indexes that seem to get left around if Database Tuning Advisor has a problem?

A: We actually have this one documented in our Books Online. Read up on how to find these objects at:

http://msdn.microsoft.com/en-us/library/ms190172.aspx

There are all types of examples in various blogs on the web to write a script to automatically detect and delete these. If DTA is closed properly this should never be a problem but if you close the application unexpectedly it is possible for some of these to be left around.

Bob Ward
Microsoft

RS Content Types and SharePoint 2010

$
0
0

When creating a new document library within SharePoint you have a few options.  You can just create a new Document Library, or you can go to More Options and choose Report Library.

 

image[6] image[8]

 

When you choose Report Library, by default, it will allow you to create a report, but still will not have all of the Reporting Services Content Types.  If you go to the Library Settings, you will then have an option to add the Content Types.

image

image

However, if you choose a regular Document Library and go to Library Settings, you’ll notice that you don’t have an option for Content Types.  To get the Content Types area to appear in Library Settings, you’ll need to go to Advanced Settings within Library Settings and allow for management of content types.  This should be the first setting within Advanced Settings.

image

image

Once that is enabled, you will be able to add the Reporting Services Content Types to a regular Document Library.

Of Note, with SharePoint 2010 and the 2008 R2 Add-In, if you create a site with the Template of “Business Intelligence Center” through the Central Admin, we will default the Reporting Services Content Types to on for the “Site Collection Documents” Library.  Any new Libraries will still need to go through the steps above to get the Reporting Services Content Types enabled.

image


Adam W. Saxton | Microsoft SQL Server Escalation Services

http://twitter.com/awsaxton

Going dark next week…..


Don’t touch that schema!!!

$
0
0

You know how every product that has an underlying database has documentation that says not to modify the schema?  Do you always pay attention to that warning?

If your product is Reporting Services, I just ran into a case today which I hope convinces you to keep your hands off!!!

The problem was that the customer could not edit any of his subscriptions.  They would run, but he could not modify any of their properties.  Every time he would attempt to modify the subscription, he would get an error about being unable to cast a GUID to a string:

System.Web.Services.Protocols.SoapException: Server was unable to process request. ---> System.InvalidCastException: Unable to cast object of type 'System.Guid' to type 'System.String'.   at System.Data.SqlClient.SqlBuffer.get_String()   at Microsoft.ReportingServices.Library.InstrumentedSqlDataReader.<>c__DisplayClass3d.<GetString>b__3c()   at Microsoft.ReportingServices.Library.SqlBoundaryWithReturn`1.Invoke(Method m)   at Microsoft.ReportingServices.Library.SubscriptionImpl..ctor(IDataRecord record, Int32 indexStart)   at Microsoft.ReportingServices.Library.SubscriptionDB.GetSubscription(Guid id)   at Microsoft.ReportingServices.Library.SubscriptionManager.DeleteSubscription(Guid id)   at Microsoft.ReportingServices.Library.DeleteSubscriptionAction.PerformActionNow()   at Microsoft.ReportingServices.Library.RSService.ExecuteBatch(Guid batchId)   at Microsoft.ReportingServices.WebServer.ReportingService2005Impl.ExecuteBatch()   at Microsoft.ReportingServices.WebServer.ReportingService2005.ExecuteBatch()

Based on the source code, I found the stored procedure being run and ran it against a restored backup of the customer’s database.  The stored procedure ran fine and the data looked normal and valid.  So, I continued looking at the code trying to identify which one of the returned fields returned a GUID and then following the source to see where it was being assigned to a string value (which is impossible and will always fail).  However, I couldn’t find any place in the code where that could happen.

I did notice, though, that SSRS always attempts to find a field by index number and not name.  While at first glance this seems like a more error prone approach, it is better performing that looking up a field by name.  When I saw that, I realized that having the stored procedure out of synch with the code could cause a problem like this.  Therefore, I checked the database version of the customer’s database against the known database version for the build of SSRS they were running.  They matched, so it wasn’t a failed upgrade-type scenario.

The next thing I checked was the actual syntax of the stored procedure.  I exported both the official definition and the one from the customer’s database.  Guess what?  They didn’t match!!  The customer’s stored procedure had an extra field being returned.  It was even more obvious when I looked at the definition and noticed that the additional field was in the structure of “alias.field”.  SSRS always uses “alias.[Field]”.

The moral of the story?  Not only is modifying your SSRS database not supported and may have unintended performance impacts, but it can also break your installation!!!

Evan Basalik | Senior Support Escalation Engineer | Microsoft SQL Server Escalation Services

Reporting Services, Scale Out and Clusters…

$
0
0

Every once in a while, I get asked the question about deploying Reporting Services on a Cluster. Usually it is tied to a scale out deployment, sometimes it is not. I just was asked the question again by an engineer in our group. So, I figured I should put this out there to have some reference.

For starters, in regards to Scale Out Deployment with Reporting Services, there are two references that I recommend you read. One is in Books Online and the other is a Scale Out Best Practices doc from the SQL CAT team.

Configuring Reporting Services for Scale-Out Deployment (BOL)

Reporting Services Scale-Out Deployment Best Practices (SQL CAT)

However, neither really touches on installing RS alongside of a SQL Cluster. There are references to an NLB Cluster but that is something different.

The Scenario is that you want to install a SQL Cluster (we’ll say 2 node for the purposes of this example), and you install Reporting Services along with SQL for that instance. The question that usually comes to me is something of this nature: Can I install Reporting Services as part of my cluster, and what are recommendations for setting something like that up?

Let me start with the recommendation part as I think that is the more important piece of this. Please remember that this is my personal recommendation based on experience and dealing with these types of issues. Also, I have heard this from some other source, but in general these are my thoughts.

Recommendation:

Do I recommend that you install Reporting Services as part of a SQL Cluster install? No, I do not.

Think about the scenario. Why are you setting up a cluster to being with? It is probably because you want to make sure that your SQL Database Server is up and running reliably. My general thought on that are, if that is your goal, why are you introducing another service that will now be competing for resources on the same box. Who gets precedence?

When I think about setting up a SQL Cluster, I don’t want anything else running on that box. I realize that for some deployments, that may not be a reality because of resource constraints (money, hardware, licenses, etc…) which leads into the technical piece down below. So, if you have to go this route, be aware of the competing resources aspect of this and plan accordingly. Know how you are going to deal with issues when SQL decides to fail over, or of one node crashes and now a single machine is hosting all SQL Traffic and RS Traffic.

Technical:

From a technical aspect, we will let you configure an RS Instance that is part of the same Instance that the SQL Cluster is a part of. In my case, I set this up with an instance name of SQLALL. This included SQL, RS and AS. SQL and AS were clustered.

The biggest thing to realize when you do this is that RS itself is not cluster aware. RS at that point is just a standalone instance even though it resides on a cluster node. From a failover cluster perspective, the cluster has no knowledge of Reporting Services.

clip_image001

Notice that from a cluster perspective, we can see that we have AS, SQL and SQL Agent. Reporting Services is not listed.

From Reporting Services Configuration Manager, we can see that it is there and part of the same instance:

clip_image002

We can also see that when you install Reporting Services in this type of setup that Reporting Services will be running on both Cluster Nodes, even though SQL is only Active on one node. This is because SQL is controlled by the Failover Clustering feature of Windows.

clip_image003

When a Failover occurs, SQL will move from Node 1 to Node 2, but RS will not change its status at all. In this case, the load for Reporting Services should be balanced from the Load Balancer.


clip_image001[5]

However, if Node 1 goes down all together, all Reporting Services traffic will be pushed to Node 2 along with the SQL Traffic. This could cause a resource issue on Node 2 as you may start receiving unexpected load on the one machine. The Load Balancer will redirect all traffic to Node 2 as this is the only node available.

clip_image002[5]

Summary:

The big thing to take into account when you go to install Reporting Services on Clustered Nodes is to think about your load and available resources. Understand what your goals are and plan accordingly taking all scenarios into account to know how you will react. It is technically possible to install Reporting Services onto a Cluster Node, but it will not be a clustered service and will compete for resources with SQL Server.

Adam W. Saxton | Microsoft SQL Server Escalation Services
http://twitter.com/awsaxton

How It Works: The SQLAgent Log File

$
0
0

I am still working to resolve the customers problem but during my investigation I briefly looked a the SQLAgent logging format and thought you all might like some of these details.

From: Robert Dorr
Sent: Monday, May 24, 2010 9:47 AM
Subject: RE: SPID in SLEEP state indefinitely

 

The error itself is from SQLAgent while calculating the next scheduled execution time.

 

IDS_POSSIBLE_DATE_CALC_SPIN                      "Warning [%ld]: Possible date calculation spin for Schedule %ld"

 

We are trying to find the next date and time that the scheduler will run.   In this case it is schedule #15 on your system.   As I look at this is would nice if it would spell out the Job in question so I will file a DCR to add more details to the message as well.

 

We will spin in this code calculating the date if

 

                Scheduler to run again

                AND Next Run Date < current date

                OR NextRunDate equal current date AND run time < current time

 

So it would help to see the SQLAgent log entries to see the logged date and time compared to the current setup for execution schedule #15

 

Reference Information

=============================================

Common SQLAgent Log format

 

Date time – (+|?|!)  [Resource Id] <<Message Text>>

 

        INFORMATION            '?'

        WARNING                '+'

        ERROR                  '!'

 

I opened up the sqlagent.rll as a resource DLL and you can see the resource id and format string that matches the inquiry.

clip_image002

OR

Date time – (+|?|!) <<Message Text>>

=========================================

Sent: Saturday, May 22, 2010 11:42 AM
Subject: RE: SPID in SLEEP state indefinetely

 

I am seeing a warning message in the Sql agent’s logs “[191] Warning [2]: Possible date calculation spin for Schedule 15”, but it said to be fixed in SQL 2000 SP4 (http://support.microsoft.com/kb/295378)

 

Any idea of it occurring on SQL 2008 SP1 and how to fix it? Is there any relation of this issue to forever waiting SPIDs?

Bob Dorr - Principal SQL Server Escalation Engineer

SQL Server 2008 R2 New Non-Yield Ring Buffer Information

$
0
0

In 2002 the SQLOS team added specific checks for non-yielding scheduler issues.   You may be familiar with the 178** series of errors like 17883 scheduler non-yield.   Since 2002 the test matrix for SQL Server has flagged these errors and corrected them.   With the evolution of SQL Server 2005, 2008 and now 2008 R2 the number of self-inflicted 178** errors is less than I can count on one hand since 2007.

We are finding that the vast majority of the 178** error conditions are caused by external factors and processes.  So in SQL Server 2008 R2 additional ring buffer entries were added to show system wide information about processes and threads, helping to pinpoint the cause of the problem.

There is now non-yield system information ring buffers that will show things like the following performance values.  The data is self-explanatory and is collected with a combination of the performance monitor APIs and Toolhelp.

 

The data is only output by one scheduler monitor.  It does not have to be the Node 0 scheduler monitor but it is protected so only a single scheduler monitor will generate the record and secondary monitors will avoid the production properly.

 

The frequency of the record collection can increase when a non-yield situation is occurring and each ring buffer currently allows up to 1024 records.

 

Process information on the system showing common memory usage statistics.  If another process is causing excessive memory pressure the information might be revealed by the ring buffer capture.

<NonYieldProcessTable>
  <ProcessID>%d</ProcessID>

  <ProcessName><![CDATA[%ls]]</ProcessName>"    ------- Controlled with TRCFLG_RETAIL_COLLECT_PROCESS_NAME (-T1264) to include the EXE file name

<PageFaultCount>%d</PageFaultCount>
<WorkingSetSize>%ld</WorkingSetSize>
<PrivateUsage>%ld</PrivateUsage>
<NonYieldProcessTable>

Thread level information on the system showing common processing time data points.    
<NonYieldThreadTable>
<ProcessID>%d</ProcessID>
<ThreadID>%d</ThreadID>
<UserTimeStart>%I64d</UserTimeStart>       ----- user time at the start of the capture

<UserTimeEnd>%I64d</UserTimeEnd>           ----- user time at the end of the capture   (DIFF of them to see activity during the reporting period)
<KernelTimeStart>%I64d</KernelTimeStart>
<KernelTimeEnd>%I64d</KernelTimeEnd>
</NonYieldThreadTable>

If this still does not reveal the cause the mini-dump taken may contain additional information that the Microsoft SQL Server Support team can extract.  For example the following data may be available from the time of the non-yield condition. 

\PhysicalDisk(_Total)\Avg. Disk Queue Length
\PhysicalDisk(_Total)\% Disk Time
\Memory\Pages/sec
\Memory\Pages Output/sec
\Memory\Pages Input/sec
\Memory\Available Bytes
\Paging File(_Total)\% Usage

Bob Dorr - Principal SQL Server Escalation Engineer

Don’t touch that schema!!!

$
0
0

You know how every product that has an underlying database has documentation that says not to modify the schema?  Do you always pay attention to that warning?

If your product is Reporting Services, I just ran into a case today which I hope convinces you to keep your hands off!!!

The problem was that the customer could not edit any of his subscriptions.  They would run, but he could not modify any of their properties.  Every time he would attempt to modify the subscription, he would get an error about being unable to cast a GUID to a string:

System.Web.Services.Protocols.SoapException: Server was unable to process request. ---> System.InvalidCastException: Unable to cast object of type 'System.Guid' to type 'System.String'.   at System.Data.SqlClient.SqlBuffer.get_String()   at Microsoft.ReportingServices.Library.InstrumentedSqlDataReader.<>c__DisplayClass3d.<GetString>b__3c()   at Microsoft.ReportingServices.Library.SqlBoundaryWithReturn`1.Invoke(Method m)   at Microsoft.ReportingServices.Library.SubscriptionImpl..ctor(IDataRecord record, Int32 indexStart)   at Microsoft.ReportingServices.Library.SubscriptionDB.GetSubscription(Guid id)   at Microsoft.ReportingServices.Library.SubscriptionManager.DeleteSubscription(Guid id)   at Microsoft.ReportingServices.Library.DeleteSubscriptionAction.PerformActionNow()   at Microsoft.ReportingServices.Library.RSService.ExecuteBatch(Guid batchId)   at Microsoft.ReportingServices.WebServer.ReportingService2005Impl.ExecuteBatch()   at Microsoft.ReportingServices.WebServer.ReportingService2005.ExecuteBatch()

Based on the source code, I found the stored procedure being run and ran it against a restored backup of the customer’s database.  The stored procedure ran fine and the data looked normal and valid.  So, I continued looking at the code trying to identify which one of the returned fields returned a GUID and then following the source to see where it was being assigned to a string value (which is impossible and will always fail).  However, I couldn’t find any place in the code where that could happen.

I did notice, though, that SSRS always attempts to find a field by index number and not name.  While at first glance this seems like a more error prone approach, it is better performing that looking up a field by name.  When I saw that, I realized that having the stored procedure out of synch with the code could cause a problem like this.  Therefore, I checked the database version of the customer’s database against the known database version for the build of SSRS they were running.  They matched, so it wasn’t a failed upgrade-type scenario.

The next thing I checked was the actual syntax of the stored procedure.  I exported both the official definition and the one from the customer’s database.  Guess what?  They didn’t match!!  The customer’s stored procedure had an extra field being returned.  It was even more obvious when I looked at the definition and noticed that the additional field was in the structure of “alias.field”.  SSRS always uses “alias.[Field]”.

The moral of the story?  Not only is modifying your SSRS database not supported and may have unintended performance impacts, but it can also break your installation!!!

Evan Basalik | Senior Support Escalation Engineer | Microsoft SQL Server Escalation Services

Viewing all 339 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>