SQL Server MS Blogs
The Case of Anti-Virus filter drive interference with File Stream Restore
"Denzil and I were working on this issue for a customer and Denzil has been gracious enough to write-up a blog for all of us." – Bob Dorr
From Denzil:
I recently worked with a customer on a Database restore issue where the database being restored had 2TB of File stream data. The restore in this case would just not complete successfully and would fail with the error below.
10 percent processed.
20 percent processed.
30 percent processed.
40 percent processed.
Msg 3634, Level 16, State 1, Line 1
The operating system returned the error '32(The process cannot access the file because it is being used by another process.)' while attempting 'OpenFile' on 'F:\SQLData11\DataFiles\535cc368-de43-4f03-9a64-f5506a3f532e\547fc3ed-da9f-44e0-9044-12babdb7cde8\00013562-0006edbb-0037'.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
Subsequent restore attempts would fail with the same error though on "different" files and at a different point in the restore cycle.
Given that this was "not" the same file or the same point of the restore on various attempts my thoughts immediately went to some filter driver under the covers wreaking some havoc. I ran an a command to see what filter drivers were loaded (trimmed output below.)
C:\>fltmc instances
Filter Volume Name Altitude Instance Name
-------------------- ----------------------- ------------ ---------------------- -----
BHDrvx64 F:\SQLData11 365100 BHDrvx64 0
eeCtrl F:\SQLData11 329010 eeCtrl 0
SRTSP F:\SQLData11 329000 SRTSP 0
SymEFA F:\SQLData11 260600 SymEFA 0
RsFx0105 \Device\Mup 41001.05 RsFx0105 MiniFilter Instance 0
SymEFA = Symantec extended file attributes driver
SRTSP = Symantec Endpoint protection
RsFx0105 = SQL Server File Stream filter driver.
In discussing this with the customer, Anti-virus exclusions were controlled by GPO so he had put in a request to exclude the respective folders, yet the issue still continued.
In order to do my due diligence, the other question was whether we "released" the file handle after we created it, and whether someone else grabbed it? So we (Venu, Bob and I) did take a look at the code and this can be the case. On SQL Server 2008 R2 when we call the CreateFile API and we hardcode the shareAccess parameter to 0 which is exclusive access while we have it open to prevent secondary access.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx
If this parameter is zero and CreateFile succeeds, the file or device cannot be shared and cannot be opened again until the handle to the file or device is closed. For more information, see the Remarks section.
Once the file is created, we release the EX latch and can close the file handle, on the file, but sqlservr.exe continues to hold the lock on the file itself during the restore process. Once the restore operation is completed, we no longer hold an exclusive lock to the file.
We can reopen file handles during the Recovery process so the other thought was perhaps it was a transaction affected by recovery and GC and potentially some race condition but in this case we know that the restore was failing prior to that as it didn't reach 100% so that could be ruled out as well.
Getting a dump at the failure time showed me the same Restore Stack but different dumps showed multiple different files in question so it wasn't a particular Log record sequence per say causing this.
sqlservr!ex_raise
sqlservr!HandleOSError
sqlservr!FileHandleCache::OpenFile
sqlservr!FileHandleCache::ProbeForFileHandle
sqlservr!FileHandleCache::GetFileHandle
sqlservr!RestoreCopyContext::RestoreFilesystemData
BackupIoRequest::StartDatabaseScatteredWrite
Given now that it was unlikely it was SQL Server, I concentrated more on the Filter driver theory. I tried to capture Process monitor, but given the time it took and amount of files touched, Process monitor was not all that useful. I couldn't filter on a specific folder as it failed on different folders and there were 10 + mount points involved.
However from Process monitor while the restore was going on, I looked at the stack for some I/O operations (not ones that failed by any means) and I still saw fltmgr.sys sitting there for an OpenFile Call on a file in the filestream directory
fltmgr.sys + 0x2765
0xfffffa6001009765
C:\Windows\system32\drivers\fltmgr.sys
fltmgr.sys + 0x424c
0xfffffa600100b24c
C:\Windows\system32\drivers\fltmgr.sys
fltmgr.sys + 0x1f256
0xfffffa6001026256
C:\Windows\system32\drivers\fltmgr.sys
ntoskrnl.exe + 0x2c8949
0xfffff80002918949
C:\Windows\system32\ntoskrnl.exe
ntoskrnl.exe + 0x2c0e42
0xfffff80002910e42
C:\Windows\system32\ntoskrnl.exe
ntoskrnl.exe + 0x2c19d5
0xfffff800029119d5
C:\Windows\system32\ntoskrnl.exe
ntoskrnl.exe + 0x2c6fb7
0xfffff80002916fb7
C:\Windows\system32\ntoskrnl.exe
ntoskrnl.exe + 0x2b61a8
0xfffff800029061a8
C:\Windows\system32\ntoskrnl.exe
ntoskrnl.exe + 0x57573
0xfffff800026a7573
C:\Windows\system32\ntoskrnl.exe
ntdll.dll + 0x471aa
0x77b371aa
C:\Windows\System32\ntdll.dll ZwOpenFile
kernel32.dll + 0x10d48
0x779d0d48
C:\Windows\system32\kernel32.dll
kernel32.dll + 0x10a7c
0x779d0a7c
GetVolumeNameForRoot
_____SQL______Process______Available + 0x695c7e
0x1a080fe
GetVolumeDeviceNameAndMountPoint
_____SQL______Process______Available + 0x6d6898
0x1a48d18
- ParseContainerPath
_____SQL______Process______Available + 0x6d714a
0x1a495ca
sqlservr!CFsaShareFilter::RsFxControlContainerOwnership
Also looking at some other Symantec related issues, I found an article not necessarily to do with any SQL restores but the fact that this was a possibility – again this has to do with a specific issue on a specific build, but am illustrating that Filter drivers can cause some unexpected behaviors.
As far as Anti-virus exclusions go, we actually have guidance in the article below: http://support.microsoft.com/kb/309422
And also in our File stream best practices article: http://msdn.microsoft.com/en-us/library/dd206979(v=SQL.105).aspx
When you set up FILESTREAM storage volumes, consider the following guidelines:
•Turn off short file names on FILESTREAM computer systems. Short file names take significantly longer to create. To disable short file names, use the Windows fsutil utility.
•Regularly defragment FILESTREAM computer systems.
•Use 64-KB NTFS clusters. Compressed volumes must be set to 4-KB NTFS clusters.
•Disable indexing on FILESTREAM volumes and set disablelastaccess to set disablelastaccess, use the Windows fsutil utility.
•Disable antivirus scanning of FILESTREAM volumes when it is not unnecessary. If antivirus scanning is necessary, avoid setting policies that will automatically delete offending files.
•Set up and tune the RAID level for fault tolerance and the performance that is required by an application.
Looking at another run of "fltmc instances" command output and still saw the Anti-virus components on the list for those mount points. Given we "thought" we had put an exclusion in for the whole drive, and it was showing up, it was time to look at this closer
- Excluded the drives where the data was being stored – Restore still failed
- Stopped the AV Services - Restore still failed
- Uninstalled Anti-virus – Restore now succeeded
Voila once we uninstalled AV on this machine, the restore succeeded. The customer is broaching this this with the AV vendor to figure out more of the root cause.
Denzil Ribeiro – Senior PFE
But what can I *do* with Windows Azure? Backups
If you want to know more about Windows Azure, how it works, the components, or more about the entire platform, I’ve written about that here: http://blogs.msdn.com/b/buckwoody/archive/2012/06/13/windows-azure-write-run-or-use-software.aspx
But….
Maybe you just want to cut to the chase. Windows Azure. What do I *do* with it? Let’s talk about that. One of the quickest, easiest ways to use Azure is in the storage feature, as a backup target. Can Windows Azure backup data, servers, workstations or databases? Yes. Yes it can. Windows Azure storage is replicated three times in one datacenter (on different fault-domains) and then those three are replicated to another geographically separate (but still in the same country region) location, you get six copies of the data automatically. Your data stays in the datacenter you choose, and is replicated within a geo-politically same region. So it’s actually a great target for backups.
First, you need a storage account, a container underneath that, and a Blob object to put the backups on. Here’s how you do that (for free):
- Set up an account: https://www.windowsazure.com/en-us/pricing/free-trial/
- Create a Container: http://www.windowsazure.com/en-us/develop/net/how-to-guides/blob-storage/#create-account (Steps 1-7 are all you need)
- Get the Account String: Open the Portal (as above), click on Storage, select the account you want, and click Manage Keys at the bottom of the screen. Copy that string to a secure place.
OK, now that you have all that, you’re all set. In fact, you’re all set for things like Web Sites, VM’s, Code Deployment and lots of other things, but let’s focus on backups first. What are your options?
Mount a Drive, Use as Backup TargetThe easiest way to send files to Windows Azure is to mount the storage as if it is a local drive. You can use that as regular storage (I’ll talk more about this in my next post) but you can also use that as a drive letter where you can send backups. While that’s simple to implement, it isn’t always the most efficient – you’re going through a layer of storage abstraction. Still and all, it’s a good choice and quick and easy to implement. Here are some options:
- Gladinet: http://blogs.msdn.com/b/avkashchauhan/archive/2012/01/10/accessing-windows-azure-blob-storage-as-network-drive.aspx, and http://www.gladinet.com/
- Cloudberry: http://www.cloudberrylab.com/virtual-drive-amazon-s3-azure.aspx
- CloudDrive: http://www.clouddrive.com.au/default.html
- Open Source: http://coderead.wordpress.com/2011/09/02/mount-azure-blob-storage-as-a-windows-drive/ which uses: http://dokan-dev.net/en/
In addition to (and including) the providers mentioned above, some also skip the step of having to mount a drive to use as a backup target, and simply allow you to mount an agent or tool that just backs up straight to Azure.
- Cloudberry: http://techinch.com/blog/backup-your-files-to-windows-azure-with-cloudberry
- Veeam is an interesting product which focuses on backing up Virtual Machines: http://blogs.technet.com/b/haroldwong/archive/2013/02/14/modern-data-protection-for-virtual-environments-using-veeam-and-windows-azure.aspx
- Commvault (Simpana) Lots of options here: http://news.commvault.com/articles/011089_Microsoft_CommVault_Offer_Simpana_Data_Management_With_Windows_Azure_.asp
- Evault is a backup service that use Windows Azure: http://www.evault.com/products/cloud-storage-services/laptop-backup/endpoint-protection.html
- StorSimple – a hardware appliance that can act as storage or backups, with encryption, de-duplication, compression and a Hierarchical Storage Management concept: http://www.storsimple.com/total-storage/
- Riverbed Whitewater – hardware, provides de-duplication and encryption onsite prior to backup to Windows Azure: http://filipv.net/2012/08/11/using-windows-azure-cloud-storage-with-riverbed-whitewater/
- Nasuni: http://www.nasuni.com/
Data Protection Manager is a feature that is part of the System Center suite. We’ve updated that in the latest versions that will allow you to incrementally back up Servers and even Workstations and Laptops straight to Windows Azure. The beauty of this feature is that if the user is in a remote office or traveling the data will flow up to Windows Azure from wherever they are.
- More on Windows Azure Backup Services here: http://www.windowsazure.com/en-us/home/features/online-backup/
- And more on how to use DPM with Azure here: http://technet.microsoft.com/en-us/library/jj728752.aspx
SQL Server can use the mounted-drive approach described above, and you can back up your databases
- Mount Drive (first option described above)
- SQL Server 2012 backup to Windows Azure: http://msdn.microsoft.com/en-us/library/jj919148.aspx
- And, as part of HADR, this: http://blogs.msdn.com/b/buckwoody/archive/2013/01/08/microsoft-windows-azure-disaster-recovery-options-for-on-premises-sql-server.aspx
PowerPivot Table Import Wizard cannot find provider
The data source provider list in PowerPivot can often be a source of confusion for users since they equate the fact that a provider appears in the list as the provider being installed and available. Unfortunately, the list of providers is actually a static list of supported data sources for PowerPivot, so the user is still required to install the desired provider to successfully import data into PowerPivot. Thus, the most common fix for a "provider is not installed" error in the import wizard is to ensure you have the proper data provider installed and that the installed provider matches the platform architecture (32-bit or 64-bit) of PowerPivot and Excel.
If you are certain that the selected provider is installed on your client machine and are able to import data directly into Excel using the desired provider via the Data tab, then you may be encountering another issue which was recently discovered.
In this new scenario data import in PowerPivot will fail for any provider selected. The exact error seen varies depending on the provider selected but examples include:
Text File: "Details: Failed to connect to the server. Reason: Provider information is missing from the connection string"
Excel: "Cannot connect to the data source because the Excel provider is not installed."
SQL Server: "Cannot connect to the data source because the SQLServer provider is not installed."
The problem is actually due to a problem with the .NET machine configuration. PowerPivot attempts to instantiate providers by using the .NET DbProviderFactory class. If an error is encountered while instantiating the DbProviderFactory class, the error for the DbProviderFactory is not returned, instead the message returned is that the selected provider is not installed. If you are encountering this scenario it is very likely that there is a problem instantiating the .NET DBProviderFactory class.
The DbProviderFactory class configuration is read from the Machine.Config.xml file, which depending on whether you are running the 32-bit or 64-bit version of Excel and PowerPivot is located at:
c:\Windows\Microsoft.NET\Framework\v4.0.30319\Config
or
c:\Windows\Microsoft.NET\Frameworkx64\v4.0.30319\Config
Checking the Machine.Config.xml file you will find the <DBProviderFactories> element under <system.data>. The <DBProviderFactories> element should only appear once, but problematic machines may have more than one XML tag for DbProviderFactories.
Example of bad element list:
<system.data>
<DbProviderFactories>
<add name="Microsoft SQL Server Compact Data Provider"invariant="System.Data.SqlServerCe.3.5"description=".NET Framework Data Provider for Microsoft SQL Server Compact" type="System.Data.SqlServerCe.SqlCeProviderFactory, System.Data.SqlServerCe, Version=3.5.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"/>
</DbProviderFactories>
<DbProviderFactories/>
</system.data>
NOTE: The begin and end tag around the add for the SQLServerCE provider, followed by the empty element tag.
Correct Example:
<system.data>
<DbProviderFactories>
<add name="Microsoft SQL Server Compact Data Provider"invariant="System.Data.SqlServerCe.3.5"description=".NET Framework Data Provider for Microsoft SQL Server Compact" type="System.Data.SqlServerCe.SqlCeProviderFactory, System.Data.SqlServerCe, Version=3.5.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91"/>
</DbProviderFactories>
</system.data>
NOTE: The add element(s) between the open <DbProviderFactories> and close </DbProviderFactories> tags will vary depending on what providers are installed on your machine.
If you find that you have something similar to the bad example above, please use the following steps to resolve the issue:
- Make a backup copy of existing machine.config.xml file in the event you need to restore it for any reason.
- Open the machine.config.xml file in notepad or another editor of your choice.
- Delete the empty element tag <DbProviderFactories/> from the file.
- Save the updated file.
- Retry the import from PowerPivot
Wayne Robertson - Sr. Escalation Engineer
System Center Advisor is now free
It has been well over a year when I wrote a series of blog posts about a product called System Center Advisor. You can read these posts at this link
http://blogs.msdn.com/b/psssql/archive/tags/system+center+advisor/
When Advisor was first released, this cloud service was free for a 60 day trial period but required a Software Assurance contract to use past that.
Well we have decided that this should be free for everyone. Read more about this announcement at this link:
http://blogs.technet.com/b/momteam/archive/2013/03/06/system-center-advisor.aspx
For SQL Server, we now have over 100+ rules baked into this service representing the collective knowledge of CSS SQL Engineers worldwide on common customer issues. Have you ever wanted to know what the CSS teams knows based on common issues reported by customers? That is what SCA is all about. Providing you that knowledge in the form of a cloud-based service.
Give this a try on your SQL Server and look at the advice that is presented. We specifically baked in rules (called alerts) with the intention of helping you prevent problems before they happen.
Take a look through my previous blog posts above on this topic for some examples. What is incredibly powerful about this service is:
- Once you install, it just “runs”
- You view your alerts through a web portal so can do this anywhere
- As part of the service we capture configuration change history (i.e. a problem started but what changed?)
- We keep the rules “fresh” by updating the service each month but you don’t have to do anything. The service automatically pulls in these new rules for you.
I look forward to any comments you post to this blog regarding your experiences with it.
Bob Ward
Microsoft
Cumulative Update #10 for SQL Server 2008 Service Pack 3
Cumulative Update # 3 for SQL Server 2012 Service Pack 1
SQL Server 2012 partitioned table statistics update behavior change when rebuilding index
In this blog, I will talk about a couple of things related to statistics update when rebuilding index on a partitioned table.
In past versions, when you rebuild an index, you will get statistics update equivalent to FULLSCAN for free. This is true regardless if the table is partitioned table or not.
But SQL Server 2012 changed the behavior for partitioned table. If a table is partitioned, ALTER INDEX REBUILD will only update statistics for that index with default sampling rate. In other words, it is no longer a FULLSCAN. This is documented in http://technet.microsoft.com/en-us/library/ms188388.aspx. But lots of users do not realized that. If you want fullscan, you will need to run UPDATE STATISTCS WITH FULLSCAN. This change was made because we started to support large number of partitions up to 15000 by default. Previous versions did support 15000 partitions. But it’s not on by default. Supporting large number of partitions will cause high memory consumption if we track the stats with old behavior (FULLSCAN). With partitioned table, ALTER INDEX REBUILD actually first rebuilds index and then do a sample scan to update stats in order to reduce memory consumption.
Another behavior change is actually a bug. In SQL 2012, ALTER INDEX REBUILD doesn’t preserve norecompute property for partitioned tables. In other words, if you specify norecompute on an index, it will be gone after you run ALTER INDEX REBUILD for SQL 2012. We have corrected this issue in a newly released CU 3 of SQL Server 2012 SP1. Here is the KB: http://support.microsoft.com/kb/2814780
Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support
DevOps for Windows Azure
"DevOps" (Short for Developer Operations) is one of a group of new terms such as "Cloud", "Big Data" and "Data Scientist" - words that are somewhere between marketing and tasks we've actually had around in other forms for years.However, working in a Distributed Environment (Both on and off premises) like Windows Azure does bring a new set of tasks to the operations we currently perform in Information Technology.
Before I offer some guidance here, I need to carefully define the term "DevOps" as I use it.There are other definitions that involve Application Lifecycle Management (ALM) and standard operations policies, and you're free to use those as well, but this is the definition I'll use for this post: By DevOps I mean those tasks involved with deploying, managing and monitoring a Windows Azure (or hybrid) project.
Another caveat: This is a non-authoritative, non-comprehensive post. I'll include only an outline of the major tasks, not a complete manual on the topic. There's enough knowledge needed on this topic for at least a whitepaper or two, and perhaps even a book, but for the moment I wanted to get some information out to ensure you have something to work from until those come along.This is primarily a list of resources for a DevOps team.
With all of those caveats in mind, we'll start the discussion after the project is conceived and architected. In most cases the DevOps team (whether that is a dedicated team or simply part of what the current IT Ops team does) is also involved in the design, at least from an information point of view. There's a great overview of the entire process available in poster form here: http://www.microsoft.com/en-us/download/details.aspx?id=36837 And you should also read this complete manual in preparation here: http://msdn.microsoft.com/en-us/library/hh871440.aspx
DeploymentThe first task after the design of the project is deployment. The deployment method depends on the type of solution; Windows Azure has the ability to run VM's, software code, or provide services that are already created (such as Active Directory).
IaaS Deploying Virtual Machines:Manually from the Portal: http://go.microsoft.com/fwlink/?linkid=254427&clcid=0x409
Through Scripting: https://www.windowsazure.com/en-us/downloads/?fb=en-us, http://msdn.microsoft.com/en-us/library/ee460812.aspx
Copying your own VM's to Windows Azure: http://msdn.microsoft.com/en-us/library/windowsazure/gg465385.aspx
Using System Center: http://www.techrepublic.com/blog/datacenter/deploy-an-on-premise-vm-to-windows-azure-with-app-controller/5919
Virtual Networking: http://msdn.microsoft.com/en-us/library/windowsazure/jj156075.aspx, http://channel9.msdn.com/Shows/Cloud+Cover/Episode-88-Tips-and-Tricks-for-Windows-Azure-Virtual-Machines-and-Virtual-Networks,
PaaSThrough Visual Studio: http://www.microsoft.com/BizSpark/Azure/HowToDeployAzureApp.aspx
Using CSPack: http://msdn.microsoft.com/en-us/library/windowsazure/gg432988.aspx
Through Scripting: https://www.windowsazure.com/en-us/downloads/?fb=en-us, http://msdn.microsoft.com/en-us/library/ee460812.aspx
SaaSManually from the Portal:https://datamarket.azure.com/
Through Scripting: https://www.windowsazure.com/en-us/downloads/?fb=en-us
MonitoringMonitoring the system after deployment involves watching the availability and uptime of the system, along with security intrusions and tracking access through code.
HealthUsing MetricsHub: http://channel9.msdn.com/Shows/Cloud+Cover/Episode-102-Using-MetricsHub-to-Monitor-Your-Windows-Azure-Applications
Uptime and Availability through the Portal: http://www.windowsazure.com/en-us/support/service-dashboard/
Uptime and Availability through Third Party Vendors: http://www.paraleap.com/AzureWatch, http://blogs.msdn.com/b/buckwoody/archive/2012/07/03/management-and-monitoring-tools-for-windows-azure.aspx
Automatic Notification: http://www.codeproject.com/Articles/375892/Adding-SMS-notifications-to-your-Windows-Azure-pro
PerformancePerformance Counters: http://msdn.microsoft.com/en-us/library/windowsazure/hh411520.aspx
Logging Diagnostics PaaS: http://msdn.microsoft.com/en-us/library/windowsazure/gg433048.aspx
Internal Instrumentation for PaaS: http://msdn.microsoft.com/en-us/library/windowsazure/hh674491%28v=vs.103%29.aspx
Third Party Performance Testing: http://www.neustar.biz/enterprise/web-performance, http://blogs.msdn.com/b/buckwoody/archive/2012/07/03/management-and-monitoring-tools-for-windows-azure.aspx
CostsUnderstanding Costs: http://msdn.microsoft.com/en-us/library/ff803372.aspx, http://technet.microsoft.com/en-us/magazine/gg213848.aspx
Subscription Management: http://msdn.microsoft.com/en-us/library/windowsazure/gg465713.aspx
System Center: http://technet.microsoft.com/en-us/library/hh221354.aspx
Third-Party Tools: http://blogs.msdn.com/b/buckwoody/archive/2012/07/03/management-and-monitoring-tools-for-windows-azure.aspx
Example of listing your deployments: http://msdn.microsoft.com/en-us/library/gg651127.aspx
ManagementManaging the deployment involves Security, Upgrades, Troubleshooting, and High-Availability/Disaster Recovery.
Windows Azure Management Portal: http://www.windowsazure.com/en-us/
Management API's: https://www.windowsazure.com/en-us/downloads/?fb=en-us and http://msdn.microsoft.com/en-us/library/ee460812.aspx, http://www.packtpub.com/sites/default/files/2220-chapter-7-managing-hosted-services-with-the-service-management-api.pdf?utm_source=packtpub&utm_medium=free&utm_campaign=pdf
SecuritySecurity Trust Center: http://www.windowsazure.com/en-us/support/trust-center/
Working with Windows Azure Active Directory: http://blogs.msdn.com/b/windowsazure/archive/2012/11/28/windows-azure-now-supports-federation-with-windows-server-active-directory.aspx
Windows Azure Authentication: http:/www.asp.net/vnext/overview/fall-2012-update/windows-azure-authentication" target="_blank">http://www.asp.net/vnext/overview/fall-2012-update/windows-azure-authentication
Deploying a secure ASP.NET MVC application with OAuth: http://blogs.msdn.com/b/webdev/archive/2013/03/12/deploy-a-secure-asp-net-mvc-application-with-oauth-membership-and-sql-database.aspx
UpgradesALM Process for PaaS: http://sqlblog.com/blogs/buck_woody/archive/2011/01/25/windows-azure-use-case-agility.aspx
TroubleshootingWindows Azure Support: http://www.windowsazure.com/en-us/support/contact/
Upgrade and Fault Domains: http://blog.toddysm.com/2010/04/upgrade-domains-and-fault-domains-in-windows-azure.html
HADRLoad-Balancing Endpoints for IaaS: http://www.windowsazure.com/en-us/manage/windows/common-tasks/how-to-load-balance-virtual-machines/
Extending SQL Server HADR to Windows Azure: http://blogs.msdn.com/b/buckwoody/archive/2013/01/08/microsoft-windows-azure-disaster-recovery-options-for-on-premises-sql-server.aspx
HADR for IaaS: http://www.visionsolutions.com/, http://blogs.technet.com/b/windowsserver/archive/2012/03/28/microsoft-online-backup-service.aspx
Multiple Instances for PaaS: http://msdn.microsoft.com/en-us/library/windowsazure/ee871996.aspx
Business Continuity for Windows Azure: http://msdn.microsoft.com/en-us/library/windowsazure/hh873027.aspx, http://blogs.msdn.com/b/avkashchauhan/archive/2011/10/14/windows-azure-vm-downtime-due-to-host-and-guest-os-update-and-how-to-manage-it-in-multi-instance-windows-azure-application.aspx
DispositionWhen the project is complete, you'll need to remove the VM's in IaaS, or data and code from PaaS and shut down the deployment. Prior to doing that, you should:
- Copy all data from the deployment to a local repository
- Document the process
- Notify Microsoft of your intent to stop the project to work with your representative on billing matters
The primary tool for disposal is the Windows Azure Portal.
switchoffset built-in function can cause incorrect cardinality estimate
Recently, we received a call from a customer reported that a query was slow. Upon further investigation, his query has a predicate that look like this:
select * from t o where c1 >switchoffset (Convert(datetimeoffset, GETDATE()), '-04:00')
Upon further investigation, we discovered that it was a cardinality issue. The data this customer had is such that there was no date beyond today. All the dates are in the past (as it for most scenarios).
SQL Server has many built-in/intrinsic functions. During query compilation, optimizer can actually ‘peek��� the value by ‘executing’ the function to provide better estimate. For example, if you use getdate() like (“select * from t where c1 > getdate()”), optimizer will be able actually get the value of getdate() and then use histogram to obtain accurate estimate.
DateAdd is another intrinsic function that optimizer can do the same trick.
But switchoffset is not one of those intrinsic functions and optimizer can’t ‘peek’ the value and utilize histogram.
Just to compare the difference, query “select * from t o where c1 >switchoffset (Convert(datetimeoffset, GETDATE()), '-04:00')” shows incorrect estimate (74397 rows).
But “select * from t o where c1 > convert (datetimeoffset, dateadd (dd, 0, getdate()))” shows correct estimate. Note that the two queries are identical. But I used them to illustrate the difference in terms of cardinality estimate.
Solution
When you use switchoffset together with getdate(), it’s best when you ‘precompute’ the value and then plug it in your query. Here is an example:
declare @dt datetimeoffset = switchoffset (Convert(datetimeoffset, GETDATE()), '-04:00')
select * from t where c1 > @dt option (recompile)
Complete demo script
if object_id ('t') is not null
drop table t
go
create table t (c1 datetimeoffset)
go
declare @dt datetime, @now datetime
set @dt = '1900-01-01'
set @now = SYSDATETIMEOFFSET()
set nocount on
begin tran
while @dt < @now
begin
insert into t values (@dt)
insert into t values (@dt)
insert into t values (@dt)
insert into t values (@dt)
insert into t values (@dt)
insert into t values (@dt)
set @dt = dateadd (dd, 1, @dt)
end
commit tran
go
create index ix on t (c1)
go
set statistics profile on
go
--inaccurate estimate
select * from t where c1 >switchoffset (Convert(datetimeoffset, GETDATE()), '-04:00')
--accurate estimate
select * from t where c1 > convert (datetimeoffset, dateadd (dd, 0, getdate()))
--accurate estimate
declare @dt datetimeoffset = switchoffset (Convert(datetimeoffset, GETDATE()), '-04:00')
select * from t where c1 > @dt option (recompile)
go
set statistics profile off
Jack Li | Senior Escalation Engineer |Microsoft SQL Server Support
Temp table caching improvement for table valued parameters in SQL Server 2012
I wanted to point out a nice performance improvement related to table valued parameters (TVP) in SQL Server 2012. It’s not currently documented in our online documentation. But we have had customers who inquired about this.
When you use TVP, SQL Server internally uses temp table to store the data. Starting SQL Server 2005, temp tables can be cached for re-used. Caching reduces contentions such as page latch contentions on system tables which can occur as temp tables are created and dropped at a high rate.
If you use TVP with a stored procedure, temp table for the TVP will be cached since SQL Server 2008 when TVP was introduced.
But if you use TVP together with parameterized queries, temp tables for TVP won’t be cached in SQL 2008 or 2008 R2. This leads to page latch contentions on system tables mentioned earlier.
Starting SQL Server 2012, table tables for TVP are cached even for parameterized queries.
Below are two perfmon results for a sample application that uses TVP in a parameterized query. Figure 1 shows that SQL 2008 R2 had sustained a high “temp table creation rate” until the test is complete. Figure 2 shows that SQL 2012 had just a very quick spike for “temp table creation rate” but then it went to zero while running rest of the test.
Figure 1: SQL Server 2008 R2’s Temp table Creation Rate
Figure 2: SQL Server 2012’s Temp Table Creation Rate
Just on a side note, a parameterized query uses sp_executesql from SQL Server perspective. From application perspective, the following ADO.NET pseudo-code will generate parameterized query:
SqlCommand cmd = ….;
cmd.CommandText = "SELECT Value FROM @TVP"
cmd.CommandType = System.Data.CommandType.Text;
DataTable tvp = new DataTable();
//adding rows to the datatable
SqlParameter tvpParam = cmd.Parameters.AddWithValue("@MyParameter", tvp);
tvpParam.SqlDbType = SqlDbType.Structured;
Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support
Unable to register .NET framework assembly not in the supported list
SQL Server has supported CLR usage since version 2005. But support of .NET framework assemblies within SQL Server is limited per our support policy in KB http://support.microsoft.com/kb/922672.
Some users chose to use .NET framework assemblies outside the list in KB http://support.microsoft.com/kb/922672. This can cause various issues. Lately we have had a few reports of the following error following upgrade to SQL Server 2012.
Msg 6544, Level 16, State 1, Line 2
CREATE ASSEMBLY for assembly '<assembly name>' failed because assembly ‘<assembly name>’ is malformed or not a pure .NET assembly. Unverifiable PE Header/native stub.
A little background. When you develop your user assembly, you can reference .NET framework assemblies. If the referenced .NET framework assemblies are all from the supported list, you only need to register your own user assembly by using CREATE ASSEMBLY statement. When you use a .NET framework assembly that is not in the supported list, the following happens:
- You are required to mark your assembly to be unsafe.
- You are required to use CREATE ASSEMBLY statement to register .NET framework assembly and referenced assemblies (not in the supported list) within SQL Server database. In other words, the .NET framework assembly has to physically reside in a SQL Server database just the same as your own assembly.
- When you do this, you are presented with a warning: “Warning: The Microsoft .Net frameworks assembly 'AssemblyName' you are registering is not fully tested in SQL Server hosted environment.”
There are two types of .NET assemblies. Pure .NET assemblies only contain MSIL instructions. Mixed assemblies contain both unmanaged machine instructions and MSIL instructions. Mixed assemblies in general are compiled by C++ compiler with /clr switch but contain machine instructions resulting from native C++ code.
Regardless which version of SQL Server, CREATE ASSEMBLY only allows pure .NET assemblies to be registered. SQL Server has always required that an assembly to be loaded into SQL Server database with CREATE ASSEMBLY contains only MSIL instructions (pure assembly). CREATE ASSEMBLY will raise the above error if an assembly to be registered is mixed assembly.
Why are we seeing this issue now more often than before?
SQL Server 2005, 2008 and 2008 R2 use CLR 2.0. In SQL Server 2012, we upgraded CLR to use 4.0. As a result, all the .NET framework assemblies will need to be in version 4.0. If you have used a .NET framework assembly that is not in the supported list, you must re-register the 4.0 version using CREATE ASSEMBLY statement following upgrade. Some .NET framework assembly such as WCF started referencing mixed mode assembly in 4.0. Therefore you started to experience the issue in SQL 2012 instead of early versions.
A couple of clarifications
- The above error can occur in any version of .NET framework if the assembly you are trying to register (with CREATE ASSEMBLY) is not a pure .NET assembly. A .NET framework assembly is not guaranteed to be a pure .NET assembly in very version. Additionally, a newer version assembly may reference non-pure .NET assembly. In such situations, upgrade will fail with the above error.
- The issue occurs only if you use unsupported .NET framework assemblies which result in validation because of CREATE ASSEMBLY is involved. If your user assembly references the assemblies in the list documented in KB http://support.microsoft.com/kb/922672 (which will be updated to reflect the issue documented in this blog), we ensure it will work.
Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support
Link List: Becoming a Data Professional
Whenever I present at a conference, I try and make sure to include references to the topics I discuss in the session. That means you either need a lot of handouts, or I need to wait for you to take lots of notes. While note-taking is essential, writing out web links (especially long ones) is not a good use of your time. So I post the references here on my blog, with the tag “Link Lists” and you can simply write down one small URL to get to them all.
This topic deals with the skills needed to become a data professional. I’ll include references here on the role of a data professional, and also some places where you can drill in further for the skills that you need to fill those roles. I’ll try and keep this list updated, and if you have some information on any of these topics, feel free to leave that as a comment below. This list isn’t meant to be an exhaustive web search of all the technologies and concepts I mentioned, but it does cover the references I cited in the talk.
The Data Professional Roles
http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=299 – The Data Professional
http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=247 - Becoming a DBA
http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=132 – DBA Levels
http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=131 – Certification
http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=356 – SQL Server Development Plan
http://www.simple-talk.com/opinion/opinion-pieces/learn-where-you-are/ - Learn where you are
Groups and Associations:
http://www.simple-talk.com/opinion/opinion-pieces/the-value-of-professional-associations/ - The value of professional associations
http://www.sqlpass.org/ - The Professional Organization for SQL Server
http://www.sigkdd.org/ - Association for Computing Machinery (ACM) SIG for Knowledge Discovery from data
http://www.sigmod.org/ - Association for Computing Machinery (ACM) SIG for Data Management
Data Technology Futures
http://www.simple-talk.com/sql/database-administration/big-data-is-just-a-fad/ - “Big Data” is just a fad
http://www.faithpopcorn.com/ - Demographic trends and predictions (Click “Tredbank” when you open the link)
http://blogs.the451group.com/information_management/2013/02/04/updated-database-lanscape-map-february-2013/ - Database Landscape Map
http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks - Seven Databases in Seven Weeks
Data Science
http://radar.oreilly.com/2010/06/what-is-data-science.html - What is Data Science?
http://www.ocdqblog.com/home/demystifying-data-science.html - Demystifying Data Science
http://www.simple-talk.com/sql/database-administration/setting-up-a-data-science-laboratory/ - Setting up a data science laboratory system
http://www.simple-talk.com/sql/database-administration/mission-critical-database-design/ - Data Design Checklist
Reading Lists
http://www.goodreads.com/group/bookshelf/81991-data-science - Goodreads Data Science list
Special Interest Groups
http://www.linkedin.com/groups/Data-Science-Central-4298680 - LinkedIn Data Science Group
http://amstat.org/ - American Statistical Society
http://www.rss.org.uk/site/cms/contentChapterView.asp?chapter=1 – The Royal Statistical Society
RS: Database Engine does not meet edition requirements
I've ran across the following error a few times and thought I would post this out there for people to understand what is happening.
ERROR: Throwing Microsoft.ReportingServices.Diagnostics.Utilities.OperationNotSupportedException: , Microsoft.ReportingServices.Diagnostics.Utilities.OperationNotSupportedException: The feature: "The Database Engine instance you selected is not valid for this edition of Reporting Services. The Database Engine does not meet edition requirements for report data sources or the report server database. " is not supported in this edition of Reporting Services.;
You may also see a similar message in your Event Logs.
This error is a result of mismatched SKU's between the Reporting Server and the Database Engine as the message mentions. You can also look at two entries in the RS Server logs to see what it thought it hit when it performed the SKU check.
resourceutilities!WindowsService_0!e44!02/19/2013-08:58:28:: i INFO: Reporting Services starting SKU: Enterprise
library!WindowsService_0!e54!02/19/2013-08:58:34:: i INFO: Catalog SQL Server Edition = Enterprise
Where this has caused confusion is when the Catalog SQL Server Edition SKU shows the following:
resourceutilities!WindowsService_0!e44!02/19/2013-08:58:28:: i INFO: Reporting Services starting SKU: Enterprise
library!WindowsService_0!e54!02/19/2013-08:58:34:: i INFO: Catalog SQL Server Edition = Developer
This will cause the above error. From a usability perspective, Developer edition is essentially the Enterprise product that you can use for testing. However, Reporting Services has a specific SKU check to prevent running an Enterprise version of a Report Server against a Developer or Eval version of the Storage Engine.
It looks something like this:
case Standard:
case Enterprise:
case EnterpriseCore:
case DataCenter:
case BusinessIntelligence:
restricted.Add(Developer);
restricted.Add(Evaluation);
Of note, Developer edition cannot use Eval and Eval edition cannot use Developer.
So, how do we correct this if we run into this situation? You will need to uninstall and reinstall Reporting Services with the correct Product Key that matches the SKU you are looking for. If you are wanting Developer Edition, you will need to run setup with the Product Key for Developer. The Edition SKU is part of the Setup process and is stamped for that instance.
For Reporting Services 2012 in SharePoint Integrated mode, it is considered a Shared Feature, but still is reliant on the SKU that you used to run SQL Setup with. The catch here is that it is not an Instance per se. You still need to make sure that you are installing it with the proper Product Key for the edition you are using.
The product will then make a WMI call to get the "EditionName" property to determine what is appropriate.
Adam W. Saxton | Microsoft Escalation Services
http://twitter.com/awsaxton
Cumulative Update #5 for SQL Server 2008 R2 Service Pack 2
Cumulative Update #11 for SQL Server 2008 R2 Service Pack 1
Cumulative Update #6 for SQL Server 2012
Recommended SQL Server 2012 SP1 Hotfix available for msiexec issue (KB2793634)
Breaking Down 18065
We have had two blog posts on this blog regarding the 18056 error. Two from Bob Dorr (and part 2) and another from Tejas Shah. However, we still see a lot of questions about this error message. This error message can show up for different reasons. After those two blog posts were made, we released the following:
FIX: Errors when a client application sends an attention signal to SQL Server 2008 or SQL Server 2008 R2
http://support.microsoft.com/kb/2543687
This fix was specific to the following message and having to do with Attentions:
Error: 18056, Severity: 20, State: 29.
The client was unable to reuse a session with <SPID>, which had been reset for connection pooling. The failure ID is 29. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.
Since this was released, there has still continued to be confusion over this error. The intent of the fix above was to limit the amount of noise in the ERRORLOG. And, this was specific to receiving the State 29 with 18056 when an Attention was received. The Attention is the important part here. If an Attention occurred during a reset of a connection, we would normally log that to the ERRORLOG under the State 29. However, with this fix applied, if the Attention occurs during the reset of a connection, you should no longer see the error within the ERRORLOG. This does NOT mean that you will no longer see a State 29.
I will use this post to explain further how we handle these errors to give you a better understanding. To do that, I will expand on Bob Dorr's blog post that I linked above which lists out the states.
StatesDefault = 1,
GetLogin1, 2
UnprotectMem1, 3
UnprotectMem2, 4
GetLogin2, 5
LoginType, 6
LoginDisabled, 7
PasswordNotMatch, 8
BadPassword, 9
BadResult, 10
FCheckSrvAccess1, 11
FCheckSrvAccess2, 12
LoginSrvPaused, 13
LoginType, 14
LoginSwitchDb, 15
LoginSessDb, 16
LoginSessLang, 17
LoginChangePwd, 18
LoginUnprotectMem, 19
RedoLoginTrace, 20
RedoLoginPause, 21
RedoLoginInitSec, 22
RedoLoginAccessCheck, 23
RedoLoginSwitchDb, 24
RedoLoginUserInst, 25
RedoLoginAttachDb, 26
RedoLoginSessDb, 27
RedoLoginSessLang, 28
RedoLoginException, 29 (Kind of generic but you can use dm_os_ring_buffers to help track down the source and perhaps –y. Think E_FAIL or General Network Error)
ReauthLoginTrace, 30
ReauthLoginPause, 31
ReauthLoginInitSec, 32
ReauthLoginAccessCheck, 33
ReauthLoginSwitchDb, 34
ReauthLoginException, 35
**** Login assignments from master ****
LoginSessDb_GetDbNameAndSetItemDomain, 36
LoginSessDb_IsNonShareLoginAllowed, 37
LoginSessDb_UseDbExplicit, 38
LoginSessDb_GetDbNameFromPath, 39
LoginSessDb_UseDbImplicit, 40 (We can cause this by changing the default database for the login at the server)
LoginSessDb_StoreDbColl, 41
LoginSessDb_SameDbColl, 42
LoginSessDb_SendLogShippingEnvChange, 43
**** Connection String Values ****
RedoLoginSessDb_GetDbNameAndSetItemDomain, 44
RedoLoginSessDb_IsNonShareLoginAllowed, 45
RedoLoginSessDb_UseDbExplicit, 46 (Data specified in the connection string Database=XYX no longer exists)
RedoLoginSessDb_GetDbNameFromPath, 47
RedoLoginSessDb_UseDbImplicit, 48
RedoLoginSessDb_StoreDbColl, 49
RedoLoginSessDb_SameDbColl, 50
RedoLoginSessDb_SendLogShippingEnvChange, 51
**** Common Windows API Calls ****
ImpersonateClient, 52
RevertToSelf, 53
GetTokenInfo, 54
DuplicateToken, 55
RetryProcessToken, 56
LoginChangePwdErr, 57
WinAuthOnlyErr, 58
**** New with SQL 2012 ****
DbAuthGetLogin1, 59
DbAuthUnprotectMem1, 60
DbAuthUnprotectMem2, 61
DbAuthGetLogin2, 62
DbAuthLoginType, 63
DbAuthLoginDisabled, 64
DbAuthPasswordNotMatch, 65
DbAuthBadPassword, 66
DbAuthBadResult, 67
DbAuthFCheckSrvAccess1, 68
DbAuthFCheckSrvAccess2, 69
OldHash, 70
LoginSessDb_ObtainRoutingEnvChange, 71
DbAcceptsGatewayConnOnly, 72
An 18056 error can only occur when we are trying to reset a pooled connection. Most applications I see these days are setup to use pooled connections. For example, a .NET application will use connection pooling by default. The reason for using pooled connections are to avoid some of the overhead of creating a physical hard connection.
With a pooled connection, when you close the connection in your application, the physical hard connection will stick around. When the application then goes to open a connection, using the same connection string as before, it will grab an existing connection from the pool and then reset the connection.
When a connection is reset, you will not see sp_reset_connection over the wire. You will only see the "reset connection" bit set in the TDS Packet Header.
Frame: Number = 175, Captured Frame Length = 116, MediaType = ETHERNET
+ Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-15-5D-4C-B9-60],SourceAddress:[00-15-5D-4C-B9-52]
+ Ipv4: Src = 10.0.0.11, Dest = 10.0.0.130, Next Protocol = TCP, Packet ID = 18133, Total IP Length = 102
+ Tcp: [Bad CheckSum]Flags=...AP..., SrcPort=59854, DstPort=1433, PayloadLen=62, Seq=4058275796 - 4058275858, Ack=1214473613, Win=509 (scale factor 0x8) = 130304
- Tds: SQLBatch, Version = 7.3 (0x730b0003), SPID = 0, PacketID = 1, Flags=...AP..., SrcPort=59854, DstPort=1433, PayloadLen=62, Seq=4058275796 - 4058275858, Ack=1214473613, Win=130304
- PacketHeader: SPID = 0, Size = 62, PacketID = 1, Window = 0
PacketType: SQLBatch, 1(0x01)
Status: End of message true, ignore event false, reset connection true, reset connection skip tran false
Length: 62 (0x3E)
SPID: 0 (0x0)
PacketID: 1 (0x1)
Window: 0 (0x0)
- TDSSqlBatchData:
+ AllHeadersData: Head Type = MARS Header
SQLText: select @@version
In the above example, we are issuing a SQL Batch on a pooled connection. Because it was a pooled connection, we have to signal that we need to reset the connection before the Batch is executed. This is done via the "reset connection" bit.
After the above SQLBatch is issued, the app could then turn around and issue an Attention to cancel the request. This is what resulted in the 18056 with State 29 in the past under the condition of an attention.
Frame: Number = 176, Captured Frame Length = 62, MediaType = ETHERNET
+ Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-15-5D-4C-B9-60],SourceAddress:[00-15-5D-4C-B9-52]
+ Ipv4: Src = 10.0.0.11, Dest = 10.0.0.130, Next Protocol = TCP, Packet ID = 18143, Total IP Length = 48
+ Tcp: [Bad CheckSum]Flags=...AP..., SrcPort=59854, DstPort=1433, PayloadLen=8, Seq=4058275858 - 4058275866, Ack=1214473613, Win=509 (scale factor 0x8) = 130304
- Tds: Attention, Version = 7.3 (0x730b0003), SPID = 0, PacketID = 1, Flags=...AP..., SrcPort=59854, DstPort=1433, PayloadLen=8, Seq=4058275858 - 4058275866, Ack=1214473613, Win=130304
- PacketHeader: SPID = 0, Size = 8, PacketID = 1, Window = 0
PacketType: Attention, 6(0x06)
Status: End of message true, ignore event false, reset connection false, reset connection skip tran false
Length: 8 (0x8)
SPID: 0 (0x0)
PacketID: 1 (0x1)
Window: 0 (0x0)
In this case, we would still be in the process of doing the connection reset which would be a problem. Bob Dorr's Part 2 blog that is linked above goes into good detail for how this actually occurs.
So, no more State 29?The thing to realize about State 29 is that it is a generic state just indicating that an exception has occurred while trying to redo a login (Pooled Connection). This exception was not accounted for in any other logic to produce a different state that is listed above. Something similar to like an E_FAIL or General Network Error.
Going forward, assuming you the above fix applied, or are running on SQL 2012 which has it as well, if you get a State 29, it will not be because of an Attention because we are not logging the 18056 any longer for the Attention, however, if you look at dm_os_ring_buffers, you will still see the actual Attention (Error 3617). We just don't log the 18056 any longer to avoid noise.
<Record id= "3707218" type="RING_BUFFER_EXCEPTION" time="267850787"><Exception><Task address="0x52BDDC8"></Task><Error>3617</Error><Severity>25</Severity><State>23</State><UserDefined>0</UserDefined></Exception><Stack
There are things that occur in the course of resetting a login that could trigger a State 29. One example that we have seen is a Lock Timeout (1222).
In the Lock Timeout scenario, the only thing logged to the ERRORLOG was the 18056. We had to review the dm_os_ring_buffers DMV to see the Lock Timeout.
<Record id= "3707217" type="RING_BUFFER_EXCEPTION" time="267850784"><Exception><Task address="0x4676A42C8"></Task><Error>1222</Error><Severity>16</Severity><State>55</State><UserDefined>0</UserDefined></Exception><Stack
The Lock Timeout was a result of statements issuing "SET LOCK_TIMEOUT 0" which affects the connection itself. When the connection is "reset", the SET statements are carried forward. Then based on timing, and whether an exclusive lock is taken based on what the Login logic is looking for, it could end up affecting Logins off of a Pooled Connection when that connection is reused. The default lock timeout for a connection is -1.
Now what?If you receive a State 29, you should follow that up by looking in the dm_os_ring_buffers. You will want to look at the RING_BUFFER_EXCEPTION buffer type.
select
cast(record
as
XML) as
recordXML
from
sys.dm_os_ring_buffers
where
ring_buffer_type =
'RING_BUFFER_EXCEPTION'
The error that you find should help explain the condition, and/or allow you to troubleshoot the problem further. If you see 3617, then you will want to look at applying the hotfix above to prevent those messages from being logged. If you see a different error, then you may want to collect additional data (Profiler Trace, Network Trace, etc…) to assist with determining what could have led to that error.
Adam W. Saxton | Microsoft Escalation Services
http://twitter.com/awsaxton
How Does the Cloud Change a Developer's Job?
I've recently posted a blog on how cloud computing would change the Systems Architect’s role in an organization, another on how the cloud changes a Database Administrator's job, and the last post dealt with the Systems Administrator. In this post I'll cover the changes facing the Software Developer when using the cloud.
The software developer role was the earliest adopter of cloud computing. This makes perfect sense, because the software developer has always used computing "as a service" - they (most often) don't buy and configure servers, platforms and the like, they write code that runs on those platforms. And there's probably not a simpler definition of a software developer to be found, but as with all simple statements, you lose fidelity and detail. I'll offer a more complete list in a moment.
Because the software developer's process involves designing, testing and writing code locally and then migrating it to a production environment, all of the paradigms in cloud computing - from IaaS to PaaS to SaaS - come naturally.
The Software Developer's RoleThe software developer has evolved since the earliest days of programming.The software developer not only "writes code" - there are far more tasks involved in modern systems development:
- Assisting the Business Role(s) in developing software specifications
- Planning software system components and modules
- Designing system components
- Working in teams writing classes, modules, interfaces and software endpoints
- Designing data layouts, architectures, access and other data controls
- Designing and implementing security, either programmatic, declarative, or referential
- Mixing and matching various languages, scripting and other constructs within the system
- Designing and implementing user and account security rights and restrictions
- Designing various software code tests - unit, functional, fuzz, integration, regression, performance and others
- Deploying systems
- Managing and maintaining code updates and changes
Like most of the previous roles, those tasks also unpacks into a larger set of tasks, and no single developer has exactly that same list. And like the DBA, the role is often more, or less of that list based on where the developer works. Smaller companies may include the development platform in the duties so that a developer is also a systems administrator. In larger organizations I've seen developers that specialized on User Interfaces, Engine Components, Data Controls or other specific areas.
How the Cloud Changes ThingsThe software developer role obviously has the same concerns and impacts of "the cloud" as the Systems Architect. They need to educate themselves on the options within this new option (Knowledge), try a few test solutions out (Experience) and of course work with others on various parts of the implementation (Coordination).
The big changes for a developer include three major areas: Hybrid Software Design, Security, and Distributed Computing.
Hybrid Software DesignAfter the PC revolution, software developers designed systems that ran primarily on a single computer. From there the industry moved to "client/server", where most of the code still lived on the user's workstation, and various levels of state (such as the data layer) moved to a server over fast connected lines. After than followed the Internet phase, which had less to do with HTML coding than it did with state-less architectures. While no architecture is truly stateless, there are ways of allowing the client to be in a different state than the server of the application at any one time - this is the way the Web works.
Even so, the developer often simply moved one the primary layers (such as Model, View or Controller) to the server, using the User Interface merely as the View or Presentation layer. While technically stateless, this doesn't require a great deal of architecture change - there are various software modules that run on a server, and perhaps that connects to a remote data server. In the end, it's still a single paradigm.
We now have the ability to run IaaS (hardware abstraction), PaaS (hardware, operating system and runtime abstraction) and SaaS (everything abstracted, API calls only) in a single environment such as Windows Azure. A single application might have a Web-based Interface Server with federated processes (using a PaaS set of roles), a database service (using a SaaS provider such as Windows Azure SQL Database), a specialized process in Linux (using an IaaS role in Windows Azure) and a translator API (from the Windows Azure Marketplace). This example involves only one vendor - Microsoft. I've seen applications that use multiple vendors in this same way.
Thinking this way opens up a great deal of flexibility - and complexity. Complexity isn't evil; it's how complicated things get done many times. The modern developer needs to understand how to build hybrid software architectures.
Resources: Hybrid Architectures with step-by-step instructions and examples: http://msdn.microsoft.com/en-us/library/hh871440.aspx and Windows Azure Hybrid Systems: http://msdn.microsoft.com/en-us/library/hh871440.aspx?AnnouncementFeed
SecurityHaving a single security boundary, such as "everyone who works in my company", is a relatively simple problem to solve. Normally the System Administrators configure and control a security provider, such as Active Directory, and developers can access that security layer programmatically. That allows for good separation of duties and role-based control.
In modern applications, clients, managers, and users both internal and external need various levels of access to the same objects, code and data. A client should be able to enter an order, a store should be able to accept the order, the credit-card company should be able to check the order and authorize payment, and the managers should be able to report on the order or change it if needed. Using role-based security across multiple domains would be impossible to maintain.
Enter "claims-based" authentication. In this paradigm, the user logs in with whatever security they use - corporate or other Active Directory, Facebook, Google, whatever. The application (using Windows Identity Foundation or WIF) can accept a "claim" from that provider, and the developer can match whatever parts of that claim they wish to the objects, code and data. And example might be useful.
Buck logs in to his corporate Active Directory (AD), and attempts to use a program based in Windows Azure. Windows Azure rejects the login silently, and is configured to check with Buck's AD. Buck's AD says "yes, I know Buck, and he has been granted the following claims: "partner", "manager", "approver". The developer does not need to know about Buck's AD, Buck, his login, or anything else. She simply codes the proper data access to allow "approver" to approve a sale.
This allows a lot of control, at a very fine level, without having to get into the details of each security provider. .
Resources: Overview of using claims-based Azure Security: http://adnanboz.wordpress.com/2011/02/06/claims-based-access-and-windows-azure/
Distributed ComputingIs there a difference between stateless computing, or even the hybrid programming I mentioned earlier, and "Distributed Computing"? Yes - the primary difference is latency. Even stateless code can have too small a tolerance for latency.
Dealing with slow connectivity, or breaks in connections has many impacts. One method of dealing with this is to locate data and computing of that data as closely as possible, even if this means relaxing consistency or duplicating data. Another method is to go back to a great paradigm from the past that is possible underused today is a Service Oriented Architecture. The Windows Azure Service Bus is possibly one of the fastest and easiest way to adopt cloud computing without completely rearchitecting your application.
References: Great breakdown of the thought process around a distributed architecture: http://msdn.microsoft.com/en-us/magazine/jj553517.aspx and using a Windows Azure Relay Service: http://www.windowsazure.com/en-us/develop/net/how-to-guides/service-bus-relay/
Under rare conditions, using IN clause can cause unexpected SQL behavior
I want to make you aware of a latest SQL Server 2008 hotfix documented in http://support.microsoft.com/kb/2791745. Using large number of constants in IN clause can result in SQL Server termination unexpectedly. When this happens, you won’t see anything in errorlog or any dumps generated by SQL Dumper.
The condition to trigger this is not that common. Therefore, you may never experience this type of issue. In order to hit this condition, you must have mismatched numeric data type in the IN clause.
Let’s assume that you have a table defined as “create table t (c1 numeric(3, 0))”. But in the IN clause, you have something like t.c1 in ( 6887 , 18663 , 9213 , 526 , 30178 , 17358 , 0.268170 , 25638000000000.000000 ). Note that precision and scale of the constants exceed the column c1’s precision and scale.
If your have queries like these, then you may experience this unexpected behavior depending on the final query plan. This usually happens when you allow your user to do ad hoc queries and add random number of constant values which may exceed the column’s precision and scale.
Solution
The solution is to apply http://support.microsoft.com/kb/2791745. Note that the issue doesn’t happen on SQL 2012 and we are working on a fix on SQL Server 2008 R2 as well.
