SQL Server MS Blogs

Analysis Services Thread Pool Changes in SQL Server 2012

SQL Server SQL CSS - Tue, 01/31/2012 - 14:40

As part of the SQL Server 2012 release several changes were made to Analysis Services that should alleviate some issues previously seen on large NUMA node machines.

Separation of IO and Processing Jobs

To better understand how the changes that were made work, it is helpful to have some background information about what types of issues were seen in earlier releases.

During testing, it was found that a 2 NUMA node 24 core server was handling roughly 100 queries per second during stress testing.  The same tests were then run on a 4 NUMA node server with 32 cores and the queries answered per second actually decreased. Investigation indicated the cause as cross NUMA node access to the system file cache. For example if the first job that read a file was scheduled on a CPU assigned to NUMA node 1 and a subsequent job, which needed to read the same pages from the file, was scheduled on a CPU assigned to NUMA node 3, a significant performance penalty was seen reading the cached file page.  Deeper investigation determined that the bottleneck appeared to be in accessing the standby list of file cache pages.  The performance impact appeared to become more pronounced the higher the number of CPUs and NUMA nodes on the machine.  For machines with less than 4 nodes the impact was not found to be significant.  However, on machines with 4 or more nodes the impact was such that performance could begin to regress.

One possible work around is to open files in Random mode since this changes the file cache access behavior. To accommodate this SSAS 2012 now allows Random file access to be configured through the msmdsrv.ini file by setting the RandomFileAccessMode property to a value of 1.  This server property does not require a service restart to take effect, but a restart is recommended.  If the server is not restarted then Analysis Services will not release open files or change the way it accesses open files, but the setting will take effect for newly opened or created files.

While changing the file access mode may provide some relief, eliminating cross NUMA node file access whenever possible is a better long term solution. Towards this goal, SQL Server 2012 Analysis Services now has separate Process and IOProcess thread pools which are NUMA node aware.  The new IOProcess thread pool will handle read jobs while the Process thread pool will continue to handle ROLAP and processing related jobs, including writing new files. On machines with 4 or more NUMA nodes, the IOProcess thread pool is not a single thread pool but instead a collection of thread pools with each NUMA node having its own pool of IOProcess threads.  Assigning each NUMA node its own IOProcess pool doesn’t by itself help with cross NUMA node file access unless file IO operations are consistently assigned to the same IOProcess thead pool, so an algorithm for assigning partition file read operations to specific IO thread pools was also added. At RTM, the algorithm spreads partition reads across IO thread pools based on the ordinal position of the partition in the partitions collection for a measure group.  The algorithm is subject to change without notice in future builds so design decisions should not be made based on this behavior.

An attentive reader will notice that the discussion has highlighted partition reads to this point, what about dimension operations? Dimension read jobs are always assigned to the IOProcess thread pool for NUMA node 0.  While one could argue that this scheme could result in NUMA node 0 being assigned a larger percentage of the work, it is expected that most dimension operations will be operating on cached data in memory and won’t have a noticeable impact. 

Because IOProcess threads are expected to perform short duration read operations on files and as such do not register themselves as cancellable objects.  This means that even if a query which requested an I/O operation was cancelled, the in-process I/O jobs could continue to run for a short period of time after the query was cancelled, however new I/O jobs should not be created for the canceled query.

As with other SSAS features, the default behavior of the per NUMA node IOProcess thread pools is intended to cover the most common scenarios.  However, in some situations it may make sense to override the 4 NUMA node threshold for the NUMA node affinitized IOProcess thread pools.  To allow administrators to revert back to a single IOProcess thread pool or force per NUMA node IOProcess thread pools for machines with less than 4 NUMA nodes a new entry has been added to the Analysis Services configuration file (msmdsrv.ini).  The PerNumaNode setting under ThreadPool\IOProcess has a default value of -1 which tells the server to use the automatic 4 NUMA node threshold.  Changing this value to 0 disables the per NUMA node thread pool behavior, while setting it to 1 will turn on this behavior (even if there are less than 4 nodes).

The splitting of Process and I/O jobs into separate thread pools and the assignment of I/O jobs to consistent NUMA nodes should alleviate some of the performance impacts of cross NUMA node operations, significantly increasing the performance of SSAS workloads on higher end servers.


Greater than 64 CPU support and Thread Affinitization 

In adition to giving IO operations their own thread pools, Analysis Services now supports more than 64 CPUs, something the relational engine has had for a while. 

In order to add support for more than 64 CPUs, Windows uses the concept of processor groups.  A processor group in Windows can contain a maximum of 64 CPUs and systems with more than 64 CPUs will contain multiple processor groups.  For more details on processor groups and support for more than 64 CPUs read the following:
http://msdn.microsoft.com/en-us/library/dd405503%28VS.85%29.aspx
http://blogs.msdn.com/b/saponsqlserver/archive/2010/09/28/windows-2008-r2-groups-processors-sockets-cores-threads-numa-nodes-what-is-all-this.aspx

To support multiple processor groups and thus more than 64 CPUs, SSAS 2012 was updated to set the process affinity mask for the msmdsrv.exe process to span multiple processor groups.  Along with this capability a new configuration property named GroupAffinity was added for each thread pool in the server.  This property allows an SSAS administrator to have fine grain control over which CPUs on a machine are used for each thread pool.  The GroupAffinity setting is a bitmask that is used to determine which CPUs in a processor group can be used for the thread pool in which the GroupAffinity mask is defined.  For example if the following entry:
<GroupAffinity>0xFFFF,0xFFFF</GroupAffinity>
were to appear under <ThreadPool> <Process> in the msmdsrv.ini file, it would affinitize threads to 16 logical processors in the first two processor groups on the server. Where the following entry:
<GroupAffinity>0x00F0,0xFFFFFFFF</GroupAffinity>
would affinitize threads to the CPUs 4-7 in the first processor group, and the first 32 CPUs in the second processor group.  The GroupAffinity property can have as many comma separated hex values as there are defined CPU groups on a server.  If the mask contains less bits than the number of CPUs for the processor group then it is assumed that non-specified bits are zeros. If no GroupAffinity value is specified for a thread pool (default) then that thread pool is allowed to spread work across processor groups and CPUs on the box.

For diagnostic purposes the msmdsrv.log file now contains entries at service start that reflect the size of each of the five thread pools (Query, ParsingShort, ParsingLong, Processing, and IOProcessing) and their settings, including affinity.

Example:
(12/21/2011 4:16:04 PM) Message: The Query thread pool now has 1 minimum threads, 10 maximum threads, and a concurrency of 4.  Its thread pool affinity mask is 0x0000000000000003. (Source: \\?\C:\Program Files\Microsoft SQL Server\MSAS11.DENALIRC0\OLAP\Log\msmdsrv.log, Type: 1, Category: 289, Event ID: 0x4121000A)

Note: Although VertiPaq can use more than 64 CPUs, GroupAffinitization is not currently supported for the VertiPaq thread pool, even though an entry exists in the msmdsrv.ini file.

While the GroupAffinity setting was added as part of the work to support more than 64 CPUs, this property can also be used on servers with less than 64 CPUs to control which CPUs are used for specific operations.  Through use of the GroupAffinity mask administrators can push I/O, processing, query, or parsing threads to specific CPUs to obtain optimal resource usage, or better enable resource sharing across multiple processes on the same server.

Wayne Robertson, Sr. Escalation Engineer Analysis Services

Categories: SQL Server MS Blogs

Management Studio tricks you may or may not know about

SQL Server SQL CSS - Tue, 01/31/2012 - 10:33

I was sitting through SQL Server 2012 training, and Ajay Jagannathan was showing us Management Studio.  Eric Burgess had worked on this topic.  He started covering some neat things, that have apparently been there for a while, but I just never played around with it enough to see what all you could do.  I’m sure a lot of others are not aware of these either, so I thought I would highlight some of them.  These may help to make you more productive, or not - depending on how long you spend playing with these. The biggest change for Management Studio within SQL Server 2012 is that we are using the Visual Studio 2010 Shell.

These items aren’t exactly new, but they were new to me. 

1. Keyboard shortcuts

There are a ton of Keyboard shortcuts that you can use within Visual Studio.  The default settings are based on Visual Studio 2010.  Here is a list of those shortcuts. SQL Server Management Studio Keyboard Shortcuts

You can get to these through Tools -> Options -> Environment/Keyboard

 

2. Block Selection

Sometimes you may only want to select and copy a column of text as opposed to the normal text selection done by holding down the Shift Key.  To do Block selection, you can do SHIFT+ALT and drag your mouse to only select certain areas of your text in column fashion.

 

3. Status Bar

Most people should be familiar with the Status Bar at the bottom of a query window.

Under Tools -> Options -> Text Editor/Editor Tab and Status Bar, there are a bunch of options that you can play with.

One is the Status Bar Location.  By default, this is set to Bottom, but you can move it to the top if you choose.  This was a little weird for me though - although I’m just not use to it.  It does put it more front and center.

From a color perspective, you’ll notice Group Connections and Single Server Connections.  I’ll talk about Group connections below, but just be aware that you can change the coloring for when it is a group connection.

 

4. Cycle through Query Windows

I’ve known about ALT+TAB to cycle through programs (Windows).  I’ve also known about CTRL+TAB to cycle through components within a given application.  For example in Excel you could use CTRL+TAB to move between worksheets.  I’ve never tried it in Management Studio, but it allows you to cycle through the Query Windows.

 

 

Also, CTRL + F6 will cycle through the actual tabs without the graphic switching display.

5. Group Connections

You can go to View and select to show Registered Servers.  Within Registered Servers you can create a group of SQL Servers.  This then allows you to start a query that will be run against all of the servers within the group.  This is where the Group Connection Color for the status bar comes into play.

This could be really handy if you need to execute items across multiple servers.  The color of the status bar is there to help you realize that that query is a group query as opposed to a single server connection.

 

Hopefully these are useful to you.

 

Adam W. Saxton | Microsoft Escalation Services
http://twitter.com/awsaxton

Categories: SQL Server MS Blogs

Book Review (Book 8) - The Elements of Persuasion

Bucky Woody - Mon, 01/30/2012 - 08:57

This is a continuation of the books I challenged myself to read to help my career - one a month, for year. You can read my first book review here. The book I chose for January 2012 was: The Elements of Persuasion by Richard Maxwell and Robert Dickman.

Why I chose this Book:

As I mentioned in a previous review, I think good storytelling is an essential part of any career. Communication is basic in not only our professional but personal lives, and everyone I’ve met responds well to stories, from children to executive audiences. Not only that, learning to tell a story helps you formulate concepts about the topic, which is yet another way of learning.

I heard about this book from a couple of folks, and it landed within search of “storytelling” and “business”. Whenever I just search for “Storytelling” I either end up with lists of stories (which is fine) or lists of children’s books on storytelling (which is also fine) but neither of these are quite what I’m looking for.

A quick search on Amazon and I located the book, and then a quick check of my various e-library offerings and I downloaded it to my laptop for reading.

What I learned:

This is a “selling” book, but not like you might think. It’s not a book of a quick sale like at a car-lot or a “quick-sale” environment. It’s more along the lines at the executive level and longer-term sales - those involve stories as well.

Sadly, this is another “business book” - the kind I normally don’t like much. There are typical case-study layouts with lots of examples, but in my mind not enough didactic information to actually help you develop a good story-telling mantra. 

Even so, I learned some interesting things about the process these authors use. Some of the case studies are interesting, and I did pull out that a story should work towards a single, defining sentence. This isn’t unique to this book, but it is a reinforcement of what I’ve learned elsewhere. Although nothing to do with storytelling, I did like the reference to Lockheed’s “14 Rules”, which I hadn’t read before. They also break down the storytelling process into five elements, which is actually covered better (in my mind) in a book called “20 Master Plots” , which may actually be the storytelling book I’ve been searching for.

Or perhaps I should just write the one I’m looking for.  

At any rate, not sure I would recommend this book to others - perhaps as a check-out, but not a purchase, at least if this is for the same reason I looked it up. 

Raw Notes:

As I read, I take notes - it’s called “reading with a pencil”. These are the notes I made to myself, in no particular order and with no context other than the book itself:

  • Stories are interesting to us all.
  • Describes five elements in a story, but in fact this is for only one type of story. Other books describes more story types.
  • Very standard business book, but there are good tips in some of the chapters.
  • Explained how to connect with the audience, good points Spends a lot of time referring to other books The book of five rings Work towards a single, memorable sentence.
  • Changes partway through into stories about stories. This is better.
  • A mix of storytelling and sales, although this was touted for sales, feels much more like selling than storytelling, advertisements.
  • Interesting story about memory championships, where contestants memorize cards. They use unusual stories.
  • Look up Lockheed and the 14 rules
Categories: SQL Server MS Blogs

The case of the incorrect page numbers

SQL Server SQL CSS - Sun, 01/29/2012 - 16:46

As you may or may not know, SSRS 2008 R2 added the ability to automatically create page breaks on group changes.  Historically, people attempted to use custom code to accomplish this and, while it worked, the pagination logic we use in SSRS 2008 R2 breaks the standard implementation of this for several reasons.  You can see the typical implementation of this at http://blogs.msdn.com/b/chrishays/archive/2006/01/05/resetpagenumberongroup.aspx.

Unfortunately, in SSRS 2008 R2 the pagination engine does multiple passes in some scenarios thus breaking this logic.  Also, the logic referenced above makes the assumption that the report is rendered from page 1 to page N and this assumption is not necessarily true either.  We recognized that losing that ability to do the pagination at a group level is a key feature so we actually rolled it completely into the product.  Robert Bruckner (one of the SSRS developers) gives a really good explanation of all of the ins and outs of the feature at http://blogs.msdn.com/b/robertbruckner/archive/2010/04/25/report-design-reset-page-number-on-group.aspx.

I wish I could say that this feature has been completely without pain, but that is not true.  If any of you have been working with SSRS for any length of time, you know that there are all kinds of quirks to the pagination process.  Given the correct information, it is always fairly easy to figure out why the pagination engine resulted in a certain pagination, but understanding all the pertinent details is….a bit difficult.  In a lot of ways, I view pagination sort of like the query optimizer in the SQL Server engine.  It seems like total black magic if you don’t know the rules it follows, but once you understand the rules it generally makes good sense.  However, again like the query optimizer, the pagination logic is very complex and is something that we are constantly refining.  As you can imagine, people regularly find edge cases where the pagination either doesn’t function as they want or flat out makes a mistake.  In addition, layering group level page breaks on top of normal pagination makes the scenario even more complicated.  Thus, I have to make the admission that we have had several fixes released in this specific area.  Sigh…

The good news is that I have also generally found that a good chunk of my page numbering problems go away by using the latest CU.  The bad news is that a simple fix like upgrading to the most recent CU didn’t fix my most recent problem.

So, here’s the situation.  My customer wanted to reset the page number whenever the group changed.  No big deal in theory.  As documented both in the MSDN documentation and Robert’s blog, all they had to do was to add a couple of properties to the target group:

No big deal, right?  They completely expected to see the pagination look something like this:

and , etc.

This is what my customer saw instead:

What the heck????!?!!?!?!

Needless to say, this is when they decided to open a case with Microsoft Support.  The problem got even weirder when I discovered that under no circumstances could I duplicate the behavior!!!  I could reproduce a whole host of problems (2 of 1, 9 of 6), but by applying some workarounds and the latest CU’s I could always get back to normal page numbering resets.  At this point, I was completely stumped.  I even engaged the SSRS Product Group to ask if there was some sort of race condition that could lead to this behavior.  Unfortunately, the answer I got back was that there was no identifiable race condition scenario where 1 of 337 could happen.

Then, as always seems to be the case, I put the problem aside for a couple of days to let it stew.  And then, while driving into work the next Sunday, it hit me.  Target server version!

You see, it had occurred to me that one of the good things about BIDS 2008 R2 is that it allows you to publish against both a 2008 and a 2008 R2 instance of SSRS.  *However*, the thing I had never checked before was how we dealt with a 2008 R2 feature when deploying against 2008.  It turns out we strip the feature out during the publish process.  When I change my project properties to target a 2008 instance even though my instance is actually 2008 R2, this is what I saw:

Building ‘FailingScenario.rdl' for SQL Server 2008 Reporting Services.

A PageBreak property was removed from the data region, rectangle, or group ‘GroupOnWhichToIterate’. SQL Server 2008 Reporting Services does not support the PageBreak property ResetPageNumber.

The PageName property specified on the data region, rectangle, or group ‘GroupOnWhichToIterate’was removed from the report. SQL Server 2008 Reporting Services does not support the PageName property.

Build complete -- 0 errors, 2 warnings

This led me to ask my customer what version of SSRS their instance of BIDS was targeting as guess what?  It was SSRS 2008!

The good news is that getting the desired group level page breaks was as easy as setting the target version to Reporting Services 2008 R2 and deploying the report.

Categories: SQL Server MS Blogs

The official release of System Center Advisor…

SQL Server SQL CSS - Thu, 01/26/2012 - 22:49

If you have followed this blog, you have probably seen a series of posts documenting the life of a project I’ve been working on, Atlanta, to a product called System Center Advisor. Today marks the official release of that product. This has been a particularly rewarding journey for me to see an idea about giving our knowledge in CSS to customers turn into a full proactive assessment service powered by System Center and the cloud.

In my past blog posts, I’ve shown you examples of the knowledge this product brings in the form of alerts and configuration history. With the official release comes new features such as a new dashboard about the assessment of your servers, a revised and simplified setup program, voting buttons to tell us if the alerts are useful, and the ability to provision other users to view or administer your console.

While I love to see new features like these that make this product compelling, the power of this product is in the knowledge. So while System Center Advisor has been ramping up towards this release while being a release candidate, our CSS teams have been adding knowledge in the form of alerts each month based on actual customer experiences.

Here is an example. Several months ago, one of our engineers in CSS pointed out to me that we have seen a trend where our customers would contact us and report the following error in Management Studio when trying to expand the Databases Node:

After some investigation, we found out the customer had disabled the ‘guest’ user in the msdb database. As it turns out, Management Studio requires the login connecting to have “CONNECT” access to msdb. If the login is not mapped directly to a user in msdb (usually not the case), it needs to use guest. If guest is disabled, then any access to msdb (Ex. use msdb) would result in this error.

We looked into this further and found out this was not a one time occurrence. Several customers had reported this problem to us and from other sources (Microsoft Connect, …). Why would so many people hit this? We asked one customer and they said they were simply following guidelines established in our documentation as outlined at:

http://msdn.microsoft.com/en-us/library/ff848752.aspx

In this part of the docs, the tip for the guest account says:

While the server itself doesn’t allow you to disable guest in master or tempdb (You get the error Msg 15182 Cannot disable access to the guest user in master or tempdb. if you try), for msdb we unfortunately allow this. While the server will run in this scenario, the use of tools like Management Studio will have problems. So customers were just doing what we told them.

While changing the product to prevent this is the right thing to do, what do we do in the meantime to warn customers this could be a problem? We can certainly publish articles and documentation for them to discover but why not automate the check? When I heard about this issue, I thought What a perfect rule for System Center Advisor

So as part of our monthly update to rules for SQL Server we introduced this check on your system. Should you disable guest for msdb you will get an alert which will direct you to the following technical article written by CSS talking about the problem and how to solve it:

http://support.microsoft.com/kb/2539091

This is the power of what Advisor can bring to you. We continue to update our rules each month as we discover trends about common customer problems or customer problems that may be difficult to detect or find by just searching the web.

This journey is far from over. While we to continue to create new rules for SQL Server, both the System Center Advisor team and CSS will continue to enhance this product including:

  • Rules for other server products (there are already several rules for the core Windows OS, Active Directory, and Hyper-V)
  • Provide support for SQL Server 2012 (and new rules unique to that version)
  • Expand features and capabilities of the System Center Advisor software and portal

System Center Advisor is available in 26 countries and comes at no charge to customers that have a Software Assurance agreement. Don’t have Software Assurance but want to try it out for free? Install it from the System Center Advisor site and choose the 60 day free trial option. Don’t know if your company has Software Assurance? Talk to the team in your company that handles volume licensing for Microsoft purchases (they typically have access to the Volume Licensing Service Center site).

If you try System Center Advisor and want to provide feedback, there is a Feedback link at the top of the Advisor console. However, I’m always personally interested to hear directly from customer experiences good and bad on how this product works for you. So feel free to comment on this post on your experiences.

Bob Ward
Microsoft

 

Categories: SQL Server MS Blogs

How It Works: sys.dm_tran_session_transactions

SQL Server SQL CSS - Tue, 01/24/2012 - 09:47

For some reason I have been looking at DMV output closely the last couple of weeks.    I just blogged about the pending I/O requests and now I have a behavior to outline for dm_tran_session_transactions for todays blog.

select * from sys.dm_tran_session_transactions

The scenario I was looking at is as follows.

Server A   Server B Broker calls activation procedure     Begin Tran (local)     Linked Server Query Transaction Promoted to DTC Transaction Imported from Server A xp_cmdshell Looped back and blocked
Separate Transaction Blocking
  • The process is based on SQL Service Broker. 
  • The service broker session on Server A is 20s
  • It starts a local transaction and eventually performs a linked server query.  
  • This causes the transaction to be promoted to a DTC transaction.  

At this point in time the *active transaction* DMVs on both Server A and B show the enlisted UOW of the DTC transaction.  That is, except the sys.dm_tran_session_transactions on server A.

After some digging I uncovered that the dm_tran_session_transactions DMV only outputs rows for sessions (s) that are NOT system level sessions.   Since the broker activity is handled on a system session the DMV will not materialize a row for Session 20s on Server A in this example.    Instead you have to use the additional  *active transaction* tables to track the UOW across this system.

Note: I am able to use any transaction (local or DTC) as part of SQL Service Broker activation which will not show rows in the session transactions DMV because it is considered a system session.

Bob Dorr - Principal SQL Server Escalation Engineer

Categories: SQL Server MS Blogs

Team Foundation Server (TFS) in the Cloud - My Experience So Far

Bucky Woody - Tue, 01/24/2012 - 06:45

I recently joined a software development project that involves not only myself and other internal Microsoft employees, but a partner and a customer as well. We are building a hybrid solution that uses assets on premises as well as Windows Azure for processing. When we put the team together we picked a methodology (Agile) for the project (we use multiple methodologies at Microsoft - whatever the project needs) and then we started talking about Source Control.

We’re all comfortable with various tools for check-in-check-out, branching, and so on. We have all used GIT, SVN, and TFS. Some of us have even used Source Safe in past, but that’s another post. Each company has a full set of Source Control systems in place. But using each other’s systems requires logins, firewalls and the like - so we decided to use the TFS Service Preview to run the entire project from “the cloud”. Here are my experiences with that.

The process was really simple. In fact, we talked about using the cloud TFS in the first SCRUM, and the team was working from the Work Items list that afternoon. The original account login provides a web interface to allow people to join the team. Each of us happened to have a Live.Com address, so we just invited those addresses to join and they got a link, like this:

projectname.tfspreview.com

I’m using Visual Studio, and it’s a requirement for TFS preview to have SP1 installed, and this patch: KB2581206

From there, I opened Visual Studio and navigated from the main menu to Team and then Connect to Team Foundation Server. I’m given this menu:

Selecting port 443 and HTTPS (for security) and then ensuring the lower link has the “tfs” appended as the location, I opened the project.

(This VSTS screenshot is of a project I did in my University of Washington class I teach - I never show client code or names in a blog post)

From there it’s a normal set of operations. Right now the preview doesn’t have some things I’d really like, such as an automated build or some of the testing tools, but you can read this blog entry to learn more about the entire sign-up process, and what the team has planned.

Each day I log in to the project, and I’m given this new sign-in option:

I click the option, and I open the environment, hit My Work Items query, and get to work. All in all, a seamless - although basic - experience. The speed at which we could set up and work on a project was really sweet. It’s remarkable how un-remarkable this is - I just do my work each day, everything is running and backed up in the cloud. I think that’s the point.

Categories: SQL Server MS Blogs

How It Works: sys.dm_io_pending_io_requests

SQL Server SQL CSS - Mon, 01/23/2012 - 13:28

I have been working an issue where the DMV was returning io_pending_ms_ticks incorrectly.  The following output is an example of ~164 hour delay. Unlikely to occur without any other side effects noted.

select * from sys.dm_io_pending_io_requests












In the process of this investigation I uncovered several details that I found helpful to share.

Full Text and File Streams
There is a small set of I/Os that could show a large io_pending_ms_ticks value but io_pending should = 1.    There are administrator interfaces for both the full text and file stream features.   Think if them a bit like a new TDS connection to the server.   When full text or a file stream request arrives the request needs to be processed.   These requests are simply waiting on the arrival of a new request from the respective features.    They seldom show up on this list and the file handle will not map to any of the handles exposed in the DMF - virtual_file_stats.

io_pending is the key
The io_pending column indicates 1 if the I/O is still pending within the kernel.   A value of 0 indicates the I/O only needs to be processed by the local SOS scheduler.   In this case we are not getting any I/O delay warnings, performance monitor is not showing I/O issues and there are no SOS scheduler issues.  

Dirty Read
After some more digging the issue is a DIRTY READ.  SQL Server maintains a set of I/O request structures (request dispenser).  When the I/O completes the request structure is returned to the dispenser and can be reused for another I/O.    The DMV needs to materialize the list without causing scheduler issues on the system so it is designed to perform (NO LOCK) dirty reads of the I/O list. This is where your system may return you incorrect io_pending_ms_ticks value when the io_pending flag = 0.

Take the diagram below as an example.  It is possible that the DMV is being materialized on one scheduler but the I/O is being completed on the I/Os owning scheduler.  (Each scheduler has its own I/O list).

The DMV query can get access to the I/O request structure but it does not hold any locks on the structure.  If it did it could lead to unwanted blocking.   Take for instance that the application fetching the DMV output stalls.  If a lock is held on the I/O list the lock can't be released until the entire list is properly traversed.  Because the client is not fetching results it could lead to the I/O being stalled on the scheduler for a long period of time.

To avoid this the list is run in a dirty fashion.   However, this means the IO request could finish I/O completion in parallel with the reading of the structure data.   The SQL Server protects the DMV query but does not indicate to the DMV user that no lock data movement has occurred.    Instead it is possible that the output can become skewed, as shown in previous output.   The IO data structure can be re-assigned between the time the DMV query starts to read the information and the time all columns are produced, leading to unexpected output.

SQL Server 2012 (Denali) updates this behavior by adding a signature to the I/O request.  This allows SQL to maintain the dirty read capability while also identifing I/O requests that fall into this category.

When you see large pending_io_ms_ticks consider the io_pending flag and additional scheduler warning information (178** messages) in the error log.

Bob Dorr - Principal SQL Server Escalation Engineer

Categories: SQL Server MS Blogs

Assigning SQL Server, SQL Agent to a Processor Group (OOM, Hang, Performance Counters Always Zero for Buffer Pool, …)

SQL Server SQL CSS - Mon, 01/23/2012 - 13:00

Suresh brought to my attention that we have been getting questions as to why SQL Server starts on group 1 and then group 2 and it is not predictable?  Then Tejas brought up another issues and since I worked on this way back before we released SQL 2008 R2 I went back to my notes to pull up some details that I thought might be helpful.

The answer is that you need a SQL Server that is Group Aware to use more than one group worth of CPUs in Windows.  However, if you have an older version or a SKU that does not support enough CPUs to span groups the default is for Windows to start the service on any group.  

SQL Server 2008 R2 uses the group aware so it will use new APIs and establish proper use of the entire system

Legacy XPROC / COM Object / Linked Server using a CPU based scheme.  

Some designs of legacy components may not be group aware or safe.  In the following example, if originally loaded on Group 2 the initialization would see 60 CPUs and create 60 partitions for a local memory manager which might work perfectly with proper synchronization.  However, if the original initialization of the partitions occurs on Group 1 it will only create 40 partitions and access from CPUs 41 ... 60 on Group 2 may fail as they don't exist.   (SQL Server does not have any of these components.)


Great Legacy Reference: http://msdn.microsoft.com/en-us/windows/hardware/gg463349.aspx 

Windows allows you to configure node to group assignments

By configuring Windows to see the same number of CPUs and Nodes within each processor group the SQL Server, SQL Agent, … services can start on any processor group and they will see the same amount of resources.


Note: There is a hotfix for Windows 2008 R2 that automatically balances the processor assignments are corrects the issue for most systems: http://support.microsoft.com/kb/2510206

 

The following knowledge base article outlines how to assign nodes to the processor groups so you have control over the group assignments. How to manually configure K-Group assignment on multiprocessor machines: http://support.microsoft.com/kb/2506384

You can use Windows Task Manager to see and set the affinity for the process

Start with the Windows Task Manager | Process Tab.

Select a process, Right Mouse | Set Affinity -- the following dialog is presented showing you the Processor Groups (K-Group), Nodes and CPUs on the machine.

 

The Problem

 

Even after all this work there is an issue using older versions of SQL Server (SQL 2008 pre-SP3 and SQL 2008 R2 pre-SP2) which requires SQL Server to startup in the group that holds physical NUMA Node = 0 and at least 1 CPU assigned from physical node 0 to the instance, or lazywriter activity is not started property.   This can lead to buffer pool sizing, performance counters not getting updated and other stall and hang like behaviors (OOM, Latch Timeouts, …) of the SQL Server process.

 

As a general rule is best to start them in group 0 (usually contains physical NUMA node = 0 but you must confirm this.)

 

You can enforce this on the system by setting the ImageFileExecutionOptions - NodeOptions value and by making sure the SQL Server affinity mask contains at least 1 CPU in physical NUMA node 0.   However, this forces all instances of SQL Server to start on the same node and all instances have to share physical NUMA node = 0.

 

Excerpt from the Windows 2008 R2 Release Notes

"Allocation of child processes among the ideal nodes of Non-Uniform Memory Access (NUMA) nodes is not efficient, which results in performance degradation, increased latency, and cache misses, depending on the affected processes.

To correct this, edit the registry to change the inheritance of ideal NUMA nodes so that generated child processes are assigned the same IdealNode as their parent process. This setting is not system wide, but per-process.

To change the NUMA inheritance

  1. Open Registry Editor (Regedit.exe).

  2. Add the following registry key:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\<process_name>.exe

    where process_name is the name of the process you want to change the inheritance behavior for.

    Assign a DWORD value to this key with the name NodeOptions and data 0x00001000.

    Close Registry Editor and restart the computer."

Windows does not currently provide a per service, instance option for starting the service in a specific group.   To avoid starting multiple instances of SQL Server on the same processor group use SQL Server 2008 R2 or newer version and establish the proper processor affinity.

 

SQL Server 2005
SQL Server 2008 Requires SQL Server Affinity Mask to use a CPU in physical NUMA Node = 0 as presented by the operating system. Use NodeOptions to start SQL Server instance in proper group that presents CPUs from physical NUMA Node = 0

- or -

Apply Windows SMC.exe QFE to assign startup group to a specific group containing physical NUMA node = 0 SQL Server 2008 SP3 No longer requires use of CPU in physical NUMA Node = 0 Requires Windows SMC.exe QFE to assign startup to a specific group or you can allow random group startup behavior. SQL Server 2008 R2 No longer requires CPU in physical NUMA Node = 0 Is group aware so it does not need the NodeOptions or SCM.exe QFE.

 


Other Helpful References

Bob Dorr - Principal SQL Server Escalation Engineer

Categories: SQL Server MS Blogs

Bug-Out Bags and Cloud Architecture Considerations

Bucky Woody - Fri, 01/20/2012 - 11:00

I served in the U.S. Military for a while, and as part of my training we had to maintain a “Bug-Out Bag”, which was a large duffle-bag full of certain items that we could live on/fight with in an emergency. I’ve carried the spirit of that idea forward with me into civilian life, in Florida and especially here in the Pacific Northwest.

In Florida we dealt with the threat of hurricanes - I went through four of those in one year that hit my area. You’re without power, it floods quickly, and it gets wicked hot. You roof might be gone, whatever. Here in the Pacific Northwest, I live near one of the largest volcano's in the world, we have flooding, and recently we were hit with an ice-storm. Now I’ve lived all over the world, from Alaska to North Dakota and even near the Kamchatka Peninsula in Russia, and I can handle the snow. But ice - that’s a toughie no matter where you live. We had so much that it split my little pine tree in front of the house in half.

 

We lost power - although I think the folks at Puget Sound Energy did an amazing job at getting us back up in less than 24 hours, but we weren’t worried anyway. That bug-out bag mentality carried forward to a “second pantry” we keep in the garage.

We have a large plastic box (that will fit in the back of the Subaru) with dried goods like pasta, and canned goods and even a little cook stove. We have 25 gallons of clean water in Jerry-Cans. We have batteries, candles and matches. And we have flashlights around every door. We use supplies from the “pantry” to fill our house pantry, and then refill the emergency one from the grocery store. That way everything is fresh, rotated, and we can “bug-out” here at home or on the road.

So what does this have to do with Distributed Computing Architectures?

It’s the thought process. In both the military and civilian life, I’ve done a few things:

  1. Sat down and thought carefully about exactly what I need. Did I include a can-opener? A small shovel to dig out of whatever I got stuck in? Then I weed out what I *really* don’t need.
  2. Put those things into a small, manageable container.
  3. Tried them - even when (especially when) I didn’t have an emergency
  4. Tweaked the process to see what I could do better.

Have you done this when you moved an app to the “cloud”? Each of these has a computing parallel - do you know what you would do if you couldn’t access the Distributed Computing Environment?

I’ve found these thoughts are actually a great place to start - keeps the process simplified from the start, and gives you a sense of assurance when you’re asked if you can recover from an emergency.

Categories: SQL Server MS Blogs

Cloud Computing In Action: How I work with Live Mesh, SkyDrive, and Office Live Workspace

Bucky Woody - Mon, 01/16/2012 - 08:27

Recently I had an tweetversation with a couple of friends on some confusion around two of our products: SkyDrive and Live Mesh. Like most of our software, there’s no single way to do things. That can be a strength or it can cause that confusion. They asked if I would blog how I work with these two products, and what advantages there are to this way of working.

Before I start - this is specific to these two Microsoft products. If you’re a fanboi of another product, that’s great. Awesome. Go for it. You don’t have to use these. There’s no law about it or anything. It’s all good. I use the products you see below because I evaluated lots of them, and these work the best for me - not because I work at Microsoft. But do what makes you happy.

Let’s start with what each of these products do. Live Mesh synchronizes files to various locations. You can create a file on one PC, save it, and then when you fire up another PC that file will be copied from the original location. It’s a mirror of the file, and it exists in both places. You can change the file on the second location, and it will be copied back to the other system, stepping right on top of it.

SkyDrive is a storage system. You can store lots of data in there - larger than most of the other free offerings.

Office Live Workspaces allows you to integrate SkyDrive into your local copy of Microsoft Office, so that you can create, save and edit a document and it will be stored in SkyDrive, and not only that, it will keep a local, synced copy so that you can work offline. But it also has a web-based subset of Microsoft Office. You can create, edit and work with Microsoft Office documents with no software installed at all. From Linux, Mac, a cell phone, whatever has a browser. In fact, we’ve released one of my favorite products, OneNote, in iPhone and iPad flavors, which also buffer down the file as if you had a PC and Microsoft Office.

I rely on these each of these products every day. Here’s how I use them.

I use Live Mesh to copy my entire “Data” directory - files, music, everything - from my home “server” to my work and other systems. Since SkyDrive has a limit, I only send certain files to SkyDrive using Mesh. Just the ones I need access to from non Microsoft-OS devices. Of course, this means I have to leave my home server turned on - which I do anyway since it’s my media server, web server, TV, etc. But everything else I sync to about four computers running Windows.

 

For my OneNote files - quickly becoming the center of my universe - and anything else I want to access from anywhere, all the time, I use SkyDrive and Live Office. Here’s how that works.

If it’s an MP3, Visual Studio Code, a training video or whatever my customer needs, I save it in SkyDrive, mark it public, and send them the link. Done. Any device that can render these can access the file over the web. Since I play in a group on Sunday, I even put my music there (I use MuseScore) and then I can pop the music up on my netbook right at the pulpit and leave the paper at home.

For OneNote or other Microsoft Office documents, I create the document first in Office Live. Once the file is open, and before I even type in it, I click the button marked “Open in OneNote” (or Word, or Excel, or whatever) and from them on I have that file linked in the local system, and a shadow copy for working offline. I can also work with that document from the web using my Linux or Apple OS’s if needed. I recently attended a very Microsoft-hostile environment, so everything from the presentation to the code review for Windows Azure I did from Live Office and my SkyDrive, all from my Linux Laptop.

 

As I’ve always said - use what works. This arrangement gives me the ultimate flexibility. I have my data from Live Mesh synchronized on multiple systems. More than once I’ve deleted something I needed, or changed something. I simply boot up the other device without being connected to the web, copy the old version off, and then let it connect and sync. I also back up my home server once a week to a set of local drives, so I have offsite and onsite backups. I can work from anywhere I have a browser, or someone that will let me borrow a device. I have all my presentations ready to present from any system, even if mine breaks.

Hopefully this helps - and hopefully it inspires you to write a blog entry on how you use your favorite cloud products. There are always multiple ways to do things, and I love to learn.

Categories: SQL Server MS Blogs

Stand-Up Cloud Computing

Bucky Woody - Mon, 01/09/2012 - 07:33

When I was very young, I asked my uncle for career advice. He went silent, thinking for a bit, and then said: “People who work sitting down make more than people who work standing up.” I’m not certain how true that really is, but my career as a technology professional has led me to work in a seated position for most of my life.

Turns out, that’s a bad thing. Although I consider myself pretty fit, eating right, sleeping well and working out several times a week in addition to a morning walk with the family each day, I always look for low-barrier ways to stay healthy. When I first moved to the Pacific Northwest to work for Microsoft, I noticed several folks working at tall desks with no chairs. Some even had treadmills. I chalked it up to the ethos here; and certainly not something I would do.

But this year that changed. I noticed that my back was a little stiffer when I got done with my 12-13 hour days of work. For the last couple of years, I’ve worked from home, so I don’t attend meetings (at least in person) as often or have to walk very far to do almost any part of my job. I start work around 6 in the morning, and sometimes get so focused that I don’t moved for many hours. I read an article on how bad sitting really is, and after further investigation thought I might give one of those stand-up desks a try.

The research led me to believe that you don’t actually have to use a stand-up desk per-se, you can also use an alternate chair or just get up every so often. But I wanted to try this out, and figured that I would be more likely to take a break and sit every hour than I would to remember to stand every hour.

The Before

My office desk is fairly typical, but I do have a decent office chair. That’s after going through probably six or seven chairs in the last few years. I have good lighting, a speakerphone, a web cam and two monitors. I also have the typical flotsam and jetsam of desk clutter, although I’m neater than some.

 

This arrangement has suited me well since I’ve been working at home. I had something similar in an office environment, although I didn’t always have the option of a decent chair. I didn’t go through the trouble of bringing one of my own in; I just put up with whatever I got, or could “appropriate” from an empty office or conference room.

The Build

My criteria were fairly simple: the experiment had to cost less than 100.00, and be at the proper height and size to hold my keyboard, trackball, phone and monitors so that I could type with my elbows staying at a 90 degree angle.

After researching standing desks, 100.00 was going to be impossible, not even for a used one. I visited several thrift shops in the area (I do that a lot anyway to donate and to buy) and didn’t find anything that worked. Of course, when you’re faced with finding cheap furniture, you naturally turn to the most amazing store on the planet.

Before I left, I measured the top area and height of my desk, and wrote down acceptable measurements based on how I high I stood, the stuff I needed the top to hold, and the distance I needed for my typing to be done at the right height. Measurements in hand, I headed to the store.

I found a coffee table - a really cheap one (19.00) called (oddly) LACK and brought it home to begin the surreal process of assembling something bought at Ikea.

 

Happily, this was REALLY simple. Four lag bolts hold the legs on, and eight screws punched into the wood (or at least wood-like) to attach the shelf. My original thought was that I would move the shelf up higher than the Ikea instructions, and then use the pull-out tray from my desk to put the keyboard and trackball on. However, that was not to be. On investigation I found that the tray was not hung underneath the desk, but attached at the sides. That meant I had to either buy another tray, or place the keyboard on top, necessitating standing two inches higher.

The After

Researching trays, I found they were terribly expensive. These things used to be everywhere, so I was surprised that they aren’t as easy to get as they once were. Off to the thrift store to see what they had. I found an older tray, but it looked flimsy. I then found a child’s plastic picnic table. The plastic was strong, and 1.5 inches thick. I figured I needed some padding to stand on anyway, so I bought the table, pulled off the legs, and wrapped some padding in an older rug. This brought the total cost of the build to 25.00 and 2.00 for an espresso and a cinnamon bun at Ikea (I would be burning these calories off with my new desk, after all).

 

I re-routed all my cables, and everything fit correctly.

  

My Early Conclusions

The first day was easy. I thought - well I should have done this a LONG time ago! My back wasn’t that sore, and I didn’t feel that tired.

Then I woke up the next morning. My feet were sore, although not terribly. The second day, I had to sit down each hour. Not just wanted to sit down - needed to. I play in a group at Church on Sunday, so I put my guitar in the office and spent 5 minutes each hour (roughly - sometimes I have calls that are longer than that) and practice a little sitting down. That helped a lot.

I’ve now been at the desk for four days, and I don’t need the breaks as often. I also find I need to shift around a lot, which of course burns even more calories and is better for me. I honestly think the treadmill desk might be easier than a standing one. We’ll see if I go that far someday.

The verdict so far? Glad I’ve done this. If it doesn’t work out, I’ll just re-purpose the coffee table and go back to sitting - although I’m pretty stubborn and will probably stick with this for a while. I’ll let you know if I change back, and why.

My executive assistant hasn’t changed her office arrangement at all. She still keeps her (Ikea) chair just like she’s had it since she started working with me, and dutifully stays at her workstation for the entire 12-13 hours each day. We do, however, take our lunchtime walk still. She burns her calories that way, and thinks it’s better than just standing around all the time.

Categories: SQL Server MS Blogs

Valentine’s Day and Your Career

Bucky Woody - Tue, 01/03/2012 - 07:53

The new year has begun. It’s traditional to make “Resolutions” at this time, but as I’ve mentioned before, I don’t do that. I make goals instead. I like things to be measurable, and I hold myself accountable to those goals - some of which deal with my professional life.

But you might not buy into all that. Perhaps you’re the kind of person who doesn’t buy Valentine’s day cards, or take your significant other out for dinner on Valentine’s day. After all, it’s a manufactured, made up holiday from the greeting-card companies, right? Somebody just decided to come up with a day to make you do something you don’t normally do.

Here’s a tip: do it anyway. Buy the flowers. Jump into the hype. Yes, it’s a made-up holiday. Yes, they’re making money off of you. But take that person out for the nicest dinner you can find. Treating someone you love in a special way on a periodic date is shown to increase the bond in a relationship, simply because it’s a ritual date that others keep. The ritual is the magic.

What does this have to do with New Year’s, or your career? Everything. Not to burst a bubble here, but the universe is not aware of human timekeeping mechanisms. The New Year is just as artificial as Valentine’s day. In fact, many other cultures don’t even count the 1st of January as the New Year. But it’s OK - just like Valentine’s day, you can use the “start of the new year” as a time to focus on something you need to do.

It’s pretty simple to do this - but of course simple != easy. Goals need to be realistic - so sit down sometime this week, and follow this process:

  1. Write down where you want to be in a year in your career. Make it specific. An award, a position, a company, a raise. Write it down.
  2. Write down a few books you want to read that will help you get there. Blog about these books.
  3. Write down the people you need to talk to, inside your company and out. Send an invite out to these people to chat. Do that this week.
  4. Write down the things you need to accomplish for that goal in your job.
  5. Tell others you are doing these things, and what you expect.
  6. Implement your plan.
  7. Review your plan and adjust as needed each month.

Yes, the “new year” is artificial, like Valentine’s day. So what. Use it to get where you need to go.

Happy New Career.  

Categories: SQL Server MS Blogs

Book Review (Book 7) - Think Stats

Bucky Woody - Fri, 12/30/2011 - 07:11

This is a continuation of the books I challenged myself to read to help my career - one a month, for year. You can read my first book review here. The book I chose for December 2011 was: Think Stats, by Allen B. Downey

Why I chose this Book:

I originally chose another book for this month, but changed to this one after a difference in focus (sort of) in my technical career. That brings up a couple of interesting points right away. The first is that it’s OK to change a list - remember that the purpose of reading these books is to gain information that gets you closer to your professional goals. When you develop your list, you have a certain amount of knowledge, and as you read more, experience more, and are exposed to more, you get different information. When that happens, adapt.

The second point is that your goal itself may change. I am focusing on “Big Data” this year and with the changes we’ve made in Windows and SQL Azure at Microsoft, this fits neatly with my professional goals personally and the company I work for. Actually, my goals in technology haven’t changed in the 27+ years I’ve worked in IT, in roles from electronics, programming, consulting, management, architect and in my current technical role here at Microsoft. I think that it has always been about data - everything in IT is an interface to data. And I have always wanted to be at the center of that. Data Science involves not just the sourcing, administration and movement of data, but in applying scientific (with an emphasis on mathematical) disciplines to get at the meaning the situation needs.

So that brings me to this choice. My friend Jeremiah Peschka found this resource for a role I am VERY interested in - the “Data Scientist”. It’s a combination of high-end mathematics, Data Analysis and Big Data. The resource is a series of books from O’Reilly for that very title. You can find that here.

Personally, I find the grouping of books a little cobbled together. They are all fine books, but I’m not certain how they lead you through the series of knowledge required for the topic, but that’s a post for another day. Within that series of books is the one I’m reviewing today. I started (since there is no implied order in the books) with the “Data Analysis” book, but it seemed to start in the middle of some topics I needed to research, so I switched to reading this one, and chose it as my December book.

Another note here - December is a tough month. Since so many people take vacation time during this month, most of my clients try to get as much work in before the Holidays as possible. Since they are all doing that at once, it makes for a lot of overtime. Also, I travel to see family, which of course puts me out of pocket for a while myself. So staying on track with the books - especially one that makes heavy use of computing, math and focus is hard. So it’s tough to maintain your goals all of the time - but keeping in mind why you do this is the important thing. It will keep you on track.

What I learned:

This book focuses more on what the title says - it’s more about being mindful of the way you use statistics than the statistics themselves. It’s assumed you know not only the basics of statistics (I used these free lessons as a refresher, along with some of my old stats books) but how they are used.The author doesn’t stop to explain a lot of stats he uses, but periodically he does show why a given formula works the way it does. This is very useful, and helps with understanding the point of using one method over another. He also does a great job of using statistics to verify other statistics.

Although it should be obvious, the meaning of the data is essential. We think about this when we deal with the result of data processing, but not necessarily when we work with the sources. For instance - as the author explained some central tendency, smoothing and so on using statistical methods, he introduced some numbers and asks you to guess the central number from the set. Dutifully you work out the answer, but in time he reveals that it’s a series of numbers on a die - which of course can only be whole numbers. The point is that you’re so focused on getting the right answer, you don’t define what the real problem is first.

Another great tool  - and a fascinating study that I need to look into further - is the fact that you can often make at least educated inferences into data you might not imagine. For instance, he talks about the example of a series of train cars, numbered sequentially. You see a train car numbered “60” - can you guess with any certainty how many train cars the company has? Fascinating stuff.

He includes a glossary at the end of each chapter. I found this a great approach for summarizing the information in one place, and really helpful in making sure I understood everything before moving on. I didn’t always, so I had to re-read parts of the book and freshen up my stats knowledge along the way as well.

He uses Python as the language of choice - which I found a bit unusual. Most of the stats profession uses something more like the R language, which I’ve also started learning, and one of the other books in this series includes R as a primary subject. Because the author uses Python, he includes references to a series of libraries you add into it to work through the examples. Python certainly is a Data Scientist’s tool, just normally not for statistics. The author uses great examples and assignments, but doesn’t really follow up on those. I guess I’d rather see those introduced earlier in the chapter and explained better. He tends to jump around a bit, and his references are to Wikipedia, which isn’t always as reliable or thorough as it can be. But these are small quibbles. It’s a good book, and a I learned a lot reading it. In fact, I have lots of concepts to unpack based on what I read.

Categories: SQL Server MS Blogs

A faster CHECKDB – Part I

SQL Server SQL CSS - Tue, 12/20/2011 - 08:00

Earlier this year I travelled to Japan having the opportunity to visit some of our customers and prospective customers. One feedback I received loud and clear was poor performance when using DBCC CHECKDB as customers have moved into the TB range of databases on a regular basis. I had certainly heard this feedback before from other CSS engineers but not the extent of complaints I heard while in Japan. I had never really investigated these claims of performance issues because I suppose I fell under the same feelings I had and seen for years of “CHECKDB takes as long as it takes”.

On that trip I met Cameron Gardiner from the SQL Customer Advisory Team (SQL CAT). Cameron is an expert with SAP systems and focuses his efforts at Microsoft on SAP running on SQL Server. Cameron spent tine to explain what a problem the performance of CHECKDB had become for his largest SAP accounts using SQL Server.

So I went back to the US determined to investigate the cause and possible solutions for this. The result of this work is now available in the latest cumulative update for SQL Server 2008 R2 customers. You can read the basics of how to apply these two trace flags with this update in this article. A SQL Server 2008 version of these changes will also be available early next year.

I’m on vacation for the rest of the year, but when I come back, I’ll create another blog post (Part II) with the details behind these changes, how we did this, and how this enhancement might help you when using DBCC CHECKDB.

Bob Ward
Microsoft

 

 

 

 

Categories: SQL Server MS Blogs

Windows Azure Storage (WAS) Internals - Achieving Consistency

Bucky Woody - Tue, 12/13/2011 - 10:05

Windows Azure Storage has three primary components - a Queue, a Binary Large Object (BLOB) store (two types of these), and Table Storage.

Storage of data on-premises is fairly well understood - but there components of it that you may not consider. When you move to a distributed architecture, certain factors should be taken into account, such as consistency. Consistency means that when you store a datum it should be available in the same bit format across the calling mechanism. In other words, if you store a picture with a certain name, whenever you call that name that particular picture should show up. That might sound obvious - but when you begin to scale horizontally, it’s a big consideration. Systems are spread out over multiple physical racks, which are further separated into separate “fault domains” each with its own power, networking and so on, and in Windows Azure, the storage is replicated to ensure high-availability.

Some “cloud” systems relax the consistency target to allow for the highest speed throughput. This might allow inconsistent reads, meaning that the datum recorded in the naming system would be available yet, or that it might allow an older version of the datum to be read. In Windows Azure, we took the position that the consistency is of the highest importance. We achieved this through constructs such as the Location Service (LS), Stream, Partition and Front-End layers, and separate replication engines. Of key importance in a system that allows high consistency is in the naming and object access protocols - in fact, these turn out to be some of the most pivotal.

Windows Azure Storage has a complex arrangement to ensure this high consistency. You can read some very deep internals here.  And a video of the talk held at an ACM conference is here.

Categories: SQL Server MS Blogs

Exporting via HTML instead of MHTML

SQL Server SQL CSS - Fri, 12/09/2011 - 06:22

There was a question on Twitter about how to display a report in HTML instead of MHTML due to some browser issues.  Based on the fact that it was MHTML, I’m assuming we are talking about exporting a report as the default report view is HTML.  First off, if we look at our export options for a report, we see the following:

HTML isn’t an option.  One of the reasons or this is because if you export with HTML, and have items in your report such as images, they wouldn’t be included with the export.  With MHTML, the binary of the image can be included and will be displayed properly.  So, this is the default export option for HTML to make sure we get everything, and the report looks consistent with what is displayed for an on demand report.

That being said, you can change it.  This is done within the rsreportserver.config file.  Within this file we include the different renderers that Reporting Services will use.

<Render>
    <Extension Name="XML" Type="Microsoft.ReportingServices.Rendering.DataRenderer.XmlDataReport,Microsoft.ReportingServices.DataRendering"/>
    <Extension Name="NULL" Type="Microsoft.ReportingServices.Rendering.NullRenderer.NullReport,Microsoft.ReportingServices.NullRendering" Visible="false"/>
    <Extension Name="CSV" Type="Microsoft.ReportingServices.Rendering.DataRenderer.CsvReport,Microsoft.ReportingServices.DataRendering"/>
    <Extension Name="ATOM" Type="Microsoft.ReportingServices.Rendering.DataRenderer.AtomDataReport,Microsoft.ReportingServices.DataRendering" Visible="false"/>
    <Extension Name="PDF" Type="Microsoft.ReportingServices.Rendering.ImageRenderer.PDFRenderer,Microsoft.ReportingServices.ImageRendering"/>
    <Extension Name="RGDI" Type="Microsoft.ReportingServices.Rendering.ImageRenderer.RGDIRenderer,Microsoft.ReportingServices.ImageRendering" Visible="false"/>
    <Extension Name="HTML4.0" Type="Microsoft.ReportingServices.Rendering.HtmlRenderer.Html40RenderingExtension,Microsoft.ReportingServices.HtmlRendering" Visible="false"/>
    <Extension Name="MHTML" Type="Microsoft.ReportingServices.Rendering.HtmlRenderer.MHtmlRenderingExtension,Microsoft.ReportingServices.HtmlRendering"/>
    <Extension Name="EXCEL" Type="Microsoft.ReportingServices.Rendering.ExcelRenderer.ExcelRenderer,Microsoft.ReportingServices.ExcelRendering"/>
    <Extension Name="RPL" Type="Microsoft.ReportingServices.Rendering.RPLRendering.RPLRenderer,Microsoft.ReportingServices.RPLRendering" Visible="false" LogAllExecutionRequests="false"/>
    <Extension Name="IMAGE" Type="Microsoft.ReportingServices.Rendering.ImageRenderer.ImageRenderer,Microsoft.ReportingServices.ImageRendering"/>
    <Extension Name="WORD" Type="Microsoft.ReportingServices.Rendering.WordRenderer.WordDocumentRenderer,Microsoft.ReportingServices.WordRendering"/>
</Render>

One of the properties that you can see is the Visible property.  Some of these are set to false by default.  This actually lets Reporting Services know if this should be a visible export option.  One of the items is for the HTML4.0 renderer.

<Extension Name="HTML4.0" Type="Microsoft.ReportingServices.Rendering.HtmlRenderer.Html40RenderingExtension,Microsoft.ReportingServices.HtmlRendering" Visible="false"/>

By setting visible to true for this item, it then appears as an export option.

This should work good for a subscription operation, but if you just hit export from the browser, it will just pop it up in a new tab as HTML is an accepted format.  So, you won’t get the prompt to download. From a browser perspective, you can also get a clean HTML version of the page via URL access:

http://localhost/ReportServer?%2fHelloWorld&rs:Command=Render&rs:Format=HTML4.0&rc:toolbar=false

In this example, you don’t even need the Format parameter as HTML4.0 is the default renderer.  You can find more information about the available URL access parameters at this page:

Using URL Access Parameters
http://technet.microsoft.com/en-us/library/ms152835.aspx

 

Adam W. Saxton | Microsoft Escalation Services
http://twitter.com/awsaxton

Categories: SQL Server MS Blogs

How Microsoft helps you NOT break your Windows Azure Application: Storage Services Versioning

Bucky Woody - Tue, 12/06/2011 - 08:42

One of the advantages of using Windows Azure to run your code is that you don’t have to constantly manage upgrades on your platform. While that’s a big advantage indeed, it immediately brings up the question - how do the upgrades happen? Microsoft upgrades the Azure platform in periodic increments, and the components that are affected are documented.

This brings up another question - upgrades mean change, and change can sometimes alter the way you might implement a feature. What if you have taken a dependency on some feature in your code that has been altered by an upgrade? Windows Azure does have an Application Lifecycle Management (ALM) Process, which I’ll reference at the end of this post. But beyond that, there are some features we’ve put into place that will help you manage many of these changes. One of those is being able to set the version of storage features you would like your code to use.

Windows Azure is made up of three main component areas: Computing, Storage and a group of features called the Application Fabric. You can use these components together or separately, depending on what you would like your application to do. In this post I’ll deal with the version control in the storage subsystem - in other posts I’ll explain how to track and in some cases control the versions of the other components you work with.

When you send a request to a Windows Azure resource, you’re actually using a REST call. That’s a three-part call to the system that has a request (called a URI), a header, and a body of code you want to send. So a typical call, such as to a table, might look like this example, which changes the properties of a Blob:

URI:
PUT http://myaccount.table.core.windows.net/?restype=service&comp=properties HTTP/1.1

Header:
x-ms-version: 2011-08-18
x-ms-date: Tue, 30 Aug 2011 04:28:19 GMT
Authorization: SharedKey
myaccount:Z1lTLDwtq5o1UYQluucdsXk6/iB7YxEu0m6VofAEkUE=
Host: myaccount.table.core.windows.net

Body:
<?xml version="1.0" encoding="utf-8"?>
<StorageServiceProperties>
    <Logging>
        <Version>1.0</Version>
        <Delete>true</Delete>
        <Read>false</Read>
        <Write>true</Write>
        <RetentionPolicy>
            <Enabled>true</Enabled>
            <Days>7</Days>
        </RetentionPolicy>
    </Logging>
    <Metrics>
        <Version>1.0</Version>
        <Enabled>true</Enabled>
        <IncludeAPIs>false</IncludeAPIs>
        <RetentionPolicy>
            <Enabled>true</Enabled>
            <Days>7</Days>
        </RetentionPolicy>
    </Metrics>
</StorageServiceProperties>

(Source of this code)

You can see that I’ve highlighted a portion of the header block - that’s where you set the version of the Storage Services you would like to use. You can find a list of the features introduced in each version here. It’s not a requirement of adding that element to the header, but it’s best practices to do so.

You don’t have to use REST calls directly, however. It’s more common to use the API in the Software Development Kit to just change the property in your IDE environment - the setting you’re looking for there is the Set Storage Service Properties call.

Interestingly, rather than a breaking change you might run into an unexpected behavior if you are not aware of these parameters. In some code I recently reviewed a newer feature from the storage system failed when it was called. On inspection I found that the developer had used an older codeblock from a previous version of the storage system - he was not aware you can set the version of storage in the call. We changed the header to the latest version, and everything worked as expected.

References:

The Storage Services Versioning and the changes for each version:

http://msdn.microsoft.com/en-us/library/windowsazure/dd894041.aspx

Windows Azure Application Lifecycle Management:

http://msdn.microsoft.com/en-us/library/ff803362.aspx

http://channel9.msdn.com/posts/Windows-Azure-Jump-Start-03-Windows-Azure-Lifecycle-Part-1

http://channel9.msdn.com/Events/TechEd/Australia/Tech-Ed-Australia-2011/COS201 

 

Categories: SQL Server MS Blogs

Windows Azure Best Practices: Affinity Groups

Bucky Woody - Mon, 11/28/2011 - 22:43

When you create a Windows Azure application, you’ll pick a subscription to put it under. This is a billing container - underneath that, you’ll deploy a Hosted Service. That holds the Web and Worker Roles that you’ll deploy for your applications. Along side that, you use the Storage Account to create storage for the application. (In some cases, you might choose to use only storage or Roles - the info here applies anyway)

As you are setting up your environment, you’re asked to pick a “region” where your application will run.

If you choose a Region, you’ll be asked where to put the Roles. You’re given choices like Asia, North America and so on. This is where the hardware that physically runs your code lives. We have lots of fault domains, power considerations and so on to keep that set of datacenters running, but keep in mind that this is where the application lives.

You also get this selection for Storage Accounts. When you make new storage, it’s a best practice to put it where your computing is. This makes the shortest path from the code to the data, and then back out to the user.

One of the selections for the location is “Anywhere U.S.”. This selection might be interpreted to mean that we will bias towards keeping the data and the code together, but that may not be the case. There is a specific abstraction we created for just that purpose: Affinity Groups.

An Affinity Group is simply a name you can use to tie together resources. You can do this in two places - when you’re creating the Hosted Service (shown above) and on it’s own tree item on the left, called “Affinity Groups”. When you select either of those actions, You’re presented with a dialog box that allows you to specify a name, and then the Region that  names ties the resources to. Choose a specific region - not one of the "Anywhere" choices.

Now you can select that Affinity Group just as if it were a Region, and your code and data will stay together. That helps with keeping the performance high.

Official Documentation: http://msdn.microsoft.com/en-us/library/windowsazure/hh531560.aspx

Categories: SQL Server MS Blogs

Book review (Book 6) - Wikinomics

Bucky Woody - Tue, 11/22/2011 - 08:35

This is a continuation of the books I challenged myself to read to help my career - one a month, for year. You can read my first book review here. The book I chose for November 2011 was: Wikinomics: How Mass Collaboration Changes Everything, by Don Tapscott  

Why I chose this Book:

I’ve heard a lot about this book - was one of the “must read” kind of business books (many of which are very “fluffy”) and supposedly deals with collaborating using technology - so I want to see what it says about collaborative efforts and how I can leverage them.

What I learned:

I really disliked this book. I’ve never been a fan of the latest “business book”, and sadly that’s what this felt like to me. A “business book” is what I call a work that has a fairly simple concept to get across, and then proceeds to use various made-up terms, analogies and other mechanisms to fill hundreds of pages doing it.

This perception is at my own – the book is pretty old, and these things go stale quickly. The author’s general point (at least what I took away from it) was: Open Source is good, proprietary is bad. Collaboration is the hallmark of successful companies. In my mind, you can save yourself the trouble of reading this work if you get these two concepts down.

Don’t get me wrong – open source is awesome, and collaboration is a good thing, especially in places where it fits. But it’s not a panacea as the author seems to indicate. For instance, he continuously uses the example of MySpace to show a “2.0” company, which I think means that you can enter text as well as read it on a web page. All well and good. But we all know what happened to MySpace, and of course he missed the point entirely about this new web environment: low barriers to entry often mean low barriers to exit.

And the open, collaborative company being the best model – well, I think we all know a certain computer company famous for phones and music that is arguably quite successful, and is probably one of the most closed, non-collaborative (at least with its customers) on the planet. So that sort of takes away that argument.

The reality of business is far more complicated. Collaboration is an amazing tool, and should be leveraged heavily. However, at the end of the day, after you do your research you need to pick a strategy and stick with it. Asking thousands of people to assist you in building your product probably will not work well.

Open Source is great – but some proprietary products are quite functional as well, have a long track record, are well supported, and will probably be upgraded.

Everything has its place, so use what works where it is needed. There is no single answer, sadly.

So did I waste my time reading the book? Did I make a bad choice? Not at all! Reading the opinions and thoughts of others is almost always useful, and it’s important to consider opinions other than your own. If nothing else, thinking through the process either convinces you that you are wrong, or helps you understand better why you are right.

Categories: SQL Server MS Blogs