SQL 2014 Storage Solutions – Azure Blob Storage

SQL Server 2014 has support for both on-prem and cloud database files.  This opens up a series of new possibilities which assuredly will spark debate.  Here we will dig into a few of the new scenarios and potential use cases. 

SQL Server running in Azure – Files in Azure Blob Storage

The scenario where you have your SQL Server running in Azure is likely the most common, or at least the first scenario that comes to mind for this feature.  Running your database with files in Azure Blob Storage provides you with the kind of dynamic disk allocation that suits agile development and unknown growth patterns.  Leaving the storage maintenance concerns to Azure leaves IT and development the time to focus on their applications, which is where most developers would choose to focus their efforts. 

SQL Server running On-Prem – Files in Azure Blob Storage

Running a database on-prem with files in Azure Blob Storage has the same benefits of managed storage, but also enables additional HA capabilities.  For example, if your server running your database happens to crash, the database files in Azure are accessible from anywhere (potentially) and the database can be attached to a new instance – on-prem or in Azure.  Additionally, keeping all of the data and log files in Azure Blob Storage enables Always-On Availability Groups to transition between on-prem and Azure. 

SQL Server running On-Prem – Hybrid (Data in Azure, Log On-Prem)

This scenario seems to be about tiered data storage.  The setup here is an on-prem database, with an on-prem log, and data in Azure Blob Storage.  The tiered data storage piece comes into play here because there would more than likely also be data on-prem.  The primary reason that there are files in Azure is that they are accessed infrequently and can handle being somewhat latent.  An example of this is a partitioned table, whereby old partitions that are not written to and rarely read are moved into a filegroup in Azure.  The active partition and log remains on-prem to maintain performance of reads and writes. 

One problem with this scenario is that loss of connectivity to the files in Azure will leave the database In Recovery. 

Creating Database Files in Azure Blob Storage

The setup here is similar to backing up a SQL Server database to Azure Blob Storage step 1 – 3. 

Setting up your storage

Step 1: Create account

Creating an Azure account is quick and easy, and there is currently a 1 month trial available at http://www.windowsazure.com.  Once you request your account, it may take 24 hours for it to be provisioned.  Keep this in mind as you will not be able to begin testing immediately.

Step 2: Add storage

Now that you have your account, you will need to add your storage.  This is quick and easy, with the only steps being to name your storage, pick a location, and choose whether your storage should be geo-redundant or locally redundant.  While backup files can be geo-redundant, data and log files do not currently support this.

Create Storage Account

Step 3: Create container

In order to save any files into your storage, you will need to add a container.  To do this, click on the name of your newly created storage.  From here, you can monitor and change the configuration of your storage.  Click on the navigation for “Containers”.  In the “New Container” dialog, pick a name for what files will be in the container.  When we set up our container for backups previously, we called our container “sqlbackups”.  For our purposes here, we can just call this “data”.  You also have the choice of access types – generally speaking you will leave this at the default of “Private”. 

Now that you have a container created, you will see that the container has a URL.  This is the URL that you will use to point your backups to this container.  The format of this URL is [storage name].core.windows.net/[container].  While the URL shows http://, https:// will also work. 

Storage Container - Data

 

Creating the database with files in the cloud

Step 1:  Creating a credential

As with backing up a database to Azure Blob Storage, a credential needs to be in place for accessing your cloud resources.  However, this credential uses different inputs, which was not very intuitive to me.  Notes on this credential:

1.      The name of the credential must me the URL of the container (you would need 2 credentials if you had a data container and a log container)

2.      The IDENTITY must be set to “SHARED ACCESS SIGNATURE”

3.      The SECRET is not one of the 2 automatically generated access keys in the storage account as it would be for creating a backup credential.  Rather, it is generated outside of the Azure portal using the Azure API.  For this, you will either need to code up some c# to generate a secret, or use one of the Storage Explorers to generate a secret.  If you want to write the code, head over here to see how.  Otherwise, here is how to use a storage explorer:

First, you need to get your hands on a storage explorer.  From your storage account quick start guide, click on Storage Explorers.  In this example I am using the Azure Storage Explorer, which is out on CodePlex.

Add account

StorageExplorer Add Account

Click on the container you will put your files in (“data” in this case), and then click “Security”.  On the “Shared Access Signature” tab, we will generate a secret to use in the credential.

generate shared access signature

Make sure that you check the box for “write”, otherwise your credential will not have access to create the files in your container.  When you click “Generate Signature”, you will get a string that looks like this:

https://blogdemostorage.blob.core.windows.net/data?st=2014-01-13T03%3A16%3A31Z&se=2020-01-13T04%3A16%3A31Z&sr=c&sp=rw&sig=H6c7exHNUpVMY63bBh4nE5Epve5g6oP1xOkW0ybjQTQ%3D

The section highlighted is what we will use for the SECRET in the credential.  You can you everything from “st=” to the end, but that will put a time based restriction on the access.

Now that we have generated a new SECRET, let’s create that credential!

USE MASTER
GO
CREATE CREDENTIAL [https://blogdemostorage.blob.core.windows.net/data]
WITH
	IDENTITY = 'SHARED ACCESS SIGNATURE',
	SECRET = 'sr=c&sp=rw&sig=H6c7exHNUpVMY63bBh4nE5Epve5g6oP1xOkW0ybjQTQ%3D'
GO

Step 2:  Create the database with files in Azure Blob Storage

All Cloud files

CREATE DATABASE [Test_CloudFiles]
	CONTAINMENT = NONE
ON PRIMARY
(
	NAME = N'Test_CloudFiles',
	FILENAME = N'https://blogdemostorage.blob.core.windows.net/data/Test_CloudFiles.mdf',
	SIZE = 4096KB,
	FILEGROWTH = 1024KB
)
LOG ON
(
	NAME = N'Test_CloudFiles_log',
	FILENAME = N'https://blogdemostorage.blob.core.windows.net/data/Test_CloudFiles_log.ldf',
	SIZE = 1024KB,
	FILEGROWTH = 10%
)
GO

On-Prem log, cloud data

CREATE DATABASE [Test_HybridFiles]
	CONTAINMENT = NONE
ON PRIMARY
(
	NAME = N'Test_HybridFiles',
	FILENAME = N'https://blogdemostorage.blob.core.windows.net/data/Test_HybridFiles.mdf',
	SIZE = 4096KB,
	FILEGROWTH = 1024KB
)
LOG ON
(
	NAME = N'Test_HybridFiles_log',
	FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\Test_HybridFiles_log.ldf',
	SIZE = 1024KB,
	FILEGROWTH = 10%
)
GO

Testing the performance of the databases

For a quick performance test, I threw together a script that sequentially inserts 250 integers into a table. 

	create table dbo.test
		(
		id int primary key clustered
		)

	declare @x int = 0

	while (@x < 250)
	begin
		insert into dbo.test(id)
		values(@x)

		set @x = @x + 1
	end

I then ran this script against each database.

On-Prem Data / On-Prem Log  – 0 seconds

Cloud Data / On-Prem Log – 0 seconds

Cloud Data / Cloud Log – 18 seconds

Even on my internet connection, I expected this to be pretty snappy in each scenario.  When working with Azure Blob Storage, it does take some usage to get the disks hot.  It appears that the local log brings the performance back in line, even if your only data file is in Azure Blob Storage.  Presumably this is due to the log buffer flush to the transaction log occurring local.  When I have a bit more time perhaps I will research whether Log Writer waits are occurring when the log is in Blob Storage. 

Changing the default data location when SQL Server won’t start

So I suppose default data locations are important?  Updates were applied to a server recently and someone included a sql server patch that upgraded master and msdb.  SQL Server wouldn’t start after a reboot, and it was a mystery as to what had happened.  In Event Viewer, we were presented with the following:

Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 598, state 1, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.

CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file ‘G:\Logs\temp_MS_AgentSigningCertificate_database_log.LDF’.

CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file ‘G:\Data\temp_MS_AgentSigningCertificate_database.mdf’.

So, G:\Data and G:\Logs were mount points that were set up as default locations for databases.  In order to resolve this issue, we needed to change the default data location to be the location where master was, which was still in the default installation location.  For me, that was:

G:\MSSQL10_50.<InstanceName>\MSSQL\DATA

To do this, we have to dive into the registry (regedit). 

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.<Instance Name>\MSSQLServer

Key: DefaultData – needs to be changed from G:\Data to G:\MSSQL10_50.<Instance Name>\MSSQL\DATA

Key: DefaultLog – needs to be changed from G:\Logs to G:\MSSQL10_50.<Instance Name>\MSSQL\DATA

The “SQLPath” key under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.<Instance Name>\Setup is also set to the location where the installation has taken place: G:\MSSQL10_50.<Instance Name>\MSSQL

After changing these keys, I was able to start the sql service.  It takes a few seconds for the upgrade scripts to then complete.  Then, you have your instance back.  Now to run a query to see which instances have had their default locations for data and logs changed…

Using Azure Blob Storage for SQL Server Backups

With the release of SQL Server 2012 CU4, backing up databases directly to Azure Blog Storage is presented as an option for managing backups.  This provides a great opportunity to leverage the cloud for off-site, geo-redundant data storage for disaster recovery that is available from any location.  This article is an overview of how to get started using the cloud as a part of your DR plan. 

Setting up your storage

Step 1: Create account

Creating an Azure account is quick and easy, and there is currently a 1 month trial available at http://www.windowsazure.com.  Once you request your account, it may take 24 hours for it to be provisioned.  Keep this in mind as you will not be able to begin testing immediately.

Step 2: Add storage

Now that you have your account, you will need to add your storage.  This is quick and easy, with the only steps being to name your storage, pick a location, and choose whether your storage should be geo-redundant or locally redundant.

Create Storage Account

Step 3: Create container

In order to save any files into your storage, you will need to add a container.  To do this, click on the name of your newly created storage.  From here, you can monitor and change the configuration of your storage.  Click on the navigation for “Containers”.  In the “New Container” dialog, pick a name for what files will be in the container.  For our purposes here, we can just call this “sqlbackups”.  You also have the choice of access types – generally speaking you will leave this at the default of “Private”. 

Now that you have a container created, you will see that the container has a URL.  This is the URL that you will use to point your backups to this container.  The format of this URL is [storage name].core.windows.net/[container].  While the URL shows http://, https:// will also work. 

Storage Container

 

Step 4: Manage access keys – created by default

When you are in your storage account on the dashboard, you will notice “Manage Access Keys” at the bottom of the screen.  Clicking this will take you to a screen with 2 access keys pre-generated for your storage account.  You will use these to create a credential in the next step.

manage access keys

Creating your backup

Step 1: Create credential

In order to perform a backup to the cloud storage, you will need to create a credential on the SQL Server instance that you wish to take backups from.  The identity of the credential is the storage name from Azure, and the secret is the access key from step 5 above.

use master
go

CREATE CREDENTIAL BackupCredential WITH IDENTITY = 'blogdemostorage'
,SECRET = 'i/NCk4dWxUce0U+w5kPFT9m9p2UkT5ogWFYdyKEOqRNjLt32DFuu8NO7dMa6oNMqi/EZe2k+F/SuK9EbSkY1AA==' ;

go

Step 2: Backup the db

With your storage account and credential created, all that is left is to issue a backup command.  The biggest difference here is that instead of backing up TO DISK, you are backing up TO URL.  The URL specified in the command is the URL that is associated with the container that you created, with the name of the database backup file appended to it.  Authentication is handled by the credential.  Since you are transmitting this file to the cloud, it is recommended that you perform your backup with compression. 

BACKUP DATABASE AdventureWorks2012
TO URL = 'https://blogdemostorage.blob.core.windows.net/sqlbackups/AdventureWorks2012.bak'
      WITH CREDENTIAL = 'BackupCredential'
     ,COMPRESSION

After a refresh of your container in Azure, your backup file will be there.  Success!

You may choose to perform a backup using the SSMS GUI.  While backup TO URL exists in SQL Server 2012 SP1 CU4 and beyond, the ability to perform a backup in this manner through the GUI does not exist prior to SQL Server 2014.  After choosing URL instead of DISK as the “Back up to:” destination, you are prompted for the file name, credential, and container.  This will automatically generate the url for backing up the database. 

 Backup to URL GUI

Restoring a backup from Azure Blob Storage

Restoring a backup from Azure Blob Storage to your SQL Server instance (on-prem or Azure) is just as easy as taking the backup.  Your options are to download the file, or restore from URL.  To download the file, simply navigate in the Azure web portal to the container with your backups, and click “Download”.  You can then restore the file using commands exactly as you would with any other on-prem file.  If you choose to restore from URL, the modifications are the same as the changes when backing up – create the same credential as you did for taking the backup, and in your backup statement indicate a URL, and the credential.

RESTORE DATABASE AdventureWorks2012
FROM URL = 'https://blogdemostorage.blob.core.windows.net/sqlbackups/AdventureWorks2012.bak'
WITH
	CREDENTIAL = 'BackupCredential';

 As with restoring a backup from disk, you can use MOVE to put the files in a different location.

Adopting a cloud backup strategy:  Compatibility

If you choose to use Azure Blob Storage for your backup strategy, you may have an environment with instances that are not on SQL Server 2012 CU4 or beyond.  This matter is resolved using the SQL Server Backup to Azure Tool.  Simply install this tool, point it to a folder, and it manages the transmission of files in that folder to Azure Blob Storage.  This tool also allows you to compress and encrypt backups for transmission, where the version of SQL Server that you are running may not.

SQL Server 2014 – Delayed Transaction Durability

SQL Server 2014 has some great features, with my personal favorite being In-Memory OLTP (Hekaton).  However, in CTP2, a new feature was brought to light which has not received as much attention – Delayed Transaction Durability.  A common misperception is that this feature is only available when using In-Memory OLTP.  In reality, this is just another feature that can be put to use to reduce IO contention.  In this case, the targeted IO contention is transaction log writes.  Additionally, this feature may be helpful when your application is experiencing waits due to locks.  Since delaying log durability results in faster commit times, this effectively results in releasing locks faster.  The downside, is that you may lose data if the log is not flushed to disk for any reason (you can manually execute a flush if needed).

Deploying Delayed Durability

Step 1 – The Database

The database first needs to be configured for delayed durability.  This can be done by turning it on for the entire database, disabling for the entire database, or leaving it configurable at a transaction level.  This is done with an ALTER DATABASE statement.

ALTER DATABASE [DATABASE NAME] SET DELAYED_DURABILITY = {Disabled | Allowed | Forced}

These settings are fairly self-explanatory, but…

    1. Disabled – this means it can’t be used
    2. Forced – every transaction falls under delayed durability (if it can, exceptions are include things like distributed transactions which will always be durable)
    3. Allowed – this means that each transaction can control whether it is fully durable or delayed durable

Step 2:  The Transaction

If the database allows transaction durability to be configured (delayed_durability = allowed on the database), we have 2 options.  The first is to configure this in a natively compiled stored procedure (In-Memory OLTP), and the second is in a standard TSQL commit.

In-Memory OLTP (the books online example, I will dig into this further in a subsequent blog post)

CREATE PROCEDURE <procedureName> …

WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER

AS BEGIN ATOMIC WITH

(

DELAYED_DURABILITY = ON,

TRANSACTION ISOLATION LEVEL = SNAPSHOT,

LANGUAGE = N’English’

)

END

The piece that most people seem to be missing right now is that you can do this now without using a natively compiled stored procedure or in-memory table.  For any transaction (in a database that is marked delayed_durability = allowed), a commit statement can control the durability of the transaction.

BEGIN TRANSACTION

COMMIT WITH (DELAYED_DURABILITY = ON)

There are a few caveats to this.  For example, a distributed transaction will override this setting and proceed with delayed_durability = off.  For more on this, check out books online.

Summary

  1. Easy to deploy and use
  2. Use only if you can tolerate some data loss
  3. Use if you are commonly waiting on transaction log writes
  4. Use if you experience heavy contention and locks

My experience working with SQL Server 2014 on IO optimization to this point has been fantastic.  Between In-Memory OLTP and delayed durability, many of the typical IO bottleknecks can be alleviated.  Of course, there are trade-offs, but each project can now be dissected and analyzed as to whether these new features can be used to optimize the workload.  Whether it is a web application, ETL process, etc, I have already pinpointed a number of processes and applications that are prime candidates for these features.  Give them a try, I am sure you will find them useful as well!

SQL 2012 Copy/Paste to Excel Splits Rows

So you just moved to SQL 2012, excellent news! The very next day, someone is running an ad hoc query and copy/paste from the results grid is adding rows when they paste into Excel! The data hasn’t changed, so what is the cause? The answer is most likely that your data has carriage returns in it, and you just didn’t notice before. There is a behavioral difference in SQL Server 2012 that when you copy data from the grid and paste into Excel, the carriage returns are taken into account and data is split into new rows. In previous versions of SQL Server, the carriage returns are not copied as such. The problem appears to stem from moving SSMS into the Visual Studio shell, which took place in SQL Server 2012.

Example:

declare @t table(col1 nvarchar(20))

insert into @t(col1)
values('test' + char(13) + char(13))

insert into @t(col1)
values('test2' + ' ' + char(13) + char(13) + 'test3')

select *
from @t

When you copy these results from SSMS in SQL 2008, and paste into Excel 2010 or Excel 2013, you will see 2 rows.

pic1

However, if you do the same using SQL 2012, you will have 6 rows. As you can see, the data has been split by the carriage returns.

pic2

How do we resolve this issue? No…you don’t have to go back to SQL 2008. For starters, you can clean this data in your query by simply replacing the carriage returns as such:

declare @t table(col1 nvarchar(20))

insert into @t(col1)
values('test' + char(13) + char(13))

insert into @t(col1)
values('test2' + ' ' + char(13) + char(13) + 'test3')

select 
	replace(col1, char(13), ''),
	replace(replace(col1, char(13), ''), char(10), '') --get rid of line feeds too
from @t

This solution produces the following:

pic3

The great news here is that your data is actually cleaner than it was in SQL 2008. Notice that the extra spaces between “test2” and “test3” are now removed. The carriage returns in SQL 2008 displayed as blank spaces, thus the quality of your data was impacted by these even if they did not split your data. Remember, this was only cleansing for a single character that you did not want. At the very least if you are experiencing this, you should probably check for a line feed next (char(10)) and replace that as well. You could also make a general purpose UDF for stripping out all low ascii characters.

This brings us to our next point – Data Quality. Now that you have your workaround, the next step is to get a project together to clean up your old data, and also to clean your data inputs.

Other options here involve some analysis around the process you are using. Should you be running a query and copying the data into Excel? In some cases, other options may be available. For example, this issue does not exist when you load data into PowerPivot. Perhaps a Business Intelligence Semantic Model or PowerPivot Model would be a better solution moving forward.

Finally, feel free to vote up the suggestion on Connect related to this topic. Having too many rows is a much larger deal for most users than some extra whitespace, thus the manner in which this was handled prior to SQL 2012 seems to be the desired output.

Combining rows into comma delimited string

A frequent task in reporting is to combine rows together into a comma delimited string.  While this can be done in a loop or a cursor, there are other ways to perform this task.  If you are only dealing with a single result, there is a quick way to accomplish this with some shorthand:

/*
set up some data
*/

	declare @t table (id int identity(1,1), data varchar(50))
		insert into @t(data)
		values
			('a'),('b'),('c')

/* use a variable to roll up the data */

	declare @mystring varchar(100) = ''

	select
		@mystring = @mystring + ',' + data
	from @t

	select substring(@mystring, 2, len(@mystring)) --get rid of the first comma using SUBSTRING()
	select stuff(@mystring, 1, 1, '') --get rid of the first comma using STUFF()

The output in this case is “a,b,c”, combining the data from the 3 rows into one comma delimited string. Instead of writing the code to loop through the rows and append to the string, we simply have to remove the initial comma before returning the results. This example shows 2 ways to do that, using SUBSTRING() and using STUFF(). While this is a nice shortcut, this doesn’t handle all of the cases that we will be presented with over time. Specifically, we will again need to employ a loop if we need to group these strings in some manner. As it turns out, there is a method by which you can do this without the loop – use XML and STUFF().

To start, lets perform the same task as above, this time using XML and STUFF().

	declare @t table (id int identity(1,1), data varchar(50))
		insert into @t(data)
		values
			('a'),('b'),('c')

		select
			stuff
			(
				(
				select
					',' + data
				from @t
				for xml path('')
				)
			, 1, 1, '') as mystring

Our output in this case is the same: “a,b,c”.  Breaking this down, we can see that we are selecting all of the rows from the temp table.  The next step is to append a comma to the beginning of each piece of data.  Then, take this data and roll it up into XML, which will bring the data onto a single row.  The role that STUFF() plays here is the same as in the first example – removing the initial comma. We used STUFF() in this case instead of SUBSTRING() because the data is XML. In order to use SUBSTRING(), we would have first had to convert the XML back to VARCHAR.

Now, we can look at an example where we have multiple members that will each have their own comma delimited string.  For this example, we will call that our “grouping key”.

	declare @t2 table (id int identity(1,1), grouping_key int, data varchar(50))
		insert into @t2(grouping_key, data)
		values
			(1, 'a'),(1, 'b'),(1, 'c'), (2, 'd'), (2, 'e')

	select
		grouping_key,
		stuff
		(
			(
			select
				',' + data
			from @t2
			where
				grouping_key = unique_groups.grouping_key
			for xml path('')
			)
		, 1, 1, '') as mystring
	from
		(
		select
			grouping_key
		from @t2
		) unique_groups
	group by
		unique_groups.grouping_key

What we have done here is to use the XML and STUFF() as an inline function to manipulate the data specific to the row we are returning. As such, the output is not a single value. Rather, it is a table of data with our comma delimited strings broken up into 2 rows by the “grouping key”:

grouping_key mystring
1 a,b,c
2 d,e

One quick note on using XML in this manner – if your data has an ampersands in it, you will need to do some replacing to resolve those conflicts (changing “&amp;” back to “&” due to the XML conversion).

Enjoy!

SQL Server 2012 SP1 – MsiInstaller is going crazy!

Woah! I just installed SQL Server 2012 SP1 and MsiInstaller is attempting to reconfigure management studio constantly!

Event Viewer: Windows Installer reconfigured the product. Product Name: SQL Server 2012 Management Studio. Product Version: 11.1.3000.0. Product Language: 1033. Manufacturer: Microsoft Corporation. Reconfiguration success or error status: 0.

If this is happening to you, the good news is, there is a hotfix. I have had this happen on one of my SQL 2012 SP1 VMs, but I haven’t been able to exactly replicate how the issue came about. The issue has more documentation on Connect, and is resolved by requesting SQL Server SP1 CU2.