SQL 2012 Copy/Paste to Excel Splits Rows

So you just moved to SQL 2012, excellent news! The very next day, someone is running an ad hoc query and copy/paste from the results grid is adding rows when they paste into Excel! The data hasn’t changed, so what is the cause? The answer is most likely that your data has carriage returns in it, and you just didn’t notice before. There is a behavioral difference in SQL Server 2012 that when you copy data from the grid and paste into Excel, the carriage returns are taken into account and data is split into new rows. In previous versions of SQL Server, the carriage returns are not copied as such. The problem appears to stem from moving SSMS into the Visual Studio shell, which took place in SQL Server 2012.

Example:

declare @t table(col1 nvarchar(20))

insert into @t(col1)
values('test' + char(13) + char(13))

insert into @t(col1)
values('test2' + ' ' + char(13) + char(13) + 'test3')

select *
from @t

When you copy these results from SSMS in SQL 2008, and paste into Excel 2010 or Excel 2013, you will see 2 rows.

pic1

However, if you do the same using SQL 2012, you will have 6 rows. As you can see, the data has been split by the carriage returns.

pic2

How do we resolve this issue? No…you don’t have to go back to SQL 2008. For starters, you can clean this data in your query by simply replacing the carriage returns as such:

declare @t table(col1 nvarchar(20))

insert into @t(col1)
values('test' + char(13) + char(13))

insert into @t(col1)
values('test2' + ' ' + char(13) + char(13) + 'test3')

select 
	replace(col1, char(13), ''),
	replace(replace(col1, char(13), ''), char(10), '') --get rid of line feeds too
from @t

This solution produces the following:

pic3

The great news here is that your data is actually cleaner than it was in SQL 2008. Notice that the extra spaces between “test2” and “test3” are now removed. The carriage returns in SQL 2008 displayed as blank spaces, thus the quality of your data was impacted by these even if they did not split your data. Remember, this was only cleansing for a single character that you did not want. At the very least if you are experiencing this, you should probably check for a line feed next (char(10)) and replace that as well. You could also make a general purpose UDF for stripping out all low ascii characters.

This brings us to our next point – Data Quality. Now that you have your workaround, the next step is to get a project together to clean up your old data, and also to clean your data inputs.

Other options here involve some analysis around the process you are using. Should you be running a query and copying the data into Excel? In some cases, other options may be available. For example, this issue does not exist when you load data into PowerPivot. Perhaps a Business Intelligence Semantic Model or PowerPivot Model would be a better solution moving forward.

Finally, feel free to vote up the suggestion on Connect related to this topic. Having too many rows is a much larger deal for most users than some extra whitespace, thus the manner in which this was handled prior to SQL 2012 seems to be the desired output.

Advertisements

SQL Server String Comparison with Trailing White Space

AKA: When “=” does not mean “=”

Sounds a bit silly, but it causes a lot of confusion for developers.  Take the following example in SQL Server:

declare @t nvarchar(20) = 'aaaa   '
declare @t2 nvarchar(20) = 'aaaa'

if(@t=@t2)
	print 'yes'
else
	print 'no'

Are you expecting the output to be “no”?  The output will be “yes”, contrary to what most developers expect.  Often, someone will perform a comparison to remove whitespace and perform a check such as:

if(data!=rtrim(data))

This will not produce the expected results, however, since SQL Server is Width Insensitive by default.  The whitespace is essentially trimmed off during this comparison, event when rtrim() is not used. 

The problem here, is that there is no SQL Server collation name that IS width sensitive.  In order to resolve this problem, you need to perform a length comparison…but pick the right one.

declare @t nvarchar(20) = 'aaaa   '
declare @t2 nvarchar(20) = 'aaaa'

select len(@t)
select len(@t2)

select datalength(@t)
select datalength(@t2)

As books online indicates LEN() “Returns the number of characters of the specified string expression, excluding trailing blanks. “.  DATALENGTH(), however, will keep the trailing spaces and take them into account when returning the number of bytes in the expression.  One thing to note here is that if you are dealing with CHAR instead of VARCHAR data, your DATALENGTH() will be fixed for both values to the same length (20 in this example, 40 if NCHAR).

The majority of cases dealing with strings may not have to deal with trailing white space.  But, if you happen upon a scenario where you need to check for equality inclusive of white space, this is an important distinction to have knowledge of.