A blog about designing and developing Business Intelligence solutions using Microsoft SQL Server
Header

Recently I had a requirement to set the colour of a couple of non-calculated measures in Analysis Services to differentiate between an actual and inferred value. It was another one of those things that I’d remembered being as simple as selecting a value from the properties screen… Unfortunately my memory deceives me and I was greeted with a properties window totally devoid of any colour settings!

Conveniently it’s easy (enough) to set the colours using MDX in the cube calculation script. SSAS defines colour values using a somewhat strange system, there’s a great post from Boyan Penev explaining this – Colour Codes in SSAS.

To set the colour of the Amount measure in the Adventure Works cube to a lovely shade of pink, you’d add the following line to the cube calculation script…

1
FORE_COLOR ( [Measures].[Amount] ) = 16711935;

With the resulting err… lovely pink measure when browsed from Management Studio.

Pink Coloured Non-Calculated Measure

I saw a thread on MSDN Forums recently asking for T-SQL to get a list of the top 10 most frequent words in a given string. While this could be done using cursors, I have a general dislike of them! It got me interested in whether or not this could be achieved using a common table expression…

So I had a think and put together some T-SQL, with some success (it didn’t have a syntax error or run forever). After a few changes I had broken the string up into a table of words, with a bit to indicate whether or not each record is a valid word. In this case, I just defined a word as a string of characters containing only a-z or A-Z, followed by any other character (or no character). This could easily be adjusted to include other characters though, for example ‘ to allow we’re, haven’t, etc.

Essentially the logic behind the CTE is…

  1. You have two integers a and b – one indicating the start of the current word (a), the other the possible length of the word (b).
  2. If the character at a isn’t an alphabetic character, then this isn’t the start of a word. So we need to move a to the next character.
  3. If the character at a + b (one character after the end of the possible current word) isn’t an alphabetic character then we’ve found a word. So we should mark the string starting at a of length b as a word. In this case we set a = a + b + 1, and reset b = 1. If a + b is an alphabetic character, then we haven’t reached the end of the word yet and we need to increment b by 1.

Below is the T-SQL I used, applied over this post.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
DECLARE @Text var char(max) = ''; -- remove space from var char

-- A - Z = 65 = 90
-- a - z = 97 = 122

WITH
Chars AS (

SELECT
   1 AS a_in
,  1 AS b_in
,  CASE
      WHEN
         ASCII(SUBSTRING(@Text, 1, 1)) NOT BETWEEN 65 AND 90
      AND   ASCII(SUBSTRING(@Text, 1, 1)) NOT BETWEEN 97 AND 122
      THEN
         2
      ELSE
         1
   END AS a
,  1 AS b
,  SUBSTRING(@Text, 1, 0) AS Word
,  0 AS IsWord

UNION ALL

SELECT
   a AS a_in
,  b AS b_in
,  CASE
      WHEN
         ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 65 AND 90
      AND   ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 97 AND 122
      THEN
         a + 1
      WHEN
         ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 65 AND 90
      OR ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 97 AND 122
      THEN
         a
      ELSE
         a + 1 + b
   END AS a
,  CASE
      WHEN
         ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 65 AND 90
      AND   ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 97 AND 122
      THEN
         b
      WHEN
         ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 65 AND 90
      OR ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 97 AND 122
      THEN
         b + 1
      ELSE
         1
   END AS b
,  SUBSTRING(@Text, a, b) AS Word
,  CASE
      WHEN
         ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 65 AND 90
      OR ASCII(SUBSTRING(@Text, a + b, 1)) BETWEEN 97 AND 122
      OR (
            ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 65 AND 90
         AND   ASCII(SUBSTRING(@Text, a, 1)) NOT BETWEEN 97 AND 122
         )
      THEN
         0
      ELSE
         1
   END AS IsWord
FROM
   Chars
WHERE
   a < LEN(@Text)

)

SELECT TOP 10
   Word
,  COUNT(*) AS NumberOfTimesUsed
FROM
   Chars
WHERE
   IsWord = 1
GROUP BY
   Word
ORDER BY
   2 DESC
OPTION
   (MAXRECURSION 0)
…and a screen shot of the ten most frequently used words. Apparently I like short words…
Top 10 Most Frequent Words

Recently, a couple of people I’m working with came across a strange issue using the new version of the Dimension Merge SCD component. During the first load all rows would be output as new records (as you’d expect), but when running the same load again all rows would be output as both unchanged and new (not as you’d expect!). The behavior seemed very strange, only a couple of packages exhibited the issue.

Initially I had little joy trying to replicate the issues with Adventure Works, but found the source of the problem after chatting to Bhavik Merchant about it. The problem arose due to two main reasons…

  1. Effective and expiry dates were being managed externally of the Dimension Merge SCD component.
  2. The default for the last records expiry date in the Dimension Merge SCD is the SQL Server datetime maximum value, while the warehouse was using 2199-12-31.

So, in the initial load the records where being inserted with an expiry date of 2199-12-31 which from the Dimension Merge SCD components point of view means the records aren’t current until the end of time. For the second load the Dimension Merge SCD would assign an expiry date of 9999-12-31 to the same source records, making them more recent than those inserted in the first load. So the second load would result in all the records being output again with the different active to date. Since the effective and expiry dates were then managed externally this resulted in two identical records being inserted into the destination table.

The “fix” was simple. Either change the default for the last records expiry date in the data warehouse to 9999-12-31, or set the last records expiry date in the Dimension Merge SCD to the same value being used in the data warehouse. The screenshot below shows where to change this value in the component.

Dimension Merge SCD Effective/Expiry Settings

I can across an annoying error when using another client computer to connect to an Analysis Services cube via Management Studio. The error or received was – “The query could not be processed: XML for Analysis parser: The LocaleIdentifier property is not overwritable and cannot be assigned a new value.”

SSAS LocaleIdentifier Issue

To resolve this you can simply select your language from the language drop down in the cube browse window.

SSAS LocaleIdentifier Issue Resolution

Recently I was working on a little side project, where I needed to allow users to connect to a variety of data sources using several connection types – SQL Server, OLE DB and ODBC. Unless I wanted users to manually type in the OLE DB provider (errr… I had no intention of even doing this myself…) then I needed to populate a list of installed OLE DB providers on the current machine.

Having had very little experience in .NET programming I searched around a bit and found a few less the elegant solutions, but was surprised just how simple it was to do in the end. In fact there’s a class to do it… OleDbEnumerator

There’s a couple of methods of interest in the class – GetElements() which returns a data table containing information about the installed OLE DB providers, and GetRootEnumerator() which returns a data reader with information about the installed OLE DB providers.

In a windows forms application, populating a data grid view is as simple as one line:

1
2
3
' Set the data source of the data grid view to the
' data table returned by GetElements()
DataGridView1.DataSource = New OleDbEnumerator().GetElements()

The resulting data grid view is shown below.

OLE DB Provider List