http://reddymsbitools.blogspot.com

Wednesday, 22 December 2010

Useful SQL Server System Stored Procedures You Should Know

System Stored Procedures are useful in performing administrative and informational activities in SQL Server. Here’s a bunch of System Stored Procedures that are used on a frequent basis (in no particular order):

System Stored Procedure

Description

sp_help Reports information about a database object, a user-defined data type, or a data type
sp_helpdb Reports information about a specified database or all databases
sp_helptext Displays the definition of a user-defined rule, default, unencrypted Transact-SQL stored procedure, user-defined Transact-SQL function, trigger, computed column, CHECK constraint, view, or system object such as a system stored procedure
sp_helpfile Returns the physical names and attributes of files associated with the current database. Use this stored procedure to determine the names of files to attach to or detach from the server
sp_spaceused Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database
sp_who Provides information about current users, sessions, and processes in an instance of the Microsoft SQL Server Database Engine. The information can be filtered to return only those processes that are not idle, that belong to a specific user, or that belong to a specific session
sp_lock Reports information about locks. This stored procedure will be removed in a future version of Microsoft SQL Server. Use the sys.dm_tran_locks dynamic management view instead.
sp_configure Displays or changes global configuration settings for the current server
sp_tables Returns a list of objects that can be queried in the current environment. This means any object that can appear in a FROM clause, except synonym objects.
sp_columns Returns column information for the specified tables or views that can be queried in the current environment
sp_depends Displays information about database object dependencies, such as the views and procedures that depend on a table or view, and the tables and views that are depended on by the view or procedure. References to objects outside the current database are not reported
These were some System Stored Procedures in SQL Server that come in very handy. If you have been using any other system stored procedures (not listed here) on a frequent basis, feel free to share them via the comments section.

Find Number of Weeks in a Month using SQL Serv

I have been asked this question plenty of times – How do I calculate the number of weeks in a month. The answer to this query depends on how do you define a week. For simplicity purposes, I will take a week from Sunday to Saturday as depicted in the calendar



So November has 5 weeks!

Let us write the query to calculate the number of weeks in each month of this year. This query was originally written by Michael Jones and I have modified it to suit the requirement

DECLARE @Yr SMALLINT
SET @Yr = 2009

;WITH NumWeeks
AS
(
SELECT Number + 1 as mth,
DATEDIFF(day,-1,DATEADD(month,((@Yr-1900)*12)+ Number,0))/7 AS fst,
DATEDIFF(day,-1,DATEADD(month,((@Yr-1900)*12)+ Number,30))/7 AS lst
FROM master..spt_values
WHERE Type = 'P' and Number < 12
)
SELECT DateName(mm,DATEADD(mm,mth,-1)) as [MonthName],
lst - fst + 1 AS [NumberOfWeeks]
FROM NumWeeks;
OUTPUT

SQL Server 2008: Merge Statement

SQL Server 2008 introduced the Merge statement which is commonly also called as Upsert (Update / Insert). It can be used as Insert if the row does not exists and update, if it exists. It can be used as ETL process while fetching data. This feature is useful for developers. It is not required to give any particular join (inner, left outer etc).

Let’s see some advantages for using Merge statement.

With Merge statement the ability to insert a row if it does not exist or to update if it exists, is provided with a single statement. With previous version of SQL it was required to create separate statements for insert, update or delete to be done. With Merge statement, a source table (or a query) is allowed to join with target table (or updatable view) based on a criteria. Depending upon the match of the criteria specified update, insert or delete can be used. There are 3 options in the condition WHEN MATCHED, WHEN TARGET NOT MATCHED, WHEN SOURCE NOT MATCHED.

The syntax of Merge statement is as follows :
MERGE source
USING
(SELECT * FROM TARGET)
ON criteria
WHEN MATCHED
THEN UPDATE/DELETE
WHEN TARGET NOT MATCHED
THEN INSERT
WHEN SOURCE NOT MATCHED
THEN UPDATE/DELETE

Let’s see some actual code. It is created using Feb CTP of SQL Server 2008 (CTP 6).

CREATE TABLE source
(ID int NOT NULL,
FullName nvarchar(35))

CREATE TABLE target
(ID int NOT NULL,
FullName nvarchar(35))

INSERT INTO Source Values(1,'Madhuri Dixit')
INSERT INTO Source VALUES(2,'Aishwarya Roy')
INSERT INTO Source VALUES(3,'Bipasha Baasu')

INSERT INTO [target] VALUES(1,'Madhuri Nene')
INSERT INTO [target] VALUES(2,'Aishwarya Bachchan')
INSERT INTO [target] VALUES(4,'Kareena Kapoor')

With following statement only the update is done if there is match :

MERGE Source src
using (SELECT * FROM [target]) as t
ON src.Id = T.Id
WHEN MATCHED
THEN UPDATE SET src.FullName=t.FullName;

SELECT * FROM source
SELECT * FROM [target]

The following will be the actual values in both the tables




Now let’s try insert as well as update with Marge statement. Let’s delete previous records from the tables and continue with fresh data :

DELETE source
DELETE [target]

INSERT INTO Source Values(1,'Madhuri Dixit')
INSERT INTO Source VALUES(2,'Aishwarya Roy')
INSERT INTO Source VALUES(3,'Bipasha Baasu')

INSERT INTO [target] VALUES(1,'Madhuri Nene')
INSERT INTO [target] VALUES(4,'Kareena Kapoor')
INSERT INTO [target] VALUES(5,'Priyanka Chopra')

SELECT * FROM source
SELECT * FROM [target]

The select query result will look as follows :




After using Merge with Insert and Delete

MERGE Source src
using (SELECT * FROM [target]) as t
ON src.Id = T.Id
WHEN MATCHED
THEN UPDATE SET src.FullName=t.FullName
WHEN TARGET NOT MATCHED
THEN INSERT VALUES (t.ID, t.FullName);

SELECT * FROM source
SELECT * FROM [target]

The result of the Merge statement will give following output :




Now let’s try all three conditions and see the result. Let’s start fresh with the records again :

DELETE source
DELETE [target]

INSERT INTO Source Values(1,'Madhuri Dixit')
INSERT INTO Source VALUES(2,'Aishwarya Roy')
INSERT INTO Source VALUES(3,'Bipasha Baasu')

INSERT INTO [target] VALUES(1,'Madhuri Nene')
INSERT INTO [target] VALUES(4,'Kareena Kapoor')
INSERT INTO [target] VALUES(5,'Priyanka Chopra')

SELECT * FROM source
SELECT * FROM [target]

The result will be as follows :





Let’s use Merge and see the result





We no longer see record for ID =3 as it gets deleted with this merge statement.

Summary
Do not forget to give ; (semi colon) at the end of Merge statement. You will end up in parser error otherwise. If multiple rows match in target for criteria following error is given.


Msg 8672, Level 16, State 1, Line 1 The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

This is advantage of Merge over Update with join. Merge is deterministic whereas Update with join will update any row without any error.

Find First and Last Day of the Current Quarter in SQL Server

I was recently working on a requirement where the user wanted a report with data from the First day to the Last Day of the current Quarter

Here's a simple way to find the Find First and Last Day of the current quarter in SQL Server 2005/2008


SELECT DATEADD(qq,DATEDIFF(qq,0,GETDATE()),0) as FirstDayOfQuarter

SELECT DATEADD(qq,DATEDIFF(qq,-1,GETDATE()),-1) as LastDayOfQuarter



OUTPUT


FirstDayOfQuarter LastDayOfQuarter

2009-04-01 00:00:00.000 2009-06-30 00:00:00

A Simple Family Tree Query Using Recursive CTE’s in SQL Server 2005/2008

I was recently working with one of my colleagues Steve on a Family Tree query. His software collected information about Relations of a person and then displayed it in a TreeView. We started discussing about Recursive Common Table Expressions(CTE) and how well CTE’s fit into such requirements. I will give you an example of how we used Recursive CTE’s to get the Parent and Generation Level of each person in the Family Tree. One of the tables in his software looked similar to the following -

Note: A Family Tree usually contains many more columns than the one shown here. I have reduced the columns of the original table for understanding purposes. We are also considering only one of the Parents (Father or Mother) of each person in this query

DECLARE @TT TABLE
(
ID int,
Relation varchar(25),
Name varchar(25),
ParentID int
)

INSERT @TT
SELECT 1,' Great GrandFather' , 'Thomas Bishop', null UNION ALL
SELECT 2,'Grand Mom', 'Elian Thomas Wilson' , 1 UNION ALL
SELECT 3, 'Dad', 'James Wilson',2 UNION ALL
SELECT 4, 'Uncle', 'Michael Wilson', 2 UNION ALL
SELECT 5, 'Aunt', 'Nancy Manor', 2 UNION ALL
SELECT 6, 'Grand Uncle', 'Michael Bishop', 1 UNION ALL
SELECT 7, 'Brother', 'David James Wilson',3 UNION ALL
SELECT 8, 'Sister', 'Michelle Clark', 3 UNION ALL
SELECT 9, 'Brother', 'Robert James Wilson', 3 UNION ALL
SELECT 10, 'Me', 'Steve James Wilson', 3


Here’s the query to find the Parent and Generation Level using Recursive CTE’s



;WITH FamilyTree
AS
(
SELECT *, CAST(NULL AS VARCHAR(25)) AS ParentName, 0 AS Generation
FROM @TT
WHERE ParentID IS NULL

UNION ALL

SELECT Fam.*,FamilyTree.Name AS ParentName, Generation + 1
FROM @TT AS Fam
INNER JOIN FamilyTree
ON Fam.ParentID = FamilyTree.ID
)

SELECT * FROM FamilyTree

Rename a Column in a Table in a SQL Server Database

This is a very simple script I am sharing today. A sqlservercurry.com user mailed me and asked me the simplest way to rename a column. Well here’s the query I have been using all along

USE Social;
GO
EXEC sp_rename 'dbo.Person.PersonName', 'PersonNm', 'COLUMN';
GO
This query renames the Column ‘PersonName’ in the Person table to ‘PersonNm’. Be careful while you do so, as you may affect Stored Procedures that use the column. Infact, when you run the query shown above, a caution gets displayed in your SQL Server Management Studio warning you of the change.

Caution: Changing any part of an object name could break scripts and stored procedures.

Generate Insert Statements For a SQL Server Table

Microsoft SQL Server Database Publishing Wizard is a great tool to generate script for both schema and data for a database. However what if you were to write a script that could do that for you.

Here’s a real cool script written by johnnycrash that generates Insert Statements for a SQL Server Table.

For demonstration purposes, I am using the Culture table of the AdventureWorks database.

USE AdventureWorks
GO
DECLARE @Columns VARCHAR(max);
SET @Columns = '[CultureID], [Name], [ModifiedDate]'
DECLARE @Table VARCHAR(max);
SET @Table = 'Production.Culture'

DECLARE @SQL VARCHAR(max)
SET @SQL = 'DECLARE @S VARCHAR(MAX)
SELECT @S = ISNULL(@S + '' UNION '',
''INSERT INTO ' + @Table + '(' + @Columns + ')'')
+ CHAR(13) + CHAR(10)
+ ''SELECT '' + ' + REPLACE(REPLACE(REPLACE(@Columns, ',', ' + '', '' + '),
'[', ''''''''' + CAST('),']',' AS VARCHAR(max)) + ''''''''')
+' FROM ' + @Table
+ ' PRINT @S'

EXEC (@SQL)

Site to follow

http://www.sqlservercurry.com/search/label/Analysis%20Services%20SSAS?updated-max=2009-08-15T04:55:00-07:00&max-results=20----SSAS
http://www.sqlservercurry.com/search/label/Transact%20SQL%20T-SQL?updated-max=2010-01-08T04:55:00-08:00&max-results=20---SQL Server2008
http://www.sqlservercurry.com/search/label/Reporting%20Services%20SSRS----SSRS

Different ways for executing SQL Server Integration Package

With Business Intelligence Development Studio (BIDs) we get a template for creating SQL Server Integration Services (SSIS) Project, which in turn creates individual packages. Each package has an extension .dtsx (Data Transformation Services with XML) which is a unit of deployment and execution. There are several ways in which the package can be executed. Here are some of those

1. Run the package from BIDs itself.

With this option, we have the advantage of working with debug mode for control flow tasks and Data Viewer for Data Flow Tasks.
The disadvantage is that we will not have Visual Studio on deployment server.
2. Run the package with Command line utility DTEXECUI

This means Data Transfer EXecution User Interface
User Interface is provided with the help of which certain properties like setting command line parameters, specifying configuration files, setting values for variables at run time etc
3. Run the package wit Command line utility DTEXEC

As the name suggests, this option does not have an user interface but certain switches that can be used with the command
Depending upon where the package is stored, we can use different switches for file deployment or SQL server deployment
Users can specify properties values with /Set
4. Run the package after deploying it on the server either in a File System or on a SQL server database (MSDB)

This option will provide a similar interface like DTEXECUI
5. Run the package as a SQL Server Agent job

Create a step for SQL server agent job to execute a package and depending upon the schedule, the package will be executed automatically
6. Write a code with Visual Studio with C# or VB.NET which will execute package from the application. In this option, we will need to first set references to System.SQLServer.ManagedDTS so as to load existing package in the application by using Application.LoadPackage() method and then execute it by using the Package.Execute() method.

C#

using Microsoft.SqlServer.Dts.Runtime;

Package pkg = new Package();
Application app = new Application();
//load and execute package
pkg = app.LoadPackage(@"/.dtsx”, null);
pkg.Execute();
VB.NET

Imports Microsoft.SqlServer.Dts.Runtime

Package pkg= new Packagr()
Application app= new Application()
'load and package
pkg = app.LoadPackage(“/.dtsx”,null)
pkg.Execute()

Find Primary Key of a SQL Server Table

Here’s a quick way to find the primary key of a SQL Server 2005/2008 Table using TABLE_CONSTRAINTS

USE Northwind
go
SELECT ISKC.TABLE_SCHEMA, ISKC.TABLE_NAME, ISKC.COLUMN_NAME
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS ISTC
INNER JOIN
INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS ISKC
ON ISTC.CONSTRAINT_TYPE = 'PRIMARY KEY' AND
ISTC.CONSTRAINT_NAME = ISKC.CONSTRAINT_NAME
WHERE ISKC.TABLE_NAME = 'Employees'
ORDER BY ISKC.TABLE_NAME, ISKC.ORDINAL_POSITION

Tuesday, 21 December 2010

Auto Generate AlphaNumeric ID’s in a SQL Server Table

I was recently working on a query to auto generate a Column ID based on an IDENTITY column. I used computed columns to achieve this. If you too have a similar requirement, then here’s the query.

DECLARE @TT TABLE (
CandID as 'C-' + RIGHT('000' + CONVERT(varchar, CourseID),5),
CourseID int IDENTITY(1,1),
ReferrerName varchar(10)
)

INSERT INTO @TT VALUES ('Dhiraj')
INSERT INTO @TT VALUES ('Jack')
INSERT INTO @TT VALUES ('Kelly')
INSERT INTO @TT VALUES ('Kelly')
INSERT INTO @TT VALUES ('Jack')
INSERT INTO @TT VALUES ('Dhiraj')

SELECT * FROM @TT
OUTPUT



Observe how the values of the CandID column are autogenerated based on the values if the CourseID column.

List all the Weekends of the Current Year using SQL Server

Sometime back, I had posted to Find the WeekEnds in this Quarter or Year using SQL Server 2005/2008. An anonymous user wrote back asking if there was a way to determine the first and last day of the current year and rewrite the same query. Well here it is:

DECLARE @StrtDate datetime
DECLARE @EndDate datetime
SELECT @StrtDate = DATEADD(yy, DATEDIFF(yy,0,getdate()), 0)
SELECT @EndDate = DATEADD(ms,-3,DATEADD(yy, DATEDIFF(yy,0,getdate() )+1, 0))

;WITH CTE (weekends)
AS
(
SELECT @StrtDate
UNION ALL
SELECT DATEADD(d,1,weekends)
FROM CTE
WHERE weekends < @EndDate
)
SELECT weekends from CTE
WHERE DATENAME(dw,weekends)
In ('Saturday', 'Sunday')

Find Maximum Value in each Row – SQL Server

Here’s a simple and efficient way to find the maximum value in each row using SQL Server UNPIVOT

DECLARE @t table (id int PRIMARY KEY, col1 int, col2 int, col3 int)

-- SAMPLE DATA
INSERT INTO @t SELECT 1, 45, 2, 14
INSERT INTO @t SELECT 2, 8, 1, 12
INSERT INTO @t SELECT 3, 21, 20, 8
INSERT INTO @t SELECT 4, 8, 5, 2
INSERT INTO @t SELECT 5, 23, 49, 7
INSERT INTO @t SELECT 6, 19, 7, 5
INSERT INTO @t SELECT 7, 7, 7, 2
INSERT INTO @t SELECT 8, 14, 17, 2

-- QUERY
SELECT id, MAX(col) AS maxValue
FROM
(
SELECT id, col FROM @t
UNPIVOT
(col FOR ListofColumns IN (col1,col2,col3))
AS unpivott) AS p
GROUP BY id

NVARCHAR(MAX) VS NTEXT in SQL Server

Here are some differences between nvarchar(max)/nvarchar and ntext in SQL Server


Nvarchar(max)/Nvarchar()



Ntext


Nvarchar(max) is supported only from version 2005 onwards Ntext is available in prior versions too
You can specify the length in advance for nvarchar datatype i.e. nvarchar(100), nvarchar(4000) You can't specify the length for Ntext datatype
All types of string functions such as left, right, len, etc. can be used for Nvarchar data type Only limited functions like substring, datalength are used for Ntext datatype
Nvarchar datatype can be used in local variables Ntext can't be used in local variables
Nvarchar datatype is available from version 2005 onwards and will be supported for further releases Ntext will be removed from future release of SQL Server and will be substituted by nvarchar(max) datatype


In one of the upcoming posts, I will tell you how to convert an Ntext column to Nvarchar(max) and some points to consider while doing so.

Temp Table VS Table Variable in SQL Server

Here are some differences between Temp Table and Table Variable in SQL Server

Temp Table Table Variable
Temp table is valid for a session.

For eg: when you run the following code

create table #temp(i int)
insert into #temp select 345
Go
create table #temp(i int)
insert into #temp select 345
Go

you will get an error Table variable has a statement-level scope. ie as soon as you execute the statement the scope is lost

For eg: when you run the following code

declare @t table(i int)
insert into @t select 45
GO
declare @t table(i int)
insert into @t select 45
GO

you will not get an error
It is possible to alter the temp table to add columns, idexes,etc It is not possible to alter a table variable
It is possible to truncate a temp table It is not possible to truncate a table variable
SELECT INTO method can be used for temp table


SELECT * INTO #temp from your_table
SELECT INTO method cannot be used for table variable. You get error for the following

SELECT * INTO @t from your_table
Temp table can be useful when you have a large amount of data For small set of data, table variables can be useful

When to use STUFF instead of REPLACE – SQL Server

The STUFF function in SQL Server ‘inserts’ a string into another string. It deletes a specified length of characters in the first string at the start position and then inserts the second string into the first string at the start position.

REPLACE on the other hand replaces all occurrences of a specified string value with another string value.

The STUFF function is useful when you are not sure of the characters to replace, but you do know the position of the character. I saw an excellent implementation of this function over here by GMastros and Even. Here’s the code with a practical usage of STUFF:

CREATE FUNCTION [dbo].[fn_StripCharacters]
(
@String NVARCHAR(MAX),
@MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET @MatchExpression = '%['+@MatchExpression+']%'

WHILE PatIndex(@MatchExpression, @String) > 0
SET @String = STUFF(@String, PatIndex(@MatchExpression,
@String), 1, '')

RETURN @String

END

SELECT dbo.fn_StripCharacters('a1$%s#22d&39#f4$', '^a-z')
as OnlyAlphabets

OUTPUT

Make a T-SQL Query Sleep for a certain amount of time

At times, in test scenarios, you may want to emulate that a transaction or a T-SQL query takes ‘x’ amount of time. Well that is possible (although not with precision) in SQL Server (2005/2008) using the WAITFOR command.

Here’s an example of delaying query execution till a specified time interval has elapsed – let us say 10 seconds

USE Northwind
WAITFOR DELAY '00:00:10';
SELECT LASTNAME, FIRSTNAME FROM Employees;
As you can see, we used the WAITFOR statement with a DELAY option to delay code execution for 10 seconds.

There are a couple of things you should note about the WAITFOR statement

- WAITFOR does not accept date values, only time values are allowed

- In some scenarios, the delay specified by you may not exactly the same as the delay caused by WAITFOR

Duplicate a Column in SQL Server

Here’s a very simple but useful post that shows how to duplicate an existing column in SQL Server.

Assuming we have the following table structure with data:

CREATE TABLE #TmpTable(ColA varchar(10))
INSERT INTO #TmpTable SELECT 'ValueOne'
INSERT INTO #TmpTable SELECT 'ValueTwo'
INSERT INTO #TmpTable SELECT 'ValueThree'
INSERT INTO #TmpTable SELECT 'ValueFour'
INSERT INTO #TmpTable SELECT 'ValueFive'
INSERT INTO #TmpTable SELECT 'ValueSix'
Now to duplicate ColA, use the following code:

ALTER TABLE #TmpTable
ADD ColB varchar(10)
GO

UPDATE #TmpTable
SET ColB = ColA
GO

SELECT * FROM #TmpTable
As you can see, we have added a new column to the Table and used a simple Update statement to duplicate ColA. If the column already exists, you can avoid the ALTER TABLE.

OUTPUT

Tuesday, 14 December 2010

How do I setup the breakpoints in SSIS?

In the BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option. This will display the screen shot listed next section.

What are the conditions that I can set for a breakpoint in SSIS?

Breakpoints have 10 unique conditions when they can be invoked. The breakpoints can also be used in combination. One typical example is using both the OnPreExecute and OnPostExecute events to determine the status of the variables as the process begins and ends.

What other breakpoint parameters can be configured?

For each of the breakpoints, the Hit Count Type and Hit Count can be configured.

The Hit Count Type values are:

* Always
* Hit count greater than or equal to
* Hit count multiple
* Hit count equals

The Hit Count value is an integer greater than 1.

The image below shows an example of these options, but is only an example not a probable configuration.

Next Steps

Sunday, 28 November 2010

Deleting the duplicate records in sql sevrer by using CTE Expression

As I have mentioned in one of my previous tips that using ROW_NUMBER() we can get the serial number which is here.
 
Using the same ROW_NUMBER() function you can also easily delete the Duplicate Records present inside a table.

 
Lets say I have a Users Table which is having duplicate records. I need to delete the duplicate records which are having both the FirstName and LastName same.

 

 
Table: UsersList

 
FirstName       LastName       PhoneNumber       Address
------------    ------------     ----------------    -------------
Rashmita        Devi                987554437           NULL
Rashmita        Devi                446576578           NULL
Adwin            Ratzz              222222222           NULL
Charle            Hardwick         343534545           NULL
Adwin            Ratzz              576767688           NULL
Charle            Hardwick         877778777           NULL

 

 
So, with the help of ROW_NUMBER() all the duplicate records present in the above table can be removed easily.
The query will be composed this way:-

 
WITH TempUsers (FirstName,LastName, duplicateRecordCount)
AS
(
SELECT FirstName,LastName,
ROW_NUMBER()OVER(PARTITIONBY FirstName, LastName ORDERBY FirstName) AS duplicateRecordCount
FROM dbo.Users
)
DELETE
FROM TempUsers
WHERE duplicateRecordCount > 1
GO

 
 
Instead of TempUsers you can give any name. Because this is used only for Temporary purpose.
 
After the execution of the above query, Users table will have the following records.

 
FirstName          LastName        PhoneNumber         Address
------------       ------------      ----------------      -------------
Rashmita           Devi                987554437              NULL
Adwin               Ratzz              222222222             NULL
Charle               Hardwick         877778777             NULL
 
 

 
Note: Row_Number() function is present in SQL Server 2005 and in later version(s).

Thursday, 18 November 2010

Incremental Loading & Batch Processing?

Incremental Loading:Its also called retrospective data load.In this data being loaded on a regular inerval and only new data is moved.
eg in ur data warehouse ur loading "Transactional Data" on daily basis and its happning only for new Transactions because old Data is already in the Data warehouse.
Batch Processing:
Its processing of Jobs/Task together.
eg if u have a staging area u may have valid reason to get data for different Departments there first and then move all that in one shot to production.

Difference between ODS and Staging

---Scenario 1#
An operational data store (ODS) is a type of database often 
used as an interim area for a data warehouse. Unlike a data 
warehouse which contains static data the contents of the 
ODS are updated through the course of business operations. 
An ODS is designed to quickly perform relatively simple 
queries on small amounts of data (such as finding the 
status of a customer order) rather than the complex 
queries on large amounts of data typical of the data 
warehouse. An ODS is similar to your short term memory in 
that it stores only very recent information; in comparison 
the data warehouse is more like long term memory in that it 
stores relatively permanent information.

But in staging we are storing current as well as historic 
data. This data might be a raw and then need cleansing and 
transform before load into datawarehouse.
--Scenario 2#
ODS keeps the transactional data from where direct reporting is possible. The data in ODS can be partly denormalized.
Staging area is the exact replica of the transactional database which acts as a source for datawarehouse processing.

Types Of Facts?

# Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table.
# Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table but not the others.
# Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.

Fact vs Dimension Table

A fact table captures the data that measures the organization's business operations. A fact table might contain business sales events such as cash register transactions or the contributions and expenditures of a nonprofit organization. Fact tables usually contain large numbers of rows sometimes in the hundreds of millions of records when they contain one or more years of history for a large organization.

Dimension tables contain attributes that describe fact records in the fact table. Some of these attributes provide descriptive information; others are used to specify how fact table data should be summarized to provide useful information to the analyst.

Explain degenerated dimension in detail.

 Degenerate dimension: A column of the key section of the fact table that does not have the associated dimension table but used for reporting and analysis such column is called degenerate dimension or line item dimension.For ex we have a fact table with customer_id product_id branch_id employee_id bill_no date in key section and price quantity amount in measure section. In this fact table bill_no from key section is a single value it has no associated dimension table. Instead of cteating a seperate dimension table for that single value we can include it in fact table to improve performance.SO here the column bill_no is a degenerate dimension or line item dimension
-----Scenario 2
Degenerated Dimension is achieved through a gradual modeling approach following Dimensional Modeling standards. Let's take example of a Star Schema representing Sales Invoices. The FACT would have the "Invoiced Amount" as primary measure. Now when we look at the source of the Invoice, it is the body if the Paper Invoice that gives us the following particulars about each Invoice:
Invoice Date
Customer ID
Products within the Invoice
Reference to Order Number(s)
Invoice Number
Invoice Line Numbers (which are multiple lines in single Invoice)
Invoice Line Amount
Invoice Total Amount
When we model the above following Dimensional Modeling standards, we get following distinct Dimensions:
Calendar Dimension - representing the Invoice Date
Customer Dimension - representing Customer ID
Product Dimension - representing Products within the Invoice
Order Dimension - representing Orders
Invoice Dimension – representing Invoice Number & Invoice Line Numbers
Question comes - what attributes would be left to be part of the INVOICE DIMENSION, if at all we decide to have one! Only candidate attributes are Invoice Number and Invoice Line Numbers. But, this is at the granularity of the FACT, which stores references to all above said Dimensions as well as the measures i.e. Invoice Line Amount, Invoice Total Amount (Derived by aggregation).
It is at this situation, we may decide to degenerate the attributes Invoice Number & Invoice Line Number into the Fact and avoid having a distinct entity to represent Invoice Number / Line Numbers as a Dimension. What we achieve by this:
1. avoiding a huge join as both Fact and this Dimension would have the same granularity,
2. still able to query with Invoice Number as the entry point
So, when such a scenario appears, we make the left out attributes (i.e. Invoice Number & Invoice Line Number in our case) part of the Fact and part of the Primary Key in the Fact. This is why and how we model Degenerated Dimensions.

Sql server Database Backup's

OverviewThere are only two commands for backup, the primary is BACKUP DATABASE.  This allows you to do a complete backup of your database as well as differential, file, etc. backups depending on the options that you use.
ExplanationThe BACKUP DATABASE command gives you many options for creating backups.  Following are different examples.
Create a full backup to diskThe command is BACKUP DATABASE databaseName.  The "TO DISK" option specifies that the backup should be written to disk and the location and filename to create the backup is specified.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
GO
Create a differential backupThis command adds the "WITH DIFFERENTIAL" option.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK' 
WITH DIFFERENTIAL 
GO
Create a file level backupThis command uses the "WITH FILE" option to specify a file backup.  You need to specify the logical filename within the database which can be obtained by using the command sp_helpdb 'databaseName', specifying the name of your database.
BACKUP DATABASE TestBackup FILE = 'TestBackup' 
TO DISK = 'C:\TestBackup_TestBackup.FIL'
GO
Create a filegroup backupThis command uses the "WITH FILEGROUP" option to specify a filegroup backup.  You need to specify the filegroup name from the database which can be obtained by using the command sp_helpdb 'databaseName', specifying the name of your database.
BACKUP DATABASE TestBackup FILEGROUP = 'ReadOnly' 
TO DISK = 'C:\TestBackup_ReadOnly.FLG'
GO
Create a full backup to multiple disk filesThis command uses the "DISK" option multiple times to write the backup to three equally sized smaller files instead of one large file.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks_1.BAK',
DISK = 'D:\AdventureWorks_2.BAK',
DISK = 'E:\AdventureWorks_3.BAK'
GO
Create a full backup with a passwordThis command creates a backup with a password that will need to be supplied when restoring the database.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
WITH PASSWORD = 'Q!W@E#R$'
GO
Create a full backup with progress statsThis command creates a full backup and also displays the progress of the backup.  The default is to show progress after every 10%.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
WITH STATS
GO
Here is another option showing stats after every 1%.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
WITH STATS = 1
GO
Create a backup and give it a descriptionThis command uses the description option to give the backup a name.  This can later be used with some of the restore commands to see what is contained with the backup.  The maximum size is 255 characters.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
WITH DESCRIPTION = 'Full backup for AdventureWorks'
GO
Create a mirrored backupThis option allows you to create multiple copies of the backups, preferably to different locations.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
MIRROR TO DISK =  'D:\AdventureWorks_mirror.BAK'
WITH FORMAT
GO
Specifying multiple optionsThis next example shows how you can use multiple options at the same time.
BACKUP DATABASE AdventureWorks 
TO DISK = 'C:\AdventureWorks.BAK'
MIRROR TO DISK =  'D:\AdventureWorks_mirror.BAK'
WITH FORMAT, STATS, PASSWORD = 'Q!W@E#R$'
GO

Tuesday, 16 November 2010

Google News Blog: Add Google News to Your Site

Google News Blog: Add Google News to Your Site

Differences between SQL Server temporary tables and table variables

Temporary Tables

Temporary tables are created in tempdb. The name "temporary" is slightly misleading, for even though the tables are instantiated in tempdb, they are backed by physical disk and are even logged into the transaction log. They act like regular tables in that you can query their data via SELECT queries and modify their data via UPDATE, INSERT, and DELETE statements. If created inside a stored procedure they are destroyed upon completion of the stored procedure. Furthermore, the scope of any particular temporary table is the session in which it is created; meaning it is only visible to the current user. Multiple users could create a temp table named #TableX and any queries run simultaneously would not affect one another - they would remain autonomous transactions and the tables would remain autonomous objects. You may notice that my sample temporary table name started with a "#" sign. This is the identifier for SQL Server that it is dealing with a temporary table.

The syntax for creating a temporary table is identical to creating a physical table in Microsoft SQL Server with the exception of the aforementioned pound sign (#):

CREATE TABLE #TempTable

(

ID INT NOT NULL,

Name VARCHAR(10),

DOB DATETIME

)


Table Variables

The syntax for creating table variables is quite similar to creating either regular or temporary tables. The only differences involve a naming convention unique to variables in general, and the need to declare the table variable as you would any other local variable in Transact SQL:

DECLARE @TableVariable TABLE (

ID INT NOT NULL,

Name VARCHAR(10),

DOB DATETIME

)

As you can see the syntax bridges local variable declaration (DECLARE @variable_name variable_data_type) and table creation (column_name, data_type, nullability). As with any other local variable in T-SQL, the table variable must be prefixed with an "@" sign. Unlike temporary or regular table objects, table variables have certain clear limitations.

* Table variables can not have Non-Clustered Indexes
* You can not create constraints in table variables
* You can not create default values on table variable columns
* Statistics can not be created against table variables



Similarities with temporary tables include:

* Instantiated in tempdb
* Clustered indexes can be created on table variables and temporary tables
* Both are logged in the transaction log
* Just as with temp and regular tables, users can perform all Data Modification Language (DML) queries against a table variable: SELECT, INSERT, UPDATE, and DELETE.


Differences between SQL Server temporary tables and table variables

There are three major theoretical differences between temporary tables And table variables

* The first difference is that transaction logs are not recorded for the table variables. Hence, they are out of scope of the transaction mechanism

After declaring our temporary table #T and our table-variable @T, we assign each one with the same "old value" string. Then, we begin a transaction that updates their contents. At this point, both will now contain the same "new value" string. But when we rollback the transaction, as you can see, the table-variable @T retained its value instead of reverting back to the "old value" string. This happened because, even though the table-variable was updated within the transaction, it is not a part of the transaction itself.

* The second major difference is that any procedure with a temporary table cannot be pre-compiled, while an execution plan of procedures with table variables can be statically compiled in advance. Pre-compiling a script gives a major advantage to its speed of execution. This advantage can be dramatic for long procedures, where recompilation can be too pricy.

* Finally, table variables exist only in the same scope as variables. Contrary to the temporary tables, they are not visible in inner stored procedures and in exec(string) statements. Also, they cannot be used in an insert/exec statement.

Difference between a "where" clause and a "having" clause.

Having clause is used only with group functions whereas Where is not used with.

SQL Server consists of fourdatabases by default.

Master It contains system catalogs that keep information about disk space, file allocations, usage, system wide configuration settings, login accounts, the existence of other database, and the existence of other SQL Servers (for distributed operations).
Model It is a simply a template database. Every time you create a new database, SQL Server makes a copy of model to form the basis of the new database.
Tempdb Temporary database, tempdb, is a workspace. SQL Server’s tempdb database is unique among all other databases because it is recreated not recovered every time SQL Server is started.
Msdb This database is used by the SQL Server Agent Service, which performs scheduled activities such as backups and replication tasks.

Database Files: A database file is nothing more than an operating system file. SQL Server 2000 allows the following three types of database files:

·Primary data files (.mdf)
·Secondary data files (.ndf)
·Log files (.ldf)

When we create a new user database, SQL Server copies the model database (includes 19 system tables and 2 system views (for backward compatibility)). A new user database must be 1MB or greater in size.

We can create a new database using the following command: CREATE DATABASE newdb

Types of Backups:

* Full Backup
* Differential Backup
* Log Backup

Types of User Defined Functions

User defined Functions in SQL

User Defined Functions are compact pieces of Transact SQL code, which can accept parameters, and return either a value, or a table. Data transformation and reference value retrieval are common uses for functions. User Defined Functions enable the developer or DBA to create functions of their own, and save them inside SQL Server.

Advantages of User Defined Functions

Before SQL 2000, User Defined Functions (UDFs), were not available. Stored Procedures were often used in their place.

One of the advantages of User Defined Functions over Stored Procedures, is the fact that a UDF can be used in a Select, Where, or Case statement. They also can be used to create joins. In addition, User Defined Functions are simpler to invoke than Stored Procedures from inside another SQL statement.

Disadvantages of User Defined Functions

User Defined Functions cannot be used to modify base table information. The DML statements INSERT, UPDATE, and DELETE cannot be used on base tables.

Another disadvantage is that SQL functions that return non-deterministic values are not allowed to be called from inside User Defined Functions.

GETDATE is an example of a non-deterministic function. Every time the function is called, a different value is returned. Therefore, GETDATE cannot be called from inside a UDF you create.

Types of User Defined Functions

There are three different types of User Defined Functions. Each type refers to the data being returned by the function.

1. Scalar functions return a single value.
2. In Line Table functions return a single table variable that was created by a select statement.
3. The final UDF is a Multi-statement Table Function. This function returns a table variable whose structure was created by hand, similar to a Create Table statement. It is useful when complex data manipulation inside the function is required.


Example for Scalar-valued Function

CREATE FUNCTION dbo.DateOnly
(
@DateTime datetime
)
RETURNS varchar(10)
AS
BEGIN
DECLARE @Output varchar(10)
SET @Output = CONVERT(varchar(10),@DateTime,101)
RETURN @Output
END

To call the function, execute : SELECT dbo.DateOnly(GETDATE())


Example for Inline Table-valued Function

CREATE FUNCTION dbo.FindNamesBy
(
@Name varchar(10)
)
RETURNS TABLE
AS
BEGIN
RETURN SELECT * FROM Employee
WHERE EmpName LIKE '%' + @Name
END

To use the above function, execute : SELECT * FROM dbo.FindNamesBy('Cherukuri')


Example for Multi statement Table-valued Function

CREATE FUNCTION dbo.MultiLineFunction
(
@Name varchar(10)
)
RETURNS @Result TABLE
(
Empname varchar(20),
HireDate datetime,
OnProbation char(1)
)
AS
BEGIN
INSERT INTO @Result (EmpName, HireDate)
SELECT Empname, HireDate FROM Employee
WHERE EmpName LIKE '%' + @Name

UPDATE @Result SET OnProbation = 'N'

UPDATE @Result SET OnProbation = 'Y'

WHERE HireDate < '11/16/2002'

RETURN
END

To use the above function, execute :
SELECT EmpName,HireDate,OnProbation FROM dbo.MultiLineFunction('Ch')

Wednesday, 3 November 2010

How to Use Merge Statement

We often need to perform operation that include

1. INSERT Record if it does not exits in Target Table

2. UPDATE or DELETE Record if it already exists in Target Table

There are different ways to perform above operation like using IF, CASE construct etc but none method is better than MERGE statement. In this article, we are going to learn how to use MERGE statement.

Scenario Detail

1. Update Record if Found in Target table

2. Insert Record if Not Found in Target Table

3. Mark Record as Deleted if Deleted at Source



Database Objects and Data used in example

Source Table : HumanResource.Department

CREATE TABLE [HumanResources].[Department]

(

[DepartmentID] [smallint] IDENTITY(1,1) NOT NULL,

[Name] [dbo].[Name] NOT NULL,

[GroupName] [dbo].[Name] NOT NULL,

[ModifiedDate] [datetime] NOT NULL,

CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED

(

[DepartmentID] ASC

)

) ON [PRIMARY]

Target Table : Department_Copy (without IDENTITY property but with Additional column bit type named Deleted)

Key Column Name: Name



Data in Target Table:

Name GroupName ModifiedDate Deleted
Engineering Research and Development 6/1/1998 0
Tool Design Research and Development 6/1/1998 0
Sales Sales and Marketing 6/1/1998 0
Marketing Sales and Marketing 6/1/1998 0
Purchasing Inventory Management 6/1/1998 0
Research and Development Research and Development 6/1/1998 0
Production Manufacturing 6/1/1998 0
Production Control Manufacturing 6/1/1998 0
Human Resources Executive General and Administration 6/1/1998 0
Finance Executive General and Administration 6/1/1998 0

How to use MERGE

Scenario #1

Lets update GroupName for Finance department from "Executive General and Administration" to "Accounts and Finance" using following query:

update HumanResources.Department

set GroupName = 'Accounts and Finance'

where Name = 'Finance'

Now Lets Sync-up Department_Copy table with HumanResources.Department:

MERGE Department_Copy AS deptCopy

USING (SELECT * FROM HumanResources.Department)AS dept

ON (deptCopy.Name = dept.Name)

WHEN MATCHED then

update

set

deptCopy.GroupName = dept.GroupName

,ModifiedDate = getDate()

; -- A MERGE statement must be terminated by a semi-colon (;)

After you execute above statement, you will notice that Department_Copy table has updated value for Department "Finance" with modification date:

Name GroupName ModifiedDate Deleted
Finance Accounts and Finance 4/26/2010 0

Scenario #2

Lets insert new record in HumanResource.Department

insert into HumanResources.Department (Name, GroupName)

values ('Document Control','Quality Assurance')

Now Lets Sync-up Department_Copy table with HumanResources.Department if any new record entered then add in Department_Copy without removing flexibility of changing data if any updated:

MERGE Department_Copy AS deptCopy

USING (SELECT * FROM HumanResources.Department)AS dept

ON (deptCopy.Name = dept.Name)

WHEN MATCHED then

update

set

deptCopy.GroupName = dept.GroupName

,ModifiedDate = getDate()

--following line will add new record if not found in Department_Copy table

WHEN NOT MATCHED BY TARGET THEN

insert

(

DepartmentID

,Name

,GroupName

,ModifiedDate

)

values

(

DepartmentID

,Name

,GroupName

,ModifiedDate

)

;

Scenario #3

Lets delete record in HumanResource.Department but do not delete record in Department_Copy table while synchronizing but need to identity if record was deleted in HumanResoource.Department table

In order to achieve this, I have added an extra column in Department_Copy table named Deleted using bit data type. Now lets examine by performing followings:

delete HumanResources.Department

where Name = 'Finance'

Now, lets synchronize Department_Copy with HumanResource.Department which can flag deleted record along with providing solution for 1st two scenario tested above:

MERGE Department_Copy AS deptCopy

USING (SELECT * FROM HumanResources.Department)AS dept

ON (deptCopy.Name = dept.Name)

WHEN MATCHED then

update

set

deptCopy.GroupName = dept.GroupName

,ModifiedDate = getDate()

--following line will add new record if not found in Department_Copy table

WHEN NOT MATCHED BY TARGET THEN

insert

(

Name

,GroupName

,ModifiedDate

)

values

(

Name

,GroupName

,ModifiedDate

)

--following line will update Deleted column with value 1 if Record(s) is/are deleted in HumanResource.Department table

WHEN NOT MATCHED BY SOURCE THEN

update

set Deleted = 1

;

Result of Scenario # 3 will changed valued for Deleted column for Finance Department as shown below:

Name GroupName ModifiedDate Deleted
Finance Accounts and Finance 4/26/2010 1



Conclusion
MERGE Statement can be used to perform INSERT, UPDATE and DELETE all in single statement

Table Partitioning

SQL Server Database Table Partitioning technique is fully available in SQL Server 2005 and SQL Server 2008 but logically it was available in SQL Server 2000 as a "Partitioned View". Partitioned View had several limitations. Lets looks at few facts of Partitioned View implementation:

Partitioning a table horizontally by replacing original table with several smaller member tables.
Data separation loaded into table was based on CHECK Constraint created on one of their column
In case of distributed partitioned view, each member table is an individual member server
You can not exceed 256 member tables
One member table can not have more than one range.
If you are implementing distributed partitioned view then add a linked server
Any linked server cannot be a loopback linked server, that is, a linked server that points to the same instance of SQL Server
Data type mapping limitation in Distributed partitioned view for example smallmoney columns in remote tables is mapped as money.
Data placement has to be planned properly as it is not controlled automatically
Partitioned views are transparent to the application
Limitations of data types and column types.
Having views on every table in your database could be a maintenance nightmare
SQL Server 2005 and 2008 extended horizontal partitioning concept that 2000 was using by allowing partition placement on the same table within a single database. Hence, planning, implementation and maintenance is very easy and manageable. It also improves the performance, reduce contention and increase availability of data.

How to implement Partitioning in SQL Server 2005/2008?

Lets assume you have Sales database that has a table named SalesLog. Your goal is to to keep 3 years of data at any given point and purge old data. You also know that your mostly report requires data for 6 month to analyze sales detail. There is no update or delete operation on table. Table structure is:

SaleDate datetime,

.....................
.....................



Below is step by step to implement partitioning on SalesLog table to meet requirement and also to gain performance.

Step 1

As we identified, 3 years is life cycle of data, mostly report requires 6 month of data and table has only insert and select operation. So, it would be ideal to have 7 partitions with separate file groups. Main Reasons:

Performance
Data Backup Strategy. Since, data is static/read only so we can implement File group backup instead of performing full backup. Number of filegroups and files per database could be up to 32767 in both 32 bit and 64 bit SQL Server
Lets Alter database to add filegroups using command below:

Alter Database Sales Add FileGroup [PartitionFG1]
Alter Database Sales Add FileGroup [PartitionFG2]
Alter Database Sales Add FileGroup [PartitionFG3]
Alter Database Sales Add FileGroup [PartitionFG4]
Alter Database Sales Add FileGroup [PartitionFG5]
Alter Database Sales Add FileGroup [PartitionFG6]
Alter Database Sales Add FileGroup [PartitionFG7]

Step 2

Add files to filegroups:

Alter Database Sales Add File
(
Name = 'Partition_Data1',
FileName = 'F:\MSSQL\Data\Partition_Data1.ndf',
Size = 10240 MB
)
To FileGroup [PartitionFG1]

.
.
.
.
.
.
.

Alter Database Sales Add File
(
Name = 'Partition_Data7',
FileName = 'F:\MSSQL\Data\Partition_Data7.ndf',
Size = 10240 MB
)
To FileGroup [PartitionFG7]



Step 3

Create Partition Function to define data range. It will map row of SalesLog table into partitions based on the values of SaleDate column.

use Sales
GO
Create Partition Function SalesLogPFN (datetime)
as
range RIGHT for values
(
'2008-01-01',
'2008-07-01',
'2009-01-01',
'2009-07-01',
'2010-01-01',
'2010-07-01'
)

Data Type specified in Partition Function must be the same as Partition Column in Table that you are going to partition. Partition function as such is not directly related to any table. Possible value for range is RIGHT and LEFT. In our case, 1st Partition will contain rows that has SaleDate less than 2008-01-01, 2nd Partition will have row starting from 2008-01-01 up to 2008-06-30 23:59:59:900 and similarly 3rd and others.

Step 4

Create Partition Scheme to link partition function to proper file groups that we have created as part of Step 2. So, in our case, we created 7 file groups which is correct as we have 6 value for partition function.

use Sales
GO

Create Partition Scheme SalesLogScheme
AS
PARTITION SalesLogPFN
TO ([PartitionFG1], [PartitionFG2], [PartitionFG3], [PartitionFG4], [PartitionFG5], [PartitionFG6], [PartitionFG7])

Step 5

Now, we need to create clustered Index on SaleDate column that will place data to corresponding file groups automatically based on definition of SalesLogScheme

CREATE CLUSTERED INDEX [IDX_SalesLog_SaleDate] ON [dbo].[SalesLog]([SaleDate]) ON SalesLogScheme ([SaleDate])

It is important to create clustered index aligned which is by creating index on column that is partitioned. If a clustered index is not aligned then any merging of partition will require us to drop the clustered index along with the non-clustered indexes and rebuild them.

Now your are all set.

Conclusion

Partitioning a table divides the table and its indexes into smaller chunks so that it can be easily maintained as maintenance operation can be applied partition by partition basis and also improves query performance as optimizer can use proper queries to appropriate partitions instead of querying entire table.

Change Data Capture

It is new feature in SQL Server 2008 that provides feature to track all data changes without writing triggers. CDC process works using transaction log. When table is enabled for CDC immediately after any changes made to table is available for tracking. Meta data change is also written along with the actual change.



Before we start looking into implementation, lets look at pre-requisition:

SQL Server 2008 Enterprise Edition

Sysadmin permission required at database level to enable CDC

dbo permission required to enable CDC at table level

Primary Key required in table only when net change is required

Now, lets start how to implement it:

I have not used any particular database name or table name while writing step by step implementation guide. Hence please follow syntax pattern with actual name for value indicated within <> brackets

Step 1: Enable CDC at database Level



Step 2: Enable CDC at Table Level







You are all set to start using changed data tracked.



You may visit http://technet.microsoft.com/en-us/library/bb510744.aspx for list of functions available for CDC

Implementation Overview





Conclusion

CDC is an excellent feature in SQL Server 2008 but it is very important to use it with proper planning. A lot many number of records will be added to the tracked table so ensure you have enough space to store changed data.




It is new feature in SQL Server 2008 that provides feature to track all data changes without writing triggers. CDC process works using transaction log. When table is enabled for CDC immediately after any changes made to table is available for tracking. Meta data change is also written along with the actual change.



Before we start looking into implementation, lets look at pre-requisition:

SQL Server 2008 Enterprise Edition

Sysadmin permission required at database level to enable CDC

dbo permission required to enable CDC at table level

Primary Key required in table only when net change is required

Now, lets start how to implement it:

I have not used any particular database name or table name while writing step by step implementation guide. Hence please follow syntax pattern with actual name for value indicated within <> brackets

Step 1: Enable CDC at database Level



Step 2: Enable CDC at Table Level







You are all set to start using changed data tracked.



You may visit http://technet.microsoft.com/en-us/library/bb510744.aspx for list of functions available for CDC

Implementation Overview





Conclusion

CDC is an excellent feature in SQL Server 2008 but it is very important to use it with proper planning. A lot many number of records will be added to the tracked table so ensure you have enough space to store changed data.

What is difference between Clustered Index and Non Clustered Index?

- A Clustered Index consists of index as well as data pages. Clustered Index is not just an index but also contains the table data. A clustered index is organized as a B-tree where the non-leaf nodes are index pages and the leaf nodes are data pages.

- A Non-clustered index is organized as a B-tree but it consists of only index pages. The leaf nodes in a non-clustered index are not data pages, but contains pointer for individual rows in a data pages.

Thursday, 28 October 2010

SSIS Incremental Loads

SSIS Design Pattern - Incremental Loads
Introduction

Loading data from a data source to SQL Server is a common task. It's used in Data Warehousing, but increasingly data is being staged in SQL Server for non-Business-Intelligence purposes.

Maintaining data integrity is key when loading data into any database. A common way of accomplishing this is to truncate the destination and reload from the source. While this method ensures data integrity, it also loads a lot of data that was just deleted.

Incremental loads are a faster and use less server resources. Only new or updated data is touched in an incremental load.

When To Use Incremental Loads

Use incremental loads whenever you need to load data from a data source to SQL Server.

Incremental loads are the same regardless of which database platform or ETL tool you use. You need to detect new and updated rows - and separate these from the unchanged rows.

Incremental Loads in Transact-SQL

I will start by demonstrating this with T-SQL:

0. (Optional, but recommended) Create two databases: a source and destination database for this demonstration:


CREATE DATABASE [SSISIncrementalLoad_Source]

CREATE DATABASE [SSISIncrementalLoad_Dest]
1. Create a source named tblSource with the columns ColID, ColA, ColB, and ColC; make ColID is a primary unique key:

USE SSISIncrementalLoad_Source
GO
CREATE TABLE dbo.tblSource
(ColID int NOT NULL
,ColA varchar(10) NULL
,ColB datetime NULL constraint df_ColB default (getDate())
,ColC int NULL
,constraint PK_tblSource primary key clustered (ColID))

2. Create a Destination table named tblDest with the columns ColID, ColA, ColB, ColC:

USE SSISIncrementalLoad_Dest
GO
CREATE TABLE dbo.tblDest
(ColID int NOT NULL
,ColA varchar(10) NULL
,ColB datetime NULL
,ColC int NULL)

3. Let's load some test data into both tables for demonstration purposes:

USE SSISIncrementalLoad_Source
GO

-- insert an "unchanged" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(0, 'A', '1/1/2007 12:01 AM', -1)

-- insert a "changed" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(1, 'B', '1/1/2007 12:02 AM', -2)

-- insert a "new" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(2, 'N', '1/1/2007 12:03 AM', -3)

USE SSISIncrementalLoad_Dest
GO

-- insert an "unchanged" row
INSERT INTO dbo.tblDest
(ColID,ColA,ColB,ColC)
VALUES(0, 'A', '1/1/2007 12:01 AM', -1)

-- insert a "changed" row
INSERT INTO dbo.tblDest
(ColID,ColA,ColB,ColC)
VALUES(1, 'C', '1/1/2007 12:02 AM', -2)

4. You can view new rows with the following query:

SELECT s.ColID, s.ColA, s.ColB, s.ColC
FROM SSISIncrementalLoad_Source.dbo.tblSource s
LEFT JOIN SSISIncrementalLoad_Dest.dbo.tblDest d ON d.ColID = s.ColID
WHERE d.ColID IS NULL

This should return the "new" row - the one loaded earlier with ColID = 2 and ColA = 'N'. Why? The LEFT JOIN and WHERE clauses are the key. Left Joins return all rows on the left side of the join clause (SSISIncrementalLoad_Source.dbo.tblSource in this case) whether there's a match on the right side of the join clause (SSISIncrementalLoad_Dest.dbo.tblDest in this case) or not. If there is no match on the right side, NULLs are returned. This is why the WHERE clause works: it goes after rows where the destination ColID is NULL. These rows have no match in the LEFT JOIN, therefore they must be new.

This is only an example. You occasionally find database schemas that are this easy to load. Occasionally. Most of the time you have to include several columns in the JOIN ON clause to isolate truly new rows. Sometimes you have to add conditions in the WHERE clause to refine the definition of truly new rows.

Incrementally load the row ("rows" in practice) with the following T-SQL statement:

INSERT INTO SSISIncrementalLoad_Dest.dbo.tblDest
(ColID, ColA, ColB, ColC)
SELECT s.ColID, s.ColA, s.ColB, s.ColC
FROM SSISIncrementalLoad_Source.dbo.tblSource s
LEFT JOIN SSISIncrementalLoad_Dest.dbo.tblDest d ON d.ColID = s.ColID
WHERE d.ColID IS NULL

5. There are many ways by which people try to isolate changed rows. The only sure-fire way to accomplish it is to compare each field. View changed rows with the following T-SQL statement:

SELECT d.ColID, d.ColA, d.ColB, d.ColC
FROM SSISIncrementalLoad_Dest.dbo.tblDest d
INNER JOIN SSISIncrementalLoad_Source.dbo.tblSource s ON s.ColID = d.ColID
WHERE (
(d.ColA != s.ColA)
OR (d.ColB != s.ColB)
OR (d.ColC != s.ColC)
)

This should return the "changed" row we loaded earlier with ColID = 1 and ColA = 'C'. Why? The INNER JOIN and WHERE clauses are to blame - again. The INNER JOIN goes after rows with matching ColID's because of the JOIN ON clause. The WHERE clause refines the resultset, returning only rows where the ColA's, ColB's, or ColC's don't match and the ColID's match. This is important. If there's a difference in any or some or all the rows (except ColID), we want to update it.

Extract-Transform-Load (ETL) theory has a lot to say about when and how to update changed data. You will want to pick up a good book on the topic to learn more about the variations.
To update the data in our destination, use the following T-SQL:

UPDATE d
SET
d.ColA = s.ColA
,d.ColB = s.ColB
,d.ColC = s.ColC
FROM SSISIncrementalLoad_Dest.dbo.tblDest d
INNER JOIN SSISIncrementalLoad_Source.dbo.tblSource s ON s.ColID = d.ColID
WHERE (
(d.ColA != s.ColA)
OR (d.ColB != s.ColB)
OR (d.ColC != s.ColC)
)

Incremental Loads in SSIS

Let's take a look at how you can accomplish this in SSIS using the Lookup Transformation (for the join functionality) combined with the Conditional Split (for the WHERE clause conditions) transformations.

Before we begin, let's reset our database tables to their original state using the following query:

USE SSISIncrementalLoad_Source
GO

TRUNCATE TABLE dbo.tblSource

-- insert an "unchanged" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(0, 'A', '1/1/2007 12:01 AM', -1)

-- insert a "changed" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(1, 'B', '1/1/2007 12:02 AM', -2)

-- insert a "new" row
INSERT INTO dbo.tblSource
(ColID,ColA,ColB,ColC)
VALUES(2, 'N', '1/1/2007 12:03 AM', -3)

USE SSISIncrementalLoad_Dest
GO

TRUNCATE TABLE dbo.tblDest

-- insert an "unchanged" row
INSERT INTO dbo.tblDest
(ColID,ColA,ColB,ColC)
VALUES(0, 'A', '1/1/2007 12:01 AM', -1)

-- insert a "changed" row
INSERT INTO dbo.tblDest
(ColID,ColA,ColB,ColC)
VALUES(1, 'C', '1/1/2007 12:02 AM', -2)

Next, create a new project using Business Intelligence Development Studio (BIDS). Name the project SSISIncrementalLoad:

Once the project loads, open Solution Explorer and rename Package1.dtsx to SSISIncrementalLoad.dtsx:

When prompted to rename the package object, click the Yes button. From the toolbox, drag a Data Flow onto the Control Flow canvas:



Double-click the Data Flow task to edit it. From the toolbox, drag and drop an OLE DB Source onto the Data Flow canvas:



Double-click the OLE DB Source connection adapter to edit it:



Click the New button beside the OLE DB Connection Manager dropdown:

Click the New button here to create a new Data Connection:

Enter or select your server name. Connect to the SSISIncrementalLoad_Source database you created earlier. Click the OK button to return to the Connection Manager configuration dialog. Click the OK button to accept your newly created Data Connection as the Connection Manager you wish to define. Select "dbo.tblSource" from the Table dropdown:



Click the OK button to complete defining the OLE DB Source Adapter.

Drag and drop a Lookup Transformation from the toolbox onto the Data Flow canvas. Connect the OLE DB connection adapter to the Lookup transformation by clicking on the OLE DB Source and dragging the green arrow over the Lookup and dropping it. Right-click the Lookup transformation and click Edit (or double-click the Lookup transformation) to edit:



When the editor opens, click the New button beside the OLE DB Connection Manager dropdown (as you did earlier for the OLE DB Source Adapter). Define a new Data Connection - this time to the SSISIncrementalLoad_Dest database. After setting up the new Data Connection and Connection Manager, configure the Lookup transformation to connect to "dbo.tblDest":



Click the Columns tab. On the left side are the columns currently in the SSIS data flow pipeline (from SSISIncrementalLoad_Source.dbo.tblSource). On the right side are columns available from the Lookup destination you just configured (from SSISIncrementalLoad_Dest.dbo.tblDest). Follow the following steps:

1. We'll need all the rows returned from the destination table, so check all the checkboxes beside the rows in the destination. We need these rows for our WHERE clauses and for our JOIN ON clauses.

2. We do not want to map all the rows between the source and destination - we only want to map the columns named ColID between the database tables. The Mappings drawn between the Available Input Columns and Available Lookup Columns define the JOIN ON clause. Multi-select the Mappings between ColA, ColB, and ColC by clicking on them while holding the Ctrl key. Right-click any of them and click "Delete Selected Mappings" to delete these columns from our JOIN ON clause.

3. Add the text "Dest_" to each column's Output Alias. These rows are being appended to the data flow pipeline. This is so we can distinguish between Source and Destination rows farther down the pipeline:

Next we need to modify our Lookup transformation behavior. By default, the Lookup operates as an INNER JOIN - but we need a LEFT (OUTER) JOIN. Click the "Configure Error Output" button to open the "Configure Error Output" screen. On the "Lookup Output" row, change the Error column from "Fail component" to "Ignore failure". This tells the Lookup transformation "If you don't find an INNER JOIN match in the destination table for the Source table's ColID value, don't fail." - which also effectively tells the Lookup "Don't act like an INNER JOIN, behave like a LEFT JOIN":

Click OK to complete the Lookup transformation configuration.

From the toolbox, drag and drop a Conditional Split Transformation onto the Data Flow canvas. Connect the Lookup to the Conditional Split as shown. Right-click the Conditional Split and click Edit to open the Conditional Split Editor:



Expand the NULL Functions folder in the upper right of the Conditional Split Transformation Editor. Expand the Columns folder in the upper left side of the Conditional Split Transformation Editor. Click in the "Output Name" column and enter "New Rows" as the name of the first output. From the NULL Functions folder, drag and drop the "ISNULL( <> )" function to the Condition column of the New Rows condition:

Next, drag Dest_ColID from the columns folder and drop it onto the "<>" text in the Condition column. "New Rows" should now be defined by the condition "ISNULL( [Dest_ColID] )". This defines the WHERE clause for new rows - setting it to "WHERE Dest_ColID Is NULL".

Type "Changed Rows" into a second Output Name column. Add the expression "(ColA != Dest_ColA) || (ColB != Dest_ColB) || (ColC != Dest_ColC)" to the Condition column for the Changed Rows output. This defines our WHERE clause for detecting changed rows - setting it to "WHERE ((Dest_ColA != ColA) OR (Dest_ColB != ColB) OR (Dest_ColC != ColC))". Note "||" is used to convey "OR" in SSIS Expressions:



Change the "Default output name" from "Conditional Split Default Output" to "Unchanged Rows":

Click the OK button to complete configuration of the Conditional Split transformation.

Drag and drop an OLE DB Destination connection adapter and an OLE DB Command transformation onto the Data Flow canvas. Click on the Conditional Split and connect it to the OLE DB Destination. A dialog will display prompting you to select a Conditional Split Output (those outputs you defined in the last step). Select the New Rows output:

Next connect the OLE DB Command transformation to the Conditional Split's "Changed Rows" output:



Your Data Flow canvas should appear similar to the following:

Configure the OLE DB Destination by aiming at the SSISIncrementalLoad_Dest.dbo.tblDest table:



Click the Mappings item in the list to the left. Make sure the ColID, ColA, ColB, and ColC source columns are mapped to their matching destination columns (aren't you glad we prepended "Dest_" to the destination columns?):



Click the OK button to complete configuring the OLE DB Destination connection adapter.

Double-click the OLE DB Command to open the "Advanced Editor for OLE DB Command" dialog. Set the Connection Manager column to your SSISIncrementalLoad_Dest connection manager:



Click on the "Component Properties" tab. Click the elipsis (button with "...") beside the SQLCommand property:

The String Value Editor displays. Enter the following parameterized T-SQL statement into the String Value textbox:

UPDATE dbo.tblDest
SET
ColA = ?
,ColB = ?
,ColC = ?
WHERE ColID = ?



The question marks in the previous parameterized T-SQL statement map by ordinal to columns named "Param_0" through "Param_3". Map them as shown below - effectively altering the UPDATE statement for each row to read:

UPDATE SSISIncrementalLoad_Dest.dbo.tblDest
SET
ColA = SSISIncrementalLoad_Source.dbo.ColA
,ColB = SSISIncrementalLoad_Source.dbo.ColB
,ColC = SSISIncrementalLoad_Source.dbo.ColC
WHERE ColID = SSISIncrementalLoad_Source.dbo.ColID

Note the query is executed on a row-by-row basis. For performance with large amounts of data, you will want to employ set-based updates instead.

Click the OK button when mapping is completed.

Your Data Flow canvas should look like that pictured below:



If you execute the package with debugging (press F5), the package should succeed and appear as shown here:



Note one row takes the "New Rows" output from the Conditional Split, and one row takes the "Changed Rows" output from the Conditional Split transformation. Although not visible, our third source row doesn't change, and would be sent to the "Unchanged Rows" output - which is simply the default Conditional Split output renamed. Any row that doesn't meet any of the predefined conditions in the Conditional Split is sent to the default output.

SQL Server Interview Questions 4

61. Can you link only other SQL Servers or any database servers such as Oracle?
We can link any server provided we have the OLE-DB provider from Microsoft to allow a link. For Oracle we have a OLE-DB provider for oracle that microsoft provides to add it as a linked server to the sql server group.
62. Which stored procedure will you be running to add a linked server?
sp_addlinkedserver, sp_addlinkedsrvlogin
63. What are the OS services that the SQL Server installation adds?
MS SQL SERVER SERVICE, SQL AGENT SERVICE, DTC (Distribution transac co-ordinator)
64. Can you explain the role of each service?
SQL SERVER - is for running the databases SQL AGENT - is for automation such as Jobs, DB Maintanance, Backups DTC - Is for linking and connecting to other SQL Servers
65. How do you troubleshoot SQL Server if its running very slow?
First check the processor and memory usage to see that processor is not above 80% utilization and memory not above 40-45% utilization then check the disk utilization using Performance Monitor, Secondly, use SQL Profiler to check for the users and current SQL activities and jobs running which might be a problem. Third would be to run UPDATE_STATISTICS command to update the indexes
66. Due to N/W or Security issues client is not able to connect to server or vice versa. How do you troubleshoot?
First I will look to ensure that port settings are proper on server and client Network utility for connections. ODBC is properly configured at client end for connection; Makepipe & readpipe are utilities to check for connection. Makepipe is run on Server and readpipe on client to check for any connection issues.
67. What are the authentication modes in SQL Server?
Windows mode and mixed mode (SQL & Windows).
68. Where do you think the users names and passwords will be stored in sql server?
They get stored in master db in the sysxlogins table.
69. What is log shipping? Can we do logshipping with SQL Server 7.0?
Logshipping is a new feature of SQL Server 2000. We should have two SQL Server - Enterprise Editions. From Enterprise Manager we can configure the logshipping. In logshipping the transactional log file from one server is automatically updated into the backup database on the other server. If one server fails, the other server will have the same db and we can use this as the DR (disaster recovery) plan.
70. SQL Server crashed and you are rebuilding the databases including the master database what procedure to you follow?
For restoring the master db we have to stop the SQL Server first and then from command line we can type SQLSERVER –m which will basically bring it into the maintenance mode after which we can restore the master db.
71. What is BCP? When do we use it?
BulkCopy is a tool used to copy huge amount of data from tables and views. But it won’t copy the structures of the same.
72. What should we do to copy the tables, schema and views from one SQL Server to another?
We have to write some DTS packages for it.

SQL Server Interview Questions 3

41. What are triggers?
Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation takes place on a table.
42. How many triggers you can have on a table?
In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE. From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action.
43. How to invoke a trigger on demand?
In SQL Server 7.0 there's no way to control the order in which the triggers fire. In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder Triggers can't be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on the table on which they are defined.
44. What are the uses of Triggers?
Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster. Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also. Search SQL Server 2000 books online for INSTEAD OF triggers.
45. What is a self join? Explain it with an example.
Self join is just like any other join, except that two instances of the same table will be joined in the query. Here is an example: Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.
46. What is normalization?
The Process of organizing relational data into tables is actually referred to as normalization.
47. What is a Stored Procedure?
Its nothing but a set of T-SQL statements combined to perform a single task of several tasks. Its basically like a Macro so when you invoke the Stored procedure, you actually run a set of statements.
48. Can you give an example of Stored Procedure?
sp_helpdb , sp_who2, sp_renamedb are a set of system defined stored procedures. We can also have user defined stored procedures which can be called in similar way.
49. What is the basic difference between clustered and a non-clustered index?
The difference is that, Clustered index is unique for any given table and we can have only one clustered index on a table. The leaf level of a clustered index is the actual data and the data is resorted in case of clustered index. Whereas in case of non-clustered index the leaf level is actually a pointer to the data in rows so we can have as many non-clustered indexes as we can on the db.
50. When do we use the UPDATE_STATISTICS command?
This command is basically used when we do a large processing of data. If we do a large amount of deletions any modification or Bulk Copy into the tables, we need to basically update the indexes to take these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.
51. Which TCP/IP port does SQL Server run on?
SQL Server runs on port 1433 but we can also change it for better security.
52. From where can you change the default port?
From the Network Utility TCP/IP properties –> Port number.both on client and the server.
53. Can we use Truncate command on a table which is referenced by FOREIGN KEY?
No. We cannot use Truncate command on a table with Foreign Key because of referential integrity.
54. What is the use of DBCC commands?
DBCC stands for database consistency checker. We use these commands to check the consistency of the databases, i.e., maintenance, validation task and status checks.
55. What command do we use to rename a db?
sp_renamedb ‘oldname’ , ‘newname’
56. Sometimes sp_reanmedb may not work you know because if some one is using the db it will not accept this command so what do you think you can do in such cases?
In such cases we can first bring to db to single user using sp_dboptions and then we can rename that db and then we can rerun the sp_dboptions command to remove the single user mode.
57. What is the difference between a HAVING CLAUSE and a WHERE CLAUSE?
Having Clause is basically used only with the GROUP BY function in a query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.
58. What do you mean by COLLATION?
Collation is basically the sort order. There are three types of sort order Dictionary case sensitive, Dictonary - case insensitive and Binary.
59. When do you use SQL Profiler?
SQL Profiler utility allows us to basically track connections to the SQL Server and also determine activities such as which SQL Scripts are running, failed jobs etc..
60. What is a Linked Server?
Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query both the SQL Server dbs using T-SQL Statements.

SQL Server Interview Questions 2

21. What is blocking and how would you troubleshoot it?
Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.
22. How to restart SQL Server in single user mode?
SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very important parameters with which a DBA should be familiar with. -m is used for starting SQL Server in single user mode and -f is used to start the SQL Server in minimal confuguration mode.
23. what are the DBCC commands that you commonly use for database maintenance?
DBCC CHECKDB - Ensures that tables in the db and the indexes are correctly linked.and DBCC CHECKALLOC - To check that all pages in a db are correctly allocated. DBCC SQLPERF - It gives report on current usage of transaction log in percentage. DBCC CHECKFILEGROUP - Checks all tables file group for any damage.
24. What are statistics?
Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not while executing a query.
25. Under what circumstances they go out of date, how do you update statistics?
Some situations under which you should update statistics:
1) If there is significant change in the key values in the index
2) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
3) Database is upgraded from a previous version Look up SQL Server books online for the following commands: UPDATE STATISTICS, STATS_DATE, DBCC SHOW_STATISTICS, CREATE STATISTICS, DROP STATISTICS, sp_autostats, sp_createstats, sp_updatestats
26. What are the different ways of moving data/databases between servers and databases in SQL Server?
BACKUP/RESTORE, dettaching and attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.
27. Explian different types of BACKUPs avaialabe in SQL Server?
Full database backup, differential database backup, transaction log backup, filegroup backup.
28. What is database replicaion?
Replication is the process of copying/moving data between databases on the same or different servers.
29. What are the different types of replication you can set up in SQL Server?
SQL Server supports the following types of replication scenarios:
* Snapshot replication
* Transactional replication (with immediate updating subscribers, with queued updating subscribers)
* Merge replication
30. How to determine the service pack currently installed on SQL Server?
The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed.
31. What are cursors?
Cursors allow row-by-row prcessing of the resultsets.
32. Explain different types of cursors.
Types of cursors:
Static, Dynamic, Forward-only, Keyset-driven.
33. What are the disadvantages of cursors?
Disadvantages of cursors:
Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one rowundtrip, however large the resultset is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors.
34. How can you avoid cursors?
Most of the times, set based operations can be used instead of cursors. Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this. This can be achieved using WHILE loop, as long as there is a unique key to identify each row.
35. What is a join?
Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.
36. Explain different types of joins.
Types of joins:
INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.
37. Can you have a nested transaction?
Yes, very much.
38. What is an extended stored procedure?
An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement.
39. Can you instantiate a COM object by using T-SQL?
Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure.
40. What is the system function to get the current user's user id?
USER_ID(). Some more system functions: USER_NAME(), SYSTEM_USER, SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().