Microsoft Technologies: December 2011

Tuesday, December 27, 2011

Sql Server Moving System Databases

Today i am going to discuss how can we move the system database files ( data file - mdf & log file ndf ) to a different location with in the same machine. There will be some exceptions which will be discussed at the bottom

Different scenarios where we need to move the files are :

1) Planned relocation - ex: you installed it on c drive & realized that those files should not be there on OS drive :)
2) Relocation for scheduled disk maintainance
3) Failure recovery - hardware recovery
could be many more ..

Steps :

1) Identify the files that are used by a database & you want to move :

2) Create the files in the new location using command for each file :
   use master
go
alter database tempdb modify file (name='tempdev', filename='c:\tempDB.MDF', size = 1mb)
go

3) Stop the sql server engine either from sql server management studio (or) from sql server management console ( sqlservermanager10.msc in run command for sql server 2008 )

4) copy the files from old location to new location.

5) restart the sql server . From the restart time the sql server will use the files from the new location.

6) To verify if sql server is using the files from new location , you can run the query to find the location :
    SELECT name, physical_name AS CurrentLocation, state_desc
FROM sys.master_files
WHERE database_id = DB_ID(N'tempdb');

Exceptions :
1) For Tempdb , we need not restore the files to new location as for every sql server restart tempdb files will be initialized . Even if we don't restore the files, still sql server will create new files in the new location.
2) Resource database can't be moved
3) For master database , we have to change the start up parameter to point to new location
    sql server configuration manager --> sql server services --> right click --> properties --> advanced --> startup
This has to be done before we restart the service.

4) for msdb database, make sure you check the service broker is enabled , if database mail is configured.
Database Mail uses msdb database.

SELECT is_broker_enabled
FROM sys.databases
WHERE name = N'msdb';

If there is a Hardware failure , then moving files follows a seperate logic :

1) Stop the sql server instance

2) Start the instance in master only recovery mode by typing the cmd from command prompt :

NET START MSSQLSERVER /f /T3608   ( for default instance )     
    NET START MSSQL$instancename /f /T3608  ( for named instance )

3) for each file, run the alter database command as mentioned above .

4) Stop the instance  (  NET STOP MSSQLSERVER )

5) move the files to new location

6) start the instance  ( NET START MSSQLSERVER )

7) check if service is using new location or not as mentioned above.

Happy Reading.. Hope this helps..

Sunday, December 25, 2011

Sql Server Isolation Levels

Isolation level is one of the property of ACID . Sql server has different levels of Isolations implemented using locks .
The Term Isolation defines Isolating the transaction & doesn't having impact on other transactions happening in a different user session.
Sql Server 2008 defines different isolation types to cater the needs from High Concurrent executions to the accuracy of the data.

What is the default Isolation Level of Sql Server ?
Well .. it is Read Committed. So by default if you do not specify any isolation level for a query then it will take default isolation level of Read Committed.

to find the default Isolation Level execute the query :
dbcc useroptions

Before going in to the isolation levels , let us see some of the locks that sql server uses during transactions :

Row Level Locks :

shared locks (S) , Update locks (U) , Exclusive locks (X)

By default every time you connect to database, shared locks are applied. these locks are for more concurrency.
Once a row need to be updated , Update lock is applied & then exclusive lock is applied . so that only that transaction can exclusively have access to the row.

Update additionally have 3 other kinds of locks
1) table level exclusive lock
2) page level exclusive lock
3) row level exclusive lock

Update Locks :

1) update locks provide high concurrancy
2) they isolate rows before the update happens
3) they does not prevent the select statements , that means readable statements can still read the data
4) they upgrade to exclusive lock to update a row.

Page & Table Level locks will be discussed in a seperate thread..

Let us go in to the isolation levels now :

1) ReadUncommitted :
In this isolation level, transaction T1 can have simultaneous access to data that is being modified by another transaction T2 running in a different session. It means T1 will have dirty reads as T2 can either commit the data or roll back the data . Even if T2 rollbacks the changes still they are displayed in T1 . How ever this isolation level provides more concurrancy.
Pros : No read locks required, no read / write blocking
Cons : High inconsistancy in data reads, as other transactions can roll back the changes
Usage: These are used in transactions where data consistancy is not of high priority & can have stale data.

NOTE : Read Uncommitted & No Lock - both does the same thing.

ex:
Create table

Begin the transaction & update the records , do not commit (or) rollback

Query the records in a seperate Window with readuncommitted mode. It will return the result set as it doesn;'t hold read locks & gets the stale ( dirty ) data.

T2 can roll back the updates resulting in stale data shown in Transaction T1.

2) Read Committed :
In this mode, Transaction T1 can read only committed data. Each time a row is accessed it will be apply a shared lock & releases the shared lock once the row is read. If another transaction T2 modifies the data , then T1 can't read the data which is modified in T2 till it is committed or rolled back.

ex: if T2 modifies row2 & T1 query need to return the row1 & row2 , then T1 Reads row1 how ever it will wait for row2 to be released by T2. This creates some blocking.

Pros : high consistancy of data
Cons : Can have blocking caused till data is committed or rolled back
Usage : This is sql server default isolation mode.

In the above example if updates are done & not committed or rolled back , then select query will not return the result set as they will be blocked.

ex :

The Data is not returned till data is either committed or rolled back, causing some kind of blockage but data will be highly consistent.

If you take the spid for above transaction & query the locks for the spid, then we will find that Shared lock is required & waiting for a shared lock .

3) Repeatable Read :
In this isolation level , transaction T1 can only access committed data & additional guarantee of not modifying the already read data till the transaction is completed. This is achieved by applying a shared lock on rows.How ever it doesn't protect from new inserts .

Pros: High data consistency
Cons: Locking & Blocking, Rows are locked till the transaction is completed, How ever doesn't protect against Phantom rows.
Usage : Rare
The below example illustrates the situation :
Create table, insert some records in to it

Begin Transaction & Update 3rd row. Don't commit

As Transaction is not yet committed , select statement would be blocked on 3rd row.
it already reads 1st , 2nd rows & waiting on 3rd row.

The above query is looking for condition technology = sql server , it already read row 1 as it satisfied the condition. Now update the rows making 2rd row as sql server instead of row 1 & commit the data.

The commit will release the locks & select query will be able to read row3 , how ever data retrieved says row 1 as sql server consultant where as actual record that matches now is row 2
This incosistency is caused because select query has read row 1, row2 & found row 1 is satisfying the query.
It is blocked at Row3 as different transaction has updated the row 3 & didn't commit it.
When the row2, row1 are updated & committed, lock is released & able to read row 3
How ever it will not look again for Row 1 & row 2 for consistency of data.

To avoid this inconsistency of data during a transaction , repeatable read can be used.

Repeatable Read will lock each row that it transverses & doesn't release the lock till the complete transaction is done. In the above example , if we use repeatable read isolation then First update query will lock 3rd row. Select will lock 1st & 2nd row , waiting for 3rd row. Second Update will be waiting for 1st & 2rd rows which were locked by select statement resulting in dead lock.
Dead lock is fine in some cases, where data consistency is a must till the transaction is completed.

Let us try a different scenario :

Drop the table , recreate it again & insert some records in to it.
Set the isolation level to repeatable read & run the select query with begin transaction :
The transaction will apply a shared lock on the row & will not release a lock till the transaction is completed.

Run the update query in another window. Update mode will be blocked as the row is already locked by above select statement.

Now go back to the select window & commit the transaction , it will release the lock & update statement can acquire the lock .

Update statement will be executed :

How ever Repeatable Reads doesn't stop the phantom reads . phantom reads are read that read the newly inserted data,
Since repeatable reads just lock the rows that are scanned through , it just blocks the existing rows . It will not lock the table from inserting any new rows , new rows can be inserted & can be read next time.

ex: Run the select query , with out committing.

Now the select query has locked the first row , how ever it doesn't block any new inserts from happening.
Now run the insert query in a seperate window & it inserts the record.

4) Serializable : This isolation level provides highest data consistency including excluding phantom rows by applying range level locks (or) table level locks, resulting in low concurrency. The serializable isolation takes the transactions one at a time in a serializable fashion. Let us say 4 transactions runs viz. T1, T2, T3, T4 .
They run one after the another transaction is completed.

Pros : High data consistency
Cons : Severe blockage, low concurrency.
Usage:: Used in High Data Consistent environments.

Ex : Use the same example from above repeatable read.
Create table , insert records, Run select query in serializable isolation.
Select query will apply a range lock blocking any other transaction to access the range.
Insert statements will be blocked.

Now insert statement will be blocked , as select statement has applied range level lock.
Insert statement will wait till select statement commits or roll backs.

5) Snapshot Isolation :
This isolation level provides the data consistency as much as read committed does & provides high performance as read uncommitted . This isolation level doesn't apply locks on the rows.
How ever it doesn't read the modified data till it is committed. Means no dirty reads. It reads data from a snapshot.
Let us take the same sample example.
Update a row in the table , it locks the row .

Now set the isolation level to snapshot & run the select query :

You may see the error that snapshot isolation is not defined on database:

Alter database to set the snapshot isolation level for a database :

NOw run the select statement in Snapshot isolation , it will not apply locks on the rows & returns the committed data only.

Sunday, December 18, 2011

Full Text Search Internals - part 3

Overview :

Full text search provides the ability to search the character based data
1) character data
2) varbinary data --> when html documents are used for full text search ms-locale meta tag is used to define the language for the full text search
3) xml data --> xmllang is used to define the language for search
4) file stream data

Full text search functionality includes
1) simple search using freetext
2) prefix search using contain
3)inflectional forms of same words( like run, ran etc ) using contains & form of inflectional
4) words near other words using contains & nearby
5) ranking values & weighted values using containstable & isabout

Components:

1) supports close to 50 languages .
   SELECT        lcid, name FROM            sys.fulltext_languages
2) for each language there would be a word breaker & stemmer.
   word breaker breaks the document sentences in to word & decide on what constitues the words where as stemmers look for the inflectional forms of words. Third party word breakers can be purchased & supported,
stemmers are not invoked when full text search index is populated , they are invoked when full text search is done.
3) per instance thesauraus file. they are the xml files on the hard disk located @
   C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\FTData

they are used for expansion sets & replacement sets ( if the search spells wrong then thesarus looks for the right words from xmls, replaces them & returns the search results on right words.
4) stop lists called as noice words, they are moved to internal tables in sql server 2008 . we can add the stop list per full text index.

Ifilters :

Filters are provided for each document type one for .doc, .docx, .pdf etc. Full text search looks for the registry and loads the ifilters for newly added ifilters. Filters implement Ifilter interface , third party filters can be supported.
Ifilters are used to parse the documents during the creation of fulltext index population not during the querying of data. We can provide the type of document for each row in the table & based on the type relevant ifilter is loaded , so same table can support multiple document types .

New Ifilters (or) third party Ifilters can be loaded using command :

exec sp_fulltext_service 'load_os_resources' ,1

Service :

Sql server 2005 used the external windows service for word breaker & stemmer. In sql server 2008 those service are brought in to sql server. FDHost launch service should be up & running to support the full text search. search runs with in the sql server engine.

Programming :

Full text search provides 4 programming functions,
1) Freetext 2) Contains are predicate based
3) Freetexttable 4) containstable are table valued functions.

1) Free text is to search the scentence , it does the word breaking, stemmer (inflectional words ) , thesaurus ,stop words & finds documents.
2) contains - all of the above are not done automatically, need to explicitly define thesarus, forms of inflectional etc based on what we are looking for.

---------

We can find the words that are indexed in the full text search per table with command :
select * from sys.dm_fts_index_keywords( DB_ID(), OBJECT_ID('HumanResources.Employee'))

fts parser gives the exact matches , stop word & numbers for a word given in the search.
Numbers will have nn in it.

sys.internal_tables gives more information on full text .

DOCID is the unique id which refers to the rows in the table. when the table has unique int primary key ,then that key will act as the DOC ID.

Thesaurus files can be modified & loaded as shown in diagram,

SQL Server Denali Improvements :

1) Indexing is made multi threading, in 2008 it is single threaded.
2) predicative performance is improved. ex: search for many things like contains( tree) and contains (road ) , they modified to convert ( tree and road )
3) faster time response using streamed table valued functions, means we don't have to wait till all rows are returned. they stream the rows.
4)More granular locks & min time locking.
5) property based search for the documents, like document author name, date created etc..
properties are per index & when ever properties are updated , full text index must be repopulated.
Ifilters must support the extraction of properties .

-----------

Friday, December 16, 2011

Full Text Search Basics -part 2

Here are the steps to create full text index from UI.

Thursday, December 15, 2011

Full Text Search Basics

Sql server full text search is a feature used to scan or search the large document stored in a table like documents, html, varbinary etc..we can do a like '%' search, but these are much more efficient & optimised .
It can also rank the results based on the proximity of words you define & the parts of content you search for, inflectional words . We can tweak the index performance , avoid stop list words from search etc..

By default databases are enabled for full text search unless databases are created using ssms & we can enable the full search text by right clicking database s -> select properties -> check full text indexing

To create full text index --> catalog has to be created to store one or more full text indexes & full text indexes have to be created on them.

Adventureworks database is used for demo.

1) Creating Catalog :
Create FullText Catalog MyCatalog
This creates a folder in the specified directory during installation to store the full text indexes on the hard drive.
Catalog stores alll index files & configuration files. full text indexes store each word from each document indexed and maps them back to the table row for reference.

2) There should be primary key on the table which is unique & non nullable . create primary key if it doesn't exist.

3) create full text index on the table

create fulltext index on production.productreview (comments)
key index PK_ProductReview_ProductReviewID on MyCatalog
with change_tracking auto

comments is the column on which we want to full text index.

PK_ProductReview_ProductReviewID is the unique primary key on the table

MyCatalog is the catalog on which we want the full text index to be on
change_tracking -> created full text index needs to be populated with data & it has different population modes.
1) Full Population is done for the first time after full text index is created
2) Incremental population can be done either based on change tracking or incremental time stamp based.

Full text index is very IO oriented & intensive operation. so it must scheduled & maintain during down time.

Creating one more advanced full text index on table production.document.

create fulltext index on production.document
(
document
type column FileExtension
language 0x0
)
key index PK_Document_DocumentID on MyCatalog
with change_tracking auto

Above query says, create full text index on column document in table production.document & the document columns holds data of type specified in Fileextension column . File extension is used to load the appropriate filter . lanaguage 0x0 is neutral language but we can specify any language needs to be used for parsing the text . so that it can use the appropriate language dictionary to parse the text from document.

sys.fulltext_document_types gives all the list of document types full text index can support.

FreeText() is a predicate , query engine splits the sentence in to words (word breaking )
generates inflectional forms of the words ( stemming)
identifies a list of matching words based on the words from thesaurus.

Now querying :
Basic query :

We can use different search conditions for contains :
1) like wild cards where contains( comments, '"mount*"') matches to mountain, mount , mounting etc.

2) can use binary operations like AND, OR, NOT.
ex : contains ( comments, ' "mountain" AND "flag"' NOT "flagging" ')

3) can use near operations which lists all the documents which have both the words & ranks higher the one which are more closer to each other.

4) we can use isabout to give the different weights to different words & they will be ranked on the weight given. the words with more weight show up first & then the rest follows in the order of weights given.

the above query contains the containstable which creates a temporary table taking the bay as wild card with different weights given to different words. since street has more weight than view , bay wild card with street shows up first ,then comes the rest . we joined the temporary table with the actual table on address id (pk) & then order by rank desc,

5) we can use the inflectional words which give similar meaning like foot returning documents with foot, feet etc.

Any thing related to log for Full Text Index can be found @
C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Log

with file name starting with SQLFT...

Full text catalogs in a database can be found using the system table sys.fulltext_catalogs

The list of lanaguages that full text can support can be found using sys.fulltext_languages
the lcid can be declared when we create a full text index to search on the language desired.

Stop list is the words list which will not be used by full text search to search on. they are most generally used words in day to day life. by using stop list we can narrow down the words to search for.
we can add a stop list to full text index by :

CREATE FULLTEXT STOPLIST stList
FROM SYSTEM STOPLIST;

we can query the stop list words as below :

to add or remove words from stop list :

ALTER FULLTEXT STOPLIST stList
ADD / DROP 'pen' LANGUAGE 1033;

language code 1033 is for english ( which is nothing but lcid discussed above )

Misc : Only one free text index is allowed per table & it can have 1024 columns