Wednesday, January 31, 2018

SQL Server Frequent mirror failovers, think beyond DBA skills

As you are a DBA you might have observed the primary databases configured in mirror gets failed over frequently / unexpectedly which causes downtime to the users.   So how do you start troubleshoot.... probably you could see the database status and once you find this was failed over to secondary, then how do you verify as you have all connectivity in place from secondary otherwise bring it back to the primary server to make sure the application connectivity remains to the users. 

But how do you troubleshoot to fix that unexpected errors, so that it will not happen again.

The first step is to start analyzing the SQL Server logs ....  you would get lot of information from it ;

SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\MSDBLog.ldf] in database [msdb] (4). The OS file handle is 0x000000000000

 
The mirroring connection to "TCP://xxxx_mir.xxx.isyntax.net:5022" has timed out for database "xxxxx" after 10 seconds without a response. Check the service and network connections.

And check in Windows Event logs as well where you can find information for hardware & OS related.

So, one of these scenarios how do you troubleshoot further and fix the issue. 

1) You could change the timeout increased to a higher value, so that if there is no response from the IO or network you can avoid frequent failovers.
2) Analyze the IO errors working with storage admin then take up the action accordingly.
3) If you observe much load happening it requires system to be tuned.
4) See any disk related errors locally and sufficient disk space available.

In my case I have observed the server is a shared database node in VMware cluster and the SAN which is hosted the storage allocated to the database node is used for the other nodes as well.    So it looks like recently there were few more nodes added which kept more load on it, hence the IOPs is not happening with the desired rate.   

Hence SQL server is getting timed out as there is no response when it writes the data on the data file.   I worked with storage admin and move one of the other node to different VMcluster and that is connected to an another SAN together.  So reduced the IO load.

Hope this info helps you troubleshooting the mirror failover issues.
 

Tuesday, January 23, 2018

SQL Code/Procedure development suggestions

Most of you familiar writing stored procedures in SQL Server, but if you follow the development standards (in my experience)  which would help debugging the code (or) in case code modifications required it is easy to point out. 

Hope the following SQL coding rules help you while writing the SQL procedures & functions.

  • Header part of the script should be updated reflecting the current changes to the script with the version details.  Suppose it is a new procedure you could mention as initial draft/release.  Later it can be added with all subsequent changes.
  • Alignment of the script content should be proper.  Proper alignment shall help programmer to read the code easily.
  • SNAPShot isolation helps reduce the locks, If the DB isolation is Snapshot and the SNAPSHOT isolation is supposed to be used in the scripts then verify that 'SET SNAPSHOT ISOLATION' is mentioned in the script. If in doubt confirm with the team which has sent the script for review and if 'SNAPSHOT ISOLATION' is mentioned, verify that SNAPSHOT ISOLATION is enabled at the database level too
  • 'NOLOCK' can be used in the code when the diry reads are allowed which is less lock operation.
  • 'SET NOCOUNT ON' statement should be mentioned in the beginning of each procedure. If the scripts require the counts and do not make use of SET NOCOUNT ON, make sure that it uses @@ROWCOUNT server variables instead, for fetching the counts.
  • It is good to assign input parameter variables to the local variables then use it in procedure code.
  • Fixed length columns in the table should use CHAR datatype all the times.
  • TOP Statement might be required/might be present in the scripts. But verify if the TOP statement mentioned in the script is intentional/required.
  • Any/All temp tables created in the procedure should be dropped at the end of the procedure
  • SELECT statements using temp tables do not require NOLOCK as locks for the temp table maintained internally by SQL Server.
  • If any other database references are present in the script, verify that the database is present, always use if exists clause to validate these cases.
  • If LINKED Server references are present in the script, verify that the correct LINKED server are referenced and the LINKED Server is present in the appropriate environment.  Better to use If exists clause.
  • Instead of multiple SET statements, a single SELECT statement can be used to assign value to variables that avoid confusion.
  • Avoid using 'SELECT *' instead, use specific column name(s) in the SELECT statement.
  •  Avoid using table variable if data stored in the table is too high.
  • Avoid using CTE's if data stored in the corresponding CTE is too high.
  • Avoid using co-related queries. See if the same can be resolved using subqueries instead.
  • Avoid using CURSOR's if possible, try while instead.  If CURSOR is needed & used, check the CURSOR is CLOSED & DEALLOCATED before end of the procedure.
  • Trigger should not be created on any table. If it is absolutely necessary see if an alternative approach can be used to eliminate the trigger.
  • Check for Unused variable declarations in the procedures, this can be removed to have only required code.  Dead code / commented code should not be there in the procedure.
  • Proper error handling should be done in the code, explaining the error handling in comments would help review the code.  
  • Execution Plan of the procedure could help validating the logic and query optimization if needed.
  • Validating the performance of the procedure can help figure out the performance bottlenecks.  Execute the procedure using parameters supplied in test environment atleast.

Wednesday, May 27, 2015

Drop database doesn't work when database configured for replication

I incurred this situation when one of the my test server configured for Transnational replication some time before and now I wanted to restore a fresh database copy on top of it, it failed with error saying 'Database is in use'.

Then I tried working to drop the database.  Since it is involved on replication it says 'database cannot be dropped unless you remove the replication'

Then tried removing the replication it says  'the object used for the replication owner is not DBO or you don't have access to remove the replication', though I am trying with sysadmin privileges.


Finally I used one of the system proc 'sp removedbreplication' and passed the Subscriber and Publisher database names as parameters then it got dropped both the subscriber and publisher under Replication Folder.


sp_removedbreplication @publisher
sp_removedbreplication @subscriber

Hope this information helps to remove the replication when it gives you an error !!!!

Thursday, March 5, 2015

SQL Server database corrupted and in SUSPECT mode, how to fix it

As a routine DBA issue, sometimes the Database goes to suspect mode and not accessible to any of  the users.  When we verfiy the SQL Server logs it says the database consistency issue  which needs to be repaired. So how do one fix the database issue......

The possibilities are the database pages might have corrupted due to inconsistency where
the particular table resides.  We can identify these tables by using various methods;

Refer MSDB database and suspect_pages table for the list of the corrupted pages.
use MSDB
Go
select * from suspect_pages
Go

Alternatively run the below command for identify the corrupted object details

DBCC CHECKDB (DB_NAME) WITH ALL_ERRORMSGS, NO_INFOMSGS;
Error :  Table error : Object ID 0, Index ID -1, Partition ID 0 ..........
CheckDB found 0 allocation errors and 1 consistency errors not associated with any single
object.
Repair_Allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB
(db_name)

Repair Method :
You can follow the below method for getting the database online by removing the corrupted
portion, but make sure you have proper approvals before doing so.  Because it removes the
corrupted pages completely from the databases.
use MASTER
GO
alter database db_name set single_user ;
Go
DBCC CHECKDB (db_name,'repair_allow_data_loss')
Go

this will give the results something like below, once it is repaired then you can make
the database into multi_user to give it to the users with proper backups being taken.

GO
alter database db_name set multi_user ;

DBCC results :
DBCC results for 'TEST'.
Service Broker Msg 9675, State 1: Message Types analyzed: 14.
Service Broker Msg 9676, State 1: Service Contracts analyzed: 6.
Service Broker Msg 9667, State 1: Services analyzed: 3.
Service Broker Msg 9668, State 1: Service Queues analyzed: 3.
Service Broker Msg 9669, State 1: Conversation Endpoints analyzed: 0.
Service Broker Msg 9674, State 1: Conversation Groups analyzed: 0.
Service Broker Msg 9670, State 1: Remote Service Bindings analyzed: 0.
Service Broker Msg 9605, State 1: Conversation Priorities analyzed: 0.
DBCC results for 'sys.sysrscols'.
There are 870 rows in 12 pages for object "sys.sysrscols".
DBCC results for 'sys.sysrowsets'.
There are 124 rows in 1 pages for object "sys.sysrowsets".
DBCC results for 'sys.sysclones'.
There are 0 rows in 0 pages for object "sys.sysclones".
DBCC results for 'sys.sysallocunits'.
There are 138 rows in 2 pages for object "sys.sysallocunits".
DBCC results for 'sys.sysfiles1'.
There are 2 rows in 1 pages for object "sys.sysfiles1".
DBCC results for 'sys.sysseobjvalues'.
There are 0 rows in 0 pages for object "sys.sysseobjvalues".
DBCC results for 'sys.syspriorities'.
There are 0 rows in 0 pages for object "sys.syspriorities".
DBCC results for 'sys.sysdbfrag'.
There are 0 rows in 0 pages for object "sys.sysdbfrag".
DBCC results for 'sys.sysfgfrag'.
There are 0 rows in 0 pages for object "sys.sysfgfrag".
DBCC results for 'sys.sysdbfiles'.
There are 2 rows in 1 pages for object "sys.sysdbfiles".
DBCC results for 'sys.syspru'.
There are 0 rows in 0 pages for object "sys.syspru".
DBCC results for 'sys.sysbrickfiles'.
There are 0 rows in 0 pages for object "sys.sysbrickfiles".
DBCC results for 'sys.sysphfg'.
There are 1 rows in 1 pages for object "sys.sysphfg".
DBCC results for 'sys.sysprufiles'.
There are 2 rows in 1 pages for object "sys.sysprufiles".
DBCC results for 'sys.sysftinds'.
There are 0 rows in 0 pages for object "sys.sysftinds".
DBCC results for 'sys.sysowners'.
There are 14 rows in 1 pages for object "sys.sysowners".
DBCC results for 'sys.sysdbreg'.
There are 0 rows in 0 pages for object "sys.sysdbreg".
DBCC results for 'sys.sysprivs'.
There are 136 rows in 1 pages for object "sys.sysprivs".
DBCC results for 'sys.sysschobjs'.
There are 2180 rows in 29 pages for object "sys.sysschobjs".
DBCC results for 'sys.syscolpars'.
There are 694 rows in 11 pages for object "sys.syscolpars".
DBCC results for 'sys.sysxlgns'.
There are 0 rows in 0 pages for object "sys.sysxlgns".
DBCC results for 'sys.sysxsrvs'.
There are 0 rows in 0 pages for object "sys.sysxsrvs".
DBCC results for 'sys  ...................   So on

Wednesday, December 3, 2014

How to read SQL Server Error log using Query Analyzer

Reading SQL Error log typically we do by going through the Management  then SQL Server error log and open respective one .  But it is little difficult to trace it out only the particular information out of the bunch of records if you activated all kind of log info.   So alternatively it could be done using SSMS by executing extended system procedures.  

xp_readerrorlog         returns the latest error log
xp_readerrorlog 1      returns the latest archived error log
xp_readerrorlog 2      returns the previous archived error log
 ..........  and so on.

You can try inserting the log records into temp table then filter only the information that you might required to analyze the issue.  I hope this would help and save the time instead of checking complete error log manually.
 
 
 
 
 
 

Tuesday, March 26, 2013

Understand the index usage details in Sql server

Understand the index usage from DMVs  in Sql server :  The below will retrieve the indexes performed look ups, scans, seeks by the user with the utilized dates.  If we see any unused indexes, could be deleted to improve the performance for Insert,update,delete statements.

select object_name(a.object_id) TABLE_NAME, a.index_id,b.name IndexName,
b.type_desc,a.last_user_seek,a.last_user_scan,a.last_user_lookup,a.last_user_update
from sys.dm_db_index_usage_stats a inner join sys.indexes b
on a.object_id = b.object_id and a.index_id = b.index_id
where a.database_id =6 and a.object_id in (object_id('xxx'), object_id ('xxx'))
order by A.OBJECT_ID,a.index_id

Monday, November 19, 2012

Find Orphan users from all databases exists in the SQL server

It is a typical requirement to find and fix the Orphan users in restored database to access the database with its underlying privileges.  In order to achieve this I have prepared a small script to find the Orphan users in all the databses exists in server.  Hope this helps to all, who are looking for this info in MS Sql server.

Script :

Declare @dbs table (id int identity(1,1), db varchar(100))

insert into @dbs (db)
select name from master.sys.databases where database_id > 4 order by name

--select * from @dbs


Declare @findOrphans table (db varchar(100), Orpuser varchar (100),USID varchar(200))
Declare @fixorphans table (db varchar(100), Script varchar (500))


Declare @a varchar(1000)
Declare @b tinyint
Declare @c tinyint
Declare @d varchar(100)

set @b = 1
select @c = MAX (id) from @dbs

while @b < @c
Begin
select @d = db from @dbs where id=@b
select @a = db+'.dbo.sp_change_users_login report ' from @dbs where id = @b
insert into @findorphans (Orpuser, UsID) exec (@a)
Update @findOrphans set db = @d where db is null
set @b = @b+1
end

select * from @findorphans