SQLServer通过DMV实现低影响的自动监控和历史场景追溯

SQLServer通过DMV实现低影响的自动监控和历史场景追溯,第1张

概述一. 背景:      我们管理和维护的数据库系统基本都是7*24小时运转的,运转时会出现什么情况谁都无法估计,往往一个平时运行很正常的系统,某天晚上突然就抽风了,而此刻并没有系统负责人在旁边,当值班同事的电话打到正在酣睡的DBA手机上时,DBA不得不朦朦胧胧的,条件反射要上数据库看看;要在家里连到公司的内外一般都需要DBA通知运维人员(也可能是个在做梦的哥们)先给自己开个VPN(某些管控严格的公

一. 背景:

     我们管理和维护的数据库系统基本都是7*24小时运转的,运转时会出现什么情况谁都无法估计,往往一个平时运行很正常的系统,某天晚上突然就抽风了,而此刻并没有系统负责人在旁边,当值班同事的电话打到正在酣睡的DBA手机上时,DBA不得不朦朦胧胧的,条件反射要上数据库看看;要在家里连到公司的内外一般都需要DBA通知运维人员(也可能是个在做梦的哥们)先给自己开个VPN(某些管控严格的公司还需要DBA打车去公司才能处理),然后通过一大堆的验证才通过慢如蜗牛(往往也和此刻的心态有关)的网络,远程连接到了公司内网出问题的数据库上,此刻DBA才开始真正开始排查起问题来;如果说等待DBA登录到问题数据库上,问题依然在出现,那还算好,可以很快找到问题并处理,但是往往这种抽风的现象是短暂的,可能持续几分钟后,系统又恢复了正常,等到DBA"跑山涉水,翻山越岭" 的好不容易登录到数据库上检查时,数据库里的进程、锁、日志一起都正常,系统运行的很Hai;于是第二天上班时老大们问起昨天晚上事故的原因时,DBA只能凭空猜测,可能是网络、数据库阻塞、抑或是程序方面的问题吧......,此刻不同部门的人都猜是其他部门管理的东西出了问题,于是大家都把手头上的数据、监控图拿出来,证明自己这里是没有问题的;如果监控网络和系统的图和APP的Log都没用出现啥问题(说实话监控也不一定准确的),那就基本要把问题推到数据库身上了(DBA往往成为炮灰)。

      数据库有没有出问题,如果出了问题又是啥问题呢?windows Log和DB Log都没异常,如果我们没有把Profile持续的开启(基于性能的考虑,一般都不会持续开启),就很难说清楚在系统出问题时数据库究竟有没有出问题,出了问题又是因为什么原因引起的;如果说此类问题出现了一次,就消失了,那还算好,成为一个无头案,悲剧的是这类问题无规律,反复的出现,如果DBA不能找出问题的原因,也不能证明数据库当时是正常的话,估计在公司里面就没用立足之地了。

 

二. 对策:

    其实对这种灵异的系统抽风事件,有两种比较好的解决方案:

    1. 开启Trace 跟踪,就是数据库的Profile功能 ,这个还可以结合windows的性能计数器一同使用,能够直观的了解到特殊时间点上运行了什么语句,资源消耗情况是怎样的;不过这个方法比较消耗系统的资源,对访问压力比较大的数据库需要慎重;

    2. 收集数据库DMV当时的情况,使得DBA在故障时间过后,还能通过这些数据了解到事故发生时,数据库里面运行的语句,以及锁和资源的情况;这种方法只是访问系统性能视图,对数据库其他业务影响比较小,也是我接下来要介绍的方案。

 

三. 实施方案:

   思路:根据数据库中某些动态指标,触发收集DMV的过程,必要时发送报警邮件

1. 创建存储系统性能视图的表:

--创建四个记录表
USE[master]
GO

/****** Object:  table [dbo].[dc_block_info]    Script Date: 05/23/2011 11:35:48 ******/
SETANSI_NulLS ON
GO

SETQUOTED_IDENTIFIER ON
GO

SETANSI_padding ON
GO

CREATEtable[dbo].[dc_block_info](
    [spID][smallint]NulL,
    [status][nchar](30) NulL,
    [sqlBuffer][nvarchar](max) NulL,
    [hostname][nchar](128) NulL,
    [BlkBy][varchar](10) NulL,
    [BlockedsqlBuffer][nvarchar](max) NulL,
    [Loginname][varchar](100) NulL,
    [DBname][varchar](50) NulL,
    [cpuTime][int]NulL,
    [diskIO][int]NulL,
    [LastBatch][datetime]NulL,
    [program_name][nchar](128) NulL,
    [Command][varchar](100) NulL,
    [batch_ID][int]NulL
) ON[PRIMARY]

GO

SETANSI_padding OFF
GO

USE[master]
GO

/****** Object:  table [dbo].[dc_Blocked_Resource_Info]    Script Date: 05/23/2011 11:35:48 ******/
SETANSI_NulLS ON
GO

SETQUOTED_IDENTIFIER ON
GO

CREATEtable[dbo].[dc_Blocked_Resource_Info](
    [spID][smallint]NulL,
    [dbID][smallint]NOTNulL,
    [ObjID][int]NOTNulL,
    [IndID][smallint]NOTNulL,
    [Type][nvarchar](4) NulL,
    [Resource][nvarchar](32) NulL,
    [Mode][nvarchar](8) NulL,
    [Status][nvarchar](5) NulL,
    [batch_ID][int]NulL
) ON[PRIMARY]

GO

USE[master]
GO

/****** Object:  table [dbo].[dc_info_BlockedInfo]    Script Date: 05/23/2011 11:35:48 ******/
SETANSI_NulLS ON
GO

SETQUOTED_IDENTIFIER ON
GO

SETANSI_padding ON
GO

CREATEtable[dbo].[dc_info_BlockedInfo](
    [batch_ID][int]NulL,
    [lock_type][varchar](100) NulL,
    [database_name][varchar](20) NulL,
    [blk_object][varchar](100) NulL,
    [lock_req][varchar](100) NulL,
    [waiter_sID][int]NulL,
    [wait_time][int]NulL,
    [waiter_batch][varchar](max) NulL,
    [waiter_stmt][varchar](max) NulL,
    [blocker_sID][int]NulL,
    [blocker_stmt][varchar](max) NulL,
    [create_date][datetime]NulL
) ON[PRIMARY]

GO

SETANSI_padding OFF
GO

USE[master]
GO

/****** Object:  table [dbo].[dc_info_SessionConn]    Script Date: 05/23/2011 11:35:48 ******/
SETANSI_NulLS ON
GO

SETQUOTED_IDENTIFIER ON
GO

SETANSI_padding ON
GO

CREATEtable[dbo].[dc_info_SessionConn](
    [batch_ID][int]NulL,
    [session_ID][int]NulL,
    [blocking_session_ID][int]NulL,
    [textdata][varchar](max) NulL,
    [login_name][varchar](30) NulL,
    [host_name][varchar](100) NulL,
    [database_name][varchar](30) NulL,
    [program_name][varchar](200) NulL,
    [command][varchar](100) NulL,
    [status][varchar](20) NulL,
    [cpu_time][int]NulL,
    [memory_usage_kb][int]NulL,
    [reads][int]NulL,
    [writes][int]NulL,
    [transaction_isolation_level][int]NulL,
    [connect_time][datetime]NulL,
    [last_read][datetime]NulL,
    [last_write][datetime]NulL,
    [net_transport][varchar](20) NulL,
    [clIEnt_net_address][varchar](30) NulL,
    [clIEnt_tcp_port][int]NulL,
    [local_tcp_port][int]NulL,
    [start_time][datetime]NulL,
    [wait_type][varchar](100) NulL,
    [last_wait_type][varchar](100) NulL,
    [wait_resource][varchar](1000) NulL,
    [open_transaction_count][int]NulL,
    [create_date][datetime]NulL
) ON[PRIMARY]

GO

SETANSI_padding OFF
GO

2. 定义性能收集的存储过程:

USE[master]
GO

--数据库的阻塞和锁信息,该SP可以记录session中显示不出的信息  
createprocedure[dbo].[usp_blocker_info](@batch_IDint)   
                                                                                                                                                                                                                   
as   
                                                                                                                                                                                                                                                               
begin   
                                                                                                                                                                                                                                                            
 setnocount on   
                                                                                                                                                                                                                                                 
 declare@spIDsmallint,@blockedsmallint   
                                                                                                                                                                                                                                          
 declare@c_sqlnvarchar(4000),@b_sqlnvarchar(4000)   
                                                                                                                                                                                                                                   
 declare@IDint   
                                                                                                                                                                                                                                                 
 --保存Sysprocesses 的内容 
                                                                                                                                                                                                                                           
 createtable#Temp(spID smallint,status nchar(30),hostname nchar(128),program_name nchar(128)   
                                                                                                                                                                   
  ,cmd nchar(16),cpu int,physical_io int,blocked smallint,dbID smallint   
                                                                                                                                                                                         
  ,loginame nchar(128),last_batch datetime   
                                                                                                                                                                                                                      
  ,sqlBuffer nvarchar(4000),BlockedsqlBuffer nvarchar(4000))   

      
 --保存DBCC inputBuffer 的结果 
                                                                                                                               
 createtable#Temp1(ID intIDentity(1,1),eventtype varchar(20),parameters int,eventinfo nvarchar(4000))   
  
 createtable#Temp_b(ID intIDentity(1,eventinfo nvarchar(4000))  


 select*into#Temp2   
                                                                                                                                                                                                                                            
  frommaster..sysprocesses (nolock)   
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
 
 --保存被阻塞的进程信息 
                                                                                                                                                                                                                                                    
 insertinto#Temp(spID,status,hostname,program_name,cmd,cpu,physical_io,blocked,dbID    
                                                                                                                                                                   
  ,loginame,last_batch )   
                                                                                                                                                                                                                                       
  SELECTspID,dbID   
                                                                                                                                                                              
    ,convert(sysname,rtrim(loginame)),last_batch   
                                                                                                                                                                                                              
   from#Temp2   
                                                                                                                                                                                                                                                   
   whereblocked >0   

 --保存阻塞的源头 
                                                                                                                                                                                                                                                       
 insertinto#Temp(spID,dbID    
                                                                                                                                                                   
  ,last_batch )   
                                                                                                                                                                                                                                       
  SELECTspID,dbID   
                                                                                                                                                                              
    ,last_batch   
                                                                                                                                                                                                              
   from#Temp2   
                                                                                                                                                                                                                                                   
   wherespID in(selectblocked from#Temp)   
                                                                                                                                                                                                                     
    andspID notin(selectspID from#Temp)   
                                                                                                                                                                                                                                                     
 
 select@spID=min(spID) from#Temp   
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
 
 while@spidisnotnull   
                                                                                              
 begin   
                                                                                                                                                                                                                                                           
  set@c_sql='dbcc inputbuffer('+convert(varchar(5),@spID) +')'   
   
  select@blocked=isnull(blocked,0) from#TempwherespID=@spID 
   
  if(@blocked<>0) 
   begin 
    set@b_sql='dbcc inputbuffer('+convert(varchar(5),@blocked) +')'   
    insertinto#Temp_b 
        exec(@b_sql)   
         
   select@ID=@@IDentity   
    
   update#Temp   
                                                                                                                                                                                                                                                  
   setBlockedsqlBuffer =#Temp_b.eventinfo   
                                                                                                                                                                                                                             
   from#Temp,#Temp_b   
                                                                                            
   where#Temp_b.ID =@ID   
                                                                                                                                                                                                                                         
    and#Temp.blocked =@blocked   
   end 
                                                                                                                                                                                                                                                       
 
  insertinto#Temp1   
                                                                                                                                                                                                                                             
   exec(@c_sql)   
                                                                                                                                                                                                                                                 

  select@ID=@@IDentity   
                                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                       
   update#Temp   
                                                                                                                                                                                                                                                  
    setsqlBuffer =#Temp1.eventinfo   
                                                                                                                                                                                                                             
   from#Temp,#Temp1   
                                                                                                                                                                                                                                             
   where#Temp1.ID =@ID   
                                                                                                                                                                                                                                         
    and#Temp.spID =@spID   
                                                                                                                                                                                                                                                     
 
  select@spID=min(spID) from#TempwherespID >@spID   
                                                                                                                                                                                                         
 end   
                                                                                                                                                                                      
    
 insertintodc_block_info(batch_ID,spID,sqlBuffer,BlkBy,BlockedsqlBuffer,Loginname,DBname, 
 cpuTime,diskIO,LastBatch,Command)                                                                                                                                                                                                               
                                               
 SELECT@batch_ID,convert(char(5),spID) SPID,CASElower(status) When'sleePing'Thenlower(status) Elseupper(status) ENDStatus   
                                                                                                                                           
  ,sqlBuffer   
                                                                                                                                                                                                                                                    
  ,CASEhostname WhenNull  Then'  .'When' 'Then'  .'Elsehostname ENDHostname   
                                                                                                                                                                          
  ,CASEisnull(convert(char(5),blocked),'0') When'0'Then'  .'   
                                                                                                                                                                                                
                       Elseisnull(convert(char(5),'0') ENDBlkBy,BlockedsqlBuffer   
                                                                                                                                                                                       
  ,loginame Login   
                                                                                                                                                                                                                                               
  ,db_name(dbID) DBname,convert(varchar,cpu) cpuTime   
                                                                                                                                                                                                            
  ,physical_io) diskIO,Last_Batch LastBatch   
                                                                                                                                                                                                     
  ,program_name Programname,cmd Command   
                                                                                                                                                                                                                        
  from#Temp   
                                                                                                                                                                                                                                                     
  orderbyBlkBy,spID   
                                                                                                                                                                                                                                                        
 
 setnocount off   
                                                                                                                                                                                                                                                 
end   

Go

USE[master]
GO   
 
--预警SP
CREATEproc[dbo].[RecodeAnDalertInfo]   
as   
begin   
declare@banch_IDint   
select@banch_ID=isnull(MAX(batch_ID),0)+1fromdc_info_SessionConn  

--记录当前所有会话信息
insertintodc_info_SessionConn   
SELECT@banch_ID,S.session_ID,R.blocking_session_ID,   
 current_execute_sql =SUBSTRING(T.text,   
    R.statement_start_offset /2+1,   
    CASE   
     WHENstatement_end_offset =-1THENLEN(T.text)   
     ELSE(R.statement_end_offset -statement_start_offset) /2+1   
    END),   
 S.login_name,S.host_name,databasename=DB_name(R.database_ID),S.program_name,R.command,   
 S.status,S.cpu_time,memory_usage_kb =S.memory_usage *8,S.reads,S.writes,   
 S.transaction_isolation_level,C.connect_time,C.last_read,C.last_write,   
 C.net_transport,C.clIEnt_net_address,C.clIEnt_tcp_port,C.local_tcp_port,   
 R.start_time,R.wait_time,R.wait_type,R.last_wait_type,R.wait_resource,   
 R.open_transaction_count,GETDATE()    
FROMsys.dm_exec_sessions S   
 leftJOINsys.dm_exec_connections C   
  ONS.session_ID =C.session_ID   
 leftJOINsys.dm_exec_requests R   
  ONS.session_ID =R.session_ID   
   ANDC.connection_ID =R.connection_ID   
 OUteraPPLY sys.dm_exec_sql_text(R.sql_handle) T   
WHERE  S.is_user_process =1  -- 如果不限制此条件,则查询所有进程(系统和用户进程) 
ANDcommand isnotnull   
       
   
--记录当前阻塞信息 
insertintodc_info_BlockedInfo   
select@banch_ID,t1.resource_type as[lock type],db_name(resource_database_ID) as[database]   
,t1.resource_associated_entity_ID as[blk object]   
,t1.request_mode as[lock req]                          -- lock requested   
,t1.request_session_ID as[waiter sID]                      -- spID of waiter   
,t2.wait_duration_ms as[wait time]         
,(selecttextfromsys.dm_exec_requests asr                  --- get sql for waiter   
crossapply sys.dm_exec_sql_text(r.sql_handle)    
wherer.session_ID =t1.request_session_ID) aswaiter_batch   
,(selectsubstring(qt.text,r.statement_start_offset/2,    
(casewhenr.statement_end_offset =-1thenlen(convert(nvarchar(max),qt.text)) *2    
elser.statement_end_offset end-r.statement_start_offset)/2)    
fromsys.dm_exec_requests asr   
crossapply sys.dm_exec_sql_text(r.sql_handle) asqt   
wherer.session_ID =t1.request_session_ID) aswaiter_stmt    --- statement executing Now   
,t2.blocking_session_ID as[blocker sID]                --- spID of blocker   
,(selecttextfromsys.sysprocesses asp                       --- get sql for blocker   
crossapply sys.dm_exec_sql_text(p.sql_handle)    
wherep.spID =t2.blocking_session_ID) asblocker_stmt,getdate()   
fromsys.dm_tran_locks ast1,sys.dm_os_waiting_tasks ast2   
wheret1.lock_owner_address =t2.resource_address   
     

--记录资源信息
insertintodc_Blocked_Resource_Info 
select  convert(smallint,req_spID) AsspID,   
  rsc_dbID AsdbID,   
  rsc_objID AsObjID,   
  rsc_indID AsIndID,   
  substring(v.name,1,4) AsType,   
  substring(rsc_text,32) asResource,   
  substring(u.name,8) AsMode,   
  substring(x.name,5) Asstatus,@banch_ID  
 from  master.dbo.syslockinfo,   
  master.dbo.spt_values v,   
  master.dbo.spt_values x,   
  master.dbo.spt_values u     
 where   master.dbo.syslockinfo.rsc_type =v.number   
   andv.type ='LR'   
   andmaster.dbo.syslockinfo.req_status =x.number   
   andx.type ='LS'   
   andmaster.dbo.syslockinfo.req_mode +1=u.number   
   andu.type ='L'   
 andsubstring(x.name,5) ='WAIT' 
 orderbyspID    
 
 execusp_blocker_info @banch_ID
   
end

GO

3. 创建信息收集和邮件报警的SP:

    说明:

    SP定义了一个邮件发送的过程,需要先配置好数据库的邮件发送(Google一下很多的);

    SP需要做到JOB里面,一分钟运行一次收集信息;

    SP收集信息时的条件(不同的系统触发条件不一样):

       a. 用户链接数大于550,活动链接数大于40;

       b. 阻塞进程比率大于10%;

USE[master]
GO   
--创建触发监控条件,并发邮件
CREATEproc[dbo].[ConAlert]         
as         
begin         
 DECLARE@connfloat,@activeconnfloat,@blockedcountfloat,@spIDcountfloat,@spIDblockedfloat,@countint      
 DECLARE@bodyNVARCHAR(MAX),@subject  nvarchar(200)     
 
 -- User Connections        
  Select@conn=ISNulL(cntr_value,0) fromsys.dm_os_performance_counters with(nolock)
  wherecounter_name='User Connections'
     
  if@conn>=550--根据用户链接数来收集信息和触发报警   
   begin     
     --active requests    
     Select@activeconn=isnull(SUM(cntr_value),0) fromsys.dm_os_performance_counters with(nolock)
       wherecounter_name ='Active requests'
       groupbycounter_name

    set@count=1

    if@activeconn>=40  --根据活动链接数来判断     
     begin       
       while1=1  --循环收集信息
        begin
            execRecodeAnDalertInfo  

           if(@count%5=0) --连续次就发邮件
             begin
               set@subject='Server:'+@@SERVERname+' Connections Alert'
               SET@body='Server:'+@@SERVERname+CHAR(13)+'; UserConnections:'+cast(@connasvarchar)+CHAR(13)+'; ActiveRequests:'+cast(@activeconnasvarchar)              
               EXECMSdb.dbo.sp_send_dbmail           
                @recipIEnts=N'Ken@xxx.com;itmon@xxx.com',                       
                @body=@body,           
                @body_format='TEXT',           
                @subject=@subject,           
                @profile_name='dba_profile'--需要配置该模板 
             end

             Select@activeconn=isnull(SUM(cntr_value),0) fromsys.dm_os_performance_counters with(nolock)
               wherecounter_name ='Active requests'
               groupbycounter_name

             if@activeconn>=40
                 waitfordelay '00:00:05'
             else
                  break;

             if@count>50
                   break;

           set@count=@count+1;
       end   
   end        
   end     
 else--根据Blocked 百分比来收集信息和触发报警  
   begin     
     select@spIDcount=count(0) fromsys.sysprocesses with(nolock) wherespID>50       
     select@blockedcount=count(0) fromsys.sysprocesses with(nolock) wherespID>50andblocked<>0 
     --计算百分比
     set@spIDblocked=@blockedcount/@spIDcount
     set@count=1

     if(@spIDblocked>=0.1) --10%时,记录信息 
      begin       
       while1=1  --循环收集信息
        begin
            execRecodeAnDalertInfo  

           if(@count%5=0or@spIDblocked>=0.15) --连续5次阻塞比率大于10%或者阻塞比率大于等于15%时,记录信息并发生报警邮件   
             begin
                set@subject='Server:'+@@SERVERname+' Blocked Alert'             
                SET@body='Server:'+@@SERVERname+CHAR(13)+'; BlockedCount:'+cast(@blockedcountasvarchar)+CHAR(13)+'; SPIDCount:'+cast(@spIDcountasvarchar)              
                EXECMSdb.dbo.sp_send_dbmail           
                @recipIEnts=N'Ken@xxx.com',                      
                @body=@body,            
                @profile_name='dba_profile'   
             end

             select@spIDcount=count(0) fromsys.sysprocesses with(nolock) wherespID>50       
             select@blockedcount=count(0) fromsys.sysprocesses with(nolock) wherespID>50andblocked<>0 
             set@spIDblocked=@blockedcount/@spIDcount

             if@spIDblocked>=0.1
                 waitfordelay '00:00:05'
             else
                  break;

             if@count>50
                   break;

           set@count=@count+1;
       end
      end 
   end 
end

4. 查看信息:

USE[master]
GO
--通过以下查询来分析信息
declare@batch_IDint
select@batch_ID=isnull(MAX(@batch_ID),1) fromdc_info_SessionConn

select  *from  dc_info_SessionConn wherebatch_ID=@batch_ID
select  *from  dc_info_BlockedInfo wherebatch_ID=@batch_ID
select  *from  dc_block_info       wherebatch_ID=@batch_ID
select  *from  dc_Blocked_Resource_Info where=@batch_ID

 

    有了这些阻塞、进程和资源的信息,我们就可以了解系统出问题时,是否出现异常,以及数据库异常时都有些什么语句在系统中运行,它们是否造成了大量的阻塞或消耗了大量的资源;这样DBA就方便定位问题了。


 

 作者“飞洋过海”  http://www.dedecms.com/knowledge/data-base/sql-server/2012/0821/11548.html

总结

以上是内存溢出为你收集整理的SQLServer通过DMV实现低影响的自动监控和历史场景追溯全部内容,希望文章能够帮你解决SQLServer通过DMV实现低影响的自动监控和历史场景追溯所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/sjk/1177679.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-02
下一篇 2022-06-02

发表评论

登录后才能评论

评论列表(0条)

保存