为什么 SQLSERVER CPU占用过高 CPU达到100% 返回
CPU占用过高和磁盘I/O没有必然联系。如果数据库I/O繁忙CPU反而利用率上不去,因为CPU处于I/O wait状态。
数据库CPU高主要有2个原因,第一个是内存latch,第二个是硬解析。所以我们可以从这两方面去排查问题,优化I/O是错误的方向。
现象
Windows2008R2 ,数据库是SQL2008R2 64位
每天的某一个时间段才会出现CPU占用高的情况
内存占用不太高,只占用了30个G
CPU占用100%


排查方向
一般排查都是用下面的脚本,一般会用到三个视图sys.sysprocesses ,dm_exec_sessions ,dm_exec_requests
1 USE master 2 GO 3 --如果要指定数据库就把注释去掉 4 SELECT * FROM sys.[sysprocesses] WHERE [spid]>50 --AND DB_NAME([dbid])='gposdb' 5 SELECT COUNT(*) FROM [sys].[dm_exec_sessions] WHERE [session_id]>50
看一下当前的数据库用户连接有多少
然后使用下面语句看一下各项指标是否正常,是否有阻塞,这个语句选取了前10个最耗CPU时间的会话
SELECT TOP 10 [session_id], [request_id], [start_time] AS '开始时间', [status] AS '状态', [command] AS '命令', dest.[text] AS 'sql语句', DB_NAME([database_id]) AS '数据库名', [blocking_session_id] AS '正在阻塞其他会话的会话ID', [wait_type] AS '等待资源类型', [wait_time] AS '等待时间', [wait_resource] AS '等待的资源', [reads] AS '物理读次数', [writes] AS '写次数', [logical_reads] AS '逻辑读次数', [row_count] AS '返回结果行数' FROM sys.[dm_exec_requests] AS der CROSS APPLY sys.[dm_exec_sql_text](der.[sql_handle]) AS dest WHERE [session_id]>50 AND DB_NAME(der.[database_id])='gposdb' ORDER BY [cpu_time] DESC
如果想看具体的SQL语句可以执行下面的SQL语句,记得在SSMS里选择以文本格式显示结果
--在SSMS里选择以文本格式显示结果 SELECT TOP 10 dest.[text] AS 'sql语句' FROM sys.[dm_exec_requests] AS der CROSS APPLY sys.[dm_exec_sql_text](der.[sql_handle]) AS dest WHERE [session_id]>50 ORDER BY [cpu_time] DESC

模拟了一些耗CPU时间的动作



还有查看CPU数和user scheduler数和最大工作线程数,检查worker是否用完也可以排查CPU占用情况
1 --查看CPU数和user scheduler数目 2 SELECT cpu_count,scheduler_count FROM sys.dm_os_sys_info 3 --查看最大工作线程数 4 SELECT max_workers_count FROM sys.dm_os_sys_info
查看机器上的所有schedulers包括user 和system
通过下面语句可以看到worker是否用完,当达到最大线程数的时候就要检查blocking了
对照下面这个表
各种CPU和SQLSERVER版本组合自动配置的最大工作线程数
CPU数 32位计算机 64位计算机
<=4 256 512
8 288 576
16 352 704
32 480 960
SELECT scheduler_address, scheduler_id, cpu_id, status, current_tasks_count, current_workers_count,active_workers_count FROM sys.dm_os_schedulers
如果SQLSERVER存在要等待的资源,那么执行下面语句就会显示出会话中有多少个worker在等待
结合[sys].[dm_os_wait_stats]视图,如果当前SQLSERVER里面没有任何等待资源,那么下面的SQL语句不会显示任何结果
SELECT TOP 10 [session_id], [request_id], [start_time] AS '开始时间', [status] AS '状态', [command] AS '命令', dest.[text] AS 'sql语句', DB_NAME([database_id]) AS '数据库名', [blocking_session_id] AS '正在阻塞其他会话的会话ID', der.[wait_type] AS '等待资源类型', [wait_time] AS '等待时间', [wait_resource] AS '等待的资源', [dows].[waiting_tasks_count] AS '当前正在进行等待的任务数', [reads] AS '物理读次数', [writes] AS '写次数', [logical_reads] AS '逻辑读次数', [row_count] AS '返回结果行数' FROM sys.[dm_exec_requests] AS der INNER JOIN [sys].[dm_os_wait_stats] AS dows ON der.[wait_type]=[dows].[wait_type] CROSS APPLY sys.[dm_exec_sql_text](der.[sql_handle]) AS dest WHERE [session_id]>50 ORDER BY [cpu_time] DESC
比如我当前执行了查询SalesOrderDetail_test表100次,由于表数据非常多,所以SSMS需要把SQLSERVER执行的结果慢慢的取走,
造成了ASYNC_NETWORK_IO等待
1 USE [AdventureWorks] 2 GO 3 SELECT * FROM dbo.[SalesOrderDetail_test] 4 GO 100

问题源头
经过排查和这几天的观察情况,确定是某些表缺失索引导致,现在在这些表上增加了索引,问题解决
select * from t_AccessControl --权限控制表权限控制 select * from t_GroupAccess --用户组权限表用户组权限 select * from t_GroupAccessType --用户组权限类表用户组权限类 select * from t_ObjectAccess --对象权限表对象权限 select * from t_ObjectAccessType --对象权限类型表对象权限类型 select * from t_ObjectType --对象类型表对象类型
查询CPU占用高的语句
SELECT TOP 10 total_worker_time/execution_count AS avg_cpu_cost, plan_handle, execution_count, (SELECT SUBSTRING(text, statement_start_offset/2 + 1, (CASE WHEN statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(max), text)) * 2 ELSE statement_end_offset END - statement_start_offset)/2) FROM sys.dm_exec_sql_text(sql_handle)) AS query_text FROM sys.dm_exec_query_stats ORDER BY [avg_cpu_cost] DESC

查询缺失索引
SELECT DatabaseName = DB_NAME(database_id) ,[Number Indexes Missing] = count(*) FROM sys.dm_db_missing_index_details GROUP BY DB_NAME(database_id) ORDER BY 2 DESC;
SELECT TOP 10 [Total Cost] = ROUND(avg_total_user_cost * avg_user_impact * (user_seeks + user_scans),0) , avg_user_impact , TableName = statement , [EqualityUsage] = equality_columns , [InequalityUsage] = inequality_columns , [Include Cloumns] = included_columns FROM sys.dm_db_missing_index_groups g INNER JOIN sys.dm_db_missing_index_group_stats s ON s.group_handle = g.index_group_handle INNER JOIN sys.dm_db_missing_index_details d ON d.index_handle = g.index_handle ORDER BY [Total Cost] DESC;


定位问题后,新建非聚集索引
CREATE NONCLUSTERED INDEX IX_t_AccessControl_F4 ON dbo.t_AccessControl ( FObjectType )include([FUserID], [FAccessType], [FAccessMask]) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO drop index IX_t_AccessControl_F4 on t_AccessControl
CPU占用恢复正常

跟踪模板和跟踪文件下载,请使用SQL2008R2 版本:files.cnblogs.com/lyhabc/跟踪模板和trace.rar
总结
从多次历史经验来看,如果CPU负载持续很高,但内存和IO都还好的话,这种情况下,首先想到的一定是索引问题,十有八九错不了。
注意文章开头贴出的客户机器负载情况图

热忱回答(2)
-
簽約乄芐﹃站\/ka VIP0
2016/12/23这个文章不错 我们也遇到过这个情况
0 回复 -
winnyrain VIP0
2017/5/11不错,解决了我的问题,谢谢分享
0 回复