SQL Server FullText解决Like字句性能问题

    xiaoxiao2025-09-21  90

    场景引入

    这天老鸟火急火燎的冲到菜鸟座位:“还记得你在云栖社区发表的一篇名为‘SQL Server利用HashKey计算列解决宽字段查询的性能问题’的文章吗?被人踢馆啦,人家觉得你这个限制条件太苛刻,只能解决完全等于的问题条件下的性能问题,没有太大的现实意义。”菜鸟烧脑的调动大脑的每一个细胞:“哦,你说的是这篇文章啊?”。

    SQL Server利用HashKey计算列解决宽字段查询的性能问题

    问题分析

    菜鸟反思着,的确,需要完全匹配这个条件限制太严格了,SQL Server有没有一种方法来代替LIKE字句的功能而又可以大大提高查询效率的呢?因为,我们知道,LIKE左模糊匹配是可以使用到索引,而右模糊和完全模糊匹配是完全无法使用到索引的。G哥告诉菜鸟有解决方法,用FullText搜索啊。据说阿里云RDS SQL Server 2008和ECS 版RDS SQL 2012都支持SQL Server的FullText哦,对于使用阿里云RDS SQL Server的用户真是个好消息。

    解决方法

    来看我们的一个简单测试。

    创建测试对象

    USE master GO IF DB_ID('test') IS NULL CREATE DATABASE Test; GO use test go --in order to check fulltext enable or not. SELECT is_fulltext_enabled,* FROM sys.databases WHERE name = DB_NAME() ; IF OBJECT_ID('dbo.Item','u') IS NOT NULL BEGIN TRUNCATE TABLE dbo.Item DROP TABLE dbo.Item END GO CREATE TABLE dbo.Item ( id int identity(1,1) primary key ,item_number int not null ,item_desc varchar(4000) null ); GO --fulltext unique index creating CREATE UNIQUE INDEX uix_item_number ON dbo.Item(item_number); --fulltext catelog creating IF NOT EXISTS( SELECT * FROM sys.fulltext_catalogs WITH(NOLOCK) WHERE name = N'ftxt_Item') BEGIN CREATE FULLTEXT CATALOG ftxt_Item AS DEFAULT; END; --create fulltext index CREATE FULLTEXT INDEX ON dbo.Item(item_desc) KEY INDEX uix_item_number ON ftxt_Item; GO

    初始化数据

    --testing table data init. DECLARE @do int ,@loop int ,@item_desc varchar(4000) ; SELECT @do = 1 ,@loop = 100000 ,@item_desc = '' ; SET NOCOUNT ON WHILE @do <= @loop BEGIN set @item_desc = REPLICATE(newid(),112) ; INSERT INTO dbo.Item SELECT CHECKSUM(@item_desc),@item_desc; SET @do = @do + 1; END; GO

    创建索引失败

    想要在超过900byte宽度的字段上创建索引,门都没有,SQL Server直接报错。

    --try to create index on item_desc:error occurs when data inited CREATE INDEX ix_item_desc ON dbo.Item(item_desc);

    错误信息

    Warning! The maximum key length is 900 bytes. The index 'ix_item_desc' has maximum length of 4000 bytes. For some combination of large values, the insert/update operation will fail. Msg 1946, Level 16, State 3, Line 1 Operation failed. The index entry of length 4000 bytes for the index 'ix_item_desc' exceeds the maximum length of 900 bytes. The statement has been terminated.

    LIKE查询和Fulltext查询

    --===testing performance bewteen like & fulltext declare @item_desc varchar(10) ; SELECT TOP 1 @item_desc = LEFT(item_desc, 10) FROM dbo.Item WITH(NOLOCK) WHERE id = 2012; SET STATISTICS PROFILE ON SET STATISTICS TIME ON SET STATISTICS IO ON SELECT * FROM dbo.Item WITH(NOLOCK) WHERE item_desc like @item_desc+'%' SELECT * FROM dbo.Item WITH(NOLOCK) WHERE item_desc like '%'+@item_desc+'%' SELECT * FROM dbo.Item WITH(NOLOCK) WHERE CONTAINS(item_desc,@item_desc) SET STATISTICS TIME OFF SET STATISTICS IO OFF SET STATISTICS PROFILE OFF GO

    执行计划对比

    非常有意思了,执行计划对比来看,Like左匹配和完全模糊匹配均走是Clustered Index Scan,相当于Table Scan;而EstimateRows显示满足条件约为12440行,也暴露出SQL Server执行计划对Like字句的评估并不准确,实际满足条件只有1行而已,所以导致最后的查询效率不高。但是,FullText的执行计划中EstimateRows显示只有1行,与实际情况相符合。

    执行效率对比

    对比完执行计划,让我们来看看最后执行效率对比:

    Like左模糊匹配执行效率: Logical Reads 50185,说明IO读非常高,这是Clustered Index Scan导致的结果;CPU消耗47 ms,总的执行时间51ms。Like完全模糊匹配执行效率:和Like左模糊匹配一样,Logical Reads 50185,也是IO读非常高; CPU 更加离谱达到了4789ms,执行时间4919ms。FullText查询执行效率:Logical Reads仅仅只有5,所以最后CPU消耗为0ms,执行时间50ms。

    从这个对比结果来看,FullText完胜LIKE字句,IO读取大大降低,CPU消耗减少,效率得到了很大的提升。

    #### Like左模糊匹配执行效率 Table 'Item'. Scan count 1, logical reads 50185, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 47 ms, elapsed time = 51 ms. #### Like完全模糊匹配执行效率 Table 'Item'. Scan count 1, logical reads 50185, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 4789 ms, elapsed time = 4919 ms. #### FullText查询执行效率 Informational: The full-text search condition contained noise word(s). Table 'Item'. Scan count 0, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 50 ms.

    根据上面的数字,做出一个炫酷无比的图表来,直观的对比

    写在最后

    完成上面的测试,菜鸟有了对老鸟和“踢馆者”合理的交代,那就是使用FullText来解决Like语句的性能问题吧。

    最新回复(0)