Full-text search is too slow

  • I am trying to implement full-text search on some of our tables. This table has 26k records and only 2 columns: prosp_id which is primary key and notes.

    If I run a query with specific full-text syntax:

    select *

    from prospect_notes

    where contains(notes, 'property')

    it takes 45 sec which is unacceptably slow.

    However if I run a traditional SQL query for the same thing:

    select *

    from prospect_notes

    where notes like '%property%'

    it takes only 300 ms and returns exactly the same number of records as in the query above.

    In the former query the execution plan uses Remote scan on full-text search index (6%), Clustered index seek on prosp_id (94%) and then joins them with Nested Loops Join (0%). For the latter query it performs Clustered index scan and that is all.

    To ensute that full-text index really exists I ran the following query:

    select table_name = t.name,

    column_name = c.name

    from sys.fulltext_index_columns f

    join sys.tables t on f.object_id = t.object_id

    join sys.columns c on f.column_id = c.column_id

    and f.object_id = c.object_id

    and it returned me prospect_notes notes which is indeed correct.

    So, it looks like Full-text search is slower than just regular SQL capability, or am I doing something wrong ?

  • that's odd.

    a CONTAINS clause should be at least as fast as a LIKE clause for a small table such as that, and faster for a very large table.

    a) the FullText service runs independantly from the SQL service, so there's a chance it's starved for memory.

    b) how large is this notes column? if it's very large and the word "property" is very common, of course that will slow things down... but still not like you're describing.

    c) try running a CONTAINSTABLE query instead and compare the performance.

    ie: [font="Courier New"]select * from CONTAINSTABLE(prospect_notes, notes, 'property')[/font]

    It appears that you've got a clustered index on prosp_id, so the index seek it's doing shouldn't be a problem.

    I've got a small table (about 10k records) with about a half dozen columns in an FT index. Running a containstable query against it takes 15ms to return 17 records. A CONTAINS query takes a hair longer at 30-80ms, both a far cry from your 45 seconds!

    -Chris

  • Might be a cahcing issue. Try running the item to completion once and if takes 45 seconds rerun to see if it returns quicker the second time. Then wait about an hour or so and see if is slow again.

    Also take a look here might help you optimize a bit.

    http://msdn2.microsoft.com/en-us/library/aa175787(sql.80).aspx

  • Thanks for your inputs.

    This is really odd. I tried to run both CONTAINS and CONTAINSTABLE, it took the same 45 sec. But after I cleaned my buffer cache with DBCC DROPCLEANBUFFERS, either CONTAINS and CONTAINSTABLE surprisingly ran for about 80 ms ! This is really much faster than 300 ms with LIKE '%...%'.

    So it looks like this is caching poblems ? But this table is not so large, it's only about 26000 records and the data along with its index occupy only about 9MB.

    Now with this success I can move to larger tables that have millions records.

    But another question arises here. DBCC looks like resolved the problem, but what I will do in production in such situation? Cleaning buffer cache pages is very performance degrading event. The only one thing I have is to hope that it will never happen in prod. Our production sever has 16 GB memory, while development is only 3 GB.

    Also thanks for the link, this article looks interesting, I will read it later.

  • Pinged someone at MS and got this:

    Steve – here’s what I found out: You’re experiencing a 45 seconds delay due component signature verification taking place by FTS. Because the server has not internet connectivity, FTS times-out after ~45 seconds waiting for the information to verify if the signature has been revoked or not for a given wordbreaker used in that query.

    FTS is checking if the component is trusted/signed AND checking if the signature has been revoked (for what we need internet connectivity). If you cannot connect the machine to the net, then you can turn off signature verification in FTS:

    EXEC sp_fulltext_service 'verify_signature’, 0

    GO

    This should solve the problem but note that you need to make sure no one is loading untrusted binaries (WBs, ifilters, etc.) The next release of FTS in 2008 will not present this behavior.

    Hope it helps.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply