Benefits of Full Text Indexing

  • I have to rewrite a web site that currently uses a component called DTSearch to do Full Text indexing and I am thinking of using SQL 2000 Full Text Indexing for the upgrade.

    However the problem we have is that the index needs populating throughout the day so if a client pays to place a job its not searchable until the next update and they want it straight away.

    I was just wondering what the differences, performance wise is between using CONTAINS(jobtitle, '"ASP"') and jobtitle like '%ASP%' or PATINDEX etc. I know that if you have a preceeding % you loose the benefit of using an index. But if the job was searchable instantly then it would be great news.

    I know you can do a lot of great things such as scoring which is handy but I just wanted to know the differences from people who have used it live.

    Thank you

  • There are options to enable almost instant indexing of new content in SQL2000 FT. Search for 'Background update' and 'Change tracking' in books online.

  • The change tracking updates can be delayed if the server is busy since the update waits until cpu processing is below a certain level.  So if the server is very busy, it might take a bit of time.

    On a side note, I'd take a careful look at how things work in FTI and compare it to your existing system - some aspects of it can be of limited use - e.g. ranking just produces a relative ranking within a single CONTAINS result set, and has no other meaning so you can't really combine different CONTAINS queries or use ranking other than to sort that one set.

    Jon

     

  • Rob,

    Both Mark and Jon are correct. "Change Tracking" with "Update Index in Background" enabled is what you're looking for a method to keep the FT Catalogs current with your FT-enabled table's column data. For more information when & how to use CT with UIiB releative to massive updates (>50% table changes), see SQL 2000 BOL title "Maintaining Full-Text Indexes". CT with UIiB *should* update your FT Catalogs at least once per second, depending upon the percentage of change. However, I have tested this with a single row insert and it can take as much as 5 to 7 seconds for one insert to show up in the FT Catalog and be searchable via CONTAINS or FREETEXT.

    Additonally, it is CONTAINSTABLE and FREETEXTTABLE that provide the ranking results (vs. CONTAINS or FREETEXT) via the keyword RANK. See SQL 2000 BOL title "Using the CONTAINSTABLE and FREETEXTTABLE Rowset-valued Functions" for more details.

    Furthermore, you may want to review this blog entry "dtSearch announces a New Terabyte Indexer..." at http://spaces.msn.com/members/jtkane/Blog/cns!1pWDBCiDX1uvH5ATJmNCVLPQ!449.entry and the "Integrating Query of Relational and Textual Data in Clinical Databases: A Case Study" [PubMed AbstractJAMIA Abstract, full text (pdf file)] for a case study comparison between dtSearch (assuming this is what you're using now).

    Finally, there are a number of improvments (performance & functionality) in SQL Server 2005 (Yukon) related to Full Text Search (FTS) that you may also want to review!

    Thanks,

    John

    SQL Full Text Search Blog

    http://spaces.msn.com/members/jtkane/


    John T. Kane

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply