WTH does Full-Text Search do 4 me ?

  • Hello...

    Can someone give me a simple (SIMPLE) reason why I'd use Full-Text searching?

    I have a table with an "AccountName" column that is used often for searches. I have it indexed and use a basic WHERE AccountName LIKE 'SomeName%' search to accomplish the search.

    Does Full-Text searching help me here? The BOL describes details of indexing methods etc.., but no overview of WHY I'd use it and under what circumstances.

    Any help ?

  • Well, a practice example from me.

    I do receive subscription mails from this site.

    I have a table in SQL Server with a column named MsgNoteText. A text column which holds the content of the message body. This is full-text enabled, indexed, populated.....

    DATALENGTH of this column shows me a top value of 80425. Far too much for a varchar.

    Now when I need to reference to some other previous thread or simply remember that there, I am able to fire a simple query like

    SELECT MsgSubject from my_mails WHERE CONTAINS(MsgNoteText,'blabla').

    or more complex queries. BOL has quite some examples on full-text searches.

    I don't think full text will give you any advantage compared to LIKE '...%' in your case, for AccountName does not seem to me like containing more than say 50 characters.

    HTH

    Apart from this, using full-text is something different from doing performance calculations on mutual funds all the time, IIRC

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Thanks F...

    I wonder why the different syntax between LIKE and CONTAINS. I guess LIKE is older and may eventually be replaced by CONTAINS in all searches - Full Text or not ?

    Regardless - thanks - answered my question to a T!

    - B

  • I don't know the origins of full text search, but I guess they were introduced when more and more complex and 'bigger' data needed to be stored in databases like documents of all kind. Sounds interesting, maybe someone else (Hello, Jonathan?) knows a little bit more?

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Well for me - I'd like my end users to learn how to use a * rather than a % for starters - they're used to a * being a wildcard. Plus what appear to be neato more complex search modes are appealing.

  • quote:


    Thanks F...

    I wonder why the different syntax between LIKE and CONTAINS. I guess LIKE is older and may eventually be replaced by CONTAINS in all searches - Full Text or not ?

    - B


    LIKE is the SQL standard. CONTAINS is from the Microsoft Search service, which is an extension to SQL Server and has nothing to do with the SQL language. http://msdn.microsoft.com/library/en-us/architec/8_ar_cs_5tid.asp

    There are reasons to choose each. As Frank pointed out, if your text searches are all in the form of LIKE 'SomeName%', then a SQL Server index will work well and obviates the need for implementing Full-Text searching (which can be complicated, particularly with dynamic data). If, on the other hand, your searches are LIKE '%SomeName%', then SQL Server indexes won't help but a full-text index will. If your data includes multiple words in a column that could be in any order, and there are more than a few thousand rows, then full-text is the answer if you need to search for words with any useful performance.

    --Jonathan



    --Jonathan

  • quote:


    Well for me - I'd like my end users to learn how to use a * rather than a % for starters - they're used to a * being a wildcard. Plus what appear to be neato more complex search modes are appealing.


    You can always use REPLACE on their search string, e.g.:

    
    
    CREATE PROC SearchByName (@SearchFor varchar(20)) AS
    SET @SearchFor = REPLACE(REPLACE(@SearchFor,'*'.%'),'?','_')
    ...

    --Jonathan



    --Jonathan

  • Another use is for searching the contents of MS Office documents (and pdf with free plugin) that are stored in the database.

  • One of the advantages of using FT search is that you can rank the search results, and, therefore, display the results in order of relevance. The FT catalog is indexed to perform this operation very efficiently. FT searching also has the ability to do "stemming", which means that if I do a search for mouse, it will also look for mice.

  • Just a note on using like. If the syntax is always "Somename%", ie., begins with, using

    the following is a much more efficient use of the index:

    where name between "Somename" and "SomenameZ"

    I'm just moving to SQL Server, but in Sybase where I come from a "like" query on an indexed column scans the whole index rather than limiting to valid pages.

  • Hrmmm - interesting....

    In a nutshell, To make name searching easy for the users I always append a "%" to the query. So, to find "Bill", they can type "B", or "Bi", or "Bil", etc...

    So yes, I am always appending the "%".

    Now, why is "Between" a better use of the Clustered index ? Any reason ?

    - B

  • quote:


    Just a note on using like. If the syntax is always "Somename%", ie., begins with, using

    the following is a much more efficient use of the index:

    where name between "Somename" and "SomenameZ"

    I'm just moving to SQL Server, but in Sybase where I come from a "like" query on an indexed column scans the whole index rather than limiting to valid pages.


    There may be a tiny (<1%) difference in performance when using BETWEEN, but that method requires recoding from the straightforward LIKE syntax and may cause other problems with character sets (just using "Z" with straight CI ASCII excludes several characters).

    LIKE 'Somename%' is "sargable" in Sybase, and so, as in SQL Server, will indeed make use of an index.

    --Jonathan



    --Jonathan

  • Just to add

    
    
    %SomeName%

    will not make use of an index, IIRC

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • quote:


    Just to add

    
    
    %SomeName%

    will not make use of an index, IIRC

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de


    quote:


    If, on the other hand, your searches are LIKE '%SomeName%', then SQL Server indexes won't help but a full-text index will.


    --Jonathan



    --Jonathan

  • uhoh...don't wanted to complicate this

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply