How to read HTML files

  • Hi,

    I just want to know how to search text/record inside the Html files. I had one column in my database named ‘Path’ I saved the html file names in this column and the physical files in a folder. Full Text search is enabled on this column. Whenever I try to search it returns nothing.

    Just need one small example to know how it works, what functions used to search inside the files and how.

    Thanks

    Navi

  • The full text search work only on data stored in database, not outside of it. It won't search for anything in external files.

    Piotr

    ...and your only reply is slàinte mhath

  • Check out this link:

    http://www.simple-talk.com/sql/t-sql-programming/reading-and-writing-files-in-sql-server-using-t-sql/

    Read in the file line by line into a table like it says and then you can search the table.

  • Thanks for the reply. This give me some hope.

    I read the article and will give it a try tonight. Just want to know couple of things for instance how far i understand is that we cannot search the files as is, where is, means we have to fetch the lines or data from the files into our table columns then we can make a search on it. Am I right?

    If so then need to know few things:

    1. Like currently my table have four columns, ID, Description, Path(path of the file i.e. ABC.html), Extension(.html). so do i need the fifth column where I will fetch the data from html files?

    2. The datatype of Path column is nvarchar(MAX) do I need to change it varbinary(MAX)?

    Thanks for any further help.

  • Here is how you can do it:

    1. Create temporary table:

    create table #temp(mydata varchar(max))

    2. Download you HTML file into that table:

    bulk insert #temp

    from 'c:\my_path\my_file.html'

    3. Use a cursor to read each row at a time.

    declare cursor

    for select mydata from #temp

    4. Within iteration of the cursor, search for a specific string by using PATINDEX function.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply