reading html

  • I have just been given a task and I'm scratching my head on how to start.....

    The following is are the sample what the contents of a column in a table (named documentbody) should be...

    <body><p><a code1="111111" code2="222222"</a></p>

    </body>

    However, we have there are many instances where we have the following...

    a. <body><p><a code1="111111"</a></p>

    </body>

    or

    b.<body><p><a code2="222222"</a></p>

    </body>

    My task is to

    1. Find the occurences of a, and add code2

    2. Find the occurences of b and add code1

    Note that there can be numerous tags in a document, but the codes shoud be with each .

    I hope my questions is not too confusuing 🙂

    Thanks for tips/pointers

  • Would a rather standard REPLACE help?

    SELECT

    REPLACE (

    '<body><p><a code1="111111"</a></p></body>' ,

    '<a code1="111111"</a>' ,

    '<a code1="111111" code2="222222"</a>'

    )



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (2/15/2012)


    Would a rather standard REPLACE help?

    Heh... there you go again! Making stuff simple. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I wished a simple replace was the solution. I was far too simplistic in my initial post.

    I've figured out a few things to get me started.

    From this

    <body><p><a code1="111111" code2="222222"</a></p>

    </body>

    I got the postion of the beginning of code1 like this

    set @poscode1 = (select patindex('%code1="%', @myvar))

    I'm trying to figure out the postion of the the last " in code1="111111"

    Thanks

  • If your initla post was too simplistic, please provide a more descriptive version of what you're looking for. Your latest reply still indicate that REPLACE would be the way to go...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • rabisco (2/16/2012)


    I wished a simple replace was the solution. I was far too simplistic in my initial post.

    I've figured out a few things to get me started.

    From this

    <body><p><a code1="111111" code2="222222"</a></p>

    </body>

    I got the postion of the beginning of code1 like this

    set @poscode1 = (select patindex('%code1="%', @myvar))

    I'm trying to figure out the postion of the the last " in code1="111111"

    Thanks

    Your example only has 2 of those in it. It would be helpful if you could post something closer to your real problem. If it's big, you could always just attach it as a text file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks.

    I'll try to describe the problem in detail.

    Users are able to enter a product code (code1 and code2 as a pair) into a document. There may be more than one entry in a document. Each entry shows up in a block.

    The business problem is that some users only enter code1 or code2 and not both.

    The task is thus to look for each code1 entry, in each paragraph, search for the corresponding code2 value (in another table) and replace (as you rightly pointed out) the original entry i.e. code1 with code1 + code2.

    I hope this describles the problem better. So I understand that replace with do the job ultimately, I guess the bit before replace is what I need help with.

    This is what I have so far....

    Given the column value - <body><p><a name="productLink" class="productLink code1="1595501" href="" uid="_1329497039580846">Processor</a></p></body>

    declare @myvar varchar(max)

    declare @code1sku varchar(50)

    declare @code2sku varchar(50)

    declare @prereplace varchar(50)

    declare @postreplace varchar(50)

    declare @poscode1 int

    set @myvar = ''

    --gets the string containing code1

    select @myvar = bodytext from documentbody

    set @poscode1 = (select patindex('%code1="%', @myvar))

    set @code1sku = (select substring(@myvar, @poscode1 + 13,7))

    --search for code2 sku where you have code1 sku

    set @code2sku = (select code2sku from SKUtable where code1sku = @code1sku)

    set @prereplace = 'code1="'+@code1sku+'"'

    set @postreplace = @prereplace+' code2="'+@code2sku+'"'

    select @postreplace

    select replace(@myvar, @prereplace,@postreplace)

    select @myvar

    This statement select replace(@myvar, @prereplace,@postreplace) returns the result I need i.e. the code2 values is added to the string, but select @myvar shows the string before the replace.

    Thanks.

  • error

  • I'm trying to update the bodytext column of the documentbody table for rows where this string is found.

    <body><p><aname="productLink" class="productLink" code1="1463513" >ProductLink</a></p></body>

    with

    <body><p><aname="productLink" class="productLink" code1="1463513" ode2="4567891">ProductLink</a></p></body>

    the update statement below, updates every row....

    declare @myvar varchar(max)

    declare @code1sku varchar(50)

    declare @code2sku varchar(50)

    declare @prereplace varchar(50)

    declare @postreplace varchar(50)

    declare @poscode1 int

    set @myvar = ''

    declare c1 cursor read_only

    for

    select bodytext from documentbody

    open c1

    fetch next from c1 into @myvar

    while @@fetch_status = 0

    begin

    if charindex('code1=',@myvar) > 0 and charindex(code2=',@myvar) = 0

    begin

    set @poscode1 = (select patindex('%code1="%', @myvar))

    set @code1sku = (select substring(@myvar, @poscode1 + 13,7))

    set @code2sku = (select code2_sku from Linktable where code1_sku = @code1sku)

    set @prereplace = 'code1="'+@code1sku+'"'

    set @postreplace = @prereplace+' code2="'+@code2sku+'"'

    set @myvar = replace(@myvar, @prereplace,@postreplace)

    update documentbody

    set bodytext = @myvar

    where bodytext is not null

    end

    fetch next from c1 into @myvar

    end

    close c1

    deallocate c1

    please help.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply