Node.value and insert the result to an existing table

  • I do not want the operation to be terminated because a row in the middle was not been able to be inserted. but that row goes to a log table. I have a unique constraint on email and we can not disable it.

    insert into SP.UserTrial

    select * from

    (SELECT TempXML.Node.value('(email)[1]','nvarchar(50)') as Email,' ' as Pass,

    TempXML.Node.value('(name/n/given)[1]', 'nVARCHAR(50)') as FirstName,

    TempXML.Node.value('(name/n/family)[1]', 'VARCHAR(50)') as LastName,

    TempXML.Node.value('(name/fn)[1]','nvarchar(50)') as DisplayName,' ' as [Profile],

    TempXML.Node.value('(email)[1]','nvarchar(50)') as DisplayEmail,' ' as CellPhone, 1 as UpdatedBy,GETDATE() as UpdateDate, 0 as deleted

    FROM @PersonXML.nodes('/enterprise/person') TempXML (Node)) as a

    How can I do that

    Thanks

  • Store the shredded xml data into a relational (temp) staging table.

    Use this table and check for existing emails within your insert statement.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g

    Thanks

  • Would you please provide some sample data to play with?

    Basically, a sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • sohairzaki (5/8/2010)


    Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g

    Thanks

    heh... as a side bar, if performance were really important, you wouldn't be accepting XML data. 😀 Instead, you'd hit the vendor up for a nice fast tab delimited file. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • <?xml version="1.0" encoding="UTF-8"?>

    <enterprise>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>John Jack</fn>

    <n>

    <family>John</family>

    <given>Jack</given>

    </n>

    </name>

    <email>JohnJack@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Mary Luiz</fn>

    <n>

    <family>Luiz</family>

    <given>Mary</given>

    </n>

    </name>

    <email>MayLuiz@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Hani Safwat</fn>

    <n>

    <family>Safwat</family>

    <given>Hani</given>

    </n>

    </name>

    <email>HaniSafwat@xyz.com</email>

    </person>

    Thanks

    I have a duplicate email intentially.

  • sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.

  • Thanks

    here is my target table structure

    UserID int Unchecked

    Email nvarchar(50) Unchecked

    Pass nvarchar(20) Unchecked

    FirstName nvarchar(20) Checked

    LastName nvarchar(40) Checked

    DisplayName nvarchar(50) Checked

    Profile nvarchar(MAX) Checked

    DisplayEmail nvarchar(50) Checked

    CellPhone nvarchar(20) Checked

    UpdatedBy int Checked

    UpdateDate datetime Unchecked

    Deleted bit Unchecked

    where email is unique

    Thanks again

  • Ok, here's how I'd do it:

    DECLARE @tbl TABLE

    (

    UserID INT ,

    Email NVARCHAR(50) ,

    Pass NVARCHAR(20) ,

    FirstName NVARCHAR(20) ,

    LastName NVARCHAR(40) ,

    DisplayName NVARCHAR(50) ,

    PROFILE NVARCHAR(MAX) ,

    DisplayEmail NVARCHAR(50) ,

    CellPhone NVARCHAR(20) ,

    UpdatedBy INT ,

    UpdateDate DATETIME ,

    Deleted BIT

    )

    DECLARE @xml XML

    SET @xml ='<?xml version="1.0" encoding="UTF-8"?>

    <enterprise>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>John Jack</fn>

    <n>

    <family>John</family>

    <given>Jack</given>

    </n>

    </name>

    <email>JohnJack@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Lee Terence</fn>

    <n>

    <family>Lee</family>

    <given>Terence</given>

    </n>

    </name>

    <email>tlee11@xyz.com</email>

    </person>

    <person>

    <name>

    <fn>Mary Luiz</fn>

    <n>

    <family>Luiz</family>

    <given>Mary</given>

    </n>

    </name>

    <email>MayLuiz@xyz.com</email>

    </person>

    </enterprise>

    '

    SELECT

    c.value('email[1]','varchar(50)') AS email,

    v.value('fn[1]','varchar(20)') AS DisplayName,

    x.value('given[1]','varchar(40)') AS FirstName,

    x.value('family[1]','varchar(50)') AS LastName

    INTO #mails

    FROM @xml.nodes('enterprise/person') T(c)

    CROSS APPLY

    T.c.nodes('name') U(v)

    CROSS APPLY

    U.v.nodes('n') W(x)

    CREATE CLUSTERED INDEX cx_#mails ON #mails(email)

    INSERT INTO @tbl (Email, FirstName, LastName, DisplayName)

    SELECT email, FirstName, LastName, DisplayName

    FROM #mails

    WHERE NOT EXISTS (SELECT 1 FROM @tbl t WHERE t.Email = #mails.email)

    GROUP BY email, DisplayName, FirstName, LastName

    SELECT *

    FROM @tbl

    DROP TABLE #mails

    A few other issues you might want to check:

    If the email column does really allow only 50 chars you might want to expand it.

    If DisplayName usually is FirstName + LastName you should change the columns so they will add up (20+40=60).

    If the content of PROFILE cannot exceed 4000 char, you should prefer NVARCHAR(4000) over (MAX).

    And finally, you need to cover the following scenarios:

    a) different data in the xml file for a given email address (e.g. differen CellPhone)

    b) How to deal with emails, that are stored in the final table but marked as deleted a.s.o.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply