May 7, 2010 at 8:20 pm
I do not want the operation to be terminated because a row in the middle was not been able to be inserted. but that row goes to a log table. I have a unique constraint on email and we can not disable it.
insert into SP.UserTrial
select * from
(SELECT TempXML.Node.value('(email)[1]','nvarchar(50)') as Email,' ' as Pass,
TempXML.Node.value('(name/n/given)[1]', 'nVARCHAR(50)') as FirstName,
TempXML.Node.value('(name/n/family)[1]', 'VARCHAR(50)') as LastName,
TempXML.Node.value('(name/fn)[1]','nvarchar(50)') as DisplayName,' ' as [Profile],
TempXML.Node.value('(email)[1]','nvarchar(50)') as DisplayEmail,' ' as CellPhone, 1 as UpdatedBy,GETDATE() as UpdateDate, 0 as deleted
FROM @PersonXML.nodes('/enterprise/person') TempXML (Node)) as a
How can I do that
Thanks
May 8, 2010 at 5:12 am
Store the shredded xml data into a relational (temp) staging table.
Use this table and check for existing emails within your insert statement.
May 8, 2010 at 11:50 am
Can you explain more how to shredded down. Performance is very important because the xml files are of size 1g
Thanks
May 8, 2010 at 12:58 pm
Would you please provide some sample data to play with?
Basically, a sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.
May 8, 2010 at 5:19 pm
sohairzaki (5/8/2010)
Can you explain more how to shredded down. Performance is very important because the xml files are of size 1gThanks
heh... as a side bar, if performance were really important, you wouldn't be accepting XML data. 😀 Instead, you'd hit the vendor up for a nice fast tab delimited file. 😉
--Jeff Moden
May 9, 2010 at 7:55 pm
<?xml version="1.0" encoding="UTF-8"?>
<enterprise>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>John Jack</fn>
<n>
<family>John</family>
<given>Jack</given>
</n>
</name>
<email>JohnJack@xyz.com</email>
</person>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>Mary Luiz</fn>
<n>
<family>Luiz</family>
<given>Mary</given>
</n>
</name>
<email>MayLuiz@xyz.com</email>
</person>
<person>
<name>
<fn>Hani Safwat</fn>
<n>
<family>Safwat</family>
<given>Hani</given>
</n>
</name>
<email>HaniSafwat@xyz.com</email>
</person>
Thanks
I have a duplicate email intentially.
May 10, 2010 at 12:50 am
sample xml file with the structure you're faced with holding FAKE data and your target table with a few, again FAKE, data together with your expected result.
May 10, 2010 at 10:58 am
Thanks
here is my target table structure
UserID int Unchecked
Email nvarchar(50) Unchecked
Pass nvarchar(20) Unchecked
FirstName nvarchar(20) Checked
LastName nvarchar(40) Checked
DisplayName nvarchar(50) Checked
Profile nvarchar(MAX) Checked
DisplayEmail nvarchar(50) Checked
CellPhone nvarchar(20) Checked
UpdatedBy int Checked
UpdateDate datetime Unchecked
Deleted bit Unchecked
where email is unique
Thanks again
May 10, 2010 at 1:19 pm
Ok, here's how I'd do it:
DECLARE @tbl TABLE
(
UserID INT ,
Email NVARCHAR(50) ,
Pass NVARCHAR(20) ,
FirstName NVARCHAR(20) ,
LastName NVARCHAR(40) ,
DisplayName NVARCHAR(50) ,
PROFILE NVARCHAR(MAX) ,
DisplayEmail NVARCHAR(50) ,
CellPhone NVARCHAR(20) ,
UpdatedBy INT ,
UpdateDate DATETIME ,
Deleted BIT
)
DECLARE @xml XML
SET @xml ='<?xml version="1.0" encoding="UTF-8"?>
<enterprise>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>John Jack</fn>
<n>
<family>John</family>
<given>Jack</given>
</n>
</name>
<email>JohnJack@xyz.com</email>
</person>
<person>
<name>
<fn>Lee Terence</fn>
<n>
<family>Lee</family>
<given>Terence</given>
</n>
</name>
<email>tlee11@xyz.com</email>
</person>
<person>
<name>
<fn>Mary Luiz</fn>
<n>
<family>Luiz</family>
<given>Mary</given>
</n>
</name>
<email>MayLuiz@xyz.com</email>
</person>
</enterprise>
'
SELECT
c.value('email[1]','varchar(50)') AS email,
v.value('fn[1]','varchar(20)') AS DisplayName,
x.value('given[1]','varchar(40)') AS FirstName,
x.value('family[1]','varchar(50)') AS LastName
INTO #mails
FROM @xml.nodes('enterprise/person') T(c)
CROSS APPLY
T.c.nodes('name') U(v)
CROSS APPLY
U.v.nodes('n') W(x)
CREATE CLUSTERED INDEX cx_#mails ON #mails(email)
INSERT INTO @tbl (Email, FirstName, LastName, DisplayName)
SELECT email, FirstName, LastName, DisplayName
FROM #mails
WHERE NOT EXISTS (SELECT 1 FROM @tbl t WHERE t.Email = #mails.email)
GROUP BY email, DisplayName, FirstName, LastName
SELECT *
FROM @tbl
DROP TABLE #mails
A few other issues you might want to check:
If the email column does really allow only 50 chars you might want to expand it.
If DisplayName usually is FirstName + LastName you should change the columns so they will add up (20+40=60).
If the content of PROFILE cannot exceed 4000 char, you should prefer NVARCHAR(4000) over (MAX).
And finally, you need to cover the following scenarios:
a) different data in the xml file for a given email address (e.g. differen CellPhone)
b) How to deal with emails, that are stored in the final table but marked as deleted a.s.o.
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply