String splitter to table weird result

Question

Post reply

String splitter to table weird result

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Digs SSCertifiable Points: 7333 More actions · Answer 1

Digs

SSCertifiable

Points: 7333

May 2, 2010 at 4:36 pm

#1161039

.."As we've seen on dozens of posts, an XML splitter would not have performed so well."..

Jeff you have seen the yahoo .csv link I posted in this thread.

I too investigated XML transfer of OHLCV data.

So you Say that plain TEXT splitting is fast than XML splitting into a DateOHLCV table.

Interesting..

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 2

Digs (5/2/2010)

.."As we've seen on dozens of posts, an XML splitter would not have performed so well."..

Jeff you have seen the yahoo .csv link I posted in this thread.

I too investigated XML transfer of OHLCV data.

So you Say that plain TEXT splitting is fast than XML splitting into a DateOHLCV table.

Interesting..

If I understand you correctly, you're asking if I think that plain text splitting is faster than shredding properly formed XML. My answer is... I don't know... I've not had to worry about XML downloads.

What I was saying is that a Tally table splitter is usually faster than using FOR XML to do the split on CSV text for you.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

steve-893342 SSCarpal Tunnel Points: 4439 More actions · Answer 3

You can also tackle this problem using OPENROWSET BULK.

The data is first referenced using OPENROWSET BULK via the CTE cteFileData. This uses the unstructured format file DataSampleBulk.fmt which has a single field called BulkColumn defined as SQLCHAR 8000 and terminated with #

The second CTE cteData then splits the file vertically using CROSS APPLY/Tally split method and in the same query reassembles the data into the 8 column components using PIVOT. The final SELECT on cteData resolves each column into the correct data type

;



WITH cteFileData AS

(

SELECT BulkColumn FROM OPENROWSET (BULK 'C:\DataSample.txt', FORMATFILE = 'C:\DataSampleBulk.fmt') AS X

)

,

cteData AS

(

SELECT [1],[2],[3],[4],[5],[6],[7],[8] FROM cteFileData

CROSS APPLY

(

SELECT [1],[2],[3],[4],[5],[6],[7],[8] FROM

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

SUBSTRING(BulkColumn + ',', N, CHARINDEX(',', BulkColumn + ',', N) - N) AS Value

FROM master.dbo.Tally

WHERE N < LEN(BulkColumn) + 2 AND SUBSTRING(',' + BulkColumn + ',', N, 1) = ','

) AS Z

PIVOT

(

MAX(Value) FOR ROW IN

(

[1],[2],[3],[4],[5],[6],[7],[8]

)

)

AS pvt

)

AS Y

)

SELECT

CONVERT(VARCHAR(20), [1]) AS COL1,

CONVERT(DATETIME, [2]) AS COL2,

CONVERT(DECIMAL(9,2), [3]) AS COL3,

CONVERT(DECIMAL(9,2), [4]) AS COL4,

CONVERT(DECIMAL(9,2), [5]) AS COL5,

CONVERT(DECIMAL(9,2), [6]) AS COL6,

CONVERT(BIGINT, [7]) AS COL7,

CONVERT(DECIMAL(9,2), [8]) AS COL8

FROM cteData

tosscrosby SSCoach Points: 18045 More actions · Answer 4

tosscrosby

SSCoach

Points: 18045

May 4, 2010 at 8:14 am

#1161829

WayneS (5/2/2010)

Jeff must be slipping :-D... usually when he gets involved, minutes go down to milliseconds! :w00t:

And then the $$$ to secure his services goes way up when that happens. Maybe he's simply trying to draw him into his lair before pouncing??? :w00t:

-- You can't be late until you show up.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 5

Digs (5/2/2010)

.."As we've seen on dozens of posts, an XML splitter would not have performed so well."..

Jeff you have seen the yahoo .csv link I posted in this thread.

I too investigated XML transfer of OHLCV data.

So you Say that plain TEXT splitting is fast than XML splitting into a DateOHLCV table.

Interesting..

Sorry for resurrecting a post that's more than a year old... I can finally answer this with some good bit of authority... please see the following article... particularly the comparison charts.

http://qa.sqlservercentral.com/articles/Tally+Table/72993/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions