Help Needed in Processing Data

  • Please find the attached files for the reference and please let me know if i need to provide any additional info.

    thank you for your precious time.

  • KGJ-Dev (8/15/2015)


    Please find the attached file for the reference and please let me know if i need to provide any additional info.

    thank you for your precious time.

    Heh... that's also generated data and not from the original file. If I needed the generated data, I could easily generate a million rows to test with.

    Let's hope the original file has the same end of line delimiter as what you've generated.

    Shifting gears, it looks like everything you did is right. Does it work for you?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff,

    Thank you

    yes, the end of character will be CRLF (carriage return line feed). also i tried with second option which was mentioned in my samplescripts attachment in my previous email and am getting the error as

    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 5000, column 6 (PhoneNumber).

    Because the last line doesn't have the CRLF.

    Any help please....

  • KGJ-Dev (8/15/2015)


    Hi Jeff,

    Thank you

    yes, the end of character will be CRLF (carriage return line feed). also i tried with second option which was mentioned in my samplescripts attachment in my previous email and am getting the error as

    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 5000, column 6 (PhoneNumber).

    Because the last line doesn't have the CRLF.

    Any help please....

    Ah. Common problem with no straight forward solution. Does the last line in a real file that you have contain the final CRLF?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • There's something else going on here. Here's the error that I get.

    Msg 4866, Level 16, State 1, Line 1

    The bulk load failed. [font="Arial Black"]The column is too long in the data file for row 1, column 6.[/font] Verify that the field terminator and row terminator are specified correctly.

    Msg 7399, Level 16, State 1, Line 1

    The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.

    Msg 7330, Level 16, State 2, Line 1

    Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff,

    thanks for your time on this and what i did was i just added line feed for the last row and am able to achieve without any errors. I checked my actual files and all the files are having carriage return line feed for the last row. so i should not be getting any issue with my actual files.

    Do you have any questions for me?

  • Heh... ya just gotta love the nuances of some of these things along with the lack of documentation on them. For example...

    1. It's documented that if you use BCP interactively, the \ r will automatically be added to \ n. For non interactive use, it must be specified.

    2. Apparently, BULK INSERT is considered to be an "interactive" use, although it's not documented as such and has no info specifically stating that \ r will be automatically pre-pended to \ n. My short coming on that was the fact that I usually use xp_CmdShell and BCP for my imports due to the "extended reach" that xp_CmdShell has for cross-server work and I rarely have to specify the row terminator when I do use BULK INSERT.

    What that means is, as non-obvious as it is, your first BULK INSERT code does the intended job. I've also verified that the text file that you've provided does, in fact, use CrLf as the row terminator and does, in fact, have such a terminator on the last line of the file.

    Still, the PhoneNumber column in the table is defined as an INT and while it will certainly handle the test phone number of 1111111111 (ten 1's), it will not handle anything over a binary 2 billion (2147483647) and must be changed, probably to BIGINT so as to continue the automatic checking for numeric values although more validation to reject negative and short numbers should be included as a column constrain.

    Also, there's not reason for AGE to be an INT. No one can be a negative number of years old and, most certainly, I don't know of anyone that lives for more than 255 years. The AGE column should be changed to TINYINT, which will automatically reject negative numbers.

    With all of that in mind, the following works as expected against the Users.txt file that you provided.

    CREATE TABLE dbo.Users

    (

    ID BIGINT IDENTITY(1,1)

    ,Name VARCHAR(60)

    ,Age TINYINT --Changed to TINYINT

    ,DOB DATE

    ,Salary MONEY

    ,[Address] VARCHAR(500)

    ,PhoneNumber BIGINT --Changed to BIGINT

    ,Date_Created DATETIME DEFAULT GETDATE()

    ,HasProcessed BIT DEFAULT 0

    CONSTRAINT PK_Users PRIMARY KEY CLUSTERED (ID)

    )

    ;

    GO

    CREATE VIEW dbo.vw_Users

    AS

    SELECT Name,Age,DOB,Salary,[Address],PhoneNumber FROM dbo.Users

    ;

    GO

    BULK INSERT dbo.vw_Users

    FROM 'E:\Test\Users.txt'

    WITH (

    FIRSTROW = 1

    ,FIELDTERMINATOR = '|'

    ,ROWTERMINATOR = '|'

    )

    ;

    As a bit of a sidebar, I never pluralize the name of a table because I also never use just "ID" as a column name. I always follow the naming convention of tablenameID when using such a column. UsersID and, certainly, something like CompaniesID just doesn't float my boat for naming.

    I also ALWAYS name the PK of a table using the PK_tablename convention for permanent tables because I may have to someday disable the constraint programmatically and don't want to have to tolerate a seemingly random machine-assigned name. It also takes the guesswork out of what I should look for when it comes to what the PK constraint is called.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • since this is research i am doing it, next i am going to try with dumping the data as it it with Bulk insert and will be using DelimtSplit8k function (obviously your asset :-)) to split the data and insert into actual table. I have one quick question on that. when we pass the row to function, it return as table with two columns (number and item). basically i need the item from the fuction which i need to insert into my actual table.how to assign the rows am getting from the function output to column values in my actual table?

    Because i am not going to loop through. i would be using cross apply. any suggestions / samples please

  • Hi Jeff,

    All your concerns are very valid and Regarding table name, age datatype, ID are just sample structure to play with data on this thread. As i said in my earlier post, this is not my real table. This is i created to do testing.I Apologize for that if those naming conventions and sample datatypes confused you.

  • ahh...

    I could figured out the logic around. Below is the try for my another question and it works fine

    select

    MAX(Case when ItemNumber = 1 then Item else '' end) as Name,

    MAX(Case when ItemNumber = 2 then Item else '' end) as Age,

    MAX(Case when ItemNumber = 3 then Item else '' end) as Salary,

    MAX(Case when ItemNumber = 4 then Item else '' end) as Address,

    MAX(Case when ItemNumber = 5 then Item else '' end) as phonenumber

    FROM EDEN_Nutrition.dbo.DelimitedSplit8K('User1|24|2015-07-31|10.00|Address1|1111111111|','|')

    Thanks for your awesome function. and once again thank you so much for replying to my post and mentoring me on this weekend.

    you are a great mentor!!!!

  • KGJ-Dev (8/16/2015)


    Hi Jeff,

    All your concerns are very valid and Regarding table name, age datatype, ID are just sample structure to play with data on this thread. As i said in my earlier post, this is not my real table. This is i created to do testing.I Apologize for that if those naming conventions and sample datatypes confused you.

    It wasn't confusion on my part. They were suggestions on my part. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • KGJ-Dev (8/16/2015)


    ahh...

    I could figured out the logic around. Below is the try for my another question and it works fine

    select

    MAX(Case when ItemNumber = 1 then Item else '' end) as Name,

    MAX(Case when ItemNumber = 2 then Item else '' end) as Age,

    MAX(Case when ItemNumber = 3 then Item else '' end) as Salary,

    MAX(Case when ItemNumber = 4 then Item else '' end) as Address,

    MAX(Case when ItemNumber = 5 then Item else '' end) as phonenumber

    FROM EDEN_Nutrition.dbo.DelimitedSplit8K('User1|24|2015-07-31|10.00|Address1|1111111111|','|')

    Thanks for your awesome function. and once again thank you so much for replying to my post and mentoring me on this weekend.

    you are a great mentor!!!!

    Nah... you're a good student.

    As you've probably found out, the idea of importing the data as "whole row" data and then splitting it is quite a bit slower than importing based on delimited fields to columns. And ,yes, I do understand that it was just an experiment. Glad to see you got this running for a single input. Take a look at the header for the function to see the usage example for using it against many rows.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • thanks Jeff and that is my next action item. Right now am closing this thread. If something goes wrong, i will open it again for the discussion.

Viewing 13 posts - 16 through 27 (of 27 total)

You must be logged in to reply to this topic. Login to reply