FLat file Normalization

  • Hi Can Any one help,

    Well i got a flat file (in{CR}{LF} format) and needs to be imported into a sql server 2000.

    now i need to know how can i create a package using DTS or some thing which kindoff breaks the data down and insert it into appropriate tables from the flat file. I need to find a way to carry it out in one process as the data is imported through the pipe. SO i have multiple tables which would eventaully contain the data from flat file.So ill Be getting the Data after every two weeks which will need to be imported into SQL server 2000. So is there any way i can automate the process.

    regards

  • There are an awful lot of variables that you have not covered in your request.  This task should be realtively simple for DTS to handle.  Open a connection to the flat file and process the data, pushing it to the correct tables.  That is what it is designed to do (move data from one place to another and transform it along the way). 

    Other things you might want to consider:

    1. Will this be a completely automated process? (i.e. run on a scheduler)

    2. Possibility of getting a duplicate file by mistake.

    3. Is there a need to "undo" a run (i.e. unload all the data from one run because of bad data,duplicate file, etc)

    4. Do you need to be notified if you don't receive a file

    5. etc.

    All the above require additional thought and programming to handle, but if this is just a down and dirty run every two weeks then you could simply run the dTS package manually each time you receive the file.

  • HI There,

    thanks for clarification. But i have tried getting around it with DTS. I can Import Stuff from Flat file into a single table which again would be a Flat table but is there any way i could import the data into different tables, like in normalized way, even at the expence of running the package every two weeks.

    regards

  • Don't do a bulk import.  Use a SQL Connection Object, a Flat File source object and then multiple transformation objects (one for each table you will transform into).  Each transformation object should access only the columns of the flat file you need for a specific table.  You should be able to tweak it to execute in a specific order I think (use workflow).

    James.

    PS. Personnally I find it easier to use DTS to do the bulk load into a "staging" table and then a Stored Procedure for the fine work of transforming the data into multiple tables (but I like writting code).

  • Ali,

    When you say "flat file", do you mean all of the rows have a "fixed field length"? 

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yep the rows have fixed lenght.

    Thanks for the help. actually i have managed to as i told mark created the store procedures which contains cursors and are hooked up with insert store procedures, which in essence does the job. but ideally i want to automate the process by using jobs and trigger it from the web. So some one would upload the flat file every two weeks and all the relevent steps within job are fired and files are uploaded in one process.

    So any help would b much appreciated.

     

  • My recommendation is to listen to JLK. Bring the data into a flat table and then fire off a sproc to break it up. You could probably do the same thing strictly within DTS by reading some columns into one table, then other columns into another, and so on. But when moving data from one place to another, generally the best way is connect, shove the data through, disconnect. Then, more or less at your leisure, process the data.

    That way, if some of the data is corrupted or just doesn't fit your working parameters or any of a long list of things that could go wrong, you aren't risking breaking the connection and having to restart once you figure it out. Bring the data in, process what can be processed, flag what couldn't, check out any problems later.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

  • Y-e-e-e-o-o-ow!  Cursors, Ali?  You don't need cursors for this... James and Tomm are correct... import the whole file into a very wide staging table and then have a process for each record type process the records by identifier... will be much quicker and a heck of a lot easier to maintain.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • How do you know when to run the import from the flat file?  Is there a certain time that it will be there so you can schedule the processing or are you alerted in some other way that it can be processed?

    If no scheduled time that the file exists, save dts created to process it into the staging table, create a stored procedure to distribute the data into the appropriate table, and manually run the dts and stored proc when needed.  If it can be scheduled, all the better.

    Steve

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply