SQL 2005 Database - Designing Mass Load Process

  • Hi All,

    The process requires reading files that is being dumped into a certain directory called the "Pending Directory", the "unknown module" must be able to read from the pending directory from time to time if there are other files being dumped.

    The file that is being dumped into the pending directory is called a "Batch File" which is a text file with a comma separated data inside it, format inside a batch file is similar to this:

    FORMAT:

    BATCH_ID,TABLE_ID,ROWID,MODE,COLUMN_ENTRY,COLUMN_ENTRY,COLUMN_ENTRY,.....

    ----------------------------------------------------------------------------------

    BATCH_ID = There is one(1) batch_id per batch file, and there are many tables per batch file

    TABLE_ID = The ID of the table on where to add, modify, or delete the data.

    ROWID = Typically the row identification per data

    MODE = This is the action that will take place for a certain data, its either Add(A),Modify(M), or Delete(D).

    COLUMN_ENTRY = certain data for a specific column in a table.

    e.g.

    1,22,1,A,Column 1 Entry,Column 2 Entry,....

    1,22,2,D,Column 1 Entry,Column 2 Entry,....

    1,22,3,M,Column 1 Entry,Column 2 Entry,....

    1,22,4,M,Column 1 Entry,Column 2 Entry,....

    1,22,5,A,Column 1 Entry,Column 2 Entry,....

    1,25,1,A,Column 1 Entry,Column 2 Entry,....

    1,25,2,A,Column 1 Entry,Column 2 Entry,....

    1,25,3,M,Column 1 Entry,Column 2 Entry,....

    1,25,4,M,Column 1 Entry,Column 2 Entry,....

    1,25,5,M,Column 1 Entry,Column 2 Entry,....

    I need to design a process with the following requirements:

    1.) A module or something that will timely read Batch files from the Pending directory

    2.) Convert the data into an XML format and validate it using XML schema

    3.) Log if there are erroneous data based from the validation above.

    3.) Insert the data into a Maintenance Database

    4.) After inserting the data, check the constraint, log if error or success.

    5.) All success data shall be inserted to Final Database.

    Right now, I am looking into the following services and technology (SSIS,DTS, and Bulk Insert) to see if it fits with the following requirements, I just don't know if I am on the right direction. If someone can tell me a concrete strategy or design on how to deal with this kind of requirements, then that would be great.

    FYI. I am using SQL Server 2005 Express Edition.

    Thank so much for your time.

  • in my opinion, SSIS would be better approach to follow. I/O of files and playing with XML data makes life more easier in SSIS. Also, there is extensive inbuilt logging functionality avaliable in SSIS which helps user to take corresponding actions. i will recommend to draft the process with your avaiable options, objects which you can use for this process (as sometimes, use of a specific object than other makes life easy).

  • anam (12/7/2008)


    in my opinion, SSIS would be better approach to follow. I/O of files and playing with XML data makes life more easier in SSIS. Also, there is extensive inbuilt logging functionality avaliable in SSIS which helps user to take corresponding actions. i will recommend to draft the process with your avaiable options, objects which you can use for this process (as sometimes, use of a specific object than other makes life easy).

    If I were to use SSIS, would it work using MS SQL Server 2005 Express Edition? and if not, can DTS be an alternative to use based from the requirement I posted previously?

  • SSIS sounds like the right tool for this task. It will definitely work with Express Editions.

    One issue you will have is that express edition does not have Agent and you will need to create a program to run the package on a schedule or on demand.

    The biggest positive is it is very easy to log the progress of a package and create various recovery points when there are issues. The other advantage is to run portions in parallel to speed processing.

    Alan

  • Alan (12/8/2008)


    SSIS sounds like the right tool for this task. It will definitely work with Express Editions.

    One issue you will have is that express edition does not have Agent and you will need to create a program to run the package on a schedule or on demand.

    The biggest positive is it is very easy to log the progress of a package and create various recovery points when there are issues. The other advantage is to run portions in parallel to speed processing.

    Alan

    Wow, that sound's good and a relief because I was researching over the net for any good article about SSIS running with Express Editions. What I found was a forum I've read a couple of days ago and was telling me that SSIS won't work on Express and I can't find any supporting documents/articles for this one. If you can suggest any link that will provide me information about SSIS or SSIS running on Express Editions then that would really be GREAT.

  • If you can suggest any link that will provide me information about SSIS or SSIS running on Express Editions then that would really be GREAT.

    i found a link where it tells how you can programatically execute SSIS packages.

    http://blogs.msdn.com/michen/archive/2007/03/22/running-ssis-package-programmatically.aspx

    create a small SSIS package and try to execute. give it a shot and see if you can execute SSIS package with Express Edition. Unfortunately i dont have this edition installed so can't test it.

  • anam (12/8/2008)


    If you can suggest any link that will provide me information about SSIS or SSIS running on Express Editions then that would really be GREAT.

    i found a link where it tells how you can programatically execute SSIS packages.

    http://blogs.msdn.com/michen/archive/2007/03/22/running-ssis-package-programmatically.aspx

    create a small SSIS package and try to execute. give it a shot and see if you can execute SSIS package with Express Edition. Unfortunately i dont have this edition installed so can't test it.

    Okay I will try to execute SSIS package with Express Edition. I was a little bit concerned on how will I deal with the situation on my original post using SSIS. Can someone please give me a brief architecture or design based from the original post using SSIS while I'll study and test the link you gave me. In this way, this will save me a lot of time as possible.

    Thank you guys for helping. I really appreciate it.

  • I ran into this Microsoft forum and it seems interesting...it seems SQLEXPRESS doesn't have SSIS:

    http://forums.microsoft.com/msdn/ShowPost.aspx?PostID=238335&SiteID=1

    (see post from cajinamaster and reply from Phil Brammer)

    and also this: https://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=3618597&SiteID=1

    So I can't use SSIS in MSSQL Express Edition then...have to try it though.

  • Unfortunately it seems you need dtexec.exe to run the package. I don't know what the distribution rights are for the executable but that is how all of our 400 jobs call SSIS packages. I can give you the batch file format for this but I don't know if you can use the visual studio copy for production use. You would have to run all from file but I think that is most common.

    I am using a Standard edition maintenance server to load several express edition databases but I use its' copy of dtexec.exe to perform these tasks.

  • You cannot use SSIS with SQL Express. The SSIS components themself are not included with this edition.

    Having said that, you have 2 options:

    1. Deploy better equipped version of SQL 2005 (or 2008).

    2. Implement the entire solution, without using SSIS at all.

    I don't recommend using DTS because MS started phasing-out the support for SQL 2000. Going forward you may encounter issues, which MS will not fix unless you pay them dearly.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • Thank you guys for the inputs.

    Alan,

    SSIS sounds like the right tool for this task. It will definitely work with Express Editions.

    I'm just a little bit confused why you said it will definitely work with Express Editions where in fact as CozyRoc have mentioned, "you cannot use SSIS with SQL Express. The SSIS components themself are not included with this edition.".

    CozyRoc,

    What features does SSIS have that will basically fit the objectives or requirements that I need to achieve as to my original post? Since I can still fight for SSIS and convince my manager to have license for the appropriate MS SQL version/release.

    If not, using Express Edition, what other technology would you suggest we can use without using SSIS or DTS?

    I'm sorry but I don't have enough knowledge for this one, and I am just making sure that I am on the right direction.

    Thank you for your time.

  • I'm just a little bit confused why you said it will definitely work with Express Editions where in fact as CozyRoc have mentioned, "you cannot use SSIS with SQL Express. The SSIS components themself are not included with this edition.".

    CozyRoc,

    What features does SSIS have that will basically fit the objectives or requirements that I need to achieve as to my original post? Since I can still fight for SSIS and convince my manager to have license for the appropriate MS SQL version/release.

    If not, using Express Edition, what other technology would you suggest we can use without using SSIS or DTS?

    I'm sorry but I don't enough knowledge for this one, and I just making sure that I am on the right direction.

    Thank you for your time.

    [/quote]

    I guess I should describe our layout. We have a central office with Full editions (Standard and Enterprise) of SQL server on several servers. In remote offices and customer locations we use SSIS to upload data to the SQL Server Express editions on their servers to keep them reasonably current. We also use a central SQL server running standard edition to backup and maintain all of the express editions from vendor software.

    The big advantage of SSIS is not having to code the data type matching and validation as you will have to do if you write your own loader. Additionally, any custom transformations, we have to convert from vendor date to SQL date all the time, can be easily reused and logged and bypassed if they fail to transform.

    As far as recommendations there are 3 I can think of quickly:

    1. Create a link server on the documents and transform and load them. I haven't done this but I pretty sure it is possible.

    2. Create a collection of loader scripts and run SQLCMDexe MS BOL

    3. My option as a programmer would be to create either a windows service or application to do the loading, validation, and error checking.

  • I think this is the misunderstanding.

    You can use SQL Express edition as a data source within SSIS however you will need SQL Satndard edition to implement SSIS itself. That would be the ideal solution for you but the cost my be prohibitive.

    The alternative would be to develop a suite of stored procs to manage this process something I have had to do many moons ago even before the advent of DTS!! This is more than doable and depending on your T-SQL skills can be very powerful and flexible.

    The only outstanding issue will then be the scheduling. I'm not sure what is the best option here. If sqlcmd is compatible with Express then you could leverage the windows scheduler to run a cmd file.

    All in all this would be a fairly bitsy solution but if that's all you have to work with then it might be you best bet.

    Rob.

  • Check the features included in SQL 2005 here.

    For SQL 2008 check here.

    It looks like you need at least SQL Server Standard edition.

    The other approach would be to implement in C# using the .NET framework what you have in mind. If you have good software engineers, this approach is not going to take too long. The .NET framework itself comes out-of-the-box with most of the plumbing you need.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • CozyRoc (12/9/2008)


    You cannot use SSIS with SQL Express. The SSIS components themself are not included with this edition.

    Having said that, you have 2 options:

    1. Deploy better equipped version of SQL 2005 (or 2008).

    2. Implement the entire solution, without using SSIS at all.

    I don't recommend using DTS because MS started phasing-out the support for SQL 2000. Going forward you may encounter issues, which MS will not fix unless you pay them dearly.

    Thank you guys for the knowledgeable inputs posted.

    CozyRoc,

    Do you have any good articles which states MS started phasing-out the support for SQL 2000 or 2005?

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply