DTS or BCP or ???

  • Hi, I'd like some help deciding between DTS and bcp and anything else I haven't heard of yet. Here's the scenario:

    I need to automate the import of a monthly fixed width text file to a SQL server 7.0 table. The data cannot be imported directly, 2 fields have to be transformed and the data gets split into two tables (based on a 'type' code). Each monthly file is named differently by year/month.

    I created a DTS Package that imports a single one of these files just fine but I don't know how to make the DTS Package choose the new file each month automatically.

    I suppose I could also use bcp to a holding table, and then use a stored procedure to do the transformation and move the data to the live table.

    In either case, I'd like this to be wrapped up in some mechanism in SQL Server that periodically checks for the new file and imports it.

    Other info: The table is typically around 20mb in size/150,000 records, the fields to transform are 1) an odd date field and 2) a numeric field that has a text byte at the end to signify +/- and decimal value.

    Your help and suggestions are greatly appreciated!

  • In my work, I have many many text files to import. I created a system that periodically scans a directory for a particular file name I store in a table, imports through bcp the contents of the file, scrubs whatever transformations are needed using a queue based scrubbing process to work through various scrub states, and finally updates or inserts the results into a production table. It handles multiple formats within a single file (ie. header, detail, trailer or type1, type3, type5, type 7, etc.), logs the process for later auditing, balances the import processing across an unlimited number of job processors, and is quite well commented throughout the code. It averages around 14,000,000 records in 7 1/2 minutes on my hardware, from currently 58 different files, handling EBCDIC to ASCII translations, pseudo julian date conversions, etc...one of the conversions I do on some of the files, is the decimal field with text to indicate +/-. A common EBCDIC to ASCII Overstrike character conversion.

    If you decide to go the bcp route, I would be happy to share my script library with you.

    Edited by - Scorpion_66 on 11/19/2002 4:52:48 PM

  • Thanks for the quick reply and, yes, I would be interested to see the script library.

  • If you want a TSQL solution use BULK INSERT, your SP could then be passed with a filename and that can then run BULK INSERT into a table which you then manipulate.

    FYI

    BULK INSERT is the fastest way of importing data.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • By using the bulk insert we cannot automate the process I think.Since we need to pass the value to variable in the stored procedure

  • How would using Bulk Insert prevent that? I don't see it.

    Anyway, Greg, I sent an E-mail to the E-mail address you have posted with some initial scripts that you could quickly chop up and apply to your needs.

    Edited by - scorpion_66 on 11/20/2002 12:25:55 PM

  • Thanks, scorpion_66! I'll be trying the bcp method next.

    As I'm a relative SQL newbie, I'm doing this same project several ways and comparing results. I just did the first phase of the BULK INSERT method...man is that fast, but is there a way to get the data into separate fields DURING the BULK INSERT? Source data is in a fixed width file with no delimiters. Right now I'm using SUBSTRING (and various string functions for the two fields that need transformation) to parse each line into another temp table after the bulk insert.

    Thanks again for all your help.

  • You need to use a format file, to import fixed width into different columns

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • Thanks, Simon. It was a bit of a pain to create the file, but it works! imports 108000 rows/10Mb in under 7 seconds. I don't expect the transformations to go so quickly...OTOH, not so bad. under 20 seconds including transformations!

    Any thoughts on how to launch this stored procedure, (including how to find and pass a filename), from inside SQL Server? I know how to do it from vb, but it'd be nice to have it work from inside the database.

    Edited by - GregLyon on 11/20/2002 6:11:10 PM

    Edited by - GregLyon on 11/20/2002 7:31:22 PM

  • You can launch the proc regularly by scheduling as a job. Using the Enterprise Manager: Management-->SQL Server Agent-->Jobs-->New Job...

    -Dan


    -Dan

  • I can't seem to substitute an input variable for a string in the bulk insert code below. Any ideas? (Error just states 'Error 170: Line 12: Incorrect Syntax Near '@ImportFileSpec' when I 'check syntax').

    Heres my SQL...

    Alter Procedure "cspALDBImport2"

    (

    @ImportFileSpec VarChar(255),

    @FormatFileSpec VarChar(255) = 'C:\MSSQL7\BCPFiles\aldb.fmt'

    )

    As

    set nocount on

    TRUNCATE TABLE CUSP001.dbo.tmpALDB_BI2

    BULK INSERT CUSP001.dbo.tmpALDB_BI2

    FROM @ImportFileSpec --THIS LINE IS THE PROBLEM

    --FROM 'C:\test.dat' --This was what I used for tests, it works.

    WITH (FORMATFILE = @FormatFileSpec)

    <snip>

    Thanks,

    Greg.

    Edited by - GregLyon on 11/21/2002 3:09:04 PM

  • As you are using a variable, not a static value, you need to put the BULK INSERT statement inside an EXEC statement.

    EG:

    EXEC ('BULK INSERT CUSP001.dbo.tmpALDB_BI2 FROM ''' + @ImportFileSpec + ''' WITH (FORMATFILE = @FormatFileSpec)')

    Thanks

    Phill Carter

    --------------------
    Colt 45 - the original point and click interface

  • Thanks for the quick reply Phill. Works like a charm.

  • The scripts I sent you have examples of the usage of exec but with bcp instead. And in my example, I read the file specs from a table.

    Also, if you have huge files, and lots of them, you might consider one scheduled job scanning for the file and importing it, and another to come along for the transformations. In my case, I had to let go of the file just as soon as possible in order to recieve or be able to recieve another like it, so I hand the conversions off to another process. Also, this way, it seperates the file handling from the conversion. Eventually, you'll run into a conversion failure, and you should decide if it should fail the full import, etc...

    The last part of my import renames the file into a subdirectory created daily. So the ftp process updates a file if it exists, puts it there if it doesn't, and the import process imports it and moves it to archive. The only problems I experience are if I recieve two files back to back before I can finish processing the first one. Some of my files are 3 - 4 GB in size, and can take an hour to load. They took over three when I was still doing conversions on the import.

    May not be applicable to your project, but something to think about, anyway.

  • Cathan (Scorpion_66), I was started down the trail of using your exec examples, but it didn't seem to work for me to build the SQL statement and store it in a variable from within the SP. Or more specifically, the command:

    SET @Foo = 'BULK INSERT CUSP001.dbo.tmpALDB_BI2 FROM ''' + @ImportFileSpec

    + ''' WITH (FORMATFILE = @FormatFileSpec)'

    EXEC (@foo)

    didn't work. I always got an error message about using the variable. Seems that Phill's example is the only way I can get it to run from right within the SP. I'd really like to understand how yours works, because it makes more sense to me (coming from a programming background) to build the string first, then execute it. You use SELECT in ways that leave me scratching my head, perhaps that's what I'm missing in my sql? ...SELECT @Foo before the exec?

    Thanks again for all your help!

    Greg.

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply