DTS question

  • I need help to design the DTS (SQL 2000)as in such a way that it will weed out unwanted data including duplicate data. I am not SQL developer, so i am stuck how i can achieve this. Do i have to write any scripts or SP not sure.Any help will be appreciated & thanking in advance.

  • Can you give me an example?

  • If you want to cleanse the data then the data needs to be loaded as is in to staging table(s) and cleansing happens on the staging table(s). Once the data is cleansed(Deduped...) then move it to main table.

  • So between two pump tasks, you'll have an "execute sql" task that will contain the dedupe clean-up script. But we'd need examples if you want more help.

  • First of all thanks every body your feedback & i appreciate that. i am going to get pipe delimiter txt file with most of dupllicate rows with same customer name. Duplicate rows are in numbers from 5-10 for each customers. but my requirement is to weed out/eliminate those & take only required rows for that Customer & then inserted in Customer Table. my second step will be to check if customer exists with the same NAME & Data then exclude that row/customer then move on to another............

    I understand this may be confusing, but this what info i have so far. any help will be appreciated.

     

  • Actually I would write a procedure instead of DTS package.

    Step 1 - truncate table Staging Table

    Step 2 - EXEC MASTER..xp_cmdshell 'bcp stagingtable IN text file...'

    Step 3 - check if staging table is empty - error

    Step 4 - Delete record in staging table if record already exists in real table or check any requirement you have.

    Step 5 - insert into real table from staging table.

  • thanks for your reply. but i need to get data out first from TXT file before i try to put into Staging table.

  • "First of all thanks every body your feedback & i appreciate that. i am going to get pipe delimiter txt file with most of dupllicate rows with same customer name. Duplicate rows are in numbers from 5-10 for each customers. but my requirement is to weed out/eliminate those & take only required rows for that Customer & then inserted in Customer Table. my second step will be to check if customer exists with the same NAME & Data then exclude that row/customer then move on to another............

    I understand this may be confusing, but this what info i have so far. any help will be appreciated"

    Yes, this is very confusing. First of all you say that customer name will be duplicated but the question is if anithing else in the row is duplicated or not. If not how can you tell which data should be improted (from which row) and wich shouldn't?

    If the whole row is duplicated then is simple: import the whole txt file into a table and then just use SELECT DISTINCT to filter out the duplicates. Of course you might need more intermediate tables but that's the ideea.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

  •  need to get data out first from TXT file before i try to put into Staging table.

    I don't understand what you meant by get the data out before putting them into staging table.   If you did not plan to put them in the staging table, where were you going to put the data ? The bcp command is used to extact the data out from text file and put it into the database.

     

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply