Export same select into different files

  • Example:

    declare @test-2 table

    (

    col2 varchar(15),col3 numeric(12,2)

    )

    insert into @test-2

    select 'File1', 15 union all

    select 'File1', 16 union all

    select 'File1', 17 union all

    select 'File2', 18 union all

    select 'File2', 19 union all

    select 'File2', 10 union all

    select 'File3', 11 union all

    select 'File3', 12 union all

    select 'File3', 13 union all

    select 'File3', 14 union all

    select 'File4', 17

    select * from @test-2

    given this example (in reality it will end up being around 30k files), is there a simple (not ridiculously complicated) way to direct this single select into several files?

    My first thought was cascading foreach loops, but that would take a crap load of time running, and as I'm new to the SSIS world, I'm kinda clueless on this.

    Any thought appreciated

    --
    Thiago Dantas
    @DantHimself

  • Strange requirement!

    Can you do it just once and then do a file copy to the others - rather than running the data transfer over and over again?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • If you're doing this in the Data Flow (which you should), then you can use the Multicast component to make "copies" of the data flow - one in, two (or more) out - and send those to independent destinations.

    Todd McDermid - SQL Server MVP, MCTS (SQL 08 BI), MCSD.Net
    My Blog - Dimension Merge SCD Component for SSIS - SSIS Community Tasks and Components

  • tmcdermid (2/24/2011)


    If you're doing this in the Data Flow (which you should), then you can use the Multicast component to make "copies" of the data flow - one in, two (or more) out - and send those to independent destinations.

    sounds good, except I will need *many* destinations!

    the way I see this, and please correct me if I'm wrong, is hardcoding the destinations, which is not possible.

    @Phil

    you mean, extracting the dataset to a single file and iterate it to generate other files? Not a bad idea really.

    Any more ideas? Anyone?

    --
    Thiago Dantas
    @DantHimself

  • Pretty much what I meant.

    Where are you getting the destinations from?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I apologize, I thought it was clear. The destinations come from the query.

    take a look at the example

    what i need (with this example) is

    filename: file1.csv

    contents:

    File1 15.00

    File1 16.00

    File1 17.00

    filename: file2.csv

    contents:

    File2 18.00

    File2 19.00

    File2 10.00

    filename: file3.csv

    contents:

    File3 11.00

    File3 12.00

    File3 13.00

    File3 14.00

    filename: file4.csv

    contents:

    File4 17.00

    --
    Thiago Dantas
    @DantHimself

  • Ah, in that case you'll need a conditional split to split out the data flow to the right destinations.

    However, if you have a lot of destinations, it can be a PITA to configure it all.

    So maybe you can create some script task in .NET to handle it dynamically?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Hmmm, wasn't clear to me! But now it is and my suggestion was not valid.

    My first idea would be to use an Execute SQL task to populate an object variable with the results of

    Select distinct filename

    from ...

    Now you have something to work with.

    Next task is a foreach loop which iterates around ('shreds') the object variable.

    Inside the foreach loop is a dataflow task which creates the files one at a time.

    Your source query within the dataflow will be an expression

    select * from ... where filename = object variable (can't remember the syntax, sorry)

    and the destination path will also be an expression using the object variable.

    This is quite advanced for an SSIS newbie 🙂

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (2/24/2011)


    Hmmm, wasn't clear to me! But now it is and my suggestion was not valid.

    My first idea would be to use an Execute SQL task to populate an object variable with the results of

    Select distinct filename

    from ...

    Now you have something to work with.

    Next task is a foreach loop which iterates around ('shreds') the object variable.

    Inside the foreach loop is a dataflow task which creates the files one at a time.

    Your source query within the dataflow will be an expression

    select * from ... where filename = object variable (can't remember the syntax, sorry)

    and the destination path will also be an expression using the object variable.

    This is quite advanced for an SSIS newbie 🙂

    that was kinda how I thought of it at first, but isn't a foreach like this pretty slow?

    Speed is not really required here, but I don't want it running for hours either

    --
    Thiago Dantas
    @DantHimself

  • One way or another, you've got to create all of those files. Given that you need the destination to be dynamic, I can't think of a easy way to make it run in parallel - I guess you could break the files into ranges and have multiple foreach loops - one for each range - to introduce some parallelism. Not nice though.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • the way im doing it right now is a foreach loop only to loop through the 5 databases i need to query.

    the query also returns parts of the destination folder, to ease up my work

    ADO.NET source with the source query and a Script Component as destination

    inside the script component, on ProcessInputRow method i define a new IO.StreamWriter with the destination assembled, build the row with a Text.StringBuilder and call writeline on the streamwriter, and close it.

    anything else i can do to speed this up? (im even newer to this kind of programming, and the company devs in Java :-P)

    --
    Thiago Dantas
    @DantHimself

  • Just got error 0xC0047020(unable to process the data) on the ado.net source then a following 0xC02090F5(something with PrimeOutput method in the message) error

    I have no idea what this means and google hasn't been really useful =|

    Anyone knows what these are?

    --
    Thiago Dantas
    @DantHimself

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply