How do I dynamically select a source folder and file in the Foreach Loop Enumerator

  • Please help...

    I'm building a package to import multiple logs(flat files) from a dynamic location. The location for the source folder will be stored in a database table. I've hit a stopping point on the Enumerator Configuration of the Foreach Loop container. There are two problems:

    1. The location of the source folder for the files will be dynamic and stored in a database table. I need to query the database and use the location to the source folder in the connection.

    2. A program will be writing to one of the files at the time the SSIS package is ran. I need to select all files except the current file, which may still have data added to it. The files have a timestamp as part of the name. So if I could add an expression to exclude a file with a name that is within 15 minutes of the current time, then that would solve the problem...hope this makes sense.

    So far I have an Execute SQL Task in the Control Flow to get the source folder path. Using the Foreach File Enumerator it looks like I have to hard code a folder path if I want the Foreach Loop container to enumerate on a flat file. Also, in the Enumerator Configuration, the selection criteria for the file does not seem to be dynamic. Ie. I have to select files with a predefined criteria such as *.log.

    Has anyone done anything like this that can offer some pointers? Maybe I'm using the wrong enumerator??

    PS. I found this video that from Brian Knight that shows the basics of what I am doing. All I need are the other two pieces. http://www.sqlshare.com/looping-through-and-loading-files-with-ssis_34.aspx?searchid=61837

    Thanks in advance..

  • On the Collection tab of the Foreach Loop Editor and on the right side, not the left, expand Expressions (Expressions occurs twice on the screen). Look for CollectionEnumerator. Set this equal to the variable that contains the source folder path.

    The first step inside the container could read the name of the file and check the timestamp. It could either accept or reject the file.

  • Hi kbatta,

    Thank you very much for the response. here are my results:

    On the Collection tab of the Foreach Loop Editor and on the right side, not the left, expand Expressions (Expressions occurs twice on the screen). Look for CollectionEnumerator. Set this equal to the variable that contains the source folder path.

    ...This worked beautifully. Originally I wasn't able to get into the expressions because of an application error, but after installing SP3, I was able to implement what you recommended.

    The first step inside the container could read the name of the file and check the timestamp. It could either accept or reject the file.

    ...I'm not sure how to do this part. Should I use the Execute SQL Task to do this or is there a Control Flow Item for this specific purpose.

    Thanks again!

  • One approach is to use a script task. This is the most straightforward approach but requires knowing the script task language.

    Another possibility is to use a data flow task. The trick is to get it to run just once since you do not want to read the selected file yet. Perhaps the source could be a select top 1 * from some table. You could then use derive column tasks to parse the file name and extract the timestamp.

    With either approach you will need to get a value into a variable indicating whether to accept or reject the file.

  • I ended up using the script task for this:

    Public Sub Main()

    Dim DateModified As Date = DateTime.Now.AddMinutes(-15)

    Dim SourceFile As String = Dts.Variables("SourceFile").Value.ToString()

    If File.GetLastWriteTime(SourceFile) <= DateModified Then

    Dts.Variables("MoveFlag").Value = True

    End If

    Dts.TaskResult = ScriptResults.Success

    End Sub

    ...then modified the precedence constraint from the script task to include the expression @MoveFlag = true

    Sorry it took so long to post the result but there was a lot more going into this than what was posted here and that had to be worked out.

    Thanks!

  • Hi ,

    I have a similar problem.

    I have a set of files with dates and time, in the form of YYYYMMDD_HHMM.

    I have to pick the latest file and load it.

    I have used a for-each loop Container, but how do I compare the 2 consecutive Files ?

    like, Do I have to store the first file and then compare and then load.

    There might be more than 10 files in the folder.

    Thanks.

  • Set up a variable to hold the latest file so far. In your script task inside the for each loop container, compare the current file to the file in the "latest file variable". If the new file is later, update the "latest file variable".

    When the loop is finished, you will have the file name you need.

  • Thanks Kbatta. I will try it out.

  • Hi kbatta,

    I implemented your idea and I am happy that it works.

    I am stuck at another point now.

    Is there any way that I can find out that the for each loop is running for the last time so that I can run the dataflow task with the latest file so that the latest data is there in the Database.

    Thanks in advance

  • Hi,

    I did the same with the help of 2 script tasks, one inside the foreach loop task and the other one, outside.

    Thanks

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply