Altering time values in text file

  • Hi all,

    Firstly, this is a home project, so no rush on this, more of an interesting way for me to learn SSIS while having fun simultaneously.

    I have a txt file used for a film subtitles. The titles and speech are about 1 second ouf of sync, ie the titles appear about 1 second after the character speaks the equivalent dialogue. Here's an example of the data, explanations in brackets:

    793 (number of the subtitle)

    00:49:08,017 --> 00:49:10,130 (Start/Finish time of subtitle appearance)

    Can you go over to the Ministry now? (Screen text)

    794 (number of the subtitle)

    00:49:10,130 --> 00:49:12,243 (Start/Finish time of subtitle appearance)

    Now? What about the game? (Screen text)

    Now, the film buffs amongst you may be thinking, "Ahhh, but many media file players allow you to advance or delay the subtitles as you wish!". True, as I found out myself in VLC after starting this post, but from a learning point of view I'd still like to know how to pull this off in BIDS/VBScript etc.

    So, I'd need a way to have the package recognise the strings immediately surrounding the --> as times, and then to decrement them by 1 second, so 00:49:10,130 --> 00:49:12,243 becomes

    00:49:09:130 --> 00:49:11,243

    Thanks in advance all,

    J

  • Not sure SSIS is the right tool for this, but here goes:

    * Read the file with a flat file source. Set it up as one column only.

    * Add a derived column. If the substring "-->" is found, take the substrings before and after the arrow.

    * Convert these substring to datetime and then use the DATEADD function to substract one second.

    * Convert the datetimes back to strings.

    * Write the changed data to a flat file.

    Note: this will be one hell of an expression in the derived column 😀

    Maybe it is easier to do in a script task with a real programming language.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thanks mate, that looks like a good starting plan. Would T-SQL be a better language for the conversion?

  • T-SQL is much easier to handle conversions than the derived column component, but, depending on the specific situation, the derived column can process rows faster in a data flow (And I say it again: it depends).

    If your dataset isn't that large, you can read the dataset, write it to a staging table and process it with T-SQL.

    One last word of advice: try to avoid loops/cursors in T-SQL.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • It's just a diddy size, 1200 rows, nothing that'll put SQL in danger of breaking a sweat.

    😀

    I'll look into the T-SQL of it but this is already a good education in deciding to use what tool for which job.

    Thanks for your help!!

  • No problem, glad to help 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply