Trouble with Excel Data Source

  • I am working on a project that based on the date I need to find a specified Excel file and import it into SQL. I think I can or have most of it figured out for dynamically creating the connection. However, I am running into a problem with being able to choose a tab from the file in the Excel Data Source. I have changed the connetion to just point to a file for now and that won't work. No matter what version of Excel I chose I get an error stating: "External table is not in the expected format." The Excel file is very large with multiple tabs but I don't think that should make a difference. I can change the Connection Maanger to other files and everything seems to work just fine.

    Anyone run into this themselves or have some insight i may try? Very frustrating and it just doesn't seem to make sense but maybe I'm too close at it to see the forest through the trees.

  • Dynamically importing Excel files (or flat files) will only work if the structure and worksheet names remain the same for all the different Excel files.

    When you create an Excel Source, you initiate a metadata contract between SSIS and the Excel file. If you just change the connection string to another Excel file with different metadata, you are violating that contract.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The file structures are all the same. It also doesn't seem to matter on these particular series of Excel files because I tried it by linking directly to the file and not dynamically and still no luck. I can, however point to a different Excel file and it will work.

  • Discovered something interesting this afternoon. The files are saved as 97-2003 Excel files and for these particluar files it doesn't matter what I do, I get the errror indicated in the original post. If I save the files in 2007 format then there aren't any problems. The only issue is I don't know if I will be able to get the owner of the files to change to this format.

    When I saved it to the 2007 format it indicated that a VB Project cannot be saved in macro-free workbooks. I saved it as macro-free. I then went back into the 97-2003 format and there were a couple of old macros so I removed those and saved in that format and it still didn't work. Not sure what else to try in order to get the original files to work.

  • How did you configure the Excel connection manager? Are you sure you selected 97-2003 file format there?

    The provider for .xls is the JET OLE DB provider, while the provider for .xlsx is the ACE OLE DB provider, so you must make sure you are using the correct one.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes, I did select the correct one. In fact, when it didn't work on the one it was supposed to I tried all of the Excel versions but got the same results with each.

  • How did you configure the dynamic connection?

    How did you configure the Excel Source component in the dataflow?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I created an Excel connection and then set the ExcelFilePath Expression to be what I wanted for the file.

    For the Excel source I then chose the connection but when you try and pick the Name of the Excel sheet that is when I get the error.

    I am able to do all of this with no problem for 2007 Excel files but unfortunately I have some that are earlier versions and it just can't get past the error. I can choose other Excel files of that version and they work fine.

    I even tried saving the 2007 files as 97-2003 files just to test them but for those particular files when I did that I then got the error.

  • Can you post the expression for the ExcelFilePath?

    If possible, also the entire connectionstring.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Here is the expression:

    @[User::DataLoadDirBegin] + (DT_STR,4,1252)YEAR( DATEADD( "dd", -1, GETDATE())) + @[User::DataLoadDirEnd] + RIGHT("0" + (DT_STR,4,1252)MONTH( DATEADD( "dd", -1, getdate())),2) + "-" + SUBSTRING((DT_STR,4,1252)YEAR( DATEADD( "dd", -1, getdate() )), 3, 2) + " Win Report Worth.xlsx"

    The connection string is:

    Provider=Microsoft.ACE.OLEDB.12.0;Data Source=F:\Financials\Worth's WIN Reports\2011 Win Reports\03-11 Win Report Worth.xlsx;Extended Properties="EXCEL 12.0;HDR=YES";

    This is for the one that works, so the 2007 version.

    Another question that just surfaced is it appears I have a warning on the source because someone has the file open and so the package fails. Is there a way around that?

  • It's not completely clear what you're trying to do. Your original post said you wanted to "choose a tab from the file in the Excel Data Source". That can be tricky but is possible with Excel 97 files.

    As Koen has said, for the "Excel Source", the worksheet name and column headers must remain the same regardless of the file name. However, there is a way to make the worksheet/tab name dynamic. You may create an OLE DB Source and specify that it use the Microsoft.Jet.OLEDB.4.0 provider. You'll need to enter the "Extended Properties" of "Excel 9.0;HDR=YES" at the top of the "All" page of the connection editor. Then, because the generic OLE DB source editor is more flexible than that for the Excel Source, you may specify your filename expression as the ServerName in the Expressions for the connection manager and the worksheet name as a "Table or View...." variable in the data source editor. Remember to put a dollar sign on the end of the worksheet name in that variable.

    Other comments on what's been posted here so far:

    The expression you posted is OK and does evaluate to the full filename (I inferred the values of DataLoadDirBegin and DataLoadDirEnd and tried it in SSIS myself).

    You posted your connection string for Excel 2007, but as you're trying to get this to work with an Excel 97 file, let's see your connection string for that version. It should look pretty much like this:

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=F:\Financials\Worth's WIN Reports\2011 Win Reports\03-11 Win Report Worth.xls;Extended Properties="Excel 8.0;HDR=YES";

  • Is it possible for you to save the Excel files as TAB separated textfiles instead?

    I generally try to avoid importing native Excel files because of the problems you

    others report.

    I put a build in macro in the Excelfile which will save a textfile copy when the

    workbook is closed. Then I use bulkinsert to import the textfile.

    Gosta M

  • john.arnott (3/10/2011)


    However, there is a way to make the worksheet/tab name dynamic. You may create an OLE DB Source and specify that it use the Microsoft.Jet.OLEDB.4.0 provider. You'll need to enter the "Extended Properties" of "Excel 9.0;HDR=YES" at the top of the "All" page of the connection editor. Then, because the generic OLE DB source editor is more flexible than that for the Excel Source, you may specify your filename expression as the ServerName in the Expressions for the connection manager and the worksheet name as a "Table or View...." variable in the data source editor. Remember to put a dollar sign on the end of the worksheet name in that variable.

    Other comments on what's been posted here so far:

    The expression you posted is OK and does evaluate to the full filename (I inferred the values of DataLoadDirBegin and DataLoadDirEnd and tried it in SSIS myself).

    You posted your connection string for Excel 2007, but as you're trying to get this to work with an Excel 97 file, let's see your connection string for that version. It should look pretty much like this:

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=F:\Financials\Worth's WIN Reports\2011 Win Reports\03-11 Win Report Worth.xls;Extended Properties="Excel 8.0;HDR=YES";

    It is perfectly possible to do all the things you described (an OLE DB Query to the Excel file) in an Excel Source. When you're in the editor, just click Build Query and you can start writing your OLE DB Query. I do it all the time.

    One thing I'd like to add is that it is adviced to put quotes around the filename, as it contains spaces. So the connectionstring becomes:

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source="F:\Financials\Worth's WIN Reports\2011 Win Reports\03-11 Win Report Worth.xls";Extended Properties="Excel 8.0;HDR=YES";

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen,

    Thank you. I must have been brain-cramping to not realize that by using a SQL query, of course one may specify the "table" from which the data is to be selected. Oh, well. At least I had a little fun exploring the OLE DB connection editor.

  • john.arnott (3/11/2011)


    Koen,

    Thank you. I must have been brain-cramping to not realize that by using a SQL query, of course one may specify the "table" from which the data is to be selected. Oh, well. At least I had a little fun exploring the OLE DB connection editor.

    I, on the other hand, just very recently discovered you could load Excel files with an OLE DB Source.

    The world is upside down 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply