Having problems bulk inserting into a view

  • I am having a problem BULK INSERTing data into a view. The problem occurs infrequently, seemingly at random. To date I have not been able to consistently cause this problem to recur, it just pops up every now and then.

    SQL Server 2005 Developer edition, sp2.

    A rough summary of the project: load 20 files into 20 tables. Each column in each table is represented in each file. The order of the columns in the files can be different from load to load -- but all columns will be present. The files are regular ASCII text & format, data is pipe-delimited, the first row is header data listing the columns. A sample six-column file:

    [font="Courier New"]G|P|E|N|V|D

    1|0|abc|10|5|3

    2|0|def|10|5|4

    3|0|ghi|20|2|6

    4|0|jkl|20|17|3[/font]

    As mentioned, the columns could be in any order (G|P|N|V|D|B, P|E|N|V|G|D, and so on). The table is defined with columns {G, P, E, N, V, D}. No data transformation of any kind are required, we just need to do a series of simple T-SQL BULK INSERTs. Some columns are nullable, some not.

    To account for this “flexible file format”, during ETL we first create a view on the table with the columns ordered as per the file to be loaded. For example, assuming format N|P|G|D|E|V, the view would be

    [font="Courier New"]CREATE VIEW Loading.vMyTable (N, P, G, D, E, V)

    as select N, P, G, D, E, V

    from Loading.MyTable[/font]

    And the build insert is a straightforward

    [font="Courier New"]BULK INSERT Loading.vMyTable

    from "C:\ETL\MyTable.csv"

    with (firstrow = 2, fieldterminator = '|', rowterminator = '', tablock, maxerrors = 0)[/font]

    Most of the time this works. Every now and then, it doesn’t – the columns get scrambled somehow. This latest time, where the table is G, P, E, N, V, D, and the data is select N|P|G|D|E|V, the data ends up being loaded in order P, G, E, N, V, D (the first two columns are switched). This is only one sample, I’ve had several with more columns scrambled than just the first two exchanged. I as yet can see no pattern in how they get mixed up.

    All indexes are dropped from the tables and they are truncated before loading. No identity or default columns, no check or foreign key constraints. A possible kicker: the table is partitioned (SQL 2005 table partitioning). It has N partitions, the data being loaded will fit in precisely one, and the table is truncated before we load anything. The table is not in the dbo schema, but the loading utility has sysadmin rights (the top-level routine is a SQL Agent job).

    Sometimes it runs without a hitch. Sometimes I get one problem load. This last time I got at least three, there may have been more that I haven’t stumbled across yet. (Curiously, I never get errors resulting from trying to load a “mixed up” column into one of a different datatype – it *appears* to only scramble integer-type columns.) I’ve found nothing on the web to explain this problem, but then the above situation does not allow for a simple five-word Google search. Any ideas or suggestions would be greatly appreciated!

  • Turns out this was due to a table partitioning-related bug that crops up somewhere/how when a new partition is added to a table. As per PSS, this bug was fixed somewhere between sp2 and cumulative upgrade 8.

    I don't much trust cumulative upgrades, and found the following workaround to be effective:

    - Drop the indexes

    - Add the new partition

    - Recreate the indexes (clustered primary key, in this case) on the empty table

    - Drop the indexes again (the table is empty, so this takes an average of .0 seconds)

    - Proceed as normal.

    With this done, the problem went away.

    Philip

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply