Errors when inserting rows from a linked server

  • I have a script that inserts rows from a linked server. It basically looks like this:

    use mydb;

    go

    insert into my.table (col1, col2, ...)

    select col1, col2, ...

    from LinkedServer.db.my.table;

    The DDL for both tables is identical: 550 columns (de-normalized from a data warehouse), of which 548 are varchar(max). The remaining two are varchar(255). (not my design! but unchangeable at the moment).

    Running my query raises an error:

    Cannot create a row of size 9948 which is greater than the allowable maximum row size of 8060.

    The statement has been terminated.

    Now, inserting rows on the table on the LinkedServer has never been an issue. (Since most columns are varchar(max), SQL has the option to store the data off-row.) The data is there, intact and whole. However, when pulling it into my target server using a LinkedServer (4-part naming), I get the error.

    Three pairs of eyes have confirmed that the table definitions are identical on both source and target.

    So....any ideas why I'm getting this error message?

    Gerald Britton, Pluralsight courses

  • Me thinks it has something to do with how SQL Server is trying to store data during the insert. From Books Online:

    Large Row Support

    Rows cannot span pages, however portions of the row may be moved off the row's page so that the row can actually be very large. The maximum amount of data and overhead that is contained in a single row on a page is 8,060 bytes (8 KB). However, this does not include the data stored in the Text/Image page type. This restriction is relaxed for tables that contain varchar, nvarchar, varbinary, or sql_variant columns. When the total row size of all fixed and variable columns in a table exceeds the 8,060 byte limitation, SQL Server dynamically moves one or more variable length columns to pages in the ROW_OVERFLOW_DATA allocation unit, starting with the column with the largest width. This is done whenever an insert or update operation increases the total size of the row beyond the 8060 byte limit. When a column is moved to a page in the ROW_OVERFLOW_DATA allocation unit, a 24-byte pointer on the original page in the IN_ROW_DATA allocation unit is maintained. If a subsequent operation reduces the row size, SQL Server dynamically moves the columns back to the original data page. For more information, see Row-Overflow Data Exceeding 8 KB.

    Link:

    https://technet.microsoft.com/en-us/library/ms190969(v=sql.105).aspx

  • Yes, well the quote from BOL is of course correct. It states in essence that you *can* insert rows where the aggregate data length exceeds 8060 if some of that data can be stored off row, as in text, ntext, varchar(max), nvarchar(max), image etc. No surprises there!

    Note however that in the case I'm talking about, the data has already been successfully inserted into the source table. In fact, on the source server I can easily copy the data to another table with a simple insert into...select.

    My problem occurs when the insert into is run on a different (the target) server and the select portion references a linked server (the source).

    So, I think the problem has something to do with the mechanics of insertion when the source data is on a linked server, but I cannot see what the problem may be nor how to work around it.

    Gerald Britton, Pluralsight courses

  • g.britton (8/14/2015)


    Yes, well the quote from BOL is of course correct. It states in essence that you *can* insert rows where the aggregate data length exceeds 8060 if some of that data can be stored off row, as in text, ntext, varchar(max), nvarchar(max), image etc. No surprises there!

    Note however that in the case I'm talking about, the data has already been successfully inserted into the source table. In fact, on the source server I can easily copy the data to another table with a simple insert into...select.

    My problem occurs when the insert into is run on a different (the target) server and the select portion references a linked server (the source).

    So, I think the problem has something to do with the mechanics of insertion when the source data is on a linked server, but I cannot see what the problem may be nor how to work around it.

    Or it is trying to store the data in the destination table differently than it is stored in the source table. If you take the 548 varchar(max) columns and the size of the pointer for each of those columns, SQL Server would need 13,152 bytes to store a 24 byte pointer in row to all the data off row if all 548 columns were to be stored off row.

    Obviously it isn't trying to do that as the row size reported in the error is 9948.

    With the information you have provided, I have to assume that SQL Server is trying to store the data differently in the destination system than the source system.

  • The target table has the identical definition and the data is being stored in the same way. No manipulations or changes

    Gerald Britton, Pluralsight courses

  • You've inadvertently given me something to check, though. What if the column order differs between the source and target? I suppose that, in such a case, it is possible that when storing data at the target, sql decides to store some data in-row and some off, but the different order causes an issue.

    Something to look into...

    Gerald Britton, Pluralsight courses

  • g.britton (8/14/2015)


    The target table has the identical definition and the data is being stored in the same way. No manipulations or changes

    Not relevant to what I am talking about here. The order of the data stored between the source and destination tables may be different. The order in which the columns are defined has no real impact in how the data is stored in the row. Te way the data is transferred over the network may be affecting how the data is being inserted in the table.

    What data from what columns are sored in row and what rows are stored in overflow pages may not be the same between your source and destination tables. Moving data between tables in the same database may not affect how the data is stored, but moving it across a network may have an affect in this area.

  • On topic but slightly different subject, what is the purpose for having 548 MAX columns? Not a challenge, Gerald... just curious.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • On topic but slightly different subject, what is the purpose for having 548 MAX columns? Not a challenge, Gerald... just curious.

    wish I knew! I've been asked to migrate this table to a new server as is. It predates my arrival by a fair bit and I've been warned that the schema must be preserved.

    I'd be happy just to split it in half but even that is not an option.

    Gerald Britton, Pluralsight courses

  • Lynn Pettis (8/14/2015)


    g.britton (8/14/2015)


    The target table has the identical definition and the data is being stored in the same way. No manipulations or changes

    Not relevant to what I am talking about here. The order of the data stored between the source and destination tables may be different. The order in which the columns are defined has no real impact in how the data is stored in the row. Te way the data is transferred over the network may be affecting how the data is being inserted in the table.

    What data from what columns are sored in row and what rows are stored in overflow pages may not be the same between your source and destination tables. Moving data between tables in the same database may not affect how the data is stored, but moving it across a network may have an affect in this area.

    Yeah I just realized that possibility (my post just after the one you quoted). I'll be checking that out ASAP

    Gerald Britton, Pluralsight courses

  • g.britton (8/15/2015)


    On topic but slightly different subject, what is the purpose for having 548 MAX columns? Not a challenge, Gerald... just curious.

    wish I knew! I've been asked to migrate this table to a new server as is. It predates my arrival by a fair bit and I've been warned that the schema must be preserved.

    I'd be happy just to split it in half but even that is not an option.

    It would be interesting to do a MAX(LEN()) on all of those columns to see what the truth is because it sounds a lot like the 5 monkeys experiment. Someone like Gail Shaw would be able to explain much better than I but I'm not sure how SQL Server is able to handle things as they are because even a 16 byte pointer (fixed length) for each of 548 columns exceeds 8,060 bytes.

    As to moving the table, would a restore be the way for now? Seems like the linked server just isn't going to handle it. If that's not an option, perhaps a "native" BCP export/import would do the trick.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • So, here's what I've found:

    1. All attributes of the columns of the source and target tables are identical, including the position in the DDL and the column_id from sys.columns

    2. the maximum row length (max(len(col1) + len(col1) + ...) = 11522; max(Max(len())) on all columns = 2006, median 11, sum 24324 (current worst case)

    2a. Note that the source table contains this data at present --> it was successfully inserted at some point.

    3. The INSERT INTO column list and the SELECT FROM (linked server) column list are in the same order as the DDL

    4. For reasons I do not understand at all

    INSERT INTO targetschema.targettable

    SELECT * FROM sourceserver.sourcedb.sourceschema.sourcetable

    works whereas the INSERT query specifying the columns by name does not. IIRC Select * returns columns in the order in which they were defined, ditto INSERT INTO with no column list.

    Gerald Britton, Pluralsight courses

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply