its definately a SQL Server Bug (table with 80 million rows)

  • You can see this yourself.

    Create a new database and set the recovery model to full.

    Add a single table with one INTEGER field.

    Insert a record into your table.

    Run: DBCC LOG(' ',3)

    You should see that in the transaction log you have information like the LSN for the backups, the operation, the transaction id, some record length information, allocation information, page information, lock counts, etc. This is all in addition to the actual data being logged in the transaction log. So, the log entry for a transaction not only includes the data, it includes a great deal of descriptive information.

    Now, if your table is tiny - like just an integer field - it is pretty easy for the descriptive information to take up more space than the data involved in the transaction.

  • you are missing that the OP is inserting 10M rows in a single transaction. I do agree that if you did a series of single-row inserts the overhead stuff would be significant.

    create database test

    go

    use test

    go

    DBCC LOG('test',3) --71 rows here now, just for empty database

    go

    create table test (a int)

    go

    DBCC LOG('test',3) --115 rows now, so 84 for create table action

    go

    insert test

    select id from syscolumns --420 rows inserted

    go

    DBCC LOG('test',3)

    --566 rows in log now, 420 of which are the actual row inserts,

    --leaving only 31 rows of 'overhead' associated with the insert statement itself

    --thus as the number of rows inserted goes up, the ratio of transaction 'overhead'

    --to rows inserted approaches zero (31+allocation logging/10M rows of actual inserts)

    go

    use master

    go

    drop database test

    go

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Yes, the number of rows is not the same. You get some additional log entries in the transaction log to go with inserts. That is not what the issue is.

    Pretend the transaction log is a table for a second. Your table has a single field in it (an integer). The transaction log "table" has a variable length field in it to hold your data plus 10 other integer fields to describe what is happening (it's not 10 and they are not integers, but that makes the math easy). Your table requires 4 bytes per row. The log table would require your 4 bytes per row plus the other 40 bytes.

    It is not a perfect ratio like this, but one of the things you get in the log file with each piece of data is an indicator for the transaction to associate your data together. In the case of a table with a single integer field, the transaction identifier alone doubles the amount of data to be logged.

  • Treading lightly in the company of so many gurus....

    have you considered doing a BulkInsert and/or checking to see what the FillFactors are? If files are set up without a lot of freespace you could have rippling effects from page splits, etc. Insert enough records to split a page... then insert more which cause both those to split again... etc...

    Toni

    *** corrected grammar **

  • never tread lightly....speak your mind.

  • Michael Earl (5/2/2008)


    never tread lightly....speak your mind.

    Definitely agree with this!!

    Didn't the OP state it was a HEAP table, in which case fillfactor isn't applicable. It would be interesting to see how much extra logging a clustered index with a fillfactor of 1 would take due to page/extent splits and allocations. Certainly would create a bloated database tho! 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • while facing similar problem earlier, i had given up this because of not having much time to spend on this. now i think i got the answer to this problem.

    thanks to all gurus here.

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply