Mark as Deleted. IsDeleted Column. Index Question

  • Hi all,

    in our database we have a lot of tables where we do not "physically" delete the rows. We just mark them as deleted (bit column).

    In order so save some lines of code and improve readibility, we use views instead of selecting the data directly from the table.

    CREATE TABLE [Table1] (

     [Col_1] [int] IDENTITY (1, 1) NOT NULL ,

     [Col_2] [varchar](50),

     .

     .

     [Col_n] [varchar](50),

     [IsDeleted] [bit]

    )

    CREATE VIEW dbo.uv_Table1

    AS

     SELECT Col_1, Col_2,.. Col_n

     FROM dbo.Table1

     WHERE IsDeleted = 0

    How should the index for table1 should look like, so that the view runs faster.

    Is it possible to create an index with a bit column? (I do not know how)

    If the previous question is true, should I set the IsDeleted column at the begginnnig or at the end of Index (selectivity)?

    Thanks in advance,

    Nito:

     

  • Usually, it is not advisable to index a bit column, the reason being that you want to index a highly selective column so that when an index is used, most of the data is filtered out, which is obviously not the case with bit columns.

    However, there are cases where you could index a bit column and use it to your advantage.  More info about it here:

     

    http://sqlserver2000.databases.aspfaq.com/can-i-create-an-index-on-a-bit-column.html

  • Thanks a lot for the interesting article.

  • Without reading that article, I have a situation here where a bit column is very indexable since this flag is on for about 25 to 125 rows out of 30 000+.  So an index seek + bookmark lookup on that column can be way more beneficial and any other access plan.

  • Use bit for IsDeleted in some cases can cause problem.

    In most cases, besides the artificial key, a table has columns must be unique in business point of view, lets say col_2 and col_3. If more than 1 row with the same unique key values (col_2,col_3) are deleted, the unique constaint will fail.

    e.g.: one record has "c2" and "c3"  for unique key columns and deleted later;

    then someone else defines a record with "c2" and "c3"  for unique key columns and deleted again.

    A better solution is to use int for IsDeleted and take the identity column value (col_1) as the deleted flag.

     

  • Hi peterhe,

    I understand what you meant with the unique columns (col_2 and col_3).

    But what do you mean with:

    >>A better solution is to use int for IsDeleted and take the identity column value (col_1) as the deleted flag. << ?

    I cannot control the Identity control value.  How can I now if a row is deleted using the identity (int) column?

  • Update dbo.TableName SET IsDelete = IdentColumn WHERE IdentColumn in ([Deleted ids go here])

  • what i meant is that if a row is deleted, copy the value of col_1 (identity)  to IsDeleted column. To check a row is deleted or not, check IsDeleted>0.

    IsDeleted can be part of the unique key.

     

  • I get it now. Good idea!

    That should also solve the issue about unique contraints on col_2 and col3 as mentioned above.

    I can create a unique contraint on col_2 + col_3 and IsDeleted right?

     

  • I believe that using a "IsDeleted" column is not usually a good solution to the issue of keeping "deleted" records.  It clutters up the "active" table, and usually requires too many workarounds.

    A better recommendation is to have a separate table for deleted rows.  Use a delete trigger to write the rows to the deleted table.

    Since most queries will not want the deleted records, those queries will run better without having table scans to check the flag or having inefficient 2-value indexes.

    To make queries for all rows easier, create a view that does union all for both tables, and name the view similar to table name like "TableA_WithDeletes".  Application code will be better documented using a separate table, rather that an obscure expression buried in the where clause.

    Hope this helps



    Mark

  • Hi Mark, you are right, but that way I need to create a Mirror "Deleted" Table for all tables where I want to have this logic.

    That's why I prefered to write a view for all the tables with this logic. And in the code I would only use the view and not the table.

    e.g: 

    SELECT *

    FROM    uv_Table1 t1

    INNER JOIN uv_Table2 t2

    ON  t1.col_x = t2.col_x

    CREATE VIEW dbo.uv_Table1   --<-- Only the view has the "fuzzy" logic

    AS

    SELECT Col_1, Col_2,.. Col_n

    FROM dbo.Table1

    WHERE IsDeleted = 0

    I wanted to avoid creating new tables for the deleted records.

    Now I am looking for a way to index the tables so that my views run optimal.

     

  • Just to complete the delete part.

     

    I would just create an instead of delete trigger. 

    In the trigger do a simple :

    Insert into table_deleted (KeyCol, col1, col2, coln) Select KeyCol, col1, col2, coln FROM Deleted

    DELETE T FROM table T inner join Deleted D on T.keyCol = D.KeyCol

     

    I totally agree that the deleted rows are often better off deleted or someplace else .

  • I don't think you'll get much performance out of this.  Either way, the view will have to do a scan.  But that scan will always be faster if it doesn't require to check for the 2nd argument and more rows.

  • I usually have three scenarios to keep deleted rows in table with IsDeleted flag:

    1) Asny deletetion. The deleting table is heavily referenced by many tables. A syn deletion cannot be done in timely fashon. By keeping the flag, a job can clean them up later together with all the records in the referencing table.

    2) For tracing/logging. some sensitive information, we don't delete them at all.

    3) As recycle bin to recover in case required

    Because the table is, in most cases, referenced by other tables (FK relation), you cannot relocate the deleted records to other tables.

  • I wanted to avoid creating new tables for the deleted records.

    But you are trying to save yourself a little work now (not creating these tables) and causing all your queries to work harder.  Your view is required for all queries, where the view for combining active and deleted tables are only used when needed.  

    It is more efficient to move the deleted rows away from the active table, even if it causes you some extra work to set up.



    Mark

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply