Cascading Deletes on a login table: FK integrity or audit accuracy?

  • I have a non-technical design question - something more along the lines of "what's everybody out there doing".

    I am in the process of building a database for a website. This morning I sat down to write the WebLogin table, to store which account logged-in to the site when.

    The table will look something like this (not my actual code - lots of details missing, just pseudo-code so that you get the idea):

    WebLogin Table

    WebLoginID INT (PK)

    WebAccountID INT (FK)

    WebLoginDate DATETIME

    My question is in regards to the WebAccountID field.

    Obviously this field will have a Foreign Key constraint, pointing the child record to its parent in the WebAccount table.

    I was thinking about whether or not I wanted to invoke the ON DELETE CASCADE clause for the foreign key.

    If I do, and I delete a certain WebAccount record, all of the associated logs for that WebAccount will be deleted as well.

    The advantage of this is that there are no records in my WebLogin table that don't have a valid WebAccountID, but I'm also skewing the value of what the WebLogin table is supposed to be doing - dutifully recording all of the logins to the site. In the situation where a WebAccountID record is responsible for a lot of traffic to the site, and its record and associated logs are deleted, my site metrics are distorted.

    One of my alternatives is to invoke the ON DELETE SET NULL clause for the WebAccountID Foreign Key constraint.

    This has the advantage of preserving all of my data in the WebLogin table, but it leaves me with incomplete records.

    I'm interested to hear what others have done... choices made that you either regret or are happy with.

    Many thanks,

    -Simon

  • Simon D (10/20/2010)


    I have a non-technical design question - something more along the lines of "what's everybody out there doing".

    I am in the process of building a database for a website. This morning I sat down to write the WebLogin table, to store which account logged-in to the site when.

    The table will look something like this (not my actual code - lots of details missing, just pseudo-code so that you get the idea):

    WebLogin Table

    WebLoginID INT (PK)

    WebAccountID INT (FK)

    WebLoginDate DATETIME

    My question is in regards to the WebAccountID field.

    Obviously this field will have a Foreign Key constraint, pointing the child record to its parent in the WebAccount table.

    I was thinking about whether or not I wanted to invoke the ON DELETE CASCADE clause for the foreign key.

    If I do, and I delete a certain WebAccount record, all of the associated logs for that WebAccount will be deleted as well.

    The advantage of this is that there are no records in my WebLogin table that don't have a valid WebAccountID, but I'm also skewing the value of what the WebLogin table is supposed to be doing - dutifully recording all of the logins to the site. In the situation where a WebAccountID record is responsible for a lot of traffic to the site, and its record and associated logs are deleted, my site metrics are distorted.

    One of my alternatives is to invoke the ON DELETE SET NULL clause for the WebAccountID Foreign Key constraint.

    This has the advantage of preserving all of my data in the WebLogin table, but it leaves me with incomplete records.

    I'm interested to hear what others have done... choices made that you either regret or are happy with.

    Let me start with a question about WebLoginID and WebAccountID... are those different entities? like having 1-n WebLoginID's per WebAccountID?

    In regards to your specific question, it all depends on the utilization of WebLogin table - if you delete rows from here reports showing WebLogin activity will not be accurate.

    A workaround would be to "disable" instead of "delete" WebAccountID - that would preserve RI as well as accurate traffic reports coming from WebLogin.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Thanks for your reply, Paul.

    WebLoginID and WebAccountID... are those different entities? like having 1-n WebLoginID's per WebAccountID?

    To answer your first question: yes, the entities are as you described. One WebAccountID can have many (n) WebLoginIDs.

    In other words, a WebAccountID record is a user visiting the site, and a WebLoginID is a counter of each visit.

    it all depends on the utilization of WebLogin table - if you delete rows from here reports showing WebLogin activity will not be accurate.

    Your second sentence describes my dilemma perfectly: reporting accuracy vs. referential integrity. Who wins?!

    A workaround would be to "disable" instead of "delete" WebAccountID

    I'm very intrigued by this option. How would I disable the WebAccountID in the WebLogin table? Can you provide any details?

    Many thanks,

    -Simon

  • Simon D (10/20/2010)


    A workaround would be to "disable" instead of "delete" WebAccountID

    I'm very intrigued by this option. How would I disable the WebAccountID in the WebLogin table? Can you provide any details?

    Actually I would disable at WebAccount table level by adding a "status" column to WebAccount table.

    If WebAccount is active then status="A"

    When you want to get rid of a WebAccount row instead of deleting set status='"D" (Disabled). I would probably add a status_date column, when WebAccount gets created status_date will be date of creation, when WebAccount gets disable status_date will be "disabled" date.

    To deal with WebAccount.status you have two basic roads...

    1- Queries should take into consideration status column, or

    2- Expose WebAccount table as a view of WebAccount table where status='A', that way you have to do little or nothing to your existing code.

    Over time, you may want to purge accounts that have been disabled lets say ... for one year? then you can cascade-delete on traffic table.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Paul:

    That's a terrific idea.

    It adds a bit of complexity, but it's also a layer of insulation against accidental deletes with cascading effects.

    Thanks very much for your input - I'm very grateful.

    -Simon

  • Simon D (10/20/2010)


    That's a terrific idea.

    It adds a bit of complexity, but it's also a layer of insulation against accidental deletes with cascading effects.

    Thanks very much for your input - I'm very grateful.

    Glad you liked it Simon - just my point of view 🙂

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Paul:

    Exactly what I was looking for - expert points of view.

    -Simon

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply