Design Dilemma

  • I am currently mapping out a new project to create an online market research site. Thus far, most of the projects i've created have used mysql or access, due to the small amount of transactions and such. My need now is for far more data and interactivity. I will be using a flash interface, cold fusion as my communicating language, and SQL Server as my backend. I read the article on reference tables and that will save me alot of tables, but my question now is log tables. I know i can use a trigger to keep information in a log file once a change is made to a record (anyone have an example of this?), but can i have a generic log file that will maintain changes to any table? I would like to know who made the change, time, status, etc. The more information the better. The main point of the site is to keep tabs on business activity, so there will millions of records pretty quickly. My current idea is to create activity tables based on a type of business as the data collected will be different for each type of business and just keep it sorted by date. But that would require alot of tables, which means alot more CF programming Queries, and more than a few columns. Is there a way to design a data storing table better? Is less columns with more records faster to query than a table with less columns and more records? I guess i'm just trying to figure out a way to have less tables to have to query, and set a log table and a data activity table with the same idea as the look up article here (http://qa.sqlservercentral.com/columnists/dsumlin/lookupstrategy.asp)

    Just confused whether i should seperate out each of my logs and data tables, so that every table has a log of changes to reference back to to see what was going on in that period of time (hence market research). Just to clarify, an example of a data table would be something like revenue and cover information for Full service restaurants, in contrast to data collected for limitied service restaurants would be different. Some types of data would be the same but many would be different. I think i could just use tables like the look up strategy idea because all the data is numeric. So i'm thinking that i can just add a column that would reference it as $, percentage, inches, acres, SQFT, etc. i have a hard time explaining things so if this doesn't make sense, sorry

    Any help or suggestions woudl be greatly appreciated!

    Thanks!

  • I worked in a situation where knowing in detail what was changed in a database was critical. I used a single log table to capture all inserts and updates to any field that needed audited. The table structure was

     

    create table

    [dbo].[TBL_Log] (

    [Action] [char] (1)

    not null

    , [FieldName] [varchar] (32)

    not null

    , [LastMod] [datetime]

    not null

    , [NewValue] [varchar] (255)

    null

    , [OldValue] [varchar] (255)

    null

    , [RecID] [int]

    not null

    , [TableName] [varchar] (32)

    not null

    , [UserID] [int]

    not null

    )

    I then had an insert / update trigger for each table containing the fields to be audited that added a record to this table as the data was changed. This worked very effectively as far as auditing goes. HTH

     

  • Wow... exactly what i had mind. My only question on this would be.. would it make sense to use a foreign key and show a relationship between the log and the look up table to show different types of information that had been changed, but in the log table stored as an integer in respectful cases to save space? Example... i have an inventory of businesses where some of them are subscribers, particpants, or even really not part of the inventory but exist nonetheless. As a research tool, i must provide historic information like when businesses close for reconstruction, or reloction, or even go out of business, i must keep tabs on that and along with the period of time the activity occured. So, if i had a look up table that had such things as out of business, under construction, etc. instead of using the whole word, as it will be redundant and very space consuming, is there a way in the trigger to first look into the look up table to see if the values match the values being changed? This is all a web based application, so alot of stuff will be intuitive drop downlists that will have numeric values coming in. Can you give me an example of a trigger that you used to do this auditing? I am very new to triggers, and i know what they do, but something like this seems easily confusing to me. Do you think it would be easier to maintain multiple log tables? I mean on the application side thats what you really want to avoid as its more programming, but for the backend, how do i reference changes that occur when using that lookup table?

    Also, i didn't mention it before, but i noticed on a top database model website, a sample of a membership ERM showed that an address should be seperate from employee or customer, etc. but the country and state related to each other and to the address. That confuses me, because set up an address table that only contained the street location and zip. The zip would reference (relate) to the region; region to state; state to country; country to continent; etc. So, why would i need anything more than a zip in an address table? 

    Thanks so much for replying and any other help you can offer is greatly appreciated!!!

    Thanks, 

    David 

  • http://www.datamodel.org/

     

    a sample of that membership database with the address issue i was talking about. Is there a way to plug into a program or interface API to keep zip codes up to date?

     

    Thanks again,

    David

  • I didn't include any foreign keys to this table. I of course used foreign keys in the user tables but for this log table I wanted exactly what was changed in each field being audited. Below is a sample of one table's trigger. As you can see, since this is at a field level, these triggers can get quite long. I developed a VB app that created the trigger scripts from the "Create Table" scripts used to build the database. A more eligant method might be to read the schema from SQL itself, but I didn't have the time.

    Hal

     

    Create Trigger tiu_Alert on [dbo].[UT_Alert] for Insert, Update AS
    Declare
    @DelCount int,
    @FieldName varchar(32),
    @InsCount int,
    @LastMod datetime,
    @NewValue varchar(255),
    @OldValue varchar(255),
    @RecID int,
    @TableName varchar (32),
    @userid int
    Select @DelCount = count(*) from Deleted
    Select @InsCount = count(*) from Inserted
    if (@DelCount = 0 and @InsCount = 0) -- No data updated
    Begin
    Return
    End
    Select @RecID = [AlertID] from Inserted
    Select @TableName = 'UT_Alert'
    Select @userid = [UserID] from Inserted
    Select @LastMod = getdate()
    if (@DelCount = 0) -- Doing Insert
    Begin
    Update [UT_Alert]
    set [LastMod] = @LastMod
    where [AlertID] = @RecID
    Insert into [TBL_Log]
    (UserID, TableName, RecID, Action, LastMod)
    values (@UserID, @TableName, @RecID, 'I', @LastMod)
    Return
    End
    -- Else Doing Update
    Update [UT_Alert]
    set [LastMod] = @LastMod
    where [AlertID] = @RecID
    if Update ([Alert])
    Begin
    Select @NewValue = [Alert] from Inserted
    Select @OldValue = [Alert] from Deleted
    Select @FieldName = 'Alert'
    Insert into [TBL_Log]
    (UserID, TableName, FieldName, RecID, Action, OldValue, NewValue, LastMod)
    values ( @userid, @TableName, @FieldName, @RecID, 'U', @OldValue, @NewValue, @LastMod )
    End
    if Update ([EndDate])
    Begin
    Select @NewValue = [EndDate] from Inserted
    Select @OldValue = [EndDate] from Deleted
    Select @FieldName = 'EndDate'
    Insert into [TBL_Log]
    (UserID, TableName, FieldName, RecID, Action, OldValue, NewValue, LastMod)
    values ( @userid, @TableName, @FieldName, @RecID, 'U', @OldValue, @NewValue, @LastMod )
    End
    if Update ([Notes])
    Begin
    Select @NewValue = [Notes] from Inserted
    Select @OldValue = [Notes] from Deleted
    Select @FieldName = 'Notes'
    Insert into [TBL_Log]
    (UserID, TableName, FieldName, RecID, Action, OldValue, NewValue, LastMod)
    values ( @userid, @TableName, @FieldName, @RecID, 'U', @OldValue, @NewValue, @LastMod )
    End
    if Update ([StartDate])
    Begin
    Select @NewValue = [StartDate] from Inserted
    Select @OldValue = [StartDate] from Deleted
    Select @FieldName = 'StartDate'
    Insert into [TBL_Log]
    (UserID, TableName, FieldName, RecID, Action, OldValue, NewValue, LastMod)
    values ( @userid, @TableName, @FieldName, @RecID, 'U', @OldValue, @NewValue, @LastMod )
    End

     

  • Thanks again for getting back to me so quickly. Again, because i'm new to triggers and sql server itself, its kinda hard to follow exactly how your trigger is sending the log information from the user table to the log table. I get the idea though i think. tiu = Trigger Insert/update correct? Also, if you had lets say 20 fields in a table, then for the update section of the trigger you would have specify each of the fields to update, correct?

    I think this could work well for application purposes, but i fear, due to the nature of the many many changes that will be made, that i shoud probably have a few seperate bundled log tables, such as inventory logs, user/member logs, category logs, etc. Ever tried that? Just seems that if i had 100k companies that had 10 or so users making changes, well that is going to be one hell of a large log no? anyways, thanks again for helping and being so prompt! Any feedback on that address dilemma i'm also having?

    thanks again,

    David

  • Actually, yes this log can grow quite large depending on the volume. We archived the log file monthly by appending it to a table on our data warehouse (it generaed about 50k records per month). There were no indexes on the production side to speed adding data to the log file.

    Yes "tiu" is for insert / update. The were also a few tables that allowed deletes and there would be a "td" or a "tiud" trigger. And yes, the more fields, the longer the update section. That is why I figured out a programatic solution to creating these triggers.

    The "from deleted" and "from inserted" are the "before" and "after" snapshots of the record before it is committed to the database.

     

    As far as the zipcode, I don't see any reason why you couldn't store only the zipcode in a database as long as you know the accuracy of the data (zipcode). My question would be, what if the user entering the zipcode got it wrong, would you have a way to know that?

    HTH,

    Hal

  • How did you reference the appended tables when you needed historical changed data?

     

    As far as the zip codes go, i was thinking of backwards compatibility on the application side. In other words, i was going to do a data drill down  starting @ country first, then region, then state etc... until i got to zip where there would be a set amount of zip codes available. Still not sure though. I have my tables set up that way so that one is related to the other country to region, region to state, state to state region, state region to county, and county to city. But in the membership example on that data model, the guy puts the country and state in the address table, yet still shows the relationship between country and state. I think thats where i'm getting confused.

    Thanks again for you help!!!

    David

  • Actually simple. I just union the current log table and the history log table in any select that needs information from the complete history. BOL has good examples of Union.

     

    Hal

  • The union makes perfect sense to me! .. the only thing i don't understand is how this data warehouse is any different than the database your currently using. Did you design another trigger that pulls out all data after 50k is reached? I guess i'm just a little bit confused on the data warehouse term and if its seperate from the current database or not. I need to brush up on that, because i will have many more tables like this that could prossibly generate that much transactions. Are there any good models you knwo that would visually represent databases and data warehouses?

    Thanks again,

    david

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply