Checking whether data has been read

  • This is a tough one to describe but I'll do my best.

    We have a vendor product that hits our intranet views and tables. One particular "job" retrieves the latest data in a table. The problem is that I need to be able to flag data that has already been retrieved. I want to flag those records to ensure there are no duplicates. The vendor program is performing the scheduled part of this; it's beyond our control. Unfortunately I can't point the product at a proc, it has to be a view or table.

    Is there any kind of "SELECT trigger" that I could use to run my code at such time as the table data is read? Failing that, is there any auditing data I could read to get that information?

    As it stands we'll likely have the product schedule and the view schedule line up as best we can (have the view retrieve records added/modified in the last x minutes, where x matches the product's job settings). There are going to be either dupes or records missed, or both, but not too many hopefully. I'd much rather make it an exact science 🙂

    Hopefully all this is clear. I welcome any thoughts, and I'll do my best to explain further if necessary.

  • I don't think there is an option such as "SELECT Trigger" but there are some system objects that stores the information about when the index/table was read/write/seek/scan etc.

    Here is an article that contains the script

    http://sqlblog.com/blogs/aaron_bertrand/archive/2008/05/06/when-was-my-database-table-last-accessed.aspx

    Sample code from the article:

    WITH agg AS

    (

    SELECT

    [object_id],

    last_user_seek,

    last_user_scan,

    last_user_lookup,

    last_user_update

    FROM

    sys.dm_db_index_usage_stats

    WHERE

    database_id = DB_ID()

    )

    SELECT

    [Schema] = OBJECT_SCHEMA_NAME([object_id]),

    [Table_Or_View] = OBJECT_NAME([object_id]),

    last_read = MAX(last_read),

    last_write = MAX(last_write)

    FROM

    (

    SELECT [object_id], last_user_seek, NULL FROM agg

    UNION ALL

    SELECT [object_id], last_user_scan, NULL FROM agg

    UNION ALL

    SELECT [object_id], last_user_lookup, NULL FROM agg

    UNION ALL

    SELECT [object_id], NULL, last_user_update FROM agg

    ) AS x ([object_id], last_read, last_write)

    GROUP BY

    OBJECT_SCHEMA_NAME([object_id]),

    OBJECT_NAME([object_id])

    ORDER BY 1,2;

    --Ramesh


  • I agree. First option would be to query the DMV's to see what has been run. However, if you wanted to know what was run yesterday or last week, the DMV's are unlikely to have that information since they only store what's currently in cache. You could also take a look at the default trace. It's a built-in trace that captures data in a rolling file. However, after a period of time, depending on the load on the system, your data would once again by unavailable. To be sure of having all the data over a period of time, you'd need to look into setting up your own trace (you can use Profiler to set it up, but run a server-side trace, not a Profiler GUI trace) and storing the information from it as long as you needed it.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Thanks for the replies!

    Actually the period in question is one of minutes: I just need to know what was read by the vendor product in the past several minutes. Once I do, I can populate my own historical tables (or flag the original records if that works better) to keep records from being processed again later. Both the vendor's job and my sql job will be running every couple minutes so I can safely tuck away anything much older.

    I'll check into the solutions mentioned.

  • Then you can probably get what you need by querying the DMV's.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Looks like the DMV will work just fine. I ended up making a "dummy table" to join my data onto since you can't differentiate users from the DMVs--there are other processes that read from the base data and I didn't want to copy it to a temporary table, so now I can just check the latest scan date on that dummy table (the only process joining that dummy table will be the view driving the periodic query in question).

    Short version, I now have a plan, and thanks a bunch!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply