Data masking - protecting confidential data

  • I have a requirement to sanitize some of the data in several of my databases where I need to make the data unrecognizable while preserving the physical characteristics of the data - length of the data string, numeric vs. alpha, etc.

    So, in the case of a social security number, I would want the sanitized numbers to be 999-99-9999 or 123-45-6789. For letters, I would like to be able to replace all vowels with a single letter and all consonants with a single letter. So, the name John Q. Doe might become abaa a. abb.

    Any tricks here?

  • I guess u do masking on your front-ends. Databases must store actual values. You might want to consider encryption if you need ur certain column to look encrypted. SQL 2k5 has inbuilt feature for encryption. Refer BOL for details.



    Pradeep Singh

  • Unfortunately, it isn't an encryption problem. It has to do with privacy laws and the use of third party contractors and maybe out-of-country developers. So, these people need to be able to see the data, but it must be rendered meaningless.

  • Tom John (6/22/2009)


    Unfortunately, it isn't an encryption problem. It has to do with privacy laws and the use of third party contractors and maybe out-of-country developers. So, these people need to be able to see the data, but it must be rendered meaningless.

    Do the third party guys access data directly by querying the database or via a front-end? If they access via an application, You an do masking at application level.

    However, if they need to query the databases, i dont think you can mask a particular column. Encrypting the column containing SSN will render it meaningless unless authorized user queries the table.

    I doubt masking a data at a column level is not available(All, pls correct me if i'm wrong here).



    Pradeep Singh

  • I am talking about developers debugging through programs that access the database. So, they will run native SQL queries and also be able to see data retrieved by VB or C++ programs.

  • Tom John (6/22/2009)


    I am talking about developers debugging through programs that access the database. So, they will run native SQL queries and also be able to see data retrieved by VB or C++ programs.

    Well in my previous organization, we used to scramble the columns containing sensitive information to be used by developers after pulling the data from production db. Production database was highly secured.

    Any other ideas anyone?



    Pradeep Singh

  • What about creating a new user and only assigning it access to views you have written? Then in your view, you can say SELECT '123-45-6789' as SSN, 'First M. Lastname' as Name FROM ....

    /* Anything is possible but is it worth it? */

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply