[Net 2000 Ltd. Home][Data Masker Home][Data Masker Manual][Data Masker FAQ]

Scrambling Data in the DMTest Database

The Data Masker software contains a set of masking rules for the sample DMTest database supplied with the Data Masker software. You are encouraged to add your own masking rules to this masking set. The first section of this page discusses the existing rules (as supplied with the installation) and the second part provides some hints of additional rules which might be added. You can create the DMTest database in an SQL Server instance of your choice using the Data Masker SQL Server Configuration Tool.


Masking Rules in the Sample DMTest Masking Set

The Existing Rules

If you wish to explore each rule further, load the masking set into the Data Masker software and double click on the rule with the mouse to launch it in editing mode. You can then view the configuration of the rule in detail.

Rule 0001
This is a Rule Controller. Notice how the other rules are dependent on it. A Rule Controller defines the database on which the other masking rules will act. If you change the login information in the Rule Controller its rules will operate on the new database. It is up to you to ensure that the Rule Controller points at an appropriate database. All other types of masking rule must have a parent Rule Controller and every masking set must contain at least one Rule Controller.
Rules 0002-0003
A group of Substitution Rules on the FIRST_NAME column in the DM_CUSTOMER table. These rules form a pair: Rule 002 masks the FIRST_NAME field in all rows in the table using the Male First Names data set. Then rule 0003 goes back over the same data and replaces the female first names with values from the Female First Names data set. A WHERE Clause on rule 0003 restricts the update values to rows which match where customer_gender='F'. The structure of this two-stage mechanism is a particularly important concept to grasp. If it were implemented so that both rules had a WHERE Clause (one for 'M' and one for 'F') then any row with a CUSTOMER_GENDER of NULL would not get masked. This is called a WHERE Clause Skip and is discussed in the Data Scrambling Issues white paper. You are strongly advised to read this white paper - it covers a number of very subtle issues.
Rule 0004
A Substitution Rule on the DM_CUSTOMER table. This rule uses the Random Surnames data set to provide the substitution data.
Rules 0005-0007
The logic behind this group of rules is that although the supplier table is not used in production, it will be needed in the test database. Accordingly, the table is truncated with a Command Rule (rule 0005) to make sure it is really empty and then rows of data are built for it. The SUPPLIER_ID column is populated using an Insertion Rule (rule 0006) and a standard Substitution Rule (rule 0007) is used to populate the SUPPLIER_NAME column.
Rule 0008
A Substitution Rule on the DM_EMPLOYEE table. This rule uses the Formatted Alpha-Numeric text data set to build a new random employee number in the original format. You should view this rule to see how the formatting is constructed.
Rule 0009
The rows in the DM_EMPLOYEE table are stored in a denormalized format. You should have a look at the table contents to see this for yourself. For any given employee, there are multiple rows each containing copies of the EMP_ID and FIRST_NAME, LAST_NAME etc. If the EMP_ID changes, it must be synchonized so that every row associated with the same employee receives an identical value. This requirement is called Table-Internal Synchronization. Rule 0009 is a Table-Internal Synchronization rule which re-introduces the correlation after the EMP_ID receives a random value from rule 0008. It is a good idea to have a look at this rule to see how the synchronization and group-by conditions are configured.
Rule 0010
The EMP_ID column also exists in the DM_ASSIGNMENTS table (see the ER diagram). Each EMP_ID in DM_ASSIGNMENTS table must change in a consistent manner to match the changes in the DM_EMPLOYEE table. This is an example of Table-To-Table Synchronization and it is an extremely common requirement when masking databases. Rule 0010 is a Table-To-Table Synchronization rule which correlates the EMP_ID column in both tables. It is also a good idea to have a look at this rule to see how the synchronization and join conditions are configured.

Note: The dependency relationship illustrated in rule groups 0002-0003 and 0005-0007 is purely arbitrary and is only done to collect the rules together to show they are related. Rules are always executed in the order in that they are visible on the screen. For example, rules 0005-0007 could be placed one after another on the same level and would have the same effect.

Additional Rules

Below are some suggestions for additional rules which can be added to the Pubs database sample masking set. An ER diagram for DMTest is available to assist you in your analysis of the database structure. If you do add new masking rules it is a good idea to save the modified masking set under a new name - that way if you upgrade the Data Masker software you will not overwrite your enhancements.

DM_CUSTOMER:...
Examine the remaining columns in DM_CUSTOMER table and make decisions about which ones contain sensitive information and implement suitable Substitution Rules for them.
DM_CUSTOMER_NOTES:customer_firstname, customer_lastname
The DM_CUSTOMER_NOTES table has two columns which must be synchronized with the DM_CUSTOMER table. The CUSTOMER_ID is the join key. Implement an appropriate Synchronization Rule to make things properly correlate.
DM_EMPLOYEE:...
Analyze the remaining columns in DM_EMPLOYEE table and make decisions about which ones contain sensitive information and implement suitable Substitution Rules for them. Note the FULL_NAME column contains information built from other columns in the same row. This will need to be re-built after the other columns are masked, and is an example of Row-Internal Synchronization. See the Pubs Masking Rules (rule 0011) for an example of the implementation of Row-Internal Synchronization.

[Net 2000 Ltd. Home][Data Masker Home][Data Masker Manual][Data Masker FAQ]