The Data Masker software contains a set of masking rules for the sample Sample database supplied with DB2 UDB. You are encouraged to add your own masking rules to this masking set. The first section of this page discusses the existing rules (as supplied with the installation) and the second part provides some hints of additional rules which might be added.
Masking Rules in the DB2 Sample Masking Set
The Existing Rules
If you wish to explore each rule further, load the masking set into the Data Masker software and double click on the rule with the mouse to launch it in editing mode. You can then view the configuration of the rule in detail.
Rule 0001
This is a Rule Controller. Notice how the other rules are dependent on it. A Rule Controller defines the database on which the other masking rules will act. If you change the login information in the Rule Controller its rules will operate on the new database. It is up to you to ensure that the Rule Controller points at an appropriate database. All other types of masking rule must have a parent Rule Controller and every masking set must contain at least one Rule Controller.
Rules 0002-0003
A group of Substitution Rules on the FIRSTNME column in the EMPLOYEE table. These rules form a pair: Rule 0002 masks the FIRST_NAME field in all rows in the table using the Male First Names data set. Then rule 0003 goes back over the same data and replaces the female first names with values from the Female First Names data set. A WHERE Clause on rule 0003 restricts the update values to rows which are female. You should have a close look at rule 0003 since the WHERE Clause used is not obvious.
Rule 0004
A Substitution Rule on the EMPLOYEE table. This rule uses the Random Surnames data set to provide the substitution data which masks the LASTNAME column.
Rule 0005
A Substitution Rule on the EMPLOYEE table. This rule uses the Random Integers data set to generate random four digit numbers for the telephone extensions in the PHONENO column.
Rule 0006
A Substitution Rule on the EMPLOYEE table. This rule uses the Number Variance data set to modify the values in the SALARY column by a random percentage in the range of plus or minus 10%. This technique useful because every salary will receive a changed value - but the effect on the average or the sum total of all salaries will be minimal.
Note: It is important to note that the Data Masker software is multi-threaded. It can, and will, execute multiple rules simultaneously. Since they are both operating on the same table and column, rule 0003 must only execute after rule 0002 completes - otherwise the actions of rule 0002 could overwrite the actions of rule 0003. The dependency relationship
used in rule groups 0002-0003 forces rule 0003 to execute after rule 0002 completes. It is also possible to use rule blocks for this purpose. For example, rules 0003 could be placed in a higher rule block than rule 0002 and the same result would have been achieved. The order in which the rules are visible on the screen is not necessarily the order in which they will execute. The execution order always needs to be explicitly specified using Dependencies and Rule Blocks. A dependency relationship was chosen in this case to emphasize that the rules are really just two aspects of the same masking operation.
Additional Rules
Below are some suggestions for additional rules which can be added to the DB2 Sample database example masking set. An ER diagram for Sample is available to assist you in your analysis of the database structure. If you do add new masking rules, it is a good idea to save the modified masking set under a new name - that way if you upgrade the Data Masker software you will not overwrite your enhancements.
EMPLOYEE:hiredate
Implement a substitution rule on this field to mask the date contents. There are a number of datasets to choose from. Perhaps the Date Variance dataset would be appropriate.
EMPLOYEE:bonus
This value should be changed - a Number Variance dataset could be used.. Perhaps a null value everywhere or a random bonus might also be appropriate. Possibly it might be best to not add enter a value in a bonus field that is already null. A Where Clause like the one on rule 0005 can be used to prevent this.
EMPLOYEE:comm
This value should be changed - a Number Variance dataset could be used. Perhaps a null value everywhere or a random commission might also be appropriate. Possibly it might be best to not add enter a value in a comm field that is already null. A Where Clause like the one on rule 0005 can be used to prevent this.
DEPARTMENT:deptname
The department name can be changed to a new value. Perhaps the substitution rule using the Text, Dictionary Words dataset would do the job. If you would like to try something innovative, perhaps implement a Row-Internal Sync. Rule which concatenates the values from two Text, Dictionary Words datasets together.