[Net 2000 Ltd. Home][Data Masker Home][Data Masker Manual][Data Masker FAQ]

The Data Masker Datasets

Datasets provide data values for Insertion, Substitution and Row-Internal Synchronization rules. The datasets associated with the columns defined in the rule indicate which type of data will be entered into the specified target table and column. A wide variety of datasets (see below) are available to provide a range of realistic looking data.

For example, a column containing customer last names could be "Masked" by implementing a Substitution rule on it using the Names, Surnames, Random dataset. When the Substitution rule is executed as part of the run of the masking set, random last names would be generated and substituted in for each real customer last name. Thus the true last name of the customer would be hidden (preserving privacy and security) but the remaining data would still be referentially relevant and usable as a test system.

A dataset is associated with the target tables column when the rule is created and can be changed at any time simply by editing the rule.

Datasets have options which provide further configuration information. Each dataset offers configuration options specific to its requirements. For example, the Dates, Random dataset offers the ability to set the starting and ending points of the date range.

Note: It is quite possible to add your own custom dataset to the Data Masker system. All that is required is to place a simple text file (with a special naming convention) in the datasets directory. Please see the User Defined datasets help page for more information on how to build your own datasets.

The datasets are installed when the Data Masker software is installed. By default the datasets are stored in a directory named DataSets located below the Data Masker installation directory. The location of this directory can be changed through the use of the configuration options on the Misc. Setup Tab.

The Datasets

Listed below are the datasets currently available with the Data Masker software. More datasets are added all the time - and we are always interested in hearing new ideas. If you have a requirement which cannot be fulfilled by the datasets below please do let us know by emailing us at Support@DataMasker.com.

Bank Sort Codes (UK)
Random UK bank sort codes.

Colours (Random)
A list of random colours.

Company Names (Generated)
A list of realistic looking company names.

Counties (UK)
Counties in the United Kingdom.

Country Names
A list of the countries of the world. Can also generate the ISO two letter country codes (us, uk, ca, fr etc) if required.

Credit Card Numbers, AMEX
A list of guaranteed invalid American Express Credit Card numbers.

Credit Card Numbers, Diners
A list of guaranteed invalid Diners Club Credit Card numbers.

Credit Card Numbers, Discover
A list of guaranteed invalid Discover Credit Card numbers.

Credit Card Numbers, MasterCard
A list of guaranteed invalid MasterCard Credit Card numbers.

Credit Card Numbers, VISA
A list of guaranteed invalid VISA Credit Card numbers.

Date Variance
The Date Variance dataset will take an existing date in the column and vary it by a random number of days, months or years. The maximum boundaries of the variation are user configurable.

Date, User Specified
A fixed, user specified, constant date.

Dates, Random
Provides an infinite number of random dates between a configurable start and end point.

Dates, Random (as Text)
Same as the Random Dates dataset but provides formatting options so the date can be used in a char or varchar field.

Dates, Sequential
Generates a sequential series of dates. Has a user configurable start and increment value.

Dates, Sequential (as Text)
Same as the Sequential Dates dataset but provides formatting options so the date can be used in a char or varchar field.

Departments (FR)
Generates random French department names optionally including abbreviations.

E-Mail Addresses (Random)
Generates realistic looking Email addresses.

Names, First Names, Female (FR)
A list of typical French female first names.

Names, First Names, Female
A list of typical female first names.

Names, First Names, Male (FR)
A list of typical French male first names.

Names, First Names, Male+Female
A list of common male and female first names.

Names, First Names, Male
A list of typical male first names.

Names, Surname Suffixes
Typical titles such as Dr. Mr. Mrs. Hon. etc..

Names, Surname Titles
Typical post name text such as Jr. III Ph.d etc.

Names, Surnames, Random (FR)
A large list of French surnames.

Names, Surnames, Random (Large List)
An extremely large list of surnames.

Names, Surnames, Random (Short List)
The 1000 most common UK surnames.

NI Numbers (UK)
Generates random United Kingdom NI numbers which will pass basic validity and formatting checks.

NULL Values
Substitutes or Inserts NULL values in the specified column.

Number Variance
Not a dataset exactly - but functions just like one. The Number Variance dataset will take the existing numeric data in the column and vary it by a random percentage. The bounds of this percentage are user configurable.

Numbers, Floating Point (Random)
Provides an infinite number of random floating between a configurable start and end point. The number of decimal points is also configurable.

Numbers, Floating Point (Random, as Text)
Same as the Random Floats dataset but provides formatting options so the date can be used in a char or varchar field.

Numbers, Integer (formatted)
A dataset which can accept a format string and replace markers within it with random letters and numbers. Valid substitution markers are %c, %C and %n which will substitute random lower case letters, upper case letters and digits for each occurrence in the format string.

For example, the format string MyText%C%C%N might produce a series of values such as MyTextZY7, MyTextFT3, MyTextDD0... etc when used in a substitution rule. Use a double percent symbol %% if you wish the format string to really contain a percent character instead of substituting it.

Numbers, Integer (Random)
Provides an infinite number of random integers between a configurable start and end point.

Numbers, Integer (Random, as Text)
Same as the Random Integers dataset but provides formatting options so the date can be used in a char or varchar field.

Numbers, Integer, Sequential
Provides an sequential list of random integers using a configurable start and increment value.

Numbers, Integer, Sequential (as Text)
Same as the Sequential Integers dataset but provides formatting options so the date can be used in a char or varchar field.

Numbers, User Specified
A fixed, user specified, constant number.

Occupations
A list of job titles and occupations.

Postcodes, Canadian (Invalid)
A list of guaranteed invalid random Canadian Postcodes.

Postcodes, FR
A list of random French Postcodes.

Postcodes, UK (Invalid)
A list of valid or invalid United Kingdom postcodes.

Provinces, Canadian
A list of Canadian Provinces. This dataset can also generate the standard two letter abbreviation for the province if required.

SIN Numbers (Canadian)
Generates random Canadian SIN numbers.

SSN Numbers, (USA)
Random US SSN numbers with correct high group codes. This dataset can also supply guaranteed invalid SSN numbers.

State Names (US)
A list of the states in the U.S.A. Can also generate the two letter state codes (tx, vt, ak, ny etc) if required.

Street Addresses (FR)
Realistic looking French street addresses.

Street Addresses
Realistic looking street addresses.

Telephone Numbers, (FR)
Generates telephone numbers in the French format. Can also generate always invalid numbers.

Telephone Numbers, (North America)
Generates telephone numbers in the North American format. Can also generate always invalid numbers.

Telephone Numbers, (UK)
Generates telephone numbers in the United Kingdom format. Can also generate always invalid numbers.

Text, Alpha-Numeric (Formatted)
Formatted text with a variety of random substution options. Use this to generate random ID numbers (etc) in specific formats.

Text, Dictionary Words
A large list of words.

Text, Paragraphs of Gibberish
Generates paragraphs of random character strings. The maximum word length and paragraph upper and lower bounds are user configurable.

Text, User Specified
Fixed, user specified, constant text.

Town/City Names (FR)
A large list of random French town and city names.

Town/City Names
A large list of random town and city names.

User Defined Dataset
Inserts (or substitutes) lines of text from a user supplied text file. Please see the User Defined datasets help page for more information on how to build your own datasets.

Vehicle Registrations (FR)
Generates random vehicle license plate numbers in the French format.

Vehicle Registrations (UK)
Generates random vehicle license plate numbers in the UK format. Offers options to choose from among the various styles which have appeared over the years.

Zip Codes, US (Invalid)
A list of valid or invalid United States zipcodes. Also offers the option of using Zip+4 notation.


[Net 2000 Ltd. Home][Data Masker Home][Data Masker Manual][Data Masker FAQ]