Data Masking refers to the surrogating of existing and sensitive information under certain test or the development databases with information that seems real but actually is of no use to any Tom, Dick and Harry who might be tempted to put it to improper use.
Substitution As A Data Masking Technique
In a nutshell, this method comprises randomly replacing the contents of data arranged in columns with information that looks similar but is completely different to the real details. For instance, the surnames in a customer database could be masked by replacing the real last names with surnames fetched from a random list. Substitution is very effective in terms of preserving the look and feel of the original data. One major downfall of substitution though is that a largish store of information which can be substituted must be available for each and every column to be substituted.
Shuffling As A Data Masking Technique
This method is similar to substitution save for the fact that the substitution data is derived from the column itself. Basically, the data in a column is randomly moved between rows until there is no more reasonable similarities with the remaining information in the row.
Here, the option of leaving the data intact and visible to users bearing the appropriate key while at the same time the encryped data remains effectively useless to anybody without the key. Seemingly, this would be a very good option – unfortunately, for anonymous test databases, it is one of the least preferred techniques.
Masking Out Data
Apart from being the generic term for the process of data masking, it also means replacing certain fields with a mask character such as letter X. This effectively camouflages the data content while preserving the same formatting on front end screens and reports
Cross Database Synchronization Technique
Just like in the requirement for Cross Schema Synchronization, Cross Database Synchronization technique is more or less the same. The databases are collocated in the same server but the masked datasets are located in separate databases (and also schemas). Provide this requirement exists, the analysis phase should then be carefully plan for and the database software should be able to support such a synchronization operation.
This is simply erasing a column of data and replacing it with NULL values is an effective way that ensures that it is not inappropriately visible in test environments. Unluckily though, it is also one of the least settled on options from a test database standpoint.