34 | Test data
data will still retain full integrity, changing the internal identifiers should
•
Hashing, which is a key component
however it will be difficult to identify be considered. If a user can identify in multi-column replacement. It
a specific customer. This can also be an internal ID, sometimes they are allows values to be transformed
used for detaching address IDs from displayed in reports, XML messages, to the same value every time
Customer IDs allowing separating error messages etc then data is no dependant on a hash key. For
customer transaction details from longer masked. example, 1 would be transformed
address details; and detaching The first problem is tracking down to 7, 2 to 6, 3 to 1 etc. Each value
sign on information from personal all of the links between tables and has a unique hashed value and is
information. columns once this is done you need repeatable based on the hash key.
to make sure that the same scramble
•
Using dynamic seed tables will build
Independent functions functions are applied to all of the an exact replacement value for each
A library of simple functions to apply related columns. Identifying these identifier. You need to protect the
to data as it is extracted should be columns may need a tree walk across seed table as it contains the ‘key to
built up. It should include the adding your model. crack the code’ and you must also
of a small decimal increment to Functions that are used in multi- protect the offset algorithm, as this
transaction values to mask individual column scrambling should include: can be used to identify data.
transactions; and the adding of a
•
A simple character by character Using the above techniques, it is
number of days to all dates, a very replacement. Basically shift possible to apply incremental updates
simple transformation to implement, character five and six in a string to your development environment. So,
assuming all your dates are identified identifier to one more less and one for example, if you extract sub sets of
as date data type. This also has the more respectively. production transaction data to keep
advantage of allowing time-dependant
process testing to be
more accurate.
Bear in mind end of month
processing can be affected by this.
You may be better off using a cross
reference table to match up periods.
Multi-table column values
Many column values are repeated
in tables across the system. These
values can be external identifiers, for
example, an account number may
be used extensively across a system
and across other applications, or
they could be internal identifiers
for example an account ID. While
changing external values is obvious, Figure 2 – A tree walk using Datamaker to identify internal IDs
T.E.S.T | March 09 March 09 | T.E.S.T
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52