A SPOTLIGHT ON… De-Duplication
It’s quite common to find the same data backed up multiple times on a network. This data duplication uses up additional storage space and wastes other resources. De-Duplication (also known as intelligent backup or intelligent compression) is designed to remove duplicated data and thus maximise your ICT resources. De- Duplication can massively reduce the amount of data that needs to be backed-up – claims of up to an 80% reduction have been made by some vendors.
De-Duplication can operate on a file, block or bit basis, and works by analysing the data that resides on a storage medium (such as a single disk drive or backup server). If identical data is found, a single copy of the original data is retained, but the remaining identical copies are replaced with a reference pointer, which directs the system to the original copy. This is useful if, for example, there are many dozens of identical mail attachments residing on your network. Whenever any new data is saved, it is analysed and compared with the stored data. If it is found to be identical, a new reference pointer is stored in its place. As far as users as concerned, there is no difference when it comes to accessing the data they need.
De-Duplication can be synchronous (performed in real-time), whereby data is analysed before being written to disk. This is a very efficient system, as it means that no duplicated data is actually stored on your network. However, synchronous De-Duplication can be a resource-intensive process (particularly on the CPU) and affect network performance. Asynchronous (also known as batch or offline) De-Duplication analyses data saved and stored on your network, and so is a less efficient process. However, because asynchronous De-Duplication can be performed when demand on network resources is low (such as overnight), it has little, if any effect on network performance. Whatever process is used, De- Duplication can save storage space, improve network performance and lower costs.
Currently, de-duplication technology is still relatively expensive to deploy in a single school environment; the technology is more commonly found in high-end backup solutions, such as those used for backing up multiple schools across an internet connection. However, as the benefits of de-duplication become more widely recognised, products suitable for use in schools will appear.
ASYNCHRONOUS DEDUPLICATION User Storage Server
User 1 Home folder
Backup Server Backup Storage
Backup Server Backup Storage
User 2 Home folder
User 3 Home folder
User 4 Home folder
User 5 Home folder
Data is backed up normally to the backup server
The backup server then deduplicates the data removing any files or data that exists elsewhere in the backup
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13