Data Deduplication

ExaGrid looked at the first generation, traditional inline approaches to data deduplication and saw that all vendors had used block-level deduplication. This traditional method splits data into 4KB to 10KB “blocks.”

The backup software, due to CPU limitations, uses 64KB to 128KB fixed-length blocks. The challenge is that for every 10TB of backup data (assuming 8KB blocks), the tracking table – or “hash table” – is one billion blocks. The hash table grows so large that it needs to be housed in a single front-end controller with additional disk shelves, an approach referred to as “scale-up.” As a result, only capacity is added as data grows and since no additional bandwidth or processing resources are added, the backup window grows in length as data volumes increase. At some point, the backup window becomes too long and a new front-end controller is required, known as a “forklift upgrade.” This is disruptive and expensive.

Since the deduplication is performed inline on the way to disk, the backup performance is very slow as data deduplication is compute intensive. In addition, all the data is deduplicated and has to be put back together (data rehydration) for every request.

The net is slow backup, slow restores, and a back window that continues to grow as data grows (due to scale-up).

ExaGrid's Unique Value Propositions

Download Data Sheet

ExaGrid Tiered Backup Storage: Detailed Product Description

Download Data Sheet

ExaGrid’s Tiered Backup Storage took a more innovative path. ExaGrid uses zone-level deduplication, which breaks data into larger “zones” and then performs similarity detection across the zones. This approach allows for the best of all worlds. First, the tracking table is 1,000th the size of the block-level approach and allows for full appliances in a scale-out solution. As data grows, all resources are added: processor, memory, and bandwidth as well as disk. If data doubles, triples, quadruples, etc., then ExaGrid doubles, triples, and quadruples the processor, memory, bandwidth, and disk so that as data grows, the backup window stays at a fixed length. Second, the zone approach is backup application agnostic, allowing ExaGrid to support virtually any backup application. Lastly, ExaGrid’s approach does not maintain a very large, ever-growing hash table and, therefore, avoids the need for expensive flash to accelerate hash table look-ups. ExaGrid’s approach keeps the cost of the hardware low.

ExaGrid provides a unique front-end disk-cache Landing Zone where backups are written without the performance overhead of deduplication. In addition, the most recent backups are kept in the Landing Zone in a non-deduplicated native backup application format. The result is the fastest backups and the fastest restores.

In summary, block-level deduplication drives a scale-up architecture that only adds disk as data grows, or with a scale-out node approach requires expensive flash storage to perform large hash table look-ups. Since block level is performed inline the back and restores are slow. ExaGrid’s Tiered Backup Storage with zone-level deduplication includes full server appliances in a scale-out solution without large hash table look-ups, which results in the fastest backup and restore performance at the lowest price. ExaGrid’s approach also supports a wide range of backup application support. This Tiered Backup Storage approach provides the best of all worlds: ExaGrid can work with any backup application and can easily scale, resulting in a fixed-length backup window regardless of data growth. This Tiered Backup Storage approach provides the best of all worlds; performance, scalability, and low cost.

ExaGrid continues to innovate to fix backup storage…forever!

Talk to us about your needs

ExaGrid is the expert in backup storage—it’s all we do.

Request Pricing

Our team is trained to ensure that your system is properly sized and supported to meet your growing data needs.

Talk With One of Our System Engineers

With ExaGrid’s Tiered Backup Storage, each appliance in the system brings with it not only disk, but also memory, bandwidth, and processing power—all the elements needed to maintain high backup performance.

Schedule call »

Schedule Proof of Concept (POC)

Test ExaGrid by installing it in your environment to experience improved backup performance, faster restores, ease of use, and scalability. Put it to the test! 8 out of 10 who test it, decide to keep it.

Schedule now »

Ready to Talk to a System Engineer?

Data Deduplication

Data Deduplication

ExaGrid's Unique Value Propositions

ExaGrid Tiered Backup Storage: Detailed Product Description

Talk to us about your needs

Request Pricing

Talk With One of Our System Engineers

Schedule Proof of Concept (POC)