Disk Backup with Data Deduplication Product Line
ExaGrid is a scalable, cost-effective disk-based backup with deduplication solution that revolutionizes how organizations back up and protect their data. ExaGrid’s disk based, scale-out GRID architecture constantly adjusts to incessant growing data backup demands, and is the only solution that combines compute with capacity and a unique landing zone to permanently shorten backup windows and eliminate expensive forklift upgrades.
High Performance, Scalable Disk Backup with Data Deduplication
ExaGrid is a scalable, cost-effective disk-based backup with deduplication solution that revolutionizes how organizations back up and protect their data.
With ExaGrid, you get the only disk backup appliance purpose-built for backup that leverages a unique architecture optimized for backup and restore performance, scalability, and price. Only ExaGrid’s performance-based GRID architecture offers you:
- Fastest backups up front with permanently short backup windows as data grows
- Instant recovery of full systems, VMs, and files so you have the least downtime
- Lowest total cost over time by eliminating “forklift” upgrades and product obsolescence
Our patented zone-level deduplication reduces the disk space needed by a range of 10:1 to 50:1 by storing only the unique bytes across backups instead of redundant data. Post-process deduplication delivers the fastest backups, and as your data grows, only ExaGrid avoids expanding backup windows by adding full servers in a GRID. ExaGrid’s unique landing zone keeps a full copy of the most recent backup on disk, delivering the fastest restores, instant VM recovery, “Instant DR,” and fast tape copy. And, as data grows, ExaGrid saves you 50% in total system costs compared to competitive solutions by avoiding costly “forklift” upgrades.
Replace Tape with Cost-Effective Disk-Based Data Protection
Using ExaGrid’s disk backup solution to replace tape in the nightly backup process can reduce backup windows by up to 90%. A typical 12-hour backup window can be decreased to as little as two to three hours. ExaGrid improves the speed and reliability of your backups and restores, including support for advanced virtualized server recovery techniques such as instant VM recovery. For offsite long-term retention or disaster recovery, ExaGrid offers the ability to transfer backup data to an installed system at a remote location to supplement or eliminate offsite tapes. ExaGrid also supports multi-site topologies where multiple locations can transfer backup data to a centralized site for DR protection. ExaGrid is very cost effective at transferring backup data offsite because ExaGrid’s deduplication only moves changes, requiring minimal WAN bandwidth. The costs and reliability issues associated with tape handling, shipment, and storage are significantly reduced or eliminated.
The ExaGrid system includes standard appliances along with ExaGrid’s software to deliver a complete turnkey solution for disk backup with data deduplication. The ExaGrid appliance is rack-mountable and uses standard components, including Intel® processors, enterprise SATA/SAS drives, and Gigabit Ethernet connection(s).
ExaGrid works seamlessly with all popular backup applications, so you can preserve your investment in backup applications and processes. Using ExaGrid is as simple as pointing your existing backup jobs to a NAS share on the ExaGrid appliance. Backup jobs are sent directly from the backup application to the ExaGrid appliance for onsite disk backup. The backup application can create copies from the ExaGrid system directly to your tape library for offsite storage, or you can deploy a second site ExaGrid to reduce or replace offsite tape.
Highest Performance for Backups
- Fastest backup performance using post-process deduplication, so nothing interferes with the data writing directly to disk, at the speed of disk
- Backup windows kept permanently short as data grows by adding full servers (with processor, memory, disk, and bandwidth) in a GRID
Fastest Restores and Instant Recovery
- Fastest restore and tape copy performance from the most recent backup kept in its whole form. No reassembly from small blocks and large hash tables is required.
- Instant recovery of VMs from high-speed landing zone, which maintains a full copy of the latest backup. If the primary VM is unavailable, recover and run a VM from the ExaGrid system within minutes.
Most Cost-Effective Solution with No “Forklift” Upgrades
- Scalable next-generation GRID architecture with full servers provides plug-and-play expansion. To add an ExaGrid appliance, you simply plug it in and let ExaGrid’s GRID software virtualize the backup capacity pool.
- Multiple appliances allow full backups of 1TB, 2TB, 3TB, 4TB, 5TB, 7TB, 10TB, 13TB, or 21TB with corresponding raw capacity of 5TB, 7TB, 9TB, 11TB, 13TB, 16TB, 26TB, 32TB, and 48TB, respectively. Any size appliance can be mixed and matched in multiple different configurations with up to ten servers combined into a single GRID of up to 480TB raw capacity, allowing full backups of up to 210TB.
- 50% lower total system cost vs competing systems over time by eliminating the costly “forklift” upgrades associated with a first-generation front-end controller/disk shelf architecture.
Why choosing the right backup architecture is critical to long term success and a risk-free backup and restore process
As more and more organizations reduce or eliminate the use of tape by deploying disk that uses deduplication, the choice of overall architectural approach used by the appliance vendor can make a significant difference to the performance, scalability, and total cost of the selected solution.
Before discussing the pros and cons of the scale-up and scale out, approaches, let’s first define the terms:
- Scale-up typically refers to architectures that use a single, fixed resource controller for all processing. To add capacity, you attach disk-shelves up to the maximum for which the controller is rated.
- Scale-out typically refers to architectures that scale performance and capacity separately or in lockstep by not relying on a single controller but instead providing processing power with each unit of disk.
See the diagram below for depictions of scale-up versus scale-out approaches
|Scale-up disk-based backup||Scale-out disk-based backup|
Comparing Scale-up to Scale out for Backup, Restore and Recovery and the impact of deduplication
A key thing to understand about disk-based backup is that without deduplication, the economics do not work very well against tape. Because many organizations keep weeks, months, or even years of backup data, the actual amount of backup data is typically a multiple of the amount of live data in the environment. This makes straight disk too expensive for backup.
Backup is more than a storage problem
So combining disk and deduplication is the first step to having an actual product for the backup and recovery market. And scale-up architectures represent that simple premise: disk plus deduplication creates a backup and recovery appliance that can meet the economics of backup. However, simply combining disk and deduplication assumes that backup and recovery represent just a storage problem. But is that really the case?
The answer is no, backup and recovery is more than just a storage problem. In fact, backup and recovery is a:
- Data movement problem – moving significant data amounts within a pre-defined backup window
- Data processing problem – data needs to be processed to be stored in deduplicated form and restored/recovered back to its original form
- Storage problem – deduplication is essential to store more backup data in far less disk space
If you do not solve all 3 of these problems, you do not fix the host of issues organizations face with the backup and restore process, including: backup window growth, scalability, and technology obsolescence.
Let’s look deeper at some of the advantages of scale-out approaches.
Managing relentless Data Growth
Data growth leads to performance problems in a scale-up architecture. And the reason is simple. Because the architecture includes a single computing element that houses all network ports, processor, and memory, the performance is limited by the capabilities of that component. As data inevitably grows, only capacity (meaning more workload) can be added until such time that the maximum capacity of that controller is reached.
This leads to 2 significant problems:
- During the period of data growth, the length of all processes also grows. This includes the backup window, deduplication time, replication time, and recovery time. Obviously if I throw more workload at a fixed resource and do not provide additional processing power, it takes longer to complete that work. If I quadruple a devices workload with no additional processing power, obviously it will take four times as long to complete that workload.
- When the controller reaches its maximum capacity, you are faced with a fork-lift upgrade to a more powerful controller which can be very costly.
Scale-out architectures handle data growth very differently. In a scale-out architecture, each building block of the architecture either does include, or can include, additional elements of performance, including: network ports, processors, memory, and yes, disk. As a result, as data grows and capacity is added, processing power is also added.
This solves the above mentioned challenges:
- Data growth does not cause the length of backups, deduplication, replication, and recovery times to also grow. If the workload is quadrupled, the processing power of the architecture is also quadrupled.
- There is no “maximum capacity”. While vendors may limit how many devices can co-exist in a singly managed system, there is never the need for a fork-lift upgrade as devices can continue to be added individually even if means starting a “new system.”
While you can eliminate tape by adding disk, you cannot permanently solve the backup window problems permanently unless you use a scale-out architecture.
Matching Your System Size to your Immediate Backup Needs
Another difficulty found with the scale-up approach relates to system sizing. Many scale-up vendors offer a variety of different controller sizes—meaning controllers that can handle different amounts of maximum disk. And as you would expect, more powerful controllers that allow for more capacity come at a higher cost. So as a purchaser of this approach, a customer has to decide whether to:
- Purchase a controller that can handle a much larger environment than they currently have, to allow for a longer expansion period, or,
- Purchase a smaller controller that matches their current environment knowing they will reach maximum capacity sooner and have to replace that appliance for a more expensive appliance with a larger controller. This is a lose:lose scenario.
Choice a) above leads to a much higher cost up-front for a controller utilizing today’s technology that will become obsolete quickly. Choice b) will save some up-front costs but will more quickly lead to a fork-lift upgrade as the controller’s maximum capacity will be hit sooner.
In marked contrast to this, scale-out approaches avoid the system sizing problem. Because of the modularity of a scale-out architecture, customers can right-size their purchase to the current environment plus reasonable growth. Then as data grow, more building blocks can be added as needed without concern for a fork-lift upgrade. This completely de-risks the upfront purchase, potentially makes it more cost-effective and definitely avoids costly fork-lift upgrades downstream.
A final weakness of the scale-up approach is the challenge of technology obsolescence. IT professionals are all too familiar with the concept of buying a new data center product only to find it reaching end of life shortly after purchase. This problem is exacerbated when you may make the decision to buy a larger controller that allows for greater expansion runway. The controller causes you to lock-in to what is then the current technology. And as the vendor releases a controller based on newer technology, the only way to leverage it is to go through another fork-lift upgrade.
Scale-out approaches may avoid this (depending on the vendor) by allowing users to mix and match different generations of building-blocks in the same system. Assuming the vendor guarantees that the hardware can all be upgraded to the latest and greatest software, you can avoid the need to rip and replace expensive components to take advantage of the vendor’s newest offerings.
ExaGrid is the only scale-out vendor in the disk-based backup with deduplication market. And it has extended its lead with its announcement of the EX21000E. This new platform offers 62% more capacity, with double the performance, and a lower cost per TB than its predecessor platforms. Now customers can expand their installations to even larger data amounts while preserving the short backup window. Further, it is backward compatible with previous generation platforms so it continues to protect previous investments in ExaGrid’s appliances by its large and growing customer base.
Conclusion – only one architecture, and vendor, solves the backup and restore problem forever
There are a number of vendors that offer scale-up approaches to disk-based backup but only one that offer scale-out – ExaGrid. If you are reviewing your approach to backup, it makes sense to de-risk your data protection forever and choose the one vendor that has a scale out approach and enables you to have total trust that you will always be able to restore or recover your data when you need it most – in a way that takes cost out of the backup infrastructure.
Each time your data grows you will be given another reason why it was the smartest choice.
ExaGrid employs a unique form of deduplication called Zone-level deduplication. What makes zone-level so unique and powerful is it is the first truly scalable deduplication algorithm that is also generic enough to support most, if not all, of the backup applications and data protection utilities on the market today and in the future.
Prior to ExaGrid’s entry into the market, first generation disk-based backup appliances utilized generic block level algorithms to perform deduplication. Due to the need for blocks to exactly match other blocks to achieve deduplication, these implementations cut data into very small objects sizes such as 8k or 16k. As a result, they suffer from significant scaling limitations due to the resulting size of their tracking tables. For example, a product using an 8k object size would generate 1.25 billion objects to track when storing as little as 10 TB of data. This prevents vendors from distributing these tracking tables across multiple servers. The block-level method limits vendors to systems that employ single servers (called controllers) plus disk shelves, or small fixed size appliances with no expansion.
So, while block-level algorithms deliver on capacity reduction and are generic enough to support a wide variety of applications and utilities, they trade off the ability to have a highly scalable architecture surrounding them. ExaGrid’s zone-level deduplication does not force this trade off.