Deduplication Frequently Asked Questions
Customers and prospects often have questions about data deduplication, ExaGrid’s products and technology as well as the company itself. A list of the most frequently asked questions and their answers is available below.
You can click on the specific categories or scroll down the page to find the information.
What backup applications does the ExaGrid appliance work with?
ExaGrid works with these backup applications and environments.
What data can be backed up into an ExaGrid appliance?
The ExaGrid system can be used with any data that comes through the above listed backup applications. It doesn’t matter if the data is made up of files, e-mail data, databases, etc. It also doesn’t matter whether it’s coming from a disk local to the target, a NAS system, a SAN, or a remote office via replication or WAN acceleration. All types of data can be effectively data-reduced with byte-level data deduplication.
How many copies of my backup data can I keep on an ExaGrid system?
If your backup schedule is to do: You can keep:
- Weekly full backups of all data (files, database, e-mail), plus
- Daily incremental backups of files, plus
- Daily full backups of databases and e-mail
Up to 16 weeks of weekly fulls plus 2 weeks of dailies
- Daily full backups of all data
Up to 75 copies of your daily fulls
The above numbers are based on a typical mix of data with an industry average change rate of 2% per week at the byte level. In other words, in any given week, data changes an average of 2% through normal business activities (file edits, file adds, file deletes, e-mail traffic, and database transactions).
Ease of Use
What do I need to change to make my backup application(s) work with ExaGrid?
No changes are required. All backup applications support writing to disk in a number of ways; to a disk volume, to a tape library, to a NAS share (Ethernet Network Attached Storage Share), or backup jobs that write snapshots to disk. ExaGrid presents NAS shares to the backup server. Existing backup jobs are simply redirected to write to the NAS shares on the ExaGrid system. Your current backup jobs and schedule stay intact.
Does ExaGrid require special software on my backup server or do I need to buy any additional software from my backup vendor?
As long as you are using one of the backup application versions listed above, then you have everything you need. ExaGrid does not require any additional software. Your backup application has everything it needs to write to disk.
How hard is it to set up?
You plug in a standard Gigabit Ethernet connection between the backup server and the ExaGrid system, set up some shares on it, point your existing backup jobs to the NAS shares, and go. Setup can be completed in a morning and backups can be sent to the ExaGrid system that night.
Do you support backup servers running on Windows, Linux and UNIX?
Yes, ExaGrid supports two protocols; CIFS for Windows and NFS for Linux and UNIX.
Do I continue to use my backup application user interface?
Yes, you set up backup jobs, rotations, restore data, etc., as you do today. ExaGrid sits behind the backup server as a disk-based backup storage repository.
Will my backup window shrink?
Yes, most customers report that their backup window will drop from 40% to 90%. It is not uncommon to see a 10-hour backup window drop to 5 hours or less. ExaGrid writes at the speed of disk as ExaGrid does all of its compression and zone-level data deduplication after the backup job is completed. ExaGrid does not do any inline compression or deduplication. This allows the backups to go as fast as the disk can write.
What level of compression and data deduplication do you achieve?
ExaGrid compresses the most recent backup file and keeps it in its entirety. Average compression is about 2 to 1, so a 10TB backup file would be stored as 5TB. All previous backup files are then kept as the byte level changes only, which averages to about 2% of the data, equating to about 200GB for every 10TB of data. This means that if you kept 20 weeks of backup retention, for 10TB of primary data, the latest backup would be stored as 5TB and the previous backups would be stored as 19 x 200GB (3.8 TB byte changes) or a total of 5TB + 3.8TB = 8.8TB. If you did not use compression and zone-level data deduplication, you would need 200TB of storage space. In this example ExaGrid would use 8.8TB versus 200TB. This instance is 23 to 1 data reduction. Overall, we see anywhere from 10 to 50:1 reduction in disk consumption.
What happens to my restore speed if you store zone-level changes?
Ninety percent of restores come from the last backup as this is the most up-to-date data. In this case, ExaGrid is extremely fast as it stores the last backup in its entirety. For the remaining 10% of restores, ExaGrid updates the last backup with the zone-level changes from the version requested. This happens very quickly as ExaGrid ships with Intel Quad-Core XEON processors.
Onsite & Offsite Systems / Tape Elimination
Can I use an ExaGrid system onsite to eliminate tape and still use tape for offsite?
Yes, your existing backup application will write to ExaGrid for your standard nightly backups on Monday, Tuesday, Wednesday, Thursday, Friday, etc. You can set up a backup job, in your existing backup application, to make a copy from ExaGrid (NAS) through the backup server to tape. All backup applications can make a copy from NAS to tape. The benefit of using this feature already in your backup application is that restores can be completed directly from tape without ExaGrid in the middle.
Can I use ExaGrid for both onsite and offsite and shut tape off?
Yes, in fact over half of ExaGrid’s customers have systems installed at more than one location and move zone-level changes offsite for long-term offsite retention and/or disaster recovery. ExaGrid supports two-site or multi-site topologies. Because ExaGrid only moves the zone-level changes from the local site to an offsite system only about 1/50th of the data has to traverse the WAN.
How much WAN bandwidth would I need between ExaGrid systems?
Typically, about 2% of the bytes change from full backup to full backup. ExaGrid compares the two backups and only moves the changes at the byte level. For every 1TB of primary data, the byte-level changes would be about 20GB. Standard bandwidth math dictates that for 20GB of data, about 3mbps is needed to move the changes to the offsite location in less than a day (about 18 hours). If the change rate is higher, then additional bandwidth would be required. Compare this to moving a full backup, and a 1TB full would take about 38 days to move across to the second site, which is why you only want to move the byte-level changes from one site to the other.
How do I get the first full backup to the offsite if I have limited bandwidth?
You bring both systems together in the same data center to start and do a full into the primary system and then replicate the full to the second system on the local network. From there, the second system is shipped to the offsite location. From that point forward, only changes at the byte level are replicated to the second site.
If only the changes are moved to the second site, do I have a complete backup on the offsite system?
The offsite system has the latest copy of the full backup because each time ExaGrid replicates changes to the offsite it updates the latest full with the changes. This creates an up-to-date full backup on both sides with identical byte-level changes for the previous copies. Both the primary and offsite are identical in every way.
Can I install the offsite system at another location and do backups into that location?
Yes, backups can be sent to the ExaGrid at the primary site and at the offsite location. The byte-level changes will replicate from the primary site to the offsite system, and the byte-level changes at the offsite will replicate to the primary site. In short, the systems will both act as primary backup storage for the site where they are located but will cross-protect each other as offsite backups.
Data Growth & System Expansion
What if my data grows or I decide to extend my retention?
ExaGrid is a highly scalable solution. ExaGrid ships with GRID computing software. As your data grows, you just plug in another ExaGrid server and it virtualizes into the existing system. To the backup server it just appears as a bigger system. The processor, memory and disk storage “virtualize” to make a larger system. ExaGrid has a back channel Gigabit Ethernet connection between servers for management and for load balancing as ExaGrid will automatically load and capacity balance across servers.
What if new drives or new processors ship; can the old work with the new?
Yes, this is the power of GRID computing. You can buy an ExaGrid server today and if you buy another ExaGrid in two years and the processors and drives have improved the two systems will still virtualize into each other as GRID computing works with different processors, processor speeds, and different disks and disk sizes. For the first time backup systems do not have to become obsolete as you go forward.
As I add more ExaGrid servers, do the backups slow down?
No. Each ExaGrid server comes with additional Gigabit Ethernet connectivity, as well as on-board processor, memory and disk. Therefore, each server comes equipped to handle the amount of data it is sized for. Also, shares can be simply migrated from one server to another such that the jobs can be spread across multiple servers in the GRID. ExaGrid tracks and knows where all the data is in the GRID, as the GRID is seen as one large virtualized system.
What size appliances does ExaGrid have and can they be mixed and matched in the GRID?
Multiple ExaGrid EX series appliances allow full backups of 1TB, 2TB, 3TB, 4TB, 5TB, 7TB, 10TB, 13, or 21TB with corresponding raw capacity of 55TB, 7TB, 9TB, 11TB, 13TB, 16TB, 23TB, 32TB, and 48TB, respectively. Any size appliance can be mixed and matched in multiple different configurations with up to ten servers combined into a single GRID configuration of up to 480TB raw capacity and allowing full backups of up to 210TB.
Maintenance & Support
What happens if a disk drive fails?
The ExaGrid system is configured with RAID6 plus a hot spare. Each ExaGrid server can survive two simultaneous drive failures. The ExaGrid system will use the spare drive to start rebuilding immediately. As long as you are on yearly maintenance all you have do is notify ExaGrid and a new drive is sent to you (next business day). There is no charge. Simply take out the old drive and insert the new one. The drives are hot swappable so the system always stays in production.
What happens if a power supply fails?
ExaGrid ships with a redundant three-component power supply system. If a power supply module fails, simply call ExaGrid and the module will be sent the next business day. There is no charge. Simply slide out the bad module and slide in the new one. The modules are hot swappable so the system always stays in production. You must be on yearly maintenance for this service.
What happens if a server fails (motherboard failure)?
If you are on yearly maintenance, ExaGrid will send out a new server the next business day. There is no charge. You can take the power supply modules and the drives and slide them into the new server. When you turn on the server, everything will be there because all of the software configuration, data, etc. is on the drives.
|Have a product question? Click here to ask us now.|