Scalability Will Enable the ITS Group to Eliminate Tape
Barnes said that UCLA has deployed ExaGrid systems locally to handle primary backup and additional systems in its Berkeley datacenter for disaster recovery. Data is replicated automatically each night between the two locations. ExaGrid’s architecture will ensure that the systems can scale to handle increased backup requirements and will enable UCLA to create a network of backup units that all tie into a larger cluster for disaster recovery.
“Our grand plan is to help other departments with their backups and data deduplication by building a large cluster of ExaGrid units in Berkeley that they can connect into,” Barnes said. “We’re confident that we can easily add appliances to the system to increase capacity and performance over time.”
ExaGrid’s appliance models can be mixed and matched into a single scale-out system allowing a full backup of up to 2.7PB with a combined ingest rate of 488TB/hr, in a single system. The
appliances automatically join the scale-out system. Each appliance includes the appropriate amount of processor, memory, disk, and bandwidth for the data size. By adding compute with capacity, the backup window remains fixed in length as the data grows. Automatic load balancing across all repositories allows for full utilization of all appliances. Data is deduplicated into an offline repository, and additionally, data is globally deduplicated across all repositories. UCLA is currently getting data deduplication ratios as high as 17:1, which helps to maximize the amount of data the University can store on the system. The technology also helps to make transmission between sites more efficient.
“Our end goal is to eliminate tape campus-wide. The University of California system has a very highspeed Internet connection, and with the ExaGrid system, we send only changed data between systems, so transmission time is minimized,” he said. “I have quite a bit of bandwidth I can work with between here and Berkeley, but it’s not sensible to be sending the same data back and forth, and we don’t want to use all our bandwidth for replication.”
ExaGrid writes backups directly to a disk-cache Landing Zone, avoiding inline processing and ensuring the highest possible backup performance, which results in the shortest backup window. Adaptive Deduplication performs deduplication and replication in parallel with backups for a strong recovery point (RPO). As data is being deduplicated to the repository, it can also be replicated to a second ExaGrid site or the public cloud for disaster recovery (DR).