Skip to main content

Solution Architecture for the world wide backup system of VM based systems, including planning of network routing limitations in AWS/Google Cloud/Azure.

Case Study: Designing a Global Backup Solution for BMW’s Virtual Machines and File Sharing Tools

Client: BMW / HPE

Project Overview:

BMW required a robust and scalable backup solution for their extensive and globally distributed virtual machine (VM) systems and file sharing tools. The primary challenge was to identify a backup solution capable of handling the massive data volume, amounting to hundreds of petabytes, while ensuring data reliability, efficient deduplication, and overcoming bandwidth limitations.

Objective:

To research, compare, and implement a comprehensive backup strategy that leverages both cloud storage providers and on-premises HPE deduplication servers, ensuring cost-efficiency and high reliability.

Solution Design Process:

Requirement Gathering:

  • Conducted detailed discussions with BMW’s IT team to understand the specific needs and constraints regarding data backup.
  • Identified critical parameters like data volume, bandwidth limitations, disk write speeds, and the importance of data deduplication.

Research and Analysis:

  • Performed extensive research on various cloud storage providers, evaluating their bandwidth limits, disk write speeds, and scalability.
  • Analyzed the cost implications of using different cloud providers for such massive data volumes.
  • Investigated HPE’s deduplication technology and its potential to integrate with both cloud and on-premises storage solutions.

Comparison of Cloud Providers:

Compared leading cloud storage providers (e.g., AWS, Google Cloud, Microsoft Azure) focusing on:

  • Bandwidth limits
  • Disk write speeds
  • Cost per petabyte of storage
  • Data redundancy and reliability features
    Assessed the feasibility of these providers to handle BMW’s backup needs effectively.

Hybrid Backup Solution Development:

  • Designed a hybrid backup strategy combining multiple cloud providers to ensure multi-reliability and avoid single points of failure.
  • Incorporated HPE deduplication servers to significantly reduce the data volume needing backup, addressing the bandwidth limitations.
  • Ensured that the most critical data was backed up in a deduplicated manner to further enhance cost-efficiency and data protection.

Implementation:

Integration:

  • Implemented the hybrid backup solution with seamless integration between on-premises HPE deduplication servers and selected cloud storage providers.
  • Developed a systematic backup schedule to optimize bandwidth usage and prevent any potential bottlenecks.

Testing and Validation:

  • Conducted rigorous testing to ensure data integrity, backup speeds, and the efficiency of deduplication processes.
  • Validated the reliability of the solution through multiple failover tests and data recovery drills.

Optimization:

  • Continuously monitored the backup processes to identify and rectify any inefficiencies.
  • Fine-tuned the balance between on-premises and cloud storage to maximize cost savings without compromising on data security and accessibility.

Results:

  • Cost Efficiency: Demonstrated that a self-hosted, deduplicated backup setup can be significantly more cost-effective than relying solely on cloud storage providers, especially for large-scale data volumes.
  • Data Reliability: Achieved high data reliability through a combination of multi-cloud redundancy and HPE’s robust deduplication technology.
  • Bandwidth Management: Successfully mitigated bandwidth limitations by utilizing HPE servers to reduce the data volume needing transfer to cloud storage.
  • Scalability: Ensured the solution could scale with BMW’s growing data needs, providing a sustainable long-term backup strategy.

Conclusion:

The project culminated in a highly efficient, scalable, and cost-effective backup solution for BMW’s worldwide VM systems and file sharing tools. By leveraging a hybrid approach with both cloud storage and on-premises HPE deduplication servers, we not only met but exceeded the client’s expectations, ensuring data reliability and significant cost savings.

Unlock unparalleled data reliability and cost savings with our hybrid backup solutions—contact us now to transform your data management strategy!

Full transcript

Everyone knows that backups are important, but rarely everyone has control about every step of it. What I mean is the typical scenario would be to back up a virtual machine daily and just save the whole package somewhere in the cloud without worrying about the exact difference between snapshots, without knowing what has been backed up and so on. This is something BMW faced with their global back-up strategy because they had thousands of virtual machines all over the world, but they had to back-up different states of them, also deciding between sensitive and not sensitive data. And just imagine if you have thousands of virtual machines all over the world. What do you do? Do you back up the whole virtual machine, like 50 to 100 gigabytes per machine, and save it back to Germany in their headquarter? I mean, as you can imagine, this is not possible at all, especially to transfer the huge data sum between different continents and countries.

Therefore, a strategy has been implemented to just take the difference between several days, meaning basically oftentimes called deduplication, and just take the difference between yesterday and today and package that into a compressed format and just transfer this except to Germany. And if you are wanting to be more specific you could even just separate sensitive data from insensitive data and just, for example, store the not sensitive data at the backup location or in another data center and transfer the sensitive personalized data back to Germany, to their headquarter.

And the way we did it in this example was with HPE because they had a special server that you can install into the data center, which worries about the connection with virtual machines and the storage providers and the deduplication. But something how you would do this in an open source environment would be, for example, with a tool called RESTIC. Because RESTIC can specify the the difference between a folder from yesterday and today and create for example seven daily backups. And if there’s an eight day old backup, it will delete it. And additionally, for example, one monthly snapshot and one yearly snapshot. And there are also several possibilities with Rastic to monitor the backup process, which is an important part as well, because you do not want to forget about deleting old ones, because then your backups would just pile up and pile up and add up to a huge sum of data. And like I said, you could then further split between sensitive, like for the business process, sensitive data and general data.

And for example, save the general data that is not too important in the one zone data center, for which we could use, for example, MinIO, which is an open source S3 solution, object storage, and save the sensitive data again in MinIO, but with three different data center layer, which basically means that we save each part of the sensitive data in three different locations for disaster recovery. And this is what it all boils down to. If we have a good data recovery strategy, we avoid costly accidents or costly restoration things and have the data at hand if something happens. And that is something companies discover way too late if the accident happened already or if they did not monitor their backup process. Because in that case, probably millions worth of data is lost and needs to be basically rebuilt up from the ground again. And therefore, a good backup strategy is very important and especially monitoring what and when you are backing up. Actually, you can find some other tutorials, especially about RESTIC, about the MinIO that I said on my website. Have a look around or if you want, we can have a small discussion about it and evaluate if you have an accident proven backup strategy. All right, see you around.

Oh hi there 👋
It’s nice to meet you.

Sign up to receive one paid tutorial/course per month for free and a weekly summary of new posts and tutorials in the Data/Cloud/AI space

We don’t spam! Read our privacy policy for more info.

Leave a Reply

Close Menu

Wow look at this!

This is an optional, highly
customizable off canvas area.

About Salient

The Castle
Unit 345
2500 Castle Dr
Manhattan, NY

T: +216 (0)40 3629 4753
E: hello@themenectar.com