Which Analytic Workloads Should Move to the Cloud First?
The number one reason that companies cite for moving to the cloud is cost savings. But if that’s true, why aren’t more companies moving all their analytic workloads to the cloud?
The number one reason that companies cite for moving to the cloud is cost savings. But if that’s true, why aren’t more companies moving all their analytic workloads to the cloud? The answers vary and, frankly, aren’t that great.
So, let’s talk turkey. Which analytic workloads should be moved to the cloud as soon as possible? The obvious ones are Test/Dev/QA—systems that aren’t running your business 24/7. Another one we love to see in the cloud is disaster recovery (DR). But the combination is the holy grail. If you host a Test/Dev/QA system in the cloud and make that your DR system in case of emergency, you’re gaining efficiency and taking full advantage of the elasticity and cost savings that come from using the cloud.
There are specific reasons why these scenarios work so well in the cloud. First, for Test/Dev/QA—or any non-production, not-always-on application—it starts with determining (1) what data you need/want these systems to access; (2) where that data needs to be located to take full advantage of cloud compute power, and; (3) how you will get your data to that ideal location.
As an example, let’s assume you want a subset of your production data accessible to these systems. You would start with moving the data you need to public cloud storage like Amazon Simple Storage Service (S3), Azure Blob, or the storage offered by your public cloud provider of choice. Next, you build out the compute power you desire to work against that data, move the data to the compute (or in some cases, you can access the data directly from the storage without moving it), and away you go.When you’re done running your Test/Dev/QA job, just shut the compute resources down. This is what the public cloud was originally built for.
These are the types of cloud use cases that not only save money, but also meet an organization’s business needs for the right compute at the best cost, when needed. Tweet This
Although analytics seems to be one of the last use cases moving to the public cloud, the use case we just described should be the first thing you move there. You pay for the storage on a constant basis, but pay only for compute when you actually need it—and that’s a beautiful thing.
The DR use case can be very similar, though it depends on the Recovery Time Objective (RTO) that your company needs. The cloud may be awesome, but it still must adhere to the laws of physics. If you have data sitting in cloud storage and have a DR event, it will still take you some time to launch the compute and move the data from storage to the compute so it can be used. The more data you have, the longer that will take, so we recommend sitting down with cloud experts to work through the numbers. Assuming they work out well, you can have a DR strategy in the cloud that will cost you a lot less than an “always on” on-premises DR system.
Finally, let’s think about what happens when we combine the two. Sometimes companies have their Test/Dev/QA systems on more often than others do. Or, the needed RTO for DR means the compute needs to be ready faster than the scenario above can accommodate. That’s when we might recommend a dual-purpose system. You’ll need to check with your software vendors to make sure their cloud deployments can accommodate this, but if they can, it’s a great use case to save money—and meet the business needs for Test/Dev/QA and a DR system in the cloud.
In this scenario, we would use cloud storage that can be attached to compute and stay ready to connect at the snap of a finger. This would be Amazon Elastic Block Store (EBS) or Azure Premium storage, for example. This storage costs more than S3 and Blob storage, but provides a use case where compute can be scaled much faster. To illustrate, I’ll use AWS storage and compute options, but you can do the same thing on Azure and other public cloud vendors.
First, spin up the EBS then choose the smallest compute that would accommodate your Test/Dev/QA workloads. Then move your data into the EBS storage. To accommodate a potential DR scenario in the future, you’ll want to move all the data you’d need in case of a DR situation, and move any changed data at a rate that meets your organization’s needs. That rate could be once a week, once a day, or even several times a day. Again, at this point you are only paying for the storage of your data plus the compute you need for your Test/Dev/QA systems. You can turn off the compute when it’s not being used for Test/Dev/QA or when not moving data into the storage. Then, if/when you have a DR event, you can scale out or scale up your compute resources to deliver what’s needed.
These are the types of cloud use cases that not only save money, but also meet an organization’s business needs for the right compute at the best cost, when needed.
To learn more about analytics in the cloud, follow the conversation at #CloudExperts or #BuiltForTheCloud, or reach out to your Teradata account executive.
Marc is Teradata’s™ Director of Cloud Strategy and is Teradata's resident Cloud Evangelist. Prior to Teradata Marc worked at cloud startups as well as a cloud consultant. Marc is a veteran cloud strategist and speaker who has participated in countless webinars, panels, and seminars as a featured expert on all topics cloud. He is the committee chair for a Cloud Conference in San Diego and nurtured its growth to what is now an important regional event. Marc’s style is to challenge preconceived notions about cloud services and engage his audience with humor, insight, and uncommon sense. He has worked successfully with multiple cloud start-ups, VARs, and MSPs to help them with their cloud services.