Best Practices for SAP High Availability in AWS
By Brett Barwick, Senior Software Engineer, SIOS Technology
Any organization relying on an SAP ERP solution has, implicitly, an interest in control. That’s not a bad thing: you rely on SAP because the world in which you operate is complex. Your reputation and your own expectations demand that you deliver your products and engage with your customers in a predictable manner. But control also extends to the data center in which your SAP landscape has been running. Your IT organization may be comfortable ensuring the availability of that landscape when it is on-premises, in a data center it controls. At the same time, it may see very good reasons for that landscape to run in the AWS cloud.
The question is, how can an SAP landscape be configured in the AWS cloud to ensure the high availability (HA) you expect? While on-premises, IT knows what to do. But in the AWS cloud?
Explore related questions
That’s a question we can address today.
Ensuring HA for SAP in AWS
The key components of an SAP ERP system that one would deploy on-premises — at the presentation, application, and database layers — are also deployed in the AWS cloud. What differs from an HA perspective is how they are deployed. There are single points of failure that threaten the high availability of the solution. In the presentation layer, the SAP Apache content management system is one; in the application layer, the file system manager is another, as are certain core SAP services, such as the message server and the enqueue server. Should any of these services fail — regardless of whether that is due to human error or a hardware failure in an AWS Availability Zone (AZ) — the availability of your entire landscape is threatened.
Best practices exist to ensure operational continuity of your SAP landscape in AWS. To begin with, you should build out your HA SAP landscape across multiple AWS AZs. This ensures that if the virtual machines (VMs) or the underlying servers in one AZ go dark, the infrastructure in another AZ can be called into service immediately. For full disaster recovery you should consider creating a shadow landscape in a geographically distinct AWS region. This provides a hedge against a catastrophic failure that could take down multiple AZs in a single region.
You’ll treat the network resources connecting the components of your landscape differently as well. Within and among AZs in an AWS region there is ample bandwidth to move information among the nodes of your SAP landscape without delay. However, the environment is likely more complex than any your IT team will manage on-premises — but a certain amount of abstraction is advantageous. Best practices involve the use of floating IP addresses and cloud quorum/witness features so that underlying resources can be swapped in and out transparently as availability demands dictate.
Finally, the manner in which data is stored in AWS must be considered and proactively addressed. You can’t use shared storage resources in the cloud in the same way you can on-premises, meaning you’ll need to anticipate how and where your data will be stored. Again, distributing storage across multiple AZs helps ensure HA, but you’ll also have to consider how that data is properly replicated among distributed storage. You’ll also have to consider how you’ll track, manage, and protect important mount points for partitions, logical volumes, and NFS exports.
Intelligent Management of Cloud Components
While distributing the components of your SAP landscape among different AWS AZs is critical for HA, so too is the deployment of a landscape monitoring and management solution. You need a solution that can monitor the health of each of these components, immediately identifying any component that is not performing properly and orchestrate the process of repairing the problem as quickly as possible.
The intelligence of such a solution is key, because different issues affect different components of an SAP landscape in different ways and no single remedy is universal. Some availability problems may be solved almost instantaneously by restarting a process on a specific VM; others may require a VM in another AZ to go online, and that may require the reassignment of IP addresses and the restarting of several SAP services in a specific sequence. Monitoring and management tools that are SAP-aware are key to an appropriate, orderly, and expeditious response.
In particular, you’ll want to look for monitoring and HA management tools that look at an SAP landscape across multiple dimensions. Tools that simply listen for a heartbeat from the SAP primary application server (PAS), for example, and then initiate a failover to a secondary server in the absence of a detected heartbeat are too heavy-handed. A suite of SAP-aware tools that can monitor the performance of critical processes on individual landscape components can provide a broader range of appropriate actions. With that kind of intelligence, lower-level issues that might eventually compromise a component in the cloud can be detected and automatically addressed before they cause serious problems.
At the same time, this suite of tools must also be AWS-aware. They must be able to translate the appropriate responses to issues detected within components of the landscape to actions appropriate to AWS. If the appropriate response is to restart a particular component within the SAP landscape on a new VM, the tools must be able to orchestrate that restart in the appropriate AZ, make the relevant IP address changes, and so on. If an error causes a database in one AZ to go offline, an intelligent monitoring and management solution orchestrates the replication of data to secondary instances of that database and manages all aspects of bringing that secondary database online seamlessly to ensure ongoing availability of the SAP landscape.
For IT departments accustomed to ensuring the availability of an on-premises SAP landscape, the cloud offers many advantages and raises many questions. But these questions have answers. It is possible to run your SAP landscape in the cloud and achieve the level of HA that you’ve been accustomed to in your on-premises deployment. It requires planning, and it takes an appreciation of the core technical differences between an on-premises environment and a cloud environment like AWS. Add to that a suite of intelligent, SAP- and AWS-aware monitoring and management tools, and your enterprise will be in good hands going forward.