What Will SysAdmins Do in the Automated Cloud Future?

What Will SysAdmins Do in the Automated Cloud Future?                                                                                                                                                                                                                                                                          

Nobody would dispute that system administrators have been integral to keeping IT environments running. But that hasn’t stopped people from wondering whether sysadmins will still have a role in a future world of highly automated clouds.

They will, and it will be just as critical. But that role will also be very different.

Today, sysadmins are all about the VM. They’re akin to workers on a manufacturer’s production line. Sometimes they’re at the beginning of the line, figuring out where to place the VM and what services to connect to it, and then handing it off to developers who will add applications inside of it. Sometimes they’re at the end of the process, deploying the VM. Many times they’re manning the station, ready to add memory to fix poor performance or move a VM when a server fails.

But increasingly, advanced analytics engines are able to identify infrastructure anomalies and recommend remediation steps, and automation tools can put many of them into action. So what does that mean for sysadmins?

Instead of focusing on discrete tasks and spending large amounts of time on daily firefighting, sysadmins will be more strategic, like pilots overseeing entire operations.

Even though airplanes are capable of getting from point A to point B on their own thanks to intelligent systems and automation, pilots still man the cockpit. Their expertise is required to oversee and, often times, adjust the incredibly intricate and interdependent systems that keep planes flying. Pilots are the ones entrusted with getting passengers to destinations safely.

So, too, with sysadmins. The cloud—and the notion of software-defined data centers—have added order-of-magnitudes more complexity to IT environments. Instead of an application running on one VM, it may run on dozens of VMs, each of which has storage, load balancing, database and other services attached to it. Instead of the VM being the container, the application becomes the container. And each container is a system with many interdependent parts and services.

The sysadmin’s new role is to optimize and manage those systems. But unlike the way sysadmins have been managing VMs, they can’t hand-hold each of these complex systems. They’d run out of hours in a day before even scratching the surface. Rather, now that products such as VMware vCO, vCAC and App Director are being combined into a single automation stack, deeply integrated with vCenter Operations Management, and working together with SRM, vSAN and NSX, software can automatically handle many of the daily tasks. When more complex, critical problems arise, they’ll be flagged for the sysadmins, who will pull from their broad knowledge and nuanced understanding of storage, networks, applications and more, to triage and resolve them.

The sysadmins’ responsibilities won’t end there. By working at this higher level, they will be able to influence those systems in ways that help businesses operate more efficiently, cost-effectively and competitively. And that’s where their real value lies. Sysadmins of the future will be planners and problem solvers who leverage automated cloud environments and their advanced analytics capabilities. Like pilots, they’ll ensure the IT systems that businesses rely on can take them the enterprise where they it needs to go.

By Leslie Muller

Business Continuity (BC) vs Disaster Recovery (DR) in VMware Site Recovery Manager (SRM) Design – (RPO, RTO, WRT, MTD)

Business Continuity vs Disaster Recovery

DR : – we hoped it would never happen, but it has…
       – get the business running again ASAP
       – it is a tactical and technical movement
BC : – C level executive
       – who, what, where, and when is needed
       – not simply technical, whole of business need to be considered

RPO, RTO, WRT, MTD (Recovery Point Objective, Recovery  Time Objective, Work Recovery Time, Maximum Tolerable Downtime)

This is a simple explanation about RPO and RTO. Also the explanation about WRT and MTD, because there are few customers understand this terms completely. But, we need to discuss about these criteria during our design of Disaster Recovery. Especially if we want to implement VMware SRM (Site Recovery Manager).


Consider the following scenario.

Stage 1: Business as usual

At this stage all systems are running production and working correctly.

Stage 2: Disaster occurs


On a given point in time, disaster occurs and systems needs to be recovered. At this point theRecovery Point Objective (RPO) determines the maximum acceptable amount of data loss measured in time. For example, the maximum tolerable data loss is 15 minutes.

Stage 3: Recovery


At this stage the system are recovered and back online but not ready for production yet. The Recovery Time Objective (RTO) determines the maximum tolerable amount of time needed to bring all critical systems back online. This covers, for example, restore data from back-up or fix of a failure. In most cases this part is carried out by system administrator, network administrator, storage administrator etc.

Stage 4: Resume Production


At this stage all systems are recovered, integrity of the system or data is verified and all critical systems can resume normal operations. The Work Recovery Time (WRT) determines the maximum tolerable amount of time that is needed to verify the system and/or data integrity. This could be, for example, checking the databases and logs, making sure the applications or services are running and are available. In most cases those tasks are performed by application administrator, database administrator etc. When all systems affected by the disaster are verified and/or recovered, the environment is ready to resume the production again.


The sum of RTO and WRT is defined as the Maximum Tolerable Downtime (MTD) which defines the total amount of time that a business process can be disrupted without causing any unacceptable consequences. This value should be defined by the business management team or someone like CTO, CIO or IT manager.

This is of course a simple example of a Business Continuity/Disaster Recovery plan and should be included in your Business Impact Analysis (BIA).

Referenced from: http://defaultreasoning.com/2013/12/10/rpo-rto-wrt-mtdwth/