Business Continuity (BC) vs Disaster Recovery (DR) in VMware Site Recovery Manager (SRM) Design – (RPO, RTO, WRT, MTD)

Business Continuity vs Disaster Recovery

DR : – we hoped it would never happen, but it has…
       – get the business running again ASAP
       – it is a tactical and technical movement
BC : – C level executive
       – who, what, where, and when is needed
       – not simply technical, whole of business need to be considered

RPO, RTO, WRT, MTD (Recovery Point Objective, Recovery  Time Objective, Work Recovery Time, Maximum Tolerable Downtime)

This is a simple explanation about RPO and RTO. Also the explanation about WRT and MTD, because there are few customers understand this terms completely. But, we need to discuss about these criteria during our design of Disaster Recovery. Especially if we want to implement VMware SRM (Site Recovery Manager).

 

Consider the following scenario.

Stage 1: Business as usual

At this stage all systems are running production and working correctly.

Stage 2: Disaster occurs

BCDR-02

On a given point in time, disaster occurs and systems needs to be recovered. At this point theRecovery Point Objective (RPO) determines the maximum acceptable amount of data loss measured in time. For example, the maximum tolerable data loss is 15 minutes.

Stage 3: Recovery

BCDR-03

At this stage the system are recovered and back online but not ready for production yet. The Recovery Time Objective (RTO) determines the maximum tolerable amount of time needed to bring all critical systems back online. This covers, for example, restore data from back-up or fix of a failure. In most cases this part is carried out by system administrator, network administrator, storage administrator etc.

Stage 4: Resume Production

BCDR-04

At this stage all systems are recovered, integrity of the system or data is verified and all critical systems can resume normal operations. The Work Recovery Time (WRT) determines the maximum tolerable amount of time that is needed to verify the system and/or data integrity. This could be, for example, checking the databases and logs, making sure the applications or services are running and are available. In most cases those tasks are performed by application administrator, database administrator etc. When all systems affected by the disaster are verified and/or recovered, the environment is ready to resume the production again.

BCDR-05

The sum of RTO and WRT is defined as the Maximum Tolerable Downtime (MTD) which defines the total amount of time that a business process can be disrupted without causing any unacceptable consequences. This value should be defined by the business management team or someone like CTO, CIO or IT manager.

This is of course a simple example of a Business Continuity/Disaster Recovery plan and should be included in your Business Impact Analysis (BIA).

Referenced from: http://defaultreasoning.com/2013/12/10/rpo-rto-wrt-mtdwth/

Review: Puppet vs. Chef vs. Ansible vs. Salt

Once again, I am taking this article from another website (http://www.infoworld.com/d/data-center/review-puppet-vs-chef-vs-ansible-vs-salt-231308). It is a very good article that I would like to remember. So, that is the reason why I re-post it again in my blog.

Review: Puppet vs. Chef vs. Ansible vs. Salt

The leading configuration management and orchestration tools take different paths to server automation

 

The proliferation of virtualization coupled with the increasing power of industry-standard servers and the availability of cloud computing has led to a significant uptick in the number of servers that need to be managed within and without an organization. Where we once made do with racks of physical servers that we could access in the data center down the hall, we now have to manage many more servers that could be spread all over the globe.

This is where data center orchestration and configuration management tools come into play. In many cases, we’re managing groups of identical servers, running identical applications and services. They’re deployed on virtualization frameworks within the organization, or they’re running as cloud or hosted instances in remote data centers. In some cases, we may be talking about large installations that exist only to support very large applications or large installations that support myriad smaller services. In either case, the ability to wave a wand and cause them all to bend to the will of the admin cannot be discounted. It’s the only way to manage these large and growing infrastructures.

[ Read the individual reviews: Puppet • Chef • Ansible • Salt | Puppet or Chef: The configuration management dilemma | Subscribe to InfoWorld’s Data Center newsletter to stay on top of the latest developments. ]

PuppetChefAnsible, and Salt were all built with that very goal in mind: to make it much easier to configure and maintain dozens, hundreds, or even thousands of servers. That’s not to say that smaller shops won’t benefit from these tools, as automation and orchestration generally make life easier in an infrastructure of any size.

I looked at each of these four tools in depth, explored their design and function, and determined that, while some scored higher than others, there’s a place for each to fit in, depending on the goals of the deployment. Here, I summarize my findings.

Puppet Enterprise
Puppet arguably enjoys the biggest mind share of the four. It’s the most complete in terms of available actions, modules, and user interfaces. Puppet represents the whole picture of data center orchestration, encompassing just about every operating system and offering deep tools for the main OSes. Initial setup is relatively simple, requiring the installation of a master server and client agents on each system that is to be managed.

From there, the CLI (command-line interface) is straightforward, allowing module downloads and installation via the puppet command. Then, changes to the configuration files are required to tailor the module for the required task, and the clients that should receive the instructions will do so when they check in with the master or via a push that will trigger the modifications immediately.

There are also modules that can provision and configure cloud server instances and virtual server instances. All modules and configurations are built with a Puppet-specific language based on Ruby, or Ruby itself, and thus will require programmatic expertise in addition to system administration skills.

 

Test Center Scorecard
20% 20% 20% 20% 10% 10%
AnsibleWorks Ansible 1.3 9 7 8 8 9 9
8.2
VERY GOOD
20% 20% 20% 20% 10% 10%
Enterprise Chef 11.4 9 8 7 9 8 9
8.3
VERY GOOD
20% 20% 20% 20% 10% 10%
Puppet Enterprise 3.0 9 9 9 9 9 9
9.0
EXCELLENT
20% 20% 20% 20% 10% 10%
SaltStack Enterprise 0.17.0 9 8 9 9 9 9
8.8
VERY GOOD

Puppet Enterprise has the most complete Web UI of the bunch, allowing for real-time control of managed nodes using prebuilt modules and cookbooks present on the master servers. The Web UI works well for management, but does not allow for much configuration of modules. The reporting tools are well developed, providing deep details on how agents are behaving and what changes have been made.

Enterprise Chef
Chef is similar to Puppet in terms of overall concept, in that there’s a master server and agents installed on managed nodes, but it differs in actual deployment. In addition to a master server, a Chef installation also requires a workstation to control the master. The agents can be installed from the workstation using the knife tool that uses SSH for deployment, easing the installation burden. Thereafter, managed nodes authenticate with the master through the use of certificates.

Continue reading Review: Puppet vs. Chef vs. Ansible vs. Salt

VMware TCO / ROI Calculator User’s Guide

You can directly use the Tool in this link = http://http://roitco.vmware.com/

You can read the detail description of TCO in here = http://www.vmware.com/pdf/TCO.pdf

 

This is some simple User Guide on How to use the Tool :

The VMware TCO Calculator was developed jointly by VMware, Inc. and ex-Gartner ROI / TCO experts from Alinean, Inc. to provide a Total Cost of Ownership (TCO) and Return on Investment (ROI) analysis for implementing VMware solutions to virtualize your IT environment including datacenter server infrastructure, testing/development labs and desktops.

You can quickly assess potential cost savings by completing five easy steps:

Step 1: Fill out a Simple Survey Questionnaire

Answer a few questions about your company (such as the type of industry you are in, and location) and which of the VMware solution sets you are most interested in:
1. Data Center Server Consolidation Cost Savings (Using VMware Infrastructure 3)
2. Virtual Lab Automation Benefits (Using VMware Lab Manager)
3. Desktop Control and Manageability Cost Savings (Using VMware Virtual Desktop Infrastructure VDI)

For each selected solution area of interest, you will be asked five to ten additional questions  about your existing assets such as the number of servers or desktops you intend to virtualize,  and about cost reduction opportunities such as current server management, management  activities, storage and other metrics.

Default metrics are provided for all of the questions based on Alinean research regarding your specified industry and location. These defaults can be reviewed and adjusted to best match your own unique metrics. This is the only information that is necessary to get a quick estimate of potential cost savings and return on investment (ROI)!

Step 2: Customize Assumptions

The tool uses over 200 additional metrics to help calculate a credible and achievable costbenefit / ROI analysis. Each of the 200 metrics is set using default assumptions based on the metrics you provided on the questionnaire regarding your industry, location, assets and current cost opportunities, and using industry average and third party research. All of these values are preset using this research, but can be easily reviewed and refined by clicking on the “View/Edit Default Assumptions” link on the bottom right corner of each solution set on the Questionnaire page. You can then review each default assumption and make adjustments to
precisely reflect your current environment and actual costs as needed. For a quick analysis, you can skip this step, revisiting the default assumptions for refinement later.

Step 3: Review Total Cost Savings and Return on Investment (ROI)

Click on the “Next” button to see your customized cost savings, comparing your current (As Is) total cost of ownership (TCO) over the next three years (managing a non-virtualized / business as usual environment), and the cost savings and business benefits estimated for a virtualized VMware environment. Key metric and financial improvements are summarized at the top, while specific details for each solution are provided in the tables, graphs and charts. For each selected VMware solution, you can click on the solution name in the table or the tab to view cost savings / benefit, investment and ROI details for VMware Infrastructure 3, VMware Lab Manager and VDI independently.  Continue reading VMware TCO / ROI Calculator User’s Guide

Agile Infrastructure Design to Support Software Development Life Cycle (SDLC)

Berikut ini adalah salah satu slide yang pernah saya buat sebagai bahan pertimbangan secara sekilas saja mengenai beberapa keuntungan dari Private Cloud yang dikhususkan penggunanya yaitu kepada para pembuat aplikasi (Software Developer).

 

Single Portal untuk melakukan seluruh kegiatan diatas (Capture, Deploy, Flexible Resource) dengan kemampuan Automation, Orchestration, dan Monitoring serta Chargeback akan sangat memberikan keuntungan yang maksimal bagi para application developer, dan juga infrastructure system administrator tentunya.

 

The Key to Change is to Let Go the Fear!

change

The riskiest thing we can do is just maintain the status quo – Bob Iger

The Future belongs to Those who Innovate! – Ziad Abdelnour

It is more fun to think of the future than dwell on the past – Sam Shepard

Tomorrow belong to those who can hear it coming – David Bowie

The Future depends on what we do Today – Gandhi

The Best way to Predict the Future is to Create it! – Abe Lincoln

Do not chase: (1) titles (2) money (3) imitations (4) compliments | Do chase: (1) opportunity to help (2) your passion (3) being better (4) meaningful work – Anonymous

Pengalaman Google dalam mengimplementasikan SCRUM

Berikut ini saya ingin mencuplik hasil browsing saya di Internet mengenai Scrum. Saya menemukan salah satu presentasi yang cukup baik mengenai cara Jeff Sutherland mereview pengalaman implementasi dari SCRUM di perusahaan raksasa GOOGLE.

Tolong lihat aja deh dari Original Link ini : (karena video-nya tidak bisa saya embed kesini, lagipula ada presentasinya juga disana)

http://www.infoq.com/presentations/Agile-Management-Google-Jeff-Sutherland

Summary
A retrospective on Google’s first Scrum implementation. Jeff Sutherland visited Google to do an analysis of the first Google implementation of Scrum on one of their largest distributed projects. Their strategy for inserting Scrum step by step into the Google engineering teams showed great insight and provides helpful lessons learned for all Agile teams.

Bio
Well known as the Co-Creator of the Scrum Agile Development Process which influenced the design of the other leading Agile process in the U.S., i.e. eXtreme Programming (XP). Scrum is a team organization process that brings focus, clarity, and enthusiasm to any project team in any domain.

About the conference
QCon is a conference that is organized by the community, for the community.The result is a high quality conference experience where a tremendous amount of attention and investment has gone into having the best content on the most important topics presented by the leaders in our community.QCon is designed with the technical depth and enterprise focus of interest to technical team leads, architects, and project managers.

Happy Scrumming 😀

Pentingnya Standard Operating Procedure untuk melakukan Deployment ke Production

Pada posting kali ini, kami ingin menyampaikan mengenai pentingnya Standard Operating Procedure untuk melakukan deployment dari product yang sudah selesai dalam Sprint untuk dinaikkan ke mesin produksi.

Pada mekanisme Agile, khususnya Scrum dijelaskan secara singkat mengenai mekanisme requirement, analysis design, coding, dan finalisasi produk. Tetapi tidak dijelaskan mengenai cara memasukkan barang yang sudah jadi tersebut ke dalam production. Sepintas terlihat ini adalah suatu hal yang sangat simple, tidak perlu dibesar-besarkan. Tetapi kenyataannya ini adalah hal yang paling crucial dari semua tahapan proses yang ada, karena ini adalah ujung dari proses yang sudah dijalankan sebelumnya. Jika mekanisme ini salah, maka bisnis akan terancam.

Pada i-pandawa framework, ada SOP khusus mengenai cara men-deploy aplikasi yang sudah diselesaikan (baik ini adalah BUG FIXING, atau NEW FEATURES). Intinya MINIMAL ada dua buah mekanisme yang harus disepakati, yaitu :

1. Dokumentasi Approval, yang terdiri dari : Daftar perubahan item aplikasi yang akan di publish. Mekanisme Versioning dari aplikasi yang akan di publish. Persyaratan-persyaratan lain dari Deployment Aplikasi ini (misal konektifitas dengan 3rd party, syarat QA, visualisasi versi, dll). Yang tidak kalah penting yaitu mekanisme Approval terhadap produk ini (dari Developer, QA, dan product owner (pada saat sprint review)).

2. Dokumentasi Step by Step Pekerjaan, yang terdiri dari : Siapa melakukan Apa, Dimana dilakukan, dan Kapan dilakukan serta berapa lama, serta Risk Mitigasi-nya jika terjadi kegagalan pada saat proses berlangsung.

 

Di dalam Lingkungan Kerja IT, kami lebih memilih memiliki Super Team daripada seorang Super Man!

Di dalam tim Scrum kami, “kalau boleh memilih” (kami memang bisa memilih, ‘coz life is a choice)….. Maka, kami lebih memilih untuk bekerjasama dengan orang yang memiliki perilaku SuperTeam daripada SuperMan. Jika seseorang tersebut adalah seorang SuperMan, maka yang bersangkutan harus dapat menularkan ke-”Super”-annya kepada seluruh anggota Team, sehingga team-nya juga menjadi Super :)

Mengapa kami tidak perlu seorang Superman? Karena, kami tidak ingin menggantungkan “leher” kita ke satu orang saja yang memegang seluruh “sistem” yang ada. Sehingga transparansi akan sangat penting bagi seluruh tim. Transparansi apa yang dikerjakan oleh tiap2 orang, transparansi bisnis proses yang diimplementasikan, transparansi teknologi yang digunakan, sehingga diharapkan seluruh tim memiliki pemahaman yang sama dan akhirnya akan bisa menjalankan tugas secara Cross Functional dan saling mengisi serta membantu. YES, Communication does Matter! (the ability to communicate effectively with each other is important)

Jika dilihat dari kacamata stakeholder, maka SuperTeam ini akan menjamin kelangsungan bisnis yang dijalani karena kehilangan salah satu personel (mis: sakit, cuti, dll) tidak akan mengganggu bisnis yang sedang berjalan.

Lalu bagaimana jika mekanisme ini membuat orang tersebut tidak ingin berkembang? Alasannya karena “takut” jika dia jadi SuperMan maka akan ada effort menularkan ke-”super”-annya ke anggota tim yang lain? Well, continuous improvement is really important in Scrum Environment Team. Seseorang harus berkembang, dan wajib berkembang!

Dalam setiap Retrospective Meeting selalu diharapkan ada gagasan2 baru baik dari sisi proses, metodologi, maupun teknologi. Lagipula jika seseorang menjadi seorang Superman akan sangat baik bagi yang bersangkutan, dan juga akan menjadi nilai tambah penilaian bagi perusahaan, tetapi nilai itu akan bisa dihitung/berlaku ketika dia bisa mengimplementasikan ke dunia nyata dan menularkan kepada anggota tim yang lain. Coz, we are in this together.

So, Happy Scrumming Guyz :)

Pembangunan Perangkat Lunak menggunakan Metodologi AGILE

Software Development Lifecycle (SDLC), atau pembangunan perangkat lunak adalah suatu hal yang sangat menarik untuk dipelajari karena SDLC memiliki banyak strategi dalam pelaksanaannya. Information Technology Project Management sendiri adalah suatu pembahasan yang sangat unik dan sangat dinamis. Setelah kami melakukan banyak penelitian, pelatihan, dan melakukan/pengalaman dalam pelaksanaannya, maka kami menyimpulkan bahwa dibandingkan dengan metodologi pembangunan aplikasi (software project management) yang lain seperti Waterfall, Agile lebih memiliki fleksibilitas dalam menangani perubahan. Secara garis besar, flow dari Agile adalah sebagai berikut :

Alasan utama fleksibilitas inilah membuat kami memutuskan untuk menggunakan Agile Methodology. Sebenarnya ada banyak pilihan metodologi untuk mengimplementasikan agile, yaitu Agile Modeling, Agile Unified Process (AUP), Dynamic Systems Development Method (DSDM), Essential Unified Process (EssUP), Extreme Programming (XP), Feature Driven Development (FDD), Open Unified Process (OpenUP), Scrum, Velocity tracking. Kami ingin mengkombinasikan agile ini dengan framework besar SDLC yang lebih komplit (hanya sebagai acuan referensi saja, tetapi utamanya tetap pada agile). Pilihannya ada dua buah, yaitu Microsoft Solution Framework (MSF) dan Rational Unified Process (RUP). Setelah melakukan penelitian yang cukup lama, akhirnya kami memutuskan untuk menggunakan RUP yang mengimplementasikan konsep Agile Scrum di dalamnya.

Regards,
Doddi Priyambodo