Troubleshooting slow performance on application di atas VMware virtualization

Setelah kita masuk ke dunia IT operation, akan banyak hal-hal operasional yang membutuhkan troubleshooting. Biasanya disebabkan karena slow performance dari sebuah aplikasi. Jika ini terjadi di virtualization environment, maka kita perlu memastikan bahwa infrastructure yang ditangani mampu memberikan jaminan SLA yang sudah kita sepakati sebelumnya.

Berikut ini adalah beberapa key area yang perlu diperhatikan untuk melakukan troubleshooting sebuah VM, secara high level:
1. Ensure bahwa ini bukan dari sisi aplikasi by working together juga dgn tim apps – logic of apps, memory leak, efficient I/O commands, etc.
2. Coba pastikan di sisi infra dari VM dan infra di belakangnya (compute, storage, network)

Berikut ini adalah hal yang bisa kita lakukan pada saat troubleshooting:

1. Cek kesehatan dari Virtual Machines

Capacity Issues (Example) Non Capacity Issues (Example)
•CPU Demand > 90%

•CPU Run Queue > 3 per vCPU

•CPU Swap Wait high, CPU IO Wait high

•RAM Free < 250 MB

•RAM Committed > 70%

•Page-In Rate is high

•Disk Queue Length > ___

•Disk IOPS or Throughput or OIO is high

•Low disk space

•Network Usage is high

•Wrong driver (storage driver, network driver) or its settings

•Too many snapshots or large snapshots

•Tools not running

•VM vCPU Usage unbalanced

•App configured wrongly, not-indexed

•Memory Leak

•Network Latency is high or TCP retransmit

•VM too big, process ping-pong, high context switch

•NUMA effect

•Guest OS power setting

2. Cek kesehatan dari Infrastructure layer

 Infra is unable to Cope (Example) Other Issues (Example)
•ESXi CPU insufficient: Demand > 90%, VM CPU Co-Stop >1%, CPU Ready >5%, no of cores to small for VM

•ESXi RAM insufficient: VM Balloon active, VM RAM Swap-in is high, NUMA migration

•ESXi Disk IOPS or Throughput is high

•ESXi vmkernel queue or latency is high

•Datastore latency is high

•ESXi vmnic usage is high

•VM was vMotion

•ESXi vmnic dropped packets or generate errors

•ESXi wrong configuration: power management, multi-pathing, driver version, queue depth setting

•Hardware fault: disk soft error, bad sector, RAM error,

Next question adalah how to check those parameters as fast you can, and as easy as you can to do the troubleshooting and solve the issues that you are facing right now. Well, jawaban yang paling cepat adalah dengan merujuk pada alat  bantu yang saya bahas di posting saya sebelumnya, yaitu dengan menggunakan VMware vRealize Operations Manager.

 

Kind Regards,
Doddi Priyambodo

 

Be Sociable, Share Tweet about this on TwitterShare on LinkedInShare on FacebookEmail this to someone