You are on page 1of 4

Page 1 of 4

close window

Print

AIX Health Check


How can you monitor a systemlet alone tune itwithout taking the time to understand how its built?
December 2007 | by Ken Milberg Systems management means different things to different people. It requires various utilities and commands to help with many systems-administration functions, including making operational configuration changes, systems monitoring and system documentation.

Administrators use many methods to adhere to these concepts, such as the AIX* Systems Management Interface Tool (SMIT), WebSM or the command line. SMIT comes in two operation modes, giving users fast paths to the most frequently used administrative tasks. Newcomers to AIX may not know that SMIT is often preferred over the command line for making changes for many operations. For example, if you use ifconfig to make a routing change, it won't save the information after a reboot. Instead, consider the Object Data Manager, a critical link of AIX. Regarding partitioning information, we used to be limited to using a hardware-management console (HMC); now users can choose the Integrated Virtualization Manager, which lets them manage LPARs and configure virtual Ethernet and SCSI without an HMC. You can also back up and restore LPAR configuration information and view application logs and device inventory, particularly important for the documentation section.

Monitoring With nmon


One of the best utilities I've seen for AIX, interestingly, isn't even a supported IBM* utility, but nmon. The beauty of nmon is that you can use it either to quickly ascertain what's going on in your system or to capture data for historical trending and analysis. Most tools do one or the other, but nmon does both equally well. Using nmon to capture and examine data is simple. After you've finished, you must sort the file and rename it with the .csv suffix, then FTP the file to your PC, start what's called the nmon analyzer and click on "Analyze nmon Data." You should see the output on a spreadsheet with graphs, shown in Figure 1. The only thing that nmon lacks is the capability to present this data on many LPARs concurrently, though you can use a tool called Ganglia to integrate the nmon analysis into a database. The lparstat command is also useful. It provides information regarding performance and

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Print.aspx?path=/ai...

7/1/2011

Page 2 of 4

systems configuration, such as mode, entitled capacity and if it's an uncapped partition. A similar command is mpstat, which displays the overall performance of all logical CPUs on the systems. You can run smtctl to confirm that you're running in simultaneous multithreading (SMT) mode. Another wonderful command I use frequently - particularly when I'm asked to work on a box I'm unfamiliar with - is prtconf. This command gives you partitioned hardware information (CPU, clock speed, RAM, firmware level, devices), network information, paging space, volume group information, etc. I'm a big believer in trying to get as much information as possible before doing anything to a system, including monitoring and systems tuning. How can you monitor a system - let alone tune it - without taking the time to understand how it's built? Several commands, such as lsattr, can be run to get kernel information about your CPU, disk and networking parameters.

Other Monitoring Utilities


When considering other basic monitoring utilities besides nmon, I use vmstat when the system is slow. It provides information quickly, albeit without any bells or whistles. Dealing with bottlenecks often depends on what your system is configured to do. Tuning a network file system (NFS) server is much different from tuning an Oracle online transaction processing server. Unlike in years past, today we set maxperm to 90 because changing the lru_file_repage parameter is a more effective way of tuning. It's preferable not to use AIX file caching at all, since this parameter indicates whether the virtual-memory manager AIX re-page counts are considered and what type of memory it should steal. Change the default of 1 to 0 so it steals only file pages to protect computational pages and let Oracle use its own cache. Using Oracle and AIX cache is like double taxation - not good. Table 1 shows how to change kernel parameters and further document other subsystems, identifying which tools to use for what purpose. Another worthwhile command is lsps, which provides important information on paging space. AIX servers rarely crash, but when they do, it's usually because the paging space has filled up. This shouldn't happen if you monitor this frequently by running these commands daily and making sure your systems have adequate paging space. A utility - other than prtconf - that can help you architect your systems is the IBM System Planning Tool (SPT), the next generation after the LPAR validation tool that comes with the capability to deploy plans generated from SPT, from the HMC itself. SPT is most often used to assist the architect in designing the managed system. It also validates that the configuration you intend to deploy is valid. It's a PC-based browser application designed to be run in a standalone environment. Make sure you fully document your virtual I/O servers (VIOs). The commands are different on these special partitions, but that doesn't mean you shouldn't document these systems. The most important documentation-type command to know for your VIOs is # lsmapp all.

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Print.aspx?path=/ai...

7/1/2011

Page 3 of 4

Upgrades and Fixes


IBM's Service Update Management Assistant (SUMA) provides a way to automate technology upgrades, letting you define policies to allow for automatic fix downloads and even entire technology levels. It could simplify your life. Administrators should also look at the Network Installation Manager (NIM), which provides a way to install and manage your AIX filesets on machines over your network, as well as maintain several basic images of environments you configure. Using NIM, you no longer spend hours installing software on the images after you install the OSs - the applications can be part of the image you maintain with NIM. Today you can integrate SUMA and NIM to provide even more functionality by using the niminv command to inventory all NIM clients and then using SUMA to automatically download fixes to the NIM server. NIM then deploys and tracks the fixes. IBM released a Redbooks* publication in May updating the popular "NIM from A to Z." This is worthwhile reading for anyone thinking about creating a NIM environment to help automate the process of creating new AIX images. Electric Service Agent is a free tool that automatically reports hardware problems to IBM, and also collects hardware, software, performance management and systems configuration information to enhance IBM's capability to help with problems and ensure swift support for service-level agreements. Many customers lose track of configuring this system, which is a big mistake. If you have a tool that can increase your availability, use it - especially if it's free and supported.

Parting Tips
Finally, make sure you back up HMC. Too many administrators with 200 mksysb copies of tapes in their desk drawer, all nicely labeled and validated, don't sufficiently back up their partitioned data on the managed systems. Under normal circumstances, if your HMC fails, you have the comfort of your service processor that can be read by a replacement HMC. There are two different backup-related tasks: Backup Critical Console Data and Save Upgrade Data. Also, the VIO must also be backed up using special commands: backupios for the rootvg data and savevgstruct for the structure of the volume group. AIX systems management is too great a topic to cover in one article, so I hope this information gives you some food for thought. Systems administrators must consider performance, availability, scalability and ways to automate processes so they can spend more time architecting systems and looking at the big picture. You should script as many routine administrative tasks as you can to help automate your life.

References
nmon: www.ibm.com/developerworks/aix/library/au-analyze_aix/ Electric Service Agent: www.ibm.com/systems/p/pm/ exec_summary.html

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Print.aspx?path=/ai...

7/1/2011

Page 4 of 4

"NIM from A to Z" Redbooks publication: www.redbooks.ibm.com/abstracts/sg247296.html?Open

IBM Systems Magazine is a trademark of International Business Machines Corporation. The editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia under license from International Business Machines Corporation.

2010 MSP Communications, Inc. All rights reserved.

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Print.aspx?path=/ai...

7/1/2011

You might also like