mercredi 14 septembre 2011

Monitor services in the cloud

Discover the secrets of monitoring cloud services through tips, tricks, and tools

Alex Amies, Senior Software Engineer, IBM
John Sanchez, Architect, IBM
Dominique Vernier, IT Architect, IBM
Xu Dong Zheng, Staff Software Engineer, IBM

Summary: Monitoring cloud services is one of the major cornerstones of the cloud. By monitoring cloud services, you can determine whether you're extracting the most you can out of your resource utilization. In this article, the authors define monitoring with a specific eye on monitoring in the IBM® Cloud environment and discuss the fundamental options for monitoring in the IBM Cloud. They walk you through two scenarios designed to illustrate the technical process of monitoring cloud services, demonstrate IBM Tivoli® Monitoring (ITM) Autonomous Agent and RESTful APIs, two tools for monitoring services, and take you step by step into setting up and using the IBM Tivoli Monitoring Autonomous Agent on the IBM Cloud.

Tags for this article: cloud, demos, monitoring, services

Date: 16 Feb 2011
Level: Intermediate
PDF: A4 and Letter (1075KB | 35 pages)Get Adobe® Reader®
Also available in: Chinese Japanese Portuguese

Activity: 9381 views
Comments: 1 (View | Add comment - Sign in)

Average rating 4 stars based on 8 votes Average rating (8 votes)
Rate this article

We all know why the concept of cloud computing excites us. We want to be able to leverage the cloud features to provide optimal utilization of resources. The ability to monitor cloud services is a key component to getting to a state in which we're wringing the most use (and value) out of every single resource.

IBM Smart Business Development and Test on the IBM Cloud is an infrastructure as a service (IaaS) public cloud hosted on the Development and Test Cloud. IBM is unique in providing a public cloud offering that is suitable for enterprises. IBM's differentiators include a broad range of services and products, a delegated administration model that enables collaboration, enterprise-suitable business support services, and a large catalog of supported open-source and commercial software images pre-configured by product teams that are ready to run on demand. System monitoring is an important requirement that allows enterprises to deliver reliable services with excellent levels of service.

There are many business scenarios where cloud services need monitoring. This article focuses on scenarios that are likely to be encountered by users of the Development and Test public cloud. The first scenario shows how to monitor key events in the lifecycle of a virtual machine instance. The cloud environment lends itself to monitoring and automated reaction even better than traditional physical computing environments. One of the great things about the IBM Smart Business Development and Test Cloud is that you can directly leverage configurations set up by the IBM Tivoli Monitoring product team. The second scenario takes advantage of the virtual LAN (VLAN) capabilities that lets you move resources on the cloud from the public Internet and connect them to your organization's private network through a fully encrypted VPN to allocate resources where they are most needed.

Monitoring includes both availability and performance monitoring. Typically an external program monitors a heartbeat of a virtual machine or an application running on a virtual machine to check availability. That is, an external program pings the system, and if the system does not respond to the ping, then an alarm is raised. Performance monitoring involves tracking key metrics of system performance. For operating systems this is usually CPU, memory, hard disk, and network usage. Application performance monitoring typically includes response times for transactions, search, page generation, or other operations and related measures of the load that the application is sustaining.

An important application of monitoring is to predict and respond to problems in a timely way. For example, a monitoring system might let you set a trigger when disk utilization reaches a certain point. Perhaps you want to send a system administrator an alert when the disk utilization reaches 80 percent. In a traditional computing environment, a system administrator, having received the alert, can mount a new disk and avoid an availability problem resulting from a disk-out-of-space condition.

Traditional monitoring requirements apply in cloud computing as well. However, there are several differences in cloud environments.

  • A cloud is an environment where actions can be automated more easily than traditional environments. For example, IBM Smart Business Development and Test on the IBM Cloud provides a REST API for managing cloud resources. The REST API is used to respond to monitoring notifications.
  • Cloud resources are paid for based on usage. So, monitoring plays a part in making the most economical use of resources.
  • Cloud resources are created more dynamically than traditional computing resources, which leads to proliferation. This can result in a sprawl of virtual resources that can easily get out of control if not monitored and managed effectively.

This article details the process of how to monitor cloud services through two scenarios:

  • Monitoring key events in the lifecycle of a virtual machine instance.
  • Managing resources on the cloud that are an extension of your enterprise's network through an encrypted VPN.

Fundamental options for monitoring in the IBM Cloud

Before looking at the scenarios, let's review some fundamental options for monitoring on the IBM Smart Business Development and Test on the IBM Cloud. Each option is described in detail and if you have the data, the information can help you decide which method gives you the most benefit on the IBM Cloud.

  • IBM Smart Business Development and Test on the IBM Cloud supports Linux® and Windows™ operating systems and many software products. The operating systems themselves include some good monitoring options out of the box, such as top on Linux and Task Monitor on Windows. These two examples include information like CPU and memory usage by process or task, in addition to network metrics. Other native tools provide more advanced metrics. Some of the things missing from the native tools are the ability to trigger an event, send a notification, manage a large amount of data from many systems centrally in a dashboard, or conveniently access information programmatically. The best use of the native monitoring tools is for casual monitoring or in doing a detailed investigation into a performance event triggered from a monitoring system.
  • A second option is a do-it-yourself script for availability monitoring of key lifecycle events. For example, you can write a Java™ program that uses an SSH library to ping a virtual machine and create an event when the system comes up or goes down. That has become a popular strategy with SSH being a first choice to access virtual machines in the cloud. The same strategy can be applied to other services that are available via network services, such as databases, web servers, LDAP servers, and so on. For example, you can write a Java program that uses the open-source Ganymed SSH library or the Apache HttpClient library for HTTP services. Programs that are more sophisticated might use operating system signals and Simple Network Management Protocol (SNMP) to manage availability. IBM Systems Director is a commercial product based on these principles and provides a number of availability monitoring, notification, and centralized management capabilities.
  • A third option is to write your own program to collect and analyze information from native monitoring sources. For example, you can write your own Perl program to parse the output from the Linux top tool. However, you will find that doing that reliably for monitoring and notification scenarios can be challenging. For example, suppose that the CPU use for a particular process suddenly spikes to 100% and then, just as quickly, goes back down. Providing meaningful monitoring notifications can be difficult.
  • The fourth monitoring option is to use a purpose built monitoring system such as IBM Tivoli Monitoring. These products are purpose built to handle scenarios that are difficult to program yourself, including moving averages and other statistical measures, and provide a robust set of notification and centralized management options, including IBM Tivoli Enterprise Portal. Tivoli Live is a new cloud offering from IBM that allows you to use a full IBM Tivoli Monitoring Server including a IBM Tivoli Enterprise Portal installation hosted by IBM. That lets you monitor in your enterprise without having to install or maintain the IBM Tivoli Enterprise Portal monitoring console.

This article focuses on the use of the IBM Tivoli Monitoring agents, including the autonomous agent capability for a powerful yet lightweight solution. Screen captures are included to illustrate this capability. IBM Tivoli Monitoring also provides an Agent Builder that lets you conveniently create a custom agent for your own applications. There is also an autonomous mode that allows lightweight monitoring solutions. That is explored in more detail below.

One of the characteristics of enterprise IT systems is that they need to be maintained over a long period even though many systems are created by people that can leave or move to somewhere else once they finish their projects. That makes it particularly important to have a centralized monitoring approach that is standardized over many different kinds of systems. In a production environment, monitoring will most likely be performed by someone dedicated to the task. It is important to provide that person with tools to enable her or him to monitor many systems in an efficient way. In a development and test environment, monitoring is one more thing that can make a developer's life more complex. However, the nice thing about cloud is that you can use assets created by others with less effort than traditional environments. That makes monitoring solutions within the reach of development and test teams. Also, the expense of development and test labor makes it important to be able to keep developers and testers productive with the systems they depend on operational.

Pros and cons of monitoring options
Native tools
  • Provided with operating systems and many software products out of the box; might be the only thing available
  • Can be used to do further investigation based on a more general monitoring event
  • Lack event triggering, notification, centralized management
  • Each monitoring tool works in a different way, making them expensive to use routinely
Do-it-yourself based on heartbeats
  • Development needed
  • Difficult to implement a robust set of features
Do-it-yourself based on native monitoring
  • Development needed
  • Difficult to implement a robust set of features
IBM Tivoli® Monitoring
  • Centralized
  • Robust set of features
  • Manages many systems in a standard way

Scenario 1: Watch key events in a virtual machine instance lifecycle

This section describes monitoring key events in the lifecycle of a virtual machine instance. Example events include startup of the operating system and high usage of CPU, memory, and hard disk. Knowledge of these events can allow management software to start up additional instances to share load, mount additional storage, or send a warning to a system administrator. This section steps through the process using IBM Tivoli Monitoring to create a simple example.

Create a virtual machine instance

  1. Log on to the Development and Test Cloud and go to the Instances area of the control panel (see Figure 1). That displays any virtual machine instances you have.

    Figure 1. View virtual machine instances
    View virtual machine instances

  2. Click Add Instance. Available software images are listed.

    Figure 2. Add instance and view software options
    Add instance and view software options

  3. Select IBM Tivoli Monitoring and click Next.
  4. On the Configure Image panel select Bronze 32 bit for server size and a system-generated IP address. If you have not already created a key, you can create and save one now without interrupting the flow of the Add Instance wizard. Do not forget to save the private key generated; otherwise, you will be out of luck when you try to connect to the virtual machine with an SSH client. You can download the public key later from the Account tab under Profile, but the Development and Test Cloud will not save your private key. Click Next.
  5. The Additional configuration parameters panel contains parameters specific to IBM Tivoli Monitoring. Enter TivoliMonitoring for the CTIRA hostname parameter, which is the name of the host that appears in the IBM Tivoli Enterprise Portal. Enter the sysadmin password. Click Next.
  6. The Verify configuration panel displays a summary view of the parameters. Click Next.
  7. On the Service agreement panel, click the radio button to accept the agreement and then click Submit. The request to create the virtual machine instance is submitted.

The Development and Test Cloud control panel shows the status of the request, which changes to "Active" when the virtual machine is ready to use. The control panel shows the configuration parameters, including the IP address to connect to the instance. With Linux operating system images, you normally connect over SSH to the virtual machine. Also, in this case, the image is configured to use with a web browser to connect to IBM Tivoli Enterprise Portal using port 1920.

Log in to IBM Tivoli Enterprise Portal using the user ID sysadmin and the password that you provided in Step 5. In IBM Tivoli Enterprise Portal, you should see an agentless monitor and an operating system agent. You should also see the SNMP data received for the agentless system with the disk, CPU, memory, and network usage.

Figure 3. Memory utilization
Memory utilization

Monitoring an agentless server

One of the easiest ways to monitor resources is to use the IBM Tivoli Monitoring agentless monitoring capability. The agentless approach is based on the collection of data using remote protocols. In general, when there is a choice between agentless and agent-based, then agentless is preferable due to the fact that there is no software to install or maintain. However, agents are necessary when there is monitoring information needed that can only be collected by a method that is not available remotely. In addition, the autonomous agent capability is also provided by agents.

To monitor an agentless server, first create a server as a virtual machine instance, called the "monitored server" in this article, configure SNMP, and update the firewall to let the SNMP requests from IBM Tivoli Enterprise Portal come in on that server.

  1. Create a server and then open an ssh session on it using iduser. There are two ways to create a server: Use the Development and Test Cloud console interface or the command line.
  2. From the command-prompt, enter sudo bash to get root access.

    You might need to install net-snmp before configuration using the steps below, since SNMP is not a default software package in SUSE Linux Enterprise Server 11. To do that, open yast on a command line, select Software Manager, search for snmp, and install net-snmp.

  3. Edit the /etc/snmp/snmp.conf file and change the line rocommunity public by replacing the IP address with your IBM Tivoli Enterprise Portal IP address (see the example in Figure 4).

    Figure 4. IP address replacement
    IP address replacement

  4. Restart the SNMP daemon using the command service snmpd restart.
  5. Open the firewall to allow User Datagram Protocol (UDP) access from the IBM Tivoli Enterprise Portal server to your server. To do that, launch yast2 and select Security and Users.

    Figure 5. Open firewall to allow UDP access
    Open firewall to allow UDP access

  6. Choose Firewall and Custom Rules.

    Figure 6. Firewall configuration
    Firewall configuration

  7. Select Add and enter your IBM Tivoli Enterprise Portal IP address as source, UDP, and 161 as destination.

    Figure 7. Add IBM Tivoli Enterprise Portal IP address, UDP, and destination
    Add IBM Tivoli Enterprise Portal IP address, UDP, and destination

  8. Click Next, Finish, and Quit.

Test your SNMP connection by logging in to the IBM Tivoli Monitoring server as root sudo bash:

snmpwalk -v 1 -c public 


You are now ready to add this server in the IBM Tivoli Enterprise Portal (see Figure 8).

  1. Open your browser at http://:1920///cnp/kdh/lib/cnp.html (the triple "/" is not a typo).

    Figure 8. Log in to IBM Tivoli Enteprise Portal
    Log in to Tivoli Enterprise Portal

  2. Expand Linux System and TivoliMonitoring, and right-click on Linux System.

    Figure 9. Linux System display
    Linux System display

  3. Click Take Action > Select.

    Figure 10. Take Action
    Take Action

  4. For Action Name, select LinuxSnmpMonitorStart.

    Figure 11. Select LinuxSnmpMonitorStart
    Select LinuxSnmpMonitorStart

    Provide a name and the IP address of the server you would like to monitor. Enter public in the community field. Click OK and OK again.

    Figure 12. Action Status
    Action Status

  5. Click OK one more time.
  6. Refresh the Navigator (opposite arrows next to the red bullet in the icon task bar). Select your monitored server in the navigator tree. You can now navigate around the different diagrams.

    Figure 13. Navigate diagrams
    Navigate diagrams

Operating system agents can be deployed using SSH. To deploy a Linux operating system agent, log on using a command line. If you are on a Windows client use Putty. If you are on a Linux client use the already provided SSH client to connect to the virtual machine using the IP address given on the control panel. Log in with the user ID idcuser and the private key that you saved in a previous step.

  1. Log in to tacmd using /opt/IBM/ITM/bin/tacmd login -s localhost.
  2. Enter sysadmin as user name and password.
  3. On successful login, execute:
    /opt/IBM/ITM/bin/tacmd createNode -h  -d /opt/IBM/ITM -p  PROTOCOL=IP.PIPE PORT=1918 SERVER=

  4. Enter the destination machine username and password.
  5. A request is queued for deployment. To check the status of deployment, enter /opt/IBM/ITM/bin/tacmd getdeploystatus.
  6. After the successful deployment, log out of tacmd using /opt/IBM/ITM/bin/tacmd logout.

You can read about how the images are set up in the Development and Test Cloud catalog. You might want to read the Getting Started documents for the IBM Tivoli Monitoring image and the base operating 32-bit SUSE Linux Enterprise Server 11 image. There are also short videos introducing each image that can be viewed online. The IBM Tivoli Monitoring image includes:

  • IBM Tivoli Enterprise Monitoring Server V06.22.01.00
  • IBM Tivoli Enterprise Portal Server V06.22.01.00
  • IBM Tivoli Enterprise Portal Desktop Client V06.22.01.00
  • IBM Tivoli Monitoring Linux Operating Systems Agent V06.22.01.00
  • IBM Tivoli Agentless Monitoring for Linux Operating Systems V06.22.01.00

There is a lot of capability in this image not described in this article, such as using IBM Tivoli Enterprise Portal to create situations and generation notification, event data warehouse, and detailed reports of monitoring data, monitoring systems other than operating systems, and using the Agent Builder to create custom agents. The rest of this article focuses on use of the IBM Tivoli Monitor Linux Operating Systems Agent to get started integrating IBM Tivoli Monitoring with the Development and Cloud APIs. The monitoring of operating systems is an important foundation for any monitoring solution.

Some of the benefits of being able to perform operating systems monitoring are:

  • Learn how to monitor cloud resources in IBM Tivoli Enterprise Portal with agentless and agent-based approaches.
  • Take advantage of a system already set up by an IBM Tivoli Monitoring expert.
  • Make your own customizations and save the image after you have a working system with IBM Tivoli Monitoring set up. Then you can start up any virtual machine instances of that image any time as needed. Using the new enterprise communities feature of Development and Test Cloud, you can share your image with other people in your organization.
  • Use Development and Test Cloud external storage to save monitoring and configuration data that can be reused from other IBM Tivoli Enterprise Portal installations.

Scenario 2: Manage resources via an encrypted VPN connection to the cloud

Suppose you discover that some of the computing resources that you have on the public Internet are underutilized, and resources on your organization's internal network are overloaded. The IBM Development and Test Cloud enables each enterprise to have its own virtual local area network (VLAN) to isolate its instances from the public Internet and connect them to the enterprise's private network via a VPN connection. This section describes how to use the Development and Test Cloud command line tool to manage resources on the cloud from the public Internet and add them to your organization's private network via a fully encrypted VPN connection.

A VLAN is an abstraction of the traditional concept of a local area network (LAN). It lets you group and isolate your computing resources on their own network without running out Ethernet cables and physical network devices like in traditional IT networks. This is good because, otherwise, in a public cloud your resources are on the public Internet. Connect to your VLAN using an encrypted virtual private network (VPN) connection.

The scenario described in the previous section related to managing cloud resources. The significance of VLAN/VPN technology is that you can connect cloud solutions to your own enterprise IT infrastructure, including both monitoring and resource balancing. In the old days, you created a network by stringing an Ethernet cable between different computers and network devices. Today, you can request a number of IP addresses in different network zones and build networks simply by assigning the addresses to different virtual machines.

Create a virtual machine instance

Creating a virtual machine attached to your enterprise's VLAN is easy with the Development and Test Cloud wizard.

  1. In the catalog, choose the SUSE Linux Enterprise Server 11 for x86.

    Figure 14. Choose SUSE Linux Enterprise Server 11 for x86
    Choose SUSE Linux Enterprise Server 11 for x86

  2. Give the instance a name, choose Bronze 32-bit server size, and select the Private VLAN option.

    Figure 15. Configure your instance selection
    Configure your instance selection

  3. Monitor the provisioning status of the request in My Instances on the control panel.

    Figure 16. Monitor provisioning status of request
    Monitor provisioning status of request

    When the instance becomes active, an IP address is displayed on the instance detail panel.

    Figure 17. Instance detail panel
    Instance detail panel

To connect to this kind of instance, you need to set up a VLAN connection.

Virtual machines on a VLAN are only visible to other resources on the VLAN, but they can see resources on the public Internet. That is important for the connection to an IBM Tivoli Enterprise Portal console. If the IBM Tivoli Enterprise Portal console is not attached to the VLAN, it will not be able to see the resources on the VLAN.

Now that you have performed these tasks:

  • You understand the use of VLANs in public cloud computing.
  • You know how to provision virtual machines only accessible on your own company's VLAN.

Automate resource management with command line scripts

This section provides details on how to integrate cloud applications using the cloud and monitoring command line scripts with the IBM Tivoli Monitoring Autonomous Agent. IBM Smart Business Development and Test on the IBM Cloud provides a REST, Java APIs, and a command line tool for automating actions for resource management that are similar to the capabilities in the self-service user interface. These actions include creating and managing virtual machine instances, storage volumes, and IP addresses. This section explains how to automate resource management using command line scripts.

To download the IBM Development and Test Cloud client, click the Support tab in the cloud portal. See the Command Line Tool Reference for details on the available commands.

Get started by making a directory for your command line scripts. The assumption is that you are using a Windows client. Linux is similar.

  1. Install an IBM version 1.6 JDK if you do not have one already installed.
  2. Define JAVA_HOME.
  3. Extract the command line tool zip into a directory on your system without spaces in the path using CC_HOME.
  4. If you prefer to store your commands in script files, create a directory for your scripts using MY_HOME.

Once you have created a directory for your scripts, initialize your client with a password file. This is required and protects your real Development and Test Cloud password; no need to store it in scripts or clear text.

  1. Type the following commands to perform this task:
    > set JAVA_HOME=D:\Program Files\IBM\SDP\jdk > set PATH=%JAVA_HOME%\bin;%PATH% > set CC_HOME=D:\dev_test_cloud\cmd > set MY_HOME=D:\myhome\script > cd %CC_HOME% > ic-create-password -u -p secret -w unlock -g %MY_HOME%/mykey.ext 

  2. Depending on whether you use Windows or Linux, the command extension should be .cmd or .sh, respectively. These commands set the Java home, the system path, the home for the scripts, changes to the script directory, and execute the ic-create-password command to create the password file.
  3. The passphrase to unlock the password file is unlock.
  4. Substitute your own values for user ID (, password (secret), and passphrase (unlock).

One way to achieve your goal of moving resources is to delete the instance on the public Internet and create another instance on your company's VLAN. In the onboarding package you will receive information on how to connect to the VLAN using a VPN connection.

  1. To get a list of the instances that you own, use the command:
    > ic-describe-instances -u -w unlock -g %MY_HOME%/mykey.ext 

    That command gives the ID and other details of the instance to be deleted.

  2. Using the ID of the instance to be deleted, INSTANCE_ID, type the command:
    > ic-delete-instance -u -w unlock -g %MY_HOME%/mykey.ext -l INSTANCE_ID is a fictitious user name.

Follow these steps to create a new instance on your VLAN.

  1. Choose the image for which you want to create a virtual machine. To get a list of images, use the command:
    > ic-describe-images -u -w unlock -g %MY_HOME%/mykey.ext > images.txt 

  2. Because the image list is so long, save it to the file images.txt. The image ID is needed below.
  3. The output of the command, as saved in the file, looks similar to this:
    ... ID : 20003206 Name : IBM Lotus Web Content Management 6.1.5 - BYOL Visibility : PUBLIC State : AVAILABLE Owner : SYSTEM Platform : SUSE Linux Enterprise Server/11 Location : 41     ~~~~~     InstanceType ID : BRZ32.1/2048/175     Label : Bronze 32 bit ... 

Let's provision an instance of the image with name SUSE Linux Enterprise Server 11 for x86 and ID 20001150.

  1. Provision the image on a Bronze 32-bit system, which has instance type ID BRZ32.1/2048/175.
  2. Since you want to provision the virtual machine on your company's network, you also need to find out how the VLAN is identified. To do that, type the command:
    > ic-describe-vlans -u -w unlock -g %MY_HOME%/mykey.ext 

  3. The output looks something like this:
    Executing action: DescribeVLANs ... ---------------------------------- ID : 1 Name : Private VLAN Raleigh Location : 41 ... 

The SSH key used to connect to the virtual machine can be generated using the Development and Test Cloud user interface or using the command line.

  1. To generate a key called MyKey, type the command:
    > ic-generate-keypair -u -w unlock -g %MY_HOME%/mykey.ext -c MyKey ? 

  2. Cut and paste the text returned into a text file for use in your SSH client.

You now have all the information needed to provision a virtual machine.

  1. Use the VLAN ID returned and the other information above in the ic-create-instance command to create the virtual machine.
  2. The information returned also includes a data center ID associated with the VLAN. You need to provision the virtual machine in the same data center as the VLAN. To create the instance, type this command:
    > ic-create-instance -u -w unlock -g %MY_HOME%/mykey.ext    -t BRZ32.1/2048/175 -n MySUSE -k 20001150 -c MyKey -d MyDescription    -L DATA_CENTER_ID -x VLAN_ID 

  3. The DATA_CENTER_ID variable should be replaced by 41 and VLAN_ID by the value 1 in this command.
  4. The IP address of the virtual machine is generated by the system and on the given VLAN. To find the IP address, type the describe instances command again.
  5. That command also gives you the status of the instance allowing you to find out when the instance has been started. You should see something like this in response to the command:
    Executing action: CreateInstance ... The request has been submitted successfully. 1 instances! ---------------------------------- ID : 36519 Name : MySUSE Hostname : InstanceType : BRZ32.1/2048/175 IP : KeyName : MyKey Owner : RequestID : 36819 RequestName : MySUSE Status : NEW Volume IDs : ---------------------------------- Executing CreateInstance finished 

  6. You can watch the status of the request in the Developer and Test Cloud control panel or use the ic-describe-instances command above to find out when the instance is ready to use. The output looks something like this:
    Executing action: DescribeInstances ... 1 instances! ---------------------------------- ID : 36519 Name : MySUSE Hostname : InstanceType : BRZ32.1/2048/175 IP : KeyName : MyKey Owner : RequestID : 36819 RequestName : MySUSE Status : ACTIVE Location : 41 Vlan ID : 1 Vlan Name : Private VLAN Raleigh Vlan Location : 41 Volume IDs : Disk Size : 175 Root Only : null 

  7. Notice that the status is ACTIVE and remember the IP address to connect to the instance.

Now that you have performed these tasks, you know how to automate management of cloud computing resources using command line scripts.

Set up an autonomous agent step by step

An IBM Tivoli Enterprise Monitoring Agent can be used independently from the Tivoli Enterprise Portal using the autonomous agent capability. A Tivoli system Monitoring Agent is an operating system agent that is installed and configured to have no dependency on IBM Tivoli Enterprise Portal. Agents are configured for autonomous capabilities by default. In this mode, an agents runs, collects data, runs situations, and generates events independently. It provides a simple, standalone HTML/XML interface, REST API, and secure authentication, and emits SNMP events.

This can be an appropriate solution for development and test environments where the cost of a full monitoring solution is not justified and development skills are available. It can also be appropriate as a lightweight monitoring component integrated into a reusable cloud application. For example, today it is becoming common to offer products as virtual appliances that include an operating system and configured software in situations where, in the past, a software install image was offered as the product. A virtual appliance offering may include a lightweight monitoring solution. However, in mission-critical applications, such as banking, insurance, e-commerce, shipping, and transportation, a full monitoring solution is the appropriate choice.

Private events are events that are processed locally in contrast to enterprise events that are processed by the monitoring server. The events emitted can be SNMP or Event Integration Facility (EIF) events. XML files are used to configure the situations for private events in a similar way to enterprise events, which are created with the IBM Tivoli Enterprise Portal situation editor.

Events can be defined by situations where data is missing or reaches a particular value. Here is an example of a situation:

                       ~/cpulog.txt]]>          000100      

  • This XML fragment contains one private situation. It detects high CPU by averaging over the virtual machine CPUs. The interval is one minute (000100).
  • The expression uses the *VALUE, *LT, *AND, and *EQ criteria functions to test for the condition CPU idle time being too low.
  • If this condition is true, then the command in the element will be executed. The command is an example that logs a line to the file cpulog.txt. In a real implementation, this should be replaced with Development and Test Cloud command line scripts, introduced above.
  • The element is required when a element is present to specify how often to execute the command.
  • The When attribute value of Y means that the command should be executed for each item that evaluates to true.
  • The Frequency attribute value of Y indicates that the command should be executed every time the criterion evaluates to true.
  • To test that it works properly, change the CPU value from 10 to 100 so it will be triggered every time the situation is sampled.
  • The available attributes are listed in the file:

There are hundreds of monitoring attributes reported by the Linux agent, including login parameters, disk usage, network usage, CPU, processes, system statistics (swapping and others), disk IO, and NFS. Following is a list of some of the available attributes to help you get started:

  • Login variables: KLZ_User_Login.System_Name, KLZ_User_Login.Timestamp, KLZ_User_Login.User_Name, KLZ_User_Login.Login_PID, KLZ_User_Login.Line, KLZ_User_Login.Login_Time, KLZ_User_Login.Idle_Time, KLZ_User_Login.From_Hostname

  • Disk usage: KLZ_Disk.System_Name, KLZ_Disk.Timestamp, KLZ_Disk.Disk_Name, KLZ_Disk.Mount_Point, KLZ_Disk.FS_Type, KLZ_Disk.Size, KLZ_Disk.Disk_Used, KLZ_Disk.Disk_Free, Linux_Disk.Space_Available_Percent

  • Network usage: KLZ_Network.System_Name, KLZ_Network.Timestamp, KLZ_Network.Network_Interface_Name, KLZ_Network.Interface_IP_Address, KLZ_Network.Interface_Status, KLZ_Network.Transmission_Unit_Maximum, KLZ_Network.KBytes_Received_Count, KLZ_Network.Bytes_Received_per_sec, KLZ_Network.KBytes_Transmitted_Count, KLZ_Network.Bytes_Transmitted_per_sec

  • CPU: KLZ_CPU.System_Name, KLZ_CPU.Timestamp, KLZ_CPU.CPU_ID, KLZ_CPU.User_CPU, KLZ_CPU.User_Nice_CPU, KLZ_CPU.System_CPU, KLZ_CPU.Idle_CPU, KLZ_CPU.Busy_CPU, KLZ_CPU.Wait_IO_CPU, KLZ_CPU.User_Sys_Pct, KLZ_CPU_Averages.System_Name, KLZ_CPU_Averages.Timestamp, KLZ_CPU_Averages.Days_to_CPU_Upgrade, KLZ_CPU_Averages.CPU_Usage_Current_Average, KLZ_CPU_Averages.CPU_Usage_Moving_Average, Linux_CPU.Idle_CPU, Linux_CPU.CPU_ID, Linux_Process.Busy_CPU

  • Processes: KLZ_Process.System_Name, KLZ_Process.Timestamp, KLZ_Process.Process_ID, KLZ_Process.Parent_Process_ID, KLZ_Process.Process_Command_Name, KLZ_Process.Proc_CMD_Line, KLZ_Process.State, KLZ_Process.Proc_System_CPU, KLZ_Process.Total_Size_Memory, KLZ_Process.Threads

  • System statistics: KLZ_System_Statistics.System_Name, KLZ_System_Statistics.Timestamp, KLZ_System_Statistics.Ctxt_Switches_per_sec, KLZ_System_Statistics.Pct_Change_Ctxt_Switches, KLZ_System_Statistics.System_Load_1min, KLZ_System_Statistics.System_Load_5min, KLZ_System_Statistics.System_Load_15min, KLZ_System_Statistics.Pages_paged_in, KLZ_System_Statistics.Pages_Swapped_in, KLZ_Swap_Rate.System_Name, Linux_System_Statistics.Pages_Swap_in_per_sec, Linux_System_Statistics.Pages_Swap_out_per_sec

  • Disk IO: KLZ_Disk_IO.System_Name, KLZ_Disk_IO.Transfers_per_sec, KLZ_Disk_IO.Blk_Rds_per_sec, KLZ_Disk_IO.Blk_wrtn_per_sec

  • NFS: KLZ_NFS_Statistics.System_Name, KLZ_NFS_Statistics.NFS_lookups, KLZ_NFS_Statistics.NFS_Read_Calls , KLZ_NFS_Statistics.NFS_Writes

The agent configuration file is /opt/IBM/ITM/config/lz.ini. Check that the file has the line: IRA_AUTONOMOUS_MODE=Y.

Try out this private situation:

  1. Cut and paste the XML above into a file called lz_situations.xml.
  2. Place the file in the directory /opt/IBM/ITM/localconfig/lz.
  3. Make sure that you don't include any characters that are not valid XML in the cut-and-paste process. Check this by loading the file into a web browser.
  4. Restart the agent using the commands:
    sudo /etc/init.d/ITMAgents1 stop sudo /etc/init.d/ITMAgents1 start 

You can also define parameters for local collection of historic monitoring data. See the Agent Autonomy section of the IBM Tivoli Monitoring 6.2.2 Information Center for more details on agent autonomy. The Information Center has several private situation example configuration files.

Use the agent service interface to receive information from the agent, for example, reports of agent information, private situations, and history. The agent service is accessed through the IBM Tivoli Monitoring Index Service facility, which operates as a HTTP server.

  1. To start the agent service interface, enter the URL http://:1920 or https://:3661/ into your browser.
  2. You should see something similar to Figure 18:

    Figure 18. IBM Tivoli Monitoring Service Index
    IBM Tivoli Monitoring Service Index

  3. Follow the IBM Tivoli LZ Agent Service Interface link to see Agent Information, Situations, History, Queries, and Agent Service Request. If you have trouble at this point, check and adjust the firewall settings using Yast (on SUSE).
  4. You need to open the port for the Agent Service Interface, which is set at random every time the agent starts. To do this, type yast at the command line and go to Security and Users > Firewall.
  5. Select Advanced under Allowed Service.
  6. Enter the TCP port shown by mousing over the LZ Agent Service Interface. The Microsoft Internet Explorer browser seems to work better for these pages.
  7. If you have problems, check the agent logs in /opt/IBM/ITM/logs. Grep through the logs with a command, such as > grep 'private situation' /opt/IBM/ITM/logs/*.*, to check for errors.
  8. You should see the Linux agent service interface screen:

    Figure 19. Linux agent service interface
    Linux agent service interface

  • The Queries link lets you view monitoring information, such as the process list:

    Figure 20. Process list
    Process list

  • The Situations link displays situations defined in an XML file. The Linux High CPU Overload situation is shown in Figure 21, defined in a previous XML file example:

    Figure 21. Linux high CPU overload situation
    Linux high CPU overload situation

  • To test execution of the event, look for the file cpulog.txt in the root user's home directory and a non-zero value of TRUESAMPLES in the web interface.

At this point, you have all the pieces that you need for a lightweight monitoring solution. One of the drawbacks to this solution is that you need to deploy the Development and Test Cloud command line bundle on each system that you are monitoring. That can be addressed by the autonomous agent REST interface, mentioned below.

  • The Service Request Interface link in the Agent Service Interface is the REST interface for retrieving data collected by the agent in XML format.
  • The IBM Tivoli Monitoring Information Center gives detail on the REST application programming interface.
  • The IBM Tivoli Monitoring Agent Service Interface Client (see Figure 22) is a nice tool that lets you test the REST service interface.
  • A private situation control allows you to start, stop, or recycle a private situation on the monitoring agent with a request.

    Figure 22. IBM Tivoli Monitoring Agent Service Interface Client
    IBM Tivoli Monitoring Agent Service Interface Client

  • For troubleshooting, remember to check firewall settings using Yast, as described above.
  • For remote access, each agent requires an outbound port 1918 to be opened. The managing server requires 1918 and 1920.

In conclusion

Now that you have worked through the steps in the article, you know how to:

  • Create a virtual machine using Development and Test on the IBM Cloud.
  • Use IBM Tivoli Monitoring on the cloud to monitor your systems.
  • Use the Development and Test on the IBM Cloud command line tools to automate creation and deletion of virtual machines.
  • Use the IBM Tivoli Monitoring autonomous agent capabilities to implement a lightweight monitoring solution.

You can use the IBM Tivoli Monitoring image in the Development and Test on the IBM Cloud catalog to experiment for yourself.



Get products and technologies

  • See the product images available on the IBM Smart Business Development and Test on the IBM Cloud.


About the authors

Alex Amies is a senior software engineer in the IBM GTS Development Lab in the China development lab. He is currently an architect working on the design of the IBM Smart Business Development and Test on the IBM Cloud. Previously, he acted as an architect and a developer on cloud and security products in other groups within IBM.

John Sanchez is an architect with the IBM Tivoli monitoring team. He has developed images for IBM Tivoli Monitoring on the IBM Smart Business Development and Test on the IBM Cloud.

In recent years, Dominique Vernier focused on Java technologies and cloud architecture. He also has been working in information technology for quite a while where he earned a broad knowledge in such technologies and products as messaging, database, SOA, EAI, client/server, C/C++, and existing frameworks. Dominique also has extensive knowledge in industry areas such as telecom, CRM, logistics, and insurance. He is the author/co-author of four patents having to do with state engines and resource management. At present, Dominique is in charge of the Smart Business Development and Test on IBM Cloud solutions on the IBM GTS Global Team.

Xu Dong Zheng is a staff software engineer in the IBM GTS Development Lab in China working on IBM Smart Business Development and Test on the IBM Cloud, specializing in performance.

Posted by ShashiKiran2 on 22 February 2011

1 commentaire:

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in SIX SIGMA , kindly contact us
    MaxMunus Offer World Class Virtual Instructor led training on SIX SIGMA . We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Saurabh Srivastava
    Skype id: saurabhmaxmunus
    Ph:+91 8553576305 / 080 - 41103383