PDA

View Full Version : Physical Memory Usage on Linux/NET-SNMP


rajib
01-24-2009, 01:16 AM
As many of our users have noticed, the "Physical Memory Usage" test on Linux (monitored using NET-SNMP agent) often returns a high value like 98%. This is due to the fact that Linux uses available memory for I/O cache. The cached memory is released when an application needs it. While this provides improved I/O performance (ever wonder why the second 'find' command is much faster than the first one :-) it can create confusion and false alarms in Traverse.

Fortunately, you can use the Composite Monitor to work around this issue. Here are the steps that you will need to follow to create a new test the reflects the true memory utilization:

Step 1: Rename Existing Memory Usage Test

- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Locate the "Physical Memory Usage" and click on modify icon
- Change the test name to "Total Memory Usage"
- Change warning and critical thresholds to 100
- Take note of the "Maximum Value" parameter; we will need this for the next step
- Click on "Submit"

Step 2: Monitor Cached Memory Usage

- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Click on "Create New Advanced Tests"
- Enable "Advanced SNMP Test" and set following parameters:

Test Name : I/O Cache Memory Usage
Warning Threshold : 100
Critical Threshold : 100
SNMP Object ID : .1.3.6.1.4.1.2021.4.15.0
Maximum Value : (maximum value from previous step)
Post Processing Directive : Percent
Test Units : %
As test value rises, severity: Ascends
Monitor Instance: Use Existing

- Click on "Provision Tests"

Step 3: Calculate Real Memory Utilization

- Navigate to Administration -> Devices
- Locate the device in question and click on "Tests"
- Click on "Create New Advanced Tests"
- Enable "Composite Test" and set following parameters:

Test Name : Physical Memory Usage
Warning Threshold : 85
Critical Threshold : 95

- Click on "Add" (Child Tests)
- From the pop-up, select "I/O Cache Memory Usage" and "Total Memory Usage"
- Click on "Add Tests" (pop-up closes)
- In "Expression" field, enter "T1 - T2" (w/o the quotes) where T1 is "Total Memory Usage"
- Click on "Provision Tests"

Now navigate to Status -> Devices and drill-down into the device in question. Within a few minutes, you should see the correct memory utilization (without the portion used by I/O cache) reflected in the (composite) "Physical Memory Utilization" test.

mgh4
04-06-2009, 07:33 PM
Great, we were just setting thresholds high and living with it. This is much better.

It would be nice to include the .1.3.6.1.4.1.2021.4.15.0 in the ucd xml

danwoods
07-28-2009, 01:58 PM
Dunno how useful this is, but it's a step saver for sure if anybody is interested... I put together a short little perl plugin that will do the same thing that Rajib describes here... Been using it for a couple of days and it seems to work fine.

API Syntax:

test.update "deviceName=<device_name>" "testName=Real Memory Usage","testType=realmemuse","subType=real_memoryuse","testName=Real Memory Usage","warningThreshold=85","criticalThreshold=95"

Default polling interval is 10 minutes, but if you are using the API to update multiple devices than you can add the "interval=300" parameter if you want 5 mintues, for example...

-dw

lauraj
10-16-2010, 12:14 AM
Some customers have inquired that although the above solution includes the I/O cache in the free memory calculation, the buffered memory is not counted. If you would like to incorporate that into the test, two further steps are needed. First, we will create an advanced SNMP test called "Buffer Memory Usage", and secondly we will modify the composite test to include this new child test into the expression.

1. Add a new advanced SNMP test called "Buffer Memory Usage".

-Navigate to Administration>Devices
-Locate the device in question and click on "Tests"
-Click on "Create New Advanced Tests"
-Enable "Advanced SNMP Test" and set the following parameters:

Test Name : Buffer Memory Usage
Warning Threshold : 100
Critical Threshold : 100
SNMP Object ID : .1.3.6.1.4.1.2021.4.14.0
Maximum Value : <maximum value from I/O Cache Memory Usage>
Post Processing Directive : Percent
Test Units : %
As test value rises, severity: Ascends
Monitor Instance: Use Existing

2. Modify the "Physical Memory Usage" composite test created earlier.

-Navigate to Administration>Devices
-Locate the device in question and click on "Tests"
-Find the "Physical Memory Usage" test in the list and click Modify
-Click on "Add" (Child Tests)
-From the pop-up, select "Buffer Memory Usage"
-Click on "Add Tests" (pop-up closes)
-In "Expression" field, change "T1-T2" to "T1-T2-T3", where T1 = "Total Memory Usage", T2 = "I/O Cache Memory Usage" and T3 = "Buffer Memory Usage"
-Click on "Provision Tests"

The "Physical Memory Usage" test will now reflect the memory utilization without the portion in the buffers or I/O cache, and will match more closely to what the "free" command in Linux reports on the "-/+ buffers/cache" line.