Very Old nmon and Very Old AIX Questions
OLD Question 1: I have a problem with nmon running on AIX 4.0.3 (or any really old AIX versions)?
- Hard luck
- I will actively help get AIX 5 bugs fixed but older versions are very much less interesting.
- In particular, on AIX 4.1.5 the TOP processes does not work but I am not going to fix it unless some one offers me hard currency
OLD Question 2: Can I get the adapters stats from other tools?
- Not in AIX 4 - there are no adapter stats in this AIX.
- This is now available in AIX 5 and higher via the libperfstat library so programmers can get this information - but a warning this is derived data from the connected disks (NOT tape drives) because there is no adapter stats. XXX
OLD Question 3: When I start nmon 9 on a system that it use to run fine I know get an error message?
- The error is something about "lslpp" AIX 5.1 about ML03 onwards - or - WLM stats go missing - after upgrading to AIX 5.2 ML5 - can you fix nmon?
- These are bugs in AIX and not nmon -there are fixes available.
- Please report these problems to your AIX support channel and not me. nmon 10 has also been back ported to AIX 5.1 and AIX 5.2 and has code to work around these bugs and can be used instead of nmon9a.
OLD Question 4: Can you add the monitoring of process priority?
- Available from the AIX 5.1 onwards
- Always available in nmon for Linux
OLD Question 5: nmon on AIX, nmon 9 does not run, please fix?
- With reports like:
- read error: No such device or address
- nmon file=nmon.c line=1278 version=XXX
- In 95% of the time it is because AIX was upgraded or a maintenance level added but the AIX/system was not rebooted. It is very easy to miss the "You must reboot" message in the gallons of installp output. The reboot is required because the AIX kernel image has been updated and the reboot is the only way to activate the new /unix file. nmon reads the /unix file to find kernel data structure addresses but if the /unix file does no match what is actually running, you get this message.
You can also get really weird effects, if you have messed up LIBPATH.
OLD Question 6: Old nmon version question: nmon and AIX commands do not agree?
- A lot of this happens with nmon 10 and the Shared Processor Logical Partitions (SPLPAR) - what marketing calls Micro-partition.
- Some of it is because the AIX commands are very unclear about what they are reporting.
- What was CPU numbers can now be physical CPU, Logical CPU or Virtual CPU numbers and the documentation is unclear.
- So you may not be comparing "like with like". This has been improved in nmon 11 - please report further issues from nmon 11 onwards.
- also see question 26.
OLD Question 7: Old nmon for AIX: Adapter stats and IOADAPT is not saved to the nmon file seems to be missing with AIX 5.1?
- Correct, this data is not available on AIX 5.1 from the libperfstat library.
- This also causes a problem on nmon2rrd version 10 where it expects the IOADAPT section and crashes.
- Recommended action upgrade AIX as 5.1 is not supported without purchasing extended support.
OLD Question 8: nmon for AIX will not start on AIX 5.1 due to a libperfstat error?
- The error is something like:
- exec(): 0509-036 Cannot load program <nmon binary file here> because of the following errors:
- 0509-150 Dependent module libperfstat.a(shr.o) could not be loaded.
- 0509-022 Cannot load module libperfstat.a(shr.o).
- 0509-026 System error: A file or directory in the path name does not exist.
- You will need to have installed the libperfstat library from the AIX CDROMs.
- This is in bos.perf.libperfstat package.
- I hope you realise that AIX 5.1 is not normally supported as it is so old.
OLD Question 9: Old nmon version: AIX 5.3 updated but then nmon gives "Illegal instruction(coredump)"
- This has been reported shortly after an upgrade to a AIX 5.3 higher ML (like ML5 or ML6) and reboot.
- After a lot of research and experiments the following was found by a persistent nmon user called Xi Chen.
- The problem seems to be nmon jumping to a library like libperfstat and the jump vectors are not right so the library/system call jumps to address zero and attempts to execute instruction zero (invalid, of course).
- This is a bug in AIX and its update process where the libperfstat kernel package does not match the library.
- Try the following command: # lslpp -L | grep -i perfstat
- You may get something like:
# lslpp -L | grep -i perfstat
bos.perf.libperfstat 18.104.22.168 C F Performance Statistics Library
bos.perf.perfstat 22.214.171.124 C F Performance Statistics
- Update the package bos.perf.libperfstat to the same (126.96.36.199) or at least much closer levels (like 188.8.131.52 and 184.108.40.206) as bos.perf.perfstat. Preferably, the latest available levels.
OLD Question 10: Old AIX version: AIX 5.3 updated but then nmon gives "Assert Failure"
- This has been reported shortly after an upgrade - some machines have this problems while others don't.
- There does not seem to be a pattern. There has been a lot of investigation of this issue with tools being written but it is still a mystery.
- The libperfstat library is claiming that an invalid parameter has been passed but tools have shown this is not true.
- The three parameters are a pointer to memory (just malloc'ed in the code), the number of adapters (just returned by the previous call to libperfstat) and the size of the diskadapter structure (which has never changed). The output looks like this:
ERROR: Assert Failure in file="nmon11.c" in function="main" at line=3300
ERROR: Reason=System call returned -1
ERROR: Expression=[[perfstat_diskadapter((perfstat_id_t * )FIRST_DISKADAPTER, p->adapt, sizeof(perfstat_diskadapter_t), adapters)]]
ERROR: errno means : Invalid argument
- Then it has been found that a reboot fixes most of these Assert Failures. We don't fully understand this but it may be adapters in funny states, or kernel modules need to be reloaded or libperfstat in a twist - one thing we do know - its not nmon! If you hit this problem:
- Check the software levels, see Question 53
- Do you think that you rebooted after the upgrade or do you know for absolutely sure!!
- Try: export NMON_IGNORE_ASSERT=1 and then start nmon from this same ksh. This may work around the problem as nmon bravely tries to carry on even with library errors.
- Try the latest beta version of nmon (if it supports your AIX level).
- I know rebooting can be a problem with production systems but it fixes this the vast majority of the time.
- If still its a problem, let us know via the usual AIX Performance Tools Forum.
OLD Question 11: Old AIX version:On AIX 5.3 ML6, nmon output files contain zeros, missing CPU stats, corrupt ZZZ lines and "nfs" strings found in the stats?
- This is yet another bug in the AIX libperfstat library at this ML6. The NFS data returned to nmon is corrupt and these characters may be output directly from the library (very bad form chaps!). The work around is:
- Do not include NFS statistics (remove the -N)
- Move to nmon12 that codes around these bugs.
OLD Question 12: Old AIX version: Why is the Process memory percentage zero? (same for System and User percent)
- This seems to happen in AIX 5.3 TL07 or there about. In fact, it is the AIX libperfstat library, which nmon uses, that has a bug in it that returns a large negative number for the Process% value. The Process, System and User Percentages are approximations (remember memory has many modes, types and uses and some overlap) and the calculation goes wrong.
- nmon reports this problem by showing 0% - which is clearly impossible.
- The bug was very hard to reproduce and track down because the problem only happens in particular circumstances and changes in memory use (like starting and stopping large memory applications). I am pretty sure you have a good chance of the number being fixed (for at least some time but may reappear), if you reboot the machine/LPAR.
- The fix is to update AIX to AIX 5.3 TL09 (or even better AIX 6) but there may be a PTF or efix. You will have to ask AIX Support by asking for a fix to the libperfstat library to fix the real_system, real_process and real_user members of the perfstat_memory_total_t structure. That will give them the right details to search for in the Retain database. Do not ask for nmon classic support as the answer could be short and/or rude!
- In my experience AIX systems administrators don't like adding these updates to a production machine. So it may be better to just accept that if any of these numbers are zero then do not use any of these percentages.