Njmon

njmon is like nmon but saving to JSON format or direct to InfluxDB

- for modern performance stats tooling

- covering AIX, VIOS and Linux

Briefly

njmon is like nmon It is a performance and configuration data collector for AIX, VIOS and Linux.

  • njmon is written in the C language. It is a single file and takes a second or two to compile. Only "make" and a C compiler like GCC are required.
  • It collects a lot more performance and configuration data than nmon. The output is self documenting using name=value style.
  • The output is in JSON or InfluxDB Line Protocol formats. Both are self documenting using name=value style.
  • JSON format is nicely popular for immediate uploading to various performance stats time-series databases or directly sent to "njmond" daemon on the network for insertion into InfluxDB.
  • The Python language can deal with JSON data very quickly. "njmond" is written in Python and multi-threaded.
  • The InfluxDB Line Protocol can be directly inserted into InfluxDB (without using njmond).
  • InfluxDB is recommended but other databases/repositories can be used to receive, store, share and long term manage the statistics like Prometheus, Elastic Stack (ELK), Splunk.
  • To visualise the statistics Grafana is recommended. This takes the data from InfluxDB (and others Database) for near real-time beautiful and powerful graphing and runs in your web browser.
  • If the binary is called "njmon" it creates JSON data. If the binaries is called "nimon" it creates InfluxDB Line Protocol.
  • There is one version of njmon for AIX (and VIOS) and another version of njmon for any Linux on any hardware.

Supported hardware and operating systems

  • njmon and nimon for AIX (and VIOS) on Power Systems
    • Uses the AIX libperfstat to extract the performance stats and other UNIX system calls for things like filesystems and processes.
    • Roughly: 1400 stats for AIX.
    • If you are monitoring the Virtual I/O Server there are an additional 55 statistics. If running a VIOS Shared Storage Pool (SSP) there are an additional 35 statistics.
  • njmon and nimon for Linux (many hardware platforms (POWER, Intel, AMD, ARM) and OS distributions/versions)
    • The performance stats come mostly from /proc file system and UNIX like system calls.
    • Roughly: 800 stats - possible future stats includes FC SAN stats.
    • Spectrum Scale (GPFS), GPU statistics & Process stats recently added.
    • Ideas welcome, if you know where the stats can be found on Linux.

In both cases the njmon program runs as a small demon on each server, virtual machine or operation system and regularly (typically, every 30 seconds) the data sent to a central data repository for analysis and graphing.

Other njmon and nimon related tools

  • ninstall - Korn shell simply installs the njmon to a suitable directory, create a hard link to nimon, installs manual pages and sets permission.
  • njmond - written in Python. This daemon provides a central daemon repository for njmon saving JSON data. the data can be saved to .json files or immediately forwarded on to InfluxDB used the InfluxDB Python REST API.
  • nmeasure - written in C. This simple tool lets user add their own data for the njmon database and includes the same hostname, serial_no, MTM, HW and OS details, so it can be graphed the same way and njmon data. Typically, this would bu run in a Ksh script and started regularly with cron.
  • njmonchart - written in Python to convert njmon JSON data to a web page of Google chart graphs (to be viewed in a web browser) or CSV files. Similar to the older nmon and nmonchart.
  • Grafana graphs - assumes you are using InfluxDB. Many sample graphs for njmon are available for download from grafana.com as a good starting point to develop your own graphs.

If you want to know more here is: Two hours of Training on njmon via YouTube by the developer:

  • Note these videos first cover njmon saving JSON via njmond collect.
  • Video 9 cover nimon. nimon is recommended as it is simpler to setup. You may choose to skip video 3 if you decide to use nimon.
  • The data save into InfluxDB is the same and graphed the same way.
  • njmon + InfluxDB + Grafana Series YouTube Playlist
    1. Click Here Introduction njmon - 15 minutes
    2. Click Here Installing InfluxDB & Grafana - 11 minutes
    3. Click Here Install njmon and Set-up - 22 minutes
    4. Click Here Using njmon Graph Dashboard / Templates - 18 minutes
    5. Click Here 1st njmon graphs - 9 minutes
    6. Click Here Creating a Template to switch server - 10 minutes
    7. Click Here Graphing Multiple Resources - 18 minutes
    8. Click Here njmonchart graphs via a .html file - 9 minutes
    9. Click Here njmond Install & Set-up
    10. Click Here nimon Intro & Setup (link njmon but the data gores straight to InfluxDB (no njmond)
    11. Click Here Tags, Measures and Statistics
    12. Click Here Adding Your Own Data
    13. Click Here Grafana 7.1 adding your First Graphs a Demo

Let me know what is not well explained at nigelargriffiths at hotmail DOT com

Latest Downloads - February 2024

LatestAIX and VIOSLinux - ppc64 and AMD64Linux older release - Mainframe/Z/s390
njmon version 83njmon_aix_v83beta.zip
Upgraded this release = no longer considered a beta.
Code and binary for AIX 7.1, 7.2 and 7.3. No further bugs reported.
Fixed Share Processor Pools stats units (entitled_capacity_cores, max_capacity_cores) , catching very rare Segmentation Violation in perfstat library, when LPM between P9 and P10 servers.
Now automatically restarts njmon with original parameters, LPAR Dynamic events like LPM, CPU/memory, and entitlement change. Dynamic toggle on/off debug output via SIGUSR1
- njmon_aix_v83_manual_page
njmon_linux_v83.zip Code, makefile, ninstall (shell script installs code and manual pages), manual pages, example output files (.json and influxlp), bug fixes including /proc/diskstats new discard, flush stats. Many improvements to manual page, njmon -h output.
This time x86_64/AMD64 binary programs only.For Power CPU binary programs: please compile your own.
- njmon_linux_v83_manual_page
- njmon_linux_v83_online_help
No access to RHEL or SLES on a mainframe also called Z.
If you can help me with access (Internet login/password with a C compiler) or compile it for me, please let me know.
njmon version 81njmon_aix_code_v81.zip
njmon_aix_binaries_v81.zip

njmon_linux_code_v81.zip
 
njmon version 80 FINALnjmon_aix_code_v80.zip
njmon_aix_binaries_v80.zip
njmon_linux_code_v80.zip
njmon_linux_binaries_v80.zip

njmon_s390_code_v70.zip
njmon_s390_binaries_v70.zip
including "ninstall" and manual pages, RHEL 7,SLES 12, Ubuntu 18, Mainframe, Z, s390
njmond v80-njmond Central Python Service njmond_linux_v80beta2.zip-
nmeasure v80Add you own stats nmeasure_aix_v80beta2_influxdb2.zip
Add you own stats nmeasure_linux_v80beta2_influxdb2.zip

Add your own stat nmeasure_linux_S390_v70.zip
Grafana Dashboards1) Go to grafana.com/dashboards
then Search for njmon, browse, & view sample graphs
or
2) Import directly into your Grafana
Click the left side menu "+" then Import
add the Dashboard ID & click Load
10891 = Very simple six graphs
13701 = above plus config
15043 = above plus njmon monitoring itself
14509 = large numbers of graphs of many types
11445 = AIX process monitoring
16278 = examples of njmon v80 new stats
11899 = experimental new Grafana graphs
10832 = Monitoring a whole server
10895 = lots of config values
12573 = Carpet heat map
13703 = NFS stats
1) Go to grafana.com/dashboards
then Search for njmon, browse, & view sample graphs
or
2) Import directly into your Grafana
Click the left side menu "+" then Import
add the Dashboard IDs & Load

10844 = Very simple six graphs plus config
15086 = Linux process monitoring

Sorry seemed to have lost the Linux Large Dash board
Anyone got Linux dashboards they can share?
See Linux Dashboards
Listing the measures and statsnimon measure and statssee AIX column-
  • Older experimental Python3 injector using Splunk HTTP Event Collector (HEC) for JSON data njmon_splunk_injector_v70.py
  • Alternative 1 is using nimon -> telegraf with Splunk plugin -> Splunk
  • Alternative 2 if using njmon -> central njmond saving JSON files and Splunk collects the JSON files like it does for log files.
  • njmon_tools_v56.zip
    • njmon JSON output files are formatted with all the stats from data capture in a single long line of text.
    • Python script to convert njmon JSON to a pretty readable format for humans - line2pretty.p: $ line2pretty <nmon_output.json > readable.json
    • Plus the pretty2line.py to convert it the other way.
    • Please ignore the other contents of this .Zip

Current development plans

  • COMPLETED: full hostnames () tags for users with multiple virtual machines with the same short hostname
  • COMPLETED: Multiple thread monitoring for njmond.py
  • nimon to use Secure Sockets to InfluxDB
  • Further testing of VIOS Virtual resource
  • COMPLETED Test Grafana working Alert thresholds and sending emails
  • Move to InfluxDB Version 2.<latest> and create new Videos on the setup.

njmonchart

Recent Release Notes

New Version 81 Linux Release - 11th April 2023

  • njmon for Linux gets the tag functions like the AIX version below i.e. a catch up for Linux.
  • New User defined additional tags for InfluxDB GroupBy variables and host searches.
    • Option example: -q loc=London,owner=NG,App=OracleRDBMS
  • New measure called "tags" with the original tags and user defined ones to allow these in Grafana Single-Stat panels.
    • Also makes showing hostname, os, mtm, architecture, serial_no simpler simple to find.
  • Changed Serial_no, MTM, Architecture to dynamically updated for LPM changes.
  • Fixed njmon_mode so it output njmon-JSON or nimon-InfluxDB.

New Version 81 AIX Release - 1st August 2022

  • New User defined additional tags for InfluxDB GroupBy variables and host searches.
    • Option example: -q loc=London,owner=NG,App=OracleRDBMS
  • New measure called "tags" with the original tags and user defined ones to allow these in Grafana Single-Stat panels.
    • Also makes showing hostname, os, mtm, architecture, serial_no simpler simple to find.
  • Added rPerf calculation for new Power10 models for Scale-out like S1024 or S1022 and E1050.
  • Changed Serial_no, MTM, Architecture to dynamically updated for LPM changes.
  • Changes lsattr handling for older AIX version without current output options.
  • Fixed njmon_mode so it output njmon-JSON or nimon-InfluxDB.

New Version 80 Linux Released - 23 May 2022

  • Changes
    • InfluxDB 2+ ready. Options: -x bucket -O organisation -T token
    • New stats for Load Averages from /proc/loadavg. Measure: loadavg stats: min1, mn5 min15, runable, schedulable, last_started_pid
    • New stats for Swaps - swapping files stats from /proc/swaps. Measure swaps stats: filename, type, size, used, priority
    • New memory rates in events per second for /proc/vmstat for incrementing counters (all 39 of them number per second). To many to name here. All end with _rate
    • Include manual pages and program help info: njmon -h or nimon -h
    • Better defaults for nimon = InfluxDB database name defaults to njmon and InfluxDB port to 8086 - so no need to specify these
    • Added local IP Address
    • Added -A hostname for manual setting of the end point hostname for people with duplicate hostnames
    • Debug output directed to stderr
    • -H nimon save full hostname (FQDN)
    • njmond.py v80 = central service to save JSON files or inject into InfluxDB. Fixes a problem version 55 due to removing the old secret password system.
  • Changes for disks
    • Current measure "disks" works as before.
    • Added to disk stats: discards and discards merge, busy, sectors, flushes, time - useful if using thin provisioning.
      • Using command lsblk to strip out the other junk (like disk partitions, paths and mapper) in /proc/diskstats
    • -D Every line in /proc/diskstats now in new measure "diskstats"
    • -M measure "filesystems" will use mount point names (like /, tmp, /home ...). This is like AIX output.
      • The default uses file system devices - which is different to AIX.
    • -B adds btrfs stats. btrfs = better file system (good for SAP HANA)
    • Added disk add/remove automatically handled - NEEDS MORE FIELD TESTING

New Version 80 AIX FINAL Release - 13 May 2022

  • Full AIX Workload Manager = groups processes together into classes for monitoring and control via shares
    • See Redbookhttps://www.redbooks.ibm.com/redbooks/pdfs/sg245977.pdf
    • Reminder AIX WLM commands are : wlmcntrl, wlmstat, wlmassign
  • Added Disk Path stats and Disk Path status' - good checks for RAS and before VIOS updates
  • New "-A hostanme" for the user to set the njmon hostname - useful for people using duplicate hostnames!
  • Five new stats for AIX & VIOS inside Measure: server = requested by nimon users
    • autorestart="true" Will the virtual machine reboot on a failure or server power-on
    • systemid="067804930" First 2 numbers = the IBM manufacturing site 06=Mexico
    • fwversion="FW950.11(VL950_075)" Firmware version of the server itself
    • ex_intr_virt="true" Using External Interrupt Virtualisation Engine (XIVE) - New in AIX 7.2 TL4 on POWER9 or higher
    • partition_uuid="6be25438-3ae3-449e-8725-900d7ac4492e" Unique id used by Cloud Management Console (CMC)
  • Added local njmon agent IP Address
  • Added disk add/remove automatically handled. Works for my limited systems but needs more field testing
  • AIX updated for Power10 rperf ratings
  • See Grafana graphs examples here https://grafana.com/grafana/dashboards/16278
  • WARNING: New InfluxDB 2+ option appears to work but still very new

WARNING: Use of InfluxDB 2+ is still in Beta Testing

  • It appears to be working for InfluxDB but not the Grafana Graphs
  • I still need to document the new/strange Influx system admin commands.
  • If you know InfluxDB2 go ahead but I may not help much if you get stuck.
  • To Do Check Grafana Dashboards work with InfluxDB 2.1 - I suspect a V1 compatibility packages needs to be added.
  • To Do More detail on using InfluxDB 2.1 - it has been a struggle learning the new CLI and GUI.
  • COMPLETED nmeasure AIX & Linux support for InfluxDB 2.1 - nmeasure lets you directly add your own stats to the InfluxDB njmon bucket
  • COMPLETED njmond.py for InfluxDB 2.1 for centrally handing JSON format from njmon. Note: changed .conf options.** Remove the previous version 80 beta 1 code and binaries
  • Hints below:
    • InfluDB 2+ is of course for the nimon command only.
    • InfluxDB 2.0+ support, handled with two new options
      • -T token from InfluxDB 2.1 GUI (Mandatory and switch to InfluxDB 2 API mode). Security Token is 50 characters (mega password).
      • -O org InfluxDB 2 organises users, dashboards and buckets with organisation (example company or dept). If not set "default" is used
      • Note: InfluxDB 2 "databases" are now called "buckets" for each organisation
      • Browse assuming a browser on the local InfluxDB 2 VM: https://localhost:/8086 - first time you create a user + passwd, org, bucket.
        • Find your security token "LoadData" -> API Token
        • To use the CLI you MUST create a config with YOUR name, URL, ORG and TOKEN and activate the config file.
        • influx create --config-name nigel.conf --host-url http://localhost:8086 --org default --token <100-character-password> --active
      • Hints on creating a org and token on InfluxDB 2.1 using the "influx" command CLI (assumes you have InfluxDB 2.1 setup):
        • $ influx org list
        • $ influx auth list
    • Example nimon with Influxdb 2.1: nimon -s30 -c 2880 -i influxIP -p 8086 -x njmon -O IBM -T <large-hexadecimal-key>

Release Notes for njmon Older Releases


Once you have a InfluxDB & Grafana: Try nextract_plus for more data

  • This extracts Power Systems detail on Server, VIOS and LPAR levels from the HMC
  • See this article for details and downloading the Python3 code: Click Here
  • See the following YouTube video Series: Click Here
    1. nextract_plus: Output Grafana Graphs - actually many more graphs available now
    2. nextract_plus: Data Handling and Statistics
    3. nextract_plus: Install and Setup

What you will need to perform:

  • Assuming you will use InfluxDB for your time-series database and Grafana for creating your graph dashboards.
    • Other software is available like Elastic (use filebeats to monitor the .json files) and Splunk (not covered here).
  1. Install InfluxDB and create a njmon database (one command).
    • InfluxDB & Grafana are supported on AMD64 (x86_64) or Linux on POWER8 or POWER9 and other platforms.
  2. Install Grafana on the same machine or VM and connect it to InfluxDB (very simple).
  3. Install the small program njmon or nimon on each of your endpoint operating systems
    • i.e. AIX, VIOS or Linux (on any hardware).
  4. Two options:
    1. Use central daemon njmond.py
      • Install the njmond.py and set the variables for your environment: user/password, hostname, database name etc.
      • Set you njmon variables to contact the njmond.py to send it the data
      • Add the njmon program to crontab so it starts every day
    2. Use the direct to InfluxDB option called nimon
      • Set you nimon variables to contact InfluxDB to send it the data
      • Add the nimon program to crontab so it starts every day
  5. Download the Grafana njmon sample dashboards for a flying start monitoring your servers.
  6. Modify the Grafana dashboard or create your own to investigate your data.

Options for prototype or implementing:

  • All open-source = no costs (no permission!)
  • Recommend operating system for a simple life: RHEL7+ & Ubuntu18+
  • Own hardware (On-Prem)
    1. AIX on POWER8/9 based LPAR
    2. Linux on POWER8/9 (ppc64le) based LPAR
    3. Linux on AMD64 (x86_86) Server or VM
    4. '''Raspberry Pi 3 or 4 (Ubuntu 16)
  • Your personal laptop / workstation
    1. Linux = very possible and tested RHEL 7.7
    2. Windows 10 (works but ugh!)
    3. Apple Mac - no idea
    4. But when you power off/disconnected your laptop. you can't capture njmon stats
  • Cloud-Based
    1. Corylsis.com Works, very space limited - reduce snapshots to 15 minutes
    2. Influxdata.com Cloud 2.0 free option = limited rate & volume or from $150/month
    3. Grafana.com Cloud $50/month could use laptop
    4. Rent a Cloud Virtual Machine (POWER or AMD64), see "own hardware"

Why njmon?


Infrastructure Choices


njmon AIX v80 new statistics and graphs


Example: Simple Six Plus Dashboard of Graphs


Yet more of a whole POWER server and it's VM's


More Graphs the Large Dash board for AIX


Alert Thresholds - to send you email messages about any problems


Architecture Diagrams


InfluxDB and Grafana are used as my primary tools for storing & graphing njmon data:

  • Open source, popular, focused on time-series stats (just like njmon) and recommended to me
  • Fast install and setup
    • I use Ubuntu as it takes about 5 to 10 minutes to install both
  • Simple Python data inserting
    • See below for bulk upload and online inserting new data
  • Direct access to the data for rapid high impact graphs and exploring the stats

Of course, any other tools that can accept JSON data can be used. Another popular toolset is elastic: Elasticsearch, Logstash and Kibana (ELK):

  • Open source and popular for people first wanting to save and analyse/graph log file data
  • From what I have read, logstash actually store its data as JSON
  • Use elastic filebeat to load your njmon JSON
    • I found this surprisingly easy to set up and use
  • Use filebeat to monitor a file for extra data and send that too
  • As JSON is already structured it is easy to then graph with the integrated Kibana (or Grafana!)

If you are an "elastic" expert, please share your experience and hint & tips.


Prometheus can also be handled using the Influxdata Telegraf tool. This also handles the difference on njmon push and Prometheus pull protocols.

The original njmon creates JSON data with is sent to the njmond.py central daemon for loading in to InfluxDB


Later the nimon version short-circuits the the JSON and the data goes straight to InfluxDB


Details of the sockets and commands for nimon


The nmeasure tool makes it simple to add your own data to the njmon by adding the required tags: hostname, serial number, architecture and operating system.


More Details

njmon endpoints I copy the njmon executable to

  • AIX /usr/lbin/njmon
  • Linux /user/local/bin
  • njmon is proving pretty stable and does not require daily restarting
  • Currently, I start njmon every hour with a -k option (no -c option), mainly for "belt and braces" safety net.
  • I also have the InfluxDB server with a DNS hostname alias of "influx"
  • So this works: 0 * * * * /usr/lbin/njmon -S 60 -k -i influx -p 8181
  • If using a secret: 0 * * * * NJMON_SECRET=abc123 /usr/lbin/njmon -s 60 -k -i influx -p 8181
  • Total one file and one crontab entry

On my central Linux InfluxDB I have:

  • User "nag"
  • Home directory /home/nag/njmon containing: njmond.py, njmon.conf and njmon2influx.py
  • Plus the reformatting tools line2pretty.py, pretty2line.py and njmonold2line.py
  • Plus a sub-directory of data - for the log file (njmon.log) and any .json files (if you are saving JSON data)
  • TOTAL 6 files plus a njmon startup mechanism. I use a hands-on nohup command - in development I am restarting often.
  • you could run njmond.py as a service - hints on how to do this would be useful, please.

Notes:

  • njmon endpoints now create a new socket for every packet of performance data (like a browser)
  • If the back end collector (now njmond.py) is stopped, it no longer crashes njmon
    • njmon ignores the failed socket request & sends the next packet of data at the next snapshot time
  • The collector & injector are merged into one Python program called njmond,py
    • njmond.py has a simple config file to make life simpler it provides InfluxDB injection
    • njmond.py can also save the data to .json files. These are now one file per hostname
    • njmond.py is using queues and back-end threads to do the injection - not two processes per endpoint
  • Use logrotate to compress and remove older JSON data and for the log file njmond.log
    • The files get huge and the file system filled, at which point it's mayhem
  • If you want to add njmon data from a file or njmon via a pipe or njmon via ssh we use a new tool called njmon2influx
  • njmon is no longer using light encryption of an initial socket connection packet
    • The 1st packet data is extracted from each performance data JSON record
    • The identity JSON structure includes the "secret" - called "cookie"
    • Later njmon version might encrypt that to ensure the data origin & security - ideas welcome
    • If security is vital use the ssh method
    • You can reject the data containing invalid an "cookie" or just set "njmon_secret": ignore

njmon modes of operation and which tools to use:

  1. njmon makes a local .json file (-f option), you take the files to the InfluxDB sever & use njmon2influx.py to
    upload the stats in batch mode
  2. njmon used with ssh to get the data to the InfluxDB server. Use njmon2influx.py in stream mode (batch = 1)
  3. If your endpoint has Python3 you can do straight to InfluxDB injection - njmon | njmon2influx.py in stream mode (batch=1)
  4. Previous using the Collector + Injector? Now use njmond.py with inject = true
  5. Previous using the Collector to centralise .json files? Now use njmond.py with json = true
  6. For Elasticsearch, elastic, ELK use njmond.py to get the files to a central server (see above) and then user elastic
    "filebeats" to upload data
    1. Note there is an elastic option for njmon -e to save the JSON data in a slightly different way
  7. For Splunk (similar to elastic) but I need to rework the Splunk injector for one record per line format
    • I will only do that, if asked!

njmond.conf - or any file name you want

  • I am hoping
  • {
    "njmon_port": 8181,
    "njmon_secret": "ignore",
    "data_inject": true,
    "data_json": false,
    # Directory for njmond.log and any .json files
    "directory": "/home/nag/njmon/data",
    "influx_host": "localhost",
    "influx_port": 8086,
    "influx_user": "Nigel",
    "influx_password": "passw0rd",
    "influx_dbname": "njmon",
    # The number of Python threads
    "workers": 2,
    "debug": false,
    # next line only used by njmon2influx - Note: lines with 1st char # are ignored.
    # 1 = stream each packet as it arrives
    # 50 or 100 = batch record from a file to higher throughput insertion
    "batch": 50
    }
    

For Linux logrotate - I am successfully testing with /etc/logrotate.d/njmon:

  • /home/nag/njmon/data/njmond.log {
        daily
        rotate 8
        su nag nag
        compress
        delaycompress
    }
    
    /home/nag/njmon/data/*.json {
        daily
        rotate 8
        su nag nag
        compress
        delaycompress
    }
    

Migration to v50

  • If your network allows you two port numbers then you can migrate to v50 painlessly:
  1. Assuming you were using the Collector + Injector and you were using port 8080 and can setup port 8181
  2. Set up on your central InfluxDB a njmon directory like mine (above)
  3. Edit the njmond.conf for port number and set the InfluxDB username and password
  4. Start: nohup ./njmond.py njmond.conf &
  5. Then for each endpoint
    1. Stop njmon with ps -ef | grep njmon and kill the njmon processes
    2. Add the new version 50 njmon to local bin
    3. Change the crontab to use port 8181
    4. Restart njmon with your favourite options
  6. Once all endpoints are done you can stop the old collector & injector
    1. Remove the old code and tool files
  7. Set up logrotate

The njmond.log file is improved with lots of info per line and one line per snapshot arriving - so it gets big

  • Here is a sample

Key:

  1. Time and date
  2. Port - 8181 should only be one, if you use one njmond.py, but you can run more than one using different ports
  3. Hostname
  4. Operating system
  5. Hardware
  6. Serial Number (tricky on Linux)
  7. Endpoint njmon version

Sample njmon graphs via InfluxDB & Grafana

Note: Click the below image for the full size versions

  • Light style first screens worth
  • single stat settings of VP, Entitlement, GB Memory. Graphs: CPU Memory pie chart Network trans/recv, CPU GHz, Run Queue, Physical CPU consumed, Systems calls and process switches

  • Light style second screen
  • Graphs: memory details, paging, AIX Active Memory Expansion (currently switched off), Paging space disk use, Dieck read/write, Disk Service Times, Network

  • Dark style
  • Server Level top left one line for each LPAR / VM
  • Other graphs are for LPARs / VMs CPU details
  • Righthand graphs are for AIX LPARs
  • Left hand graphs for Linux VMs
  • VIOS at the bottom (just showing)

  • Dark style
  • Single AIX LPAR/VM in more detail
  • Graphs Logical CPU Utilisation
  • Pie charts for Memory Use and Physical CPU Use
  • More Graphs for CPU frequency (GHz), Run queue, Physical CPU consumed, system calls and process switches

Downloadable Grafana Graph Dashboard for njmon data


  • Download from Grafana Dashboard Website
    • If not taking this direct link look search for "njmon".
    1. Once downloaded Import the Template into your Grafana website via the left side "+" then click on "Create Import".
      1. You should get asked the InfluxDB data source database name (for example the recommended name njmon).
  • Three Templates
    1. njmon for AIX
      • Use njmon v31+. Needs plugin Pie Chart & Clock. Using many graph types as examples.
      • For variable numbers of resources like CPUs, disks, N=networks, file systems it uses Group By to get them dynamically named.
      • Time in UTC zone. Variable to select the hostname also selecting only AIX systems (AIX and Linux stats are very
        different so it is otherwise confusing to mix them on one template/dashboard).
    2. njmon for Linux
      • njmon data gathers for Linux on any platform: AMD64, POWER8/9, Raspberry Pi and any Linux regardless of the
        hardware platform. Tested on POWER8/9, AMD64/x86_64, ARM (Raspberry Pi)
    3. njmon Whole Server by Serial Number for AIX & VIOS
      • POWER Server LPARs by Serial Number and running AIX or VIOS. Basics stats CPU, Memory, Network.
      • A similar template could work for Linux
  • Screenshots - see the Grafana Dashboard webpage for large images
      • Click the above image to enlarge it.
  • You can find 3 other excellent Dashboard Template examples from other people too.

Switch Hostname Dashboards AIX and Linux data


njmon for AIX and njmon for Linux can be in a single InfluxDB database

  • The main point for doing this was to separates the two data sets, so we get Template host lists which is AIX or
    Linux but not both.
  • I found a way to do this with via the Grafana variable settings
  • Then use $AIXHOST in the graph query like: where host = $AIXHOST
  • or for Linux $LINUXHOST

Big New News for Power guys running Linux

The experts at Power DevOps have recompiled InfluxDB & Grafana for the POWER processor (ppc64le)

I have installed and using these (*) now. Note: ppc64le (little-endian) only applies to POWER8 & POWER9


Links to more Information


JSON format

Details of the Stats collected

Frequently Asked Questions

njmon End Point and Back End Architecture Diagrams

  • njmon Architecture
  • njmon for an Intel free backend and Linux version supported
    • Early testing used InfluxDB and Grafana on Linux on Intel
    • To run InfluxDB & Grafana on Power see the above Power-DevOp downloads
    • Grafana on my Windows 10 Pro laptop/workstation - works OK

Can nmon benefit from InfluxDB and Grafana

Setting up InfluxDB and Grafana on Intel - initial setup

Setting up InfluxDB and Grafana on a Raspberry Pi

  • This is a low-cost way to try njmon, InfluxDB and Grafana - InfluxDB and Grafana on Raspberry Pi
  • At a conference, we ran 14 users at the same time and the CPU was not particularly busy.
  • I used a Raspberry Pi 3b - the first limit will be the memory size of 1 GB
  • Do not start X Windows (take a lot of RAM) and especially not a browser (kills the pi as it goes into heavy paging)

Command Syntax for njmon for AIX

Command Syntax for njmon for Linux

njmon Collector

  • njmon data agent can directly send the data to a njmon_collector for saving to file and/or injection into InfluxDB
  • njmon_collector

Debugging if njmon crashes

njmon Architecture alternatives to InfluxDB

What is single level and multiple level JSON?

Old News: Running InfluxDB on Linux on POWER (see Power-DevOp code above)

testing

- - - The End - - -