Skip to content

dstdev/ib_tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ib_tools

A small collection of shell utilities to parse and report InfiniBand (IB) / Mellanox diagnostics generated by ibdiagnet and to query local system IB information.

This repository contains scripts used to extract host and firmware information, look up GUIDs in ibdiagnet outputs, find duplicate node entries, and detect firmware/driver mismatches.

Quick Start

  1. Ensure you have the required dependencies installed (see Requirements below).
  2. Clone or download this repository to a management host or the host you are auditing.
  3. Make the scripts executable and (optionally) move them into your PATH.
  4. Put IB_INFO.sh on a common storage location or run it on each host to collect local system and Mellanox card information. ex: clush -a /shared/path/to/IB_INFO.sh >> ib_info_output.csv
  5. Collect ibdiagnet output files (usually under /var/tmp/ibdiagnet2/) from a host with access to the IB fabric if needed.
  6. Run: ib_fw_warn.sh
  7. Run: find_dups.sh
  8. Run: find_ib_fw_mofed_mismatch.sh on a host with access to ibdiagnet output files (usually under /var/tmp/ibdiagnet2/).

Contents

  • IB_INFO.sh — Collects local system information about InfiniBand interfaces and Mellanox cards and emits one CSV line per unique card: Hostname, Serial, Model, OS, Kernel, Card Model, Driver Type, Driver Version, Firmware Version.
  • ib_fw_warn.sh — Parses an ibdiagnet2.db_csv file and extracts the WARNINGS_FW_CHECK section, replacing GUIDs with node names (from the NODES section). Writes fw_check_output.csv by default.
  • find_dups.sh — Scans an ibdiagnet2.db_csv file for duplicate node entries and produces output.csv, duplicates.csv, and duplicate_counts.csv.
  • find_ib_fw_mofed_mismatch.sh — Finds hosts reported with firmware mismatches relative to what the switch MOFED expects, using ibdiagnet output.

Requirements

These scripts are POSIX-ish shell scripts written for bash and expect a sysadmin environment with the following utilities available (most are standard on RHEL/CentOS variants used with Mellanox):

  • bash
  • hostname, ip, awk, sed, grep, sort, cut, xargs
  • dmidecode (to get chassis/serial/model)
  • lspci (to find Mellanox cards)
  • ethtool (to query firmware/driver for interfaces)
  • lsb_release or /etc/os-release (for OS detection)
  • rpm and ofed_info (to detect installed IB driver stacks)
  • ibdiagnet (for scripts that parse its output; some scripts call ibdiagnet)

Most commands must be run as root or with sufficient privileges (for example dmidecode).

If a required tool is missing, the script will usually print an error and exit.

Installation

  1. Clone or copy the repository to a management host or the host you are auditing:

    git clone

  2. Make the scripts executable and (optionally) move them into your PATH:

    chmod +x IB_INFO.sh ib_fw_warn.sh find_dups.sh find_ib_fw_mofed_mismatch.sh sudo mv IB_INFO.sh ib_fw_warn.sh find_dups.sh find_ib_fw_mofed_mismatch.sh /usr/local/sbin/

  3. Ensure ibdiagnet output files are available under /var/tmp/ibdiagnet2/ when running the scripts that parse ibdiagnet CSV output, or use the script flags (where available) to point to the correct files.

Usage

Below are short examples of how to run each script. Use -h or --help for more details.

IB_INFO.sh

Purpose: gather local system and Mellanox card details and emit CSV lines for inventory.

Example:

sudo ./IB_INFO.sh

Output: CSV lines like

Hostname,Serial Number,Model,OS,Kernel,Mellanox Card Model,Driver Type,Driver Version,Firmware Version

Notes:

  • Requires an active InfiniBand interface (script looks for ib* links that are UP).
  • Requires dmidecode, lspci, ethtool and ofed_info/rpm to be present to collect all fields.

ib_fw_warn.sh

Purpose: parse ibdiagnet2.db_csv, extract the WARNINGS_FW_CHECK section and replace GUIDs with node names from the NODES section; writes fw_check_output.csv by default.

Example (default paths):

./ib_fw_warn.sh

Example (with custom files):

./ib_fw_warn.sh -f /path/to/ibdiagnet2.db_csv -l /path/to/ibdiagnet2.db_csv -o /tmp/fw_check_output.csv

Output: CSV column_2,column_6 where column_2 is the node name and column_6 is the summary.

find_dups.sh

Purpose: find duplicate node entries inside an ibdiagnet2.db_csv file and build three CSV files with unique hostnames, duplicates, and duplicate counts.

Example:

./find_dups.sh -f /var/tmp/ibdiagnet2/ibdiagnet2.db_csv

Outputs:

  • output.csv — unique hostnames and vendor card IDs
  • duplicates.csv — duplicate hostnames and their unique IDs
  • duplicate_counts.csv — hostname, duplicate count, and unique IDs

find_ib_fw_mofed_mismatch.sh

Purpose: identify hosts with firmware versions that don't match the switch MOFED expectations using ibdiagnet output.

Example:

./find_ib_fw_mofed_mismatch.sh

By default the script prints hostnames with mismatched firmware. Use -o <file> to write results to a file.

Helpful bash commands

sed '/^START_NODES$/,/^END_NODES$/!d;//d' ibdiagnet2.db_csv | awk -F, '{print $1}' | sed 's/\"//g' | sort -u

Troubleshooting

  • Permission errors: run scripts with sudo or as root when they call dmidecode, ethtool, lspci or when reading /var/tmp/ibdiagnet2 output files.
  • Missing commands: install the missing packages (for example ethtool, pciutils, dmidecode).
  • No Mellanox devices found: verify the host actually has Mellanox adapters and that the IB link is up (ip link) — some scripts bail out if no ib interface is present.

Contributing

Small improvements and fixes are welcome. Suggested workflow:

  1. Open an issue describing the problem or enhancement.
  2. Send a PR with a focused change and a short description of why it helps.

If you add new scripts, include a short usage example and update this README.

License

No license is provided in this repository. If you intend to share or publish these scripts, please add a LICENSE file (for example MIT or Apache 2.0) to make redistribution terms explicit.

Contact

If you need help adapting the scripts to your environment, open an issue or contact the repository owner.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages