Category Archives: HPUX

How to change process priority in Linux or Unix

Learn how to change process priority in Linux or Unix. Using the renice command understands how to alter the priority of running processes.

Processes in the kernel have their own scheduling priorities and using which they get served by the kernel. If you have a loaded system and need to have some processes serve before others you need to change process priority. This process is also called renicing since we use renice command to change process priority.

There are nice values defined from 0 to 20. 20 being the highest. The process with low nice values gets service before the process with a high nice value. So if you want a particular process to get served first you need to lower its nice value. Administrators may also choose to mark negative nice value (down up to -20) to speed up processes furthermore. Let’s see how to change process priority –

There are 3 ways to select a target for renice command. Once can submit process id (PID) or user ID or process group ID. Normally we use PID and UID in the real-world so we will see these options. New priority or nice value can be defined with option -n. Current nice value can be viewed in top command under NI column Or it can be checked using below command :

# ps -eo pid,user,nice,command | grep 30411
30411 root       0 top
31567 root       0 grep 30411

In above example, nice value is set to 0 for give PID.

Renice process id :

Process id can be submitted to renice using -p option. In below example we will renice value to 2 for pid 30411.

# renice -n 2 -p 30411
30411: old priority 0, new priority 2
# ps -eo pid,user,nice,command | grep 30411
  747 root       0 grep 30411
30411 root       2 top

renice command itself shows old and new nice values in output. We also verified new nice value using ps command.

Renice user id :

If you want to change priorities of all processes of a particular user then you can submit UID of that user using -u option. This option is useful when you want all processes by the user to complete fast, so you can set the user to -20 to get things speedy!

# ps -eo pid,user,nice,command | grep top
 3859 user4   0 top
 3892 user4   0 top
 4588 root    0 grep top
# renice -n 2 -u user4
54323: old priority 0, new priority 2
# ps -eo pid,user,nice,command | grep top
 3859 user4   2 top
 3892 user4   2 top
 4966 root    0 grep top

In the above example, there are two processes owned by user4 with priority 0. We changed the priority of user4 to 2. So both processes had their priority changed to 2.

Normal users can change their own process priority too. But he can not override priority set by the administrator. -20 is the lowest minimum nice value one can set on the system. This is the speediest priority, once set that process gets service and all available resources on the system to get its task done.

How to do safe and graceful Measureware service restart in HPUX

A how-to guide for safe and graceful measureware service restart on HPUX machines. Learn how to preserve old log files during service restart and avoid overwriting them.

Measureware service is a native utility to HPUX for performance measurement. It is responsible to collect system utilization data in the background. Measureware agent mwa runs in background and stores data in logfiles called datafiles.  If you attempt measureware service restart without moving logfiles then it will overwrite current files and all historic data is on the toss. Hence you need to stop it then move data files to another location and then start it. In this sequence, you prompt agents to create new blank data files to save data.

You can view the current status of all measureware services using below command :

# mwa status all
 Perf Agent status:
    Running scopeux               (Perf Agent data collector) pid 2814
    Running midaemon              (Measurement Interface daemon) pid 2842
    Running ttd                   (ARM registration daemon) pid 2703

 Perf Agent Server status:

    Running ovcd                  (OV control component) pid 3483
    Running ovbbccb               (BBC5 communication broker) pid 3484
    Running coda                  (perf component) pid(s) 3485
       Configured DataSources(1)
                  SCOPE

    Running perfalarm             (alarm generator) pid(s) 2845

If any of the components are not running or having issues then it may call for measureware service restart. Let’s see the process of the graceful shutdown and the start of measureware services in HPUX.

Read also another performance measurement tool System Activity Report (SAR) in the below series :

1. Stop mwa

Stop all measureware services with single command as below :

# mwa stop all

Shutting down Perf Agent collection software
         Shutting down scopeux, pid(s) 2814
         The Perf Agent collector, scopeux has been shut down successfully.
NOTE:   The ARM registration daemon ttd will be left running.

Shutting down the alarm generator perfalarm, pid(s) 2845
         The perfalarm process has terminated

OVOA is running. Not shutting down coda

As you can see in the above output ttd is left running by command. You need to kill it using below command :

# ttd -k

Also, mideamon still runs after the above command. You can terminate it using :

#  midaemon -T

These three commands collectively shut off everything related to measureware services. You can confirm if midaemon, ttd and scopeux are down with status command again :

#  mwa status all
 Perf Agent status:
WARNING: scopeux    is not active (Perf Agent data collector)
WARNING: midaemon   is not active (Measurement Interface daemon)
WARNING: ttd        is not active (ARM registration daemon)

 Perf Agent Server status:

    Running ovcd                  (OV control component) pid 3483
    Running ovbbccb               (BBC5 communication broker) pid 3484
    Running coda                  (perf component) pid(s) 3485
       Configured DataSources(1)
                  SCOPE

WARNING: perfalarm is not active (alarm generator)

This ensures you can proceed with log movement before starting mwa again.

2. Log movement

Datafiles (all starts with log) resides in /var/opt/perf/datafiles directory. List of datafiles is as below :

# ll /var/opt/perf/datafiles/log*
-rw-r--r--   1 root       users      11064908 Jan  1 03:05 /var/opt/perf/datafiles/logappl
-rw-r--r--   1 root       root       43951620 Jan  1 03:05 /var/opt/perf/datafiles/logdev
-rw-r--r--   1 root       users      9556384 Jan  1 03:05 /var/opt/perf/datafiles/logglob
-rw-r--r--   1 root       root         15716 Jan  1 03:01 /var/opt/perf/datafiles/logindx
-rw-r--r--   1 root       users           15 Nov  4  2009 /var/opt/perf/datafiles/logpcmd0
-rw-r--r--   1 root       root       76492020 Jan  1 03:05 /var/opt/perf/datafiles/logproc
-rw-r--r--   1 root       root       96153856 Jan  1 03:05 /var/opt/perf/datafiles/logtran

Now move current data files to a different directory. You can use below small inline scripts to do this or you can manually move them one by one.

# cd /var/opt/perf/datafiles
# nowis=`date +%d%b%y-%H:%M`
# mkdir /var/opt/perf/datafiles.old.`echo $nowis`
# cp log* /var/opt/perf/datafiles.old.`echo $nowis`

Make sure you copied datafiles to the destination correctly and proceed to start services again.

3. Start mwa

Start it using below command :

#  mwa start all

The Perf Agent scope collector is being started.
         The ARM registration daemon
         /opt/perf/bin/ttd has been started.

         The Performance collection daemon
         /opt/perf/bin/scopeux has been started.

         The coda daemon /opt/OV/lbin/perf/coda is already running.
The Perf Agent alarm generator is being started.
         The alarm generator /opt/perf/bin/perfalarm
         has been started.

Observe while shutting down we used three commands for shutting different components but while starting up it came up with the single command. You can check the status with mwa status all command to make sure all components are started. This pretty much sums up how to do a safe and graceful measureware service restart.

All examples on this post are from the machine running HPUX 11.31. Let us know if you have any queries, suggestions, corrections in comments.

How to install patch/software in HPUX

A how-to guide for the installation of patch or software on the HPUX operating environment. This includes steps to install, configure, and verify.

This post is a how-to guide that has steps to install, configure, and verify patch or software on HPUX operating system. Before we begin the patching process, one needs to finalize the downtime window for patching with the concerned team.  Also, if individual patches are planned to install then patch ID needs to be known to download from the HP portal. HP Passport ID required to sign-in on the HP portal.

Also read: HPUX patch naming conventions

HPUX patching enables a system with new features, enhancements and removes bugs if any. HP releases patch bundles for HPUX every 6 months i.e twice in a year. These bundles can be downloaded and applied to the system. Also, individual patches can be downloaded. For individual patches, one should know the patch ID first.

Prerequisite :

  • Log in to the HP portal and download required patches. HP provides the facility to download your required patch along with its all dependencies so you need not to again search for dependencies separately.  When prompted for download, select the gzip bundle format.
  • Upload the patch on the server using FTP. Unzip patch file which will have create_depot script inside. When executed,  this script will create $PWD/depot directory which includes all patches to be installed. Make sure depot is created properly using:
# /usr/sbin/swlist  -s /tmp/PACPTHP_00001.depot

# Initializing...
# Contacting target "testsrv2"...
#
# Target:  testsrv2:/tmp/PACPTHP_00001.depot
#

#
# No Bundle(s) on testsrv2:/tmp/PACPTHP_00001.depot
# Product(s):
#

  HPOvLcore     6.00.000       HP Software Core Functionality
  HPOvLcore     6.00.000       HP Software Core Functionality
  HPOvPerf      4.70.000       HP OpenView Performance
  HPOvPerf      4.70.000       HP OpenView Performance
  • A server backup is an important thing to be done before any activity. Take ignite backup over network or tape. Also collect nickel, sysinfo outputs from the server. Ensure the downtime window and other related ITIL processes are completed and approved.

Installation & configuration :

Before starting activity make sure all applications on the server are brought down by respective parties. Once confirmed, proceed with patch installation. Install the patch using the command:

# /usr/sbin/swinstall -s /tmp/PACPTHP_00001.depot

NOTE:    The interactive UI was invoked, since no software was
         specified.

Starting the terminal version of swinstall...

To move around in swinstall:

- use the "Tab" key to move between screen elements
- use the arrow keys to move within screen elements
- use "Ctrl-F" for context-sensitive help anywhere in swinstall

On screens with a menubar at the top like this:

        ------------------------------------------------------
       |File View Options Actions                         Help|
       | ---- ---- ------- ------------------------------- ---|

- use "Tab" to move from the list to the menubar
- use the arrow keys to move around
- use "Return" to pull down a menu or select a menu item
- use "Tab" to move from the menubar to the list without selecting a menu item
- use the spacebar to select an item in the list

On any screen, press "CTRL-K" for more information on how to use the keyboard.

This will start the text-based GUI. You can navigate in this GUI menu using, arrow keys and SPACE key. Select the all listed patches using arrow key and space bar to highlight them.

Once marked, select Install option from the Actions menu.

This will take time to install and configure the patch in the system depending upon the size of the depot. This configuration happens in the background with this command execution only.

If the required system will be rebooted post-installation. To check if the patch needs a reboot after install, you can run above swinstall command with -p preview option. Which will just check dependencies, depot health, and show you report. This report includes if a reboot is required or not. Even while downloading the patch from the HP portal, you can view its details which also includes information about reboot requirements.

Verification :

Once the installation is complete, or after reboot, check if all installed patches are configured in the kernel or not using the command :

# swlist -l fileset -a state | grep –v –e “#” –e “configured”

The output of the above command should be blank. If there are any file sets left in an unconfigured/installed state then the above command’s output will reveal it. So, if the output is not blank, you need to configure these unconfigured/installed state file sets manually. To configure all patches again use below command:

# /usr/sbin/swconfig \ *

=======  12/27/16 10:17:15 SST  BEGIN swconfig SESSION
         (non-interactive) (jobid=testsrv2-0396)

       * Session started for user "root@testsrv2".

       * Beginning Selection
       * Target connection succeeded for "testsrv2:/".

----- output clipped -----
 * Selection succeeded.


       * Beginning Analysis
       * Session selections have been saved in the file
         "/home/user3/.sw/sessions/swconfig.last".
       * "testsrv2:/":  2831 filesets have already been configured.
       * Analysis succeeded.


       * Beginning Execution
       * "testsrv2:/":  2831 software objects were determined to be
         skipped in the analysis phase.
       * Execution succeeded.


NOTE:    More information may be found in the agent logfile using the
         command "swjob -a log testsrv2-0396 @ testsrv2:/".

=======  12/27/16 10:17:59 SST  END swconfig SESSION (non-interactive)
         (jobid=testsrv2-0396)

This command will try to configure the remaining patches. After completion of this command, once again verify your installation.

sar command (Part III) : Disk, Network reporting

Learn System Activity Report sar command with real-world scenario examples. Understand how to do disk, network reporting using this command.

In the last two parts of the sar command, we have seen time formats to be used with command, its data files, CPU & Memory reporting. In this last part, we will be seeing disk, network reporting using sar command.

Read last two parts of this tutorial here :

Disk IO reporting

sar provides disk (block devices)  report with -d option.  Normally, it shows below parameters values (highlighted values are more commonly observed for performance monitoring) :

  • DEV: Block device name. It follows the dev m-n format. M is major and n is a minor number of the block devices.
  • tps: Transfers per second
  • rd_sec/s: Sector reads per second (sector is 512 byte)
  • wr_sec/s: Sector writes per second 
  • avgrq-sz:  average size (in sectors) of the requests that were issued to the device
  • avgqu-sz: average queue length of the requests that were issued to the device
  • await: The average time (in milliseconds) for I/O requests
  • svctm: The average service time (in milliseconds) for I/O requests
  • %util: Percentage  of  CPU  time  during  which  I/O requests were issued to the device
# sar -d 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/21/2016      _x86_64_        (4 CPU)

01:20:19 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:20:21 AM    dev8-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:21 AM   dev8-64      3.00      2.00      1.00      1.00      0.00      0.50      0.50      0.15
01:20:21 AM   dev8-32      3.00      2.00      1.00      1.00      0.00      0.50      0.33      0.10
01:20:21 AM  dev252-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:21 AM  dev252-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:21 AM  dev252-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

In the above output, device names are not so user friendly. So to identify devices easily, -p option is available. This option prints pretty device names in the DEV column and it should always be used with -d option.

# sar -d -p 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/21/2016      _x86_64_        (4 CPU)

01:20:38 AM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:20:40 AM       sda      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:40 AM       sdb      4.98      0.00     91.54     18.40      0.00      0.80      0.20      0.10
01:20:40 AM       sdc      2.99      1.99      1.00      1.00      0.00      0.67      0.67      0.20
01:20:40 AM      dm-0      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:40 AM      dm-1      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:20:40 AM      dm-2      0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Now see above output where device names are easily identifiable. sda means disk /dev/sda and so on.

Network utilization reporting

Option -n gives all network stats. There are a total of 18 different keywords (like NFS, IP, DEV, TCP, etc.) which can be used with -n option to get related parameters. Each keyword has almost 8-10 parameters to display. So in the call, if you are using ALL keyword, then the output will be a huge list of parameters which is difficult to understand.

To keep it short here we will see only one example of keyword DEV i.e. device. This will show the device i.e. network card’s parameter values. Most of the time NIC performance is checked hence we are using this keyword example.

# sar -n DEV 2 1
Linux 2.6.39-200.24.1.el6uek.x86_64 (textsrv2)         12/21/2016      _x86_64_        (4 CPU)

01:35:22 AM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
01:35:24 AM        lo      6.00      6.00      0.29      0.29      0.00      0.00      0.00
01:35:24 AM      eth0     15.50      0.50      0.91      0.04      0.00      0.00      0.00
01:35:24 AM      eth1      6.50      4.50      0.97      0.77      0.00      0.00      0.00
01:35:24 AM      eth3      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Average:        IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
Average:           lo      6.00      6.00      0.29      0.29      0.00      0.00      0.00
Average:         eth0     15.50      0.50      0.91      0.04      0.00      0.00      0.00
Average:         eth1      6.50      4.50      0.97      0.77      0.00      0.00      0.00
Average:         eth3      0.00      0.00      0.00      0.00      0.00      0.00      0.00

In the above example, I used the DEV keyword along with -n option and took only one iteration in output. Parameters displayed for device keyword are :

  • IFACE: Its interface name. You can easily see eth0, eth1, loopback (lo) interfaces here.
  • rxpck/s: Packets received per seconds
  • txpck/s: packets transmitted per second
  • rxkB/s: kilobytes received per second
  • txkB/s : kilobytes transmitted per second
  • rxcmp/s: compressed packets received per second
  • txcmp/s: compressed packets transmitted per second
  • rxmcst/s: Number of multicast packets received per second

This concludes sar command tutorial’s part III about the disk, network reporting. This is a three-part tutorial with example outputs included. Put your queries, suggestions, feedback in the comments below. You can also reach us using our Contact form. 

sar command (Part II) : CPU, Memory reporting

Learn System Activity Report sar command with real-world scenario examples. Understand CPU, Memory reporting using this command.

In the last post of sar command, we have seen its data file structure, how to extract data from it, and time formats to be used with this command.  In this post, let’s see how to get CPU, memory utilization reports from data files, or real-time using sar command.

Read other parts of sar tutorial :

sar command follows below format :

# sar [ options ] [ <interval> [ <count> ] ]

We have already seen what is interval and count in the last post. Now we will see different options that can be used to get different system resource utilization stats. Also, we have seen how to get historic data from sar data files, I will be using only real-time commands (i.e. without -f option) for all below examples.

Before we start with resource reporting here is a quick tip about start and end time of reports when you are extracting data from datafiles. Below two options can be used with sar command (in conjunction with -f) so that specific time window data can be extracted.

  • -s hh:mm:ss Start time of the report. Sar starts output records tagged to this time or very next available time-tagged record. Default start time is 08:00:00
  • -e hh:mm:ss The end time of the report. The default end time is 18:00:00

CPU utilization reporting using sar

For CPU statistics, sar command has -u option. Executing sar command with -u gives us below utilization matrices (highlighted values are more commonly observed for performance monitoring) :

  • %user: CPU % used by user processes
  • %nice: CPU % used by user processes with nice priority
  • %system:  CPU % used by system processes
  • %iowait: % of the time when CPU was idle (since processes were busy in IO)
  • %steal: % of time wait by virtual CPU while hypervisor servicing another CPU (virtualization aspect)
  • %idle: CPU % idle.
# sar -u 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/20/2016      _x86_64_        (16 CPU)

03:34:51 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
03:34:53 AM     all     33.00      0.00      9.99     32.22      0.00     24.80
03:34:55 AM     all     37.32      0.00     10.44     32.95      0.00     19.29
03:34:57 AM     all     36.04      0.00     11.90     29.83      0.00     22.24
Average:        all     35.46      0.00     10.78     31.66      0.00     22.11

See the above example to get the values of the parameters explained above. The output starts with a line that has OS kernel version details, hostname in brackets, date, architecture, and the total number of CPUs. Followed by a data with interval (2 sec) and count (3) specified in the command. Finally, it also gives us the average value for all parameters. Column CPU denoting value all indicates these are averaged out values of all 16 CPU for the given time instance.

If you are interested in seeing values for each processor then -P option (per processor reporting) can be used with CPU number of your choice or ‘ALL’. When ALL is specified each processor’s data is shown or only specified CPU’s data is processed.

# sar -P ALL -u 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr2)         12/20/2016      _x86_64_        (16 CPU)

04:08:33 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:08:35 AM     all     34.50      0.00      6.38      7.66      0.00     51.45
04:08:35 AM       0     32.66      0.00      9.55      4.52      0.00     53.27
04:08:35 AM       1     76.24      0.00      2.48      5.45      0.00     15.84
04:08:35 AM       2     24.62      0.00     10.05      6.53      0.00     58.79
04:08:35 AM       3     38.50      0.00     14.50      5.50      0.00     41.50
04:08:35 AM       4      3.05      0.00      4.06      0.51      0.00     92.39
04:08:35 AM       5      1.99      0.00      1.49      9.45      0.00     87.06
04:08:35 AM       6     99.00      0.00      1.00      0.00      0.00      0.00
04:08:35 AM       7      1.50      0.00      1.00      0.00      0.00     97.50
04:08:35 AM       8     62.00      0.00     13.00     13.50      0.00     11.50
04:08:35 AM       9     91.96      0.00      6.53      0.00      0.00      1.51
04:08:35 AM      10     34.67      0.00     10.55     18.09      0.00     36.68
04:08:35 AM      11     57.00      0.00      4.00      4.00      0.00     35.00
04:08:35 AM      12     11.50      0.00      5.50     20.50      0.00     62.50
04:08:35 AM      13      6.47      0.00      2.49     15.42      0.00     75.62
04:08:35 AM      14      3.00      0.00      2.00      3.50      0.00     91.50
04:08:35 AM      15      7.54      0.00     15.08     15.58      0.00     61.81
----- output clipped -----

See the above example where I mentioned ALL with -P option and got each processor’s utilization report. In the below example, only CPU number 2 data is extracted.

# sar -P 2 -u 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr2)         12/20/2016      _x86_64_        (16 CPU)

04:11:11 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:11:13 AM       2     49.75      0.00      3.98     26.87      0.00     19.40
04:11:15 AM       2     97.50      0.00      1.50      1.00      0.00      0.00
04:11:17 AM       2     97.50      0.00      1.50      0.50      0.00      0.50
Average:          2     81.53      0.00      2.33      9.48      0.00      6.66

Another small stats regarding processor is power stats. Here sar shows you processor clock frequency at given instance of time. This helps in calculating power being used by CPU. -m option gives this data and it can be used per -processor reporting option too.

# sar -m 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/20/2016      _x86_64_        (16 CPU)

04:15:39 AM     CPU       MHz
04:15:41 AM     all   1970.50
04:15:43 AM     all   1845.81
04:15:45 AM     all   1587.93
Average:        all   1801.41

# sar -P ALL 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/20/2016      _x86_64_        (16 CPU)

04:15:52 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
04:15:54 AM     all     15.67      0.00      7.59      7.02      0.00     69.72
04:15:54 AM       0     20.00      0.00      7.00     37.50      0.00     35.50
04:15:54 AM       1     27.86      0.00     19.40      6.97      0.00     45.77
04:15:54 AM       2     47.50      0.00     15.50      2.50      0.00     34.50
04:15:54 AM       3      2.49      0.00      1.99      1.99      0.00     93.53
04:15:54 AM       4      3.02      0.00      5.03      1.01      0.00     90.95
04:15:54 AM       5      1.00      0.00      7.00      1.00      0.00     91.00
04:15:54 AM       6      0.51      0.00      0.51      0.00      0.00     98.97
04:15:54 AM       7      1.00      0.00      0.50      0.00      0.00     98.50
04:15:54 AM       8     35.18      0.00     21.61     20.10      0.00     23.12
04:15:54 AM       9     51.24      0.00     22.89      4.98      0.00     20.90
04:15:54 AM      10     28.64      0.00      8.04     14.57      0.00     48.74
04:15:54 AM      11     12.94      0.00      5.97     11.94      0.00     69.15
04:15:54 AM      12      8.50      0.00      3.50      7.50      0.00     80.50
04:15:54 AM      13      7.04      0.00      2.51      1.51      0.00     88.94
04:15:54 AM      14      1.01      0.00      0.51      1.52      0.00     96.97
04:15:54 AM      15      1.49      0.00      0.99      0.00      0.00     97.52
----- output clipped -----

Memory utilization reporting using sar

Memory stats can be extracted with -r option. When sar runs with -r option, it presents below parameters (highlighted values are more commonly observed for performance monitoring) :

  • kbmemfree: Free memory available in kilobytes.
  • kbmemused: Memory used (excluding kernel usage)
  • %memused: Percentage of memory used
  • kbbuffers:  memory used as buffers by the kernel in kilobytes
  • kbcached: memory used to cache data by the kernel in kilobytes
  • kbcommit: memory in kilobytes needed for the current workload. (commitment!)
  • %commit: % of memory needed for the current workload in relation to the total memory (RAM+swap)
# sar -r 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/20/2016      _x86_64_        (16 CPU)

03:42:07 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
03:42:09 AM 119133212 145423112     54.97   1263568 109109560 106882060     32.23
03:42:11 AM 119073748 145482576     54.99   1263572 109135424 107032108     32.27
03:42:13 AM 119015480 145540844     55.01   1263572 109162404 106976556     32.25
Average:    119074147 145482177     54.99   1263571 109135796 106963575     32.25

The output above shows the parameter values. It sections the same as explained above (CPU report example); first-line details, last avg row, etc. Note that, %commit can be 100%+ too since kernel always over-commit to avoid out of memory situation.

Paging statistics can be obtained using -B option. Normally, parameters shown in this option’s output are not observed by sysadmin. But if in-depth troubleshooting or monitoring is required then only this option is used. It shows the below parameters:

  • pgpgin/s:  Number of kilobytes the system paged in from disk per second.
  • pgpgout/s: Number of kilobytes the system paged out to disk per second.
  • fault/s: Number of page faults per second.
  • majflt/s: Number of major page faults per second.
  • pgfree/s: Number of pages placed on the free list by the system per second.
  • pgscank/s: Number of pages scanned by the kswapd daemon per second.
  • pgscand/s: Number of pages scanned directly per second.
  • pgsteal/s: Number of pages the system has reclaimed from cache per second.
  • %vmeff: This is a metric of the efficiency of page reclaim.
# sar -B 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/21/2016      _x86_64_        (4 CPU)

12:59:41 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
12:59:43 AM      3.05     40.10 120411.68      0.00  99052.28      0.00      0.00      0.00      0.00
12:59:45 AM      2.99      7.46  42649.75      0.00  37486.57      0.00      0.00      0.00      0.00
12:59:47 AM      3.00     47.50     43.00      0.00     80.50      0.00      0.00      0.00      0.00
Average:         3.01     31.61  54017.22      0.00  45257.86      0.00      0.00      0.00      0.00

Swap statistics can be obtained with -S option. Swap is another aspect of memory hence its utilization monitoring is as important as memory. Swap utilization reports are shown with -S option. It shows below parameters (highlighted values are more commonly observed for performance monitoring) :

  • kbswpfree: Free swap in kilobytes
  • kbswpused: Used swap in kilobytes
  • %swpused: % of swap used
  • kbswpcad: Amount of cached swap memory in kilobytes
  • %swpcad: % of cached swap memory in relation to used swap.
# sar -S 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsrv2)         12/21/2016      _x86_64_        (4 CPU)

01:01:45 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
01:01:47 AM   8388604         0      0.00         0      0.00
01:01:49 AM   8388604         0      0.00         0      0.00
01:01:51 AM   8388604         0      0.00         0      0.00
Average:      8388604         0      0.00         0      0.00

In the above example, you can see a total of 8GB swap available on server and nothing of it used. The swap will get hits only if memory gets completely utilized.

This concludes the second part of the sar tutorial. We will be seeing network-related reporting in the next part.

sar command (Part I): All you need to know with examples

Learn System Activity Report sar command with real-world scenario examples. Understand the command’s log files, execution, and different usage.

SAR ! System Activity Report! sar command is the second-best command used to check system performance or utilization after top command. From the man page, ‘The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The accounting system, based on
the values in the count and interval parameters, writes information the specified number of times spaced at the specified intervals in seconds.’ No doubt this is the best performance monitoring tool to be used for any sysadmin.

Read next part of sar tutorial :

Command log file management:

sar keep collecting system resource utilization and store it in binary files. These files are called datafiles and those are located in /var/log/sa/saXX the path where XX is data in dd format. So this could be one of the locations to check when you are troubleshooting file system utilization.

# ll /var/log/sa
total 29024
-rw-r--r-- 1 root root 494100 Dec  1 23:50 sa01
-rw-r--r-- 1 root root 494100 Dec  2 23:50 sa02
-rw-r--r-- 1 root root 494100 Dec  3 23:50 sa03
-rw-r--r-- 1 root root 494100 Dec  4 23:50 sa04
-rw-r--r-- 1 root root 494100 Dec  5 23:50 sa05
-rw-r--r-- 1 root root 494100 Dec  6 23:50 sa06
-rw-r--r-- 1 root root 494100 Dec  7 23:50 sa07

----- output clipped -----

Log files are binary hence can be read only with sar using -f option. Normal sar command shows your data in real-time when executed. If you need to check historic data you need to use -f option and provide a path of the particular data file.

# sar -u 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/19/2016      _x86_64_        (4 CPU)

11:44:29 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
11:44:31 AM     all     25.37      0.00     10.12      0.00      0.00     64.50
11:44:33 AM     all     25.41      0.00     10.39      0.13      0.00     64.08
11:44:35 AM     all     27.84      0.00     11.36      0.12      0.00     60.67
Average:        all     26.21      0.00     10.62      0.08      0.00     63.08

In the above example, when executed it will run for 23 iterations (we will see what it is, in later part of this post) for 2 seconds each and show you an output which is in real-time. Let’s see -f option :

# sar -u 2 3 -f /var/log/sa/sa15
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/15/2016      _x86_64_        (4 CPU)

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM     all     10.24      0.00      5.18      0.17      0.00     84.41
12:20:01 AM     all     11.55      0.00      5.02      0.19      0.00     83.24
12:30:01 AM     all     10.79      0.00      4.79      0.17      0.00     84.25
Average:        all     10.86      0.00      5.00      0.17      0.00     83.97

In above example, we ran sar command but on a datafile /var/log/sa/sa15. Hence data is being read from older/historic data files which is not real-time. File’s first entry is always treated as the first iteration and further on data is displayed according to command arguments. Hence you can see the first entry is being of 12AM.

Another beauty of this command for log management is you can save real-time command output in a log file of your choice. Let’s say you need to share the output of a specific time of monitoring then you can save the output in the log file and can share. In this way, you don’t have to share complete day datafile. You have to use -o option along with file path of your choice.

# sar -u 2 3 -o /tmp/logfile
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/19/2016      _x86_64_        (4 CPU)

11:51:42 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
11:51:44 AM     all     27.75      0.00      9.88      0.12      0.00     62.25
11:51:46 AM     all     26.00      0.00      9.88      0.12      0.00     64.00
11:51:48 AM     all     25.53      0.00     10.26      0.00      0.00     64.21
Average:        all     26.43      0.00     10.00      0.08      0.00     63.48
# ls -lrt /tmp/logfile
-rw-r--r-- 1 root root 63672 Dec 19 11:51 /tmp/logfile

In the above example, you can see the output is being displayed on the terminal as well as in a file provided in command options. Note that this file is also a binary file only.

Command Intervals and Iterations :

This command takes these two arguments which will define the time factors of output.

Interval is the time in seconds between two iterations of output samples. Normally selected as 2,5,10 seconds. Iteration or count is the number of samples to be taken after an interval of defined seconds. So for a command which says sar 2 5 means 2 interval and 5 iterations i.e. take 5 samples separated by 2 seconds each. i.e. if the command is fired at 12:00:00 then the output will include samples for times 12:00:02, 12:00:04 till 12:00:10. Check any above example and you will figure out how it works.

If the interval parameter is set to zero, the sar command displays the average statistics for the time since the system was started. If the iterations parameter is specified without the count parameter, then reports are generated continuously as shown below.

# sar -u 2
Linux 2.6.39-200.24.1.el6uek.x86_64 (oratest02)         12/19/2016      _x86_64_        (4 CPU)

12:09:28 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:09:30 PM     all      0.75      0.00      0.50      0.25      0.00     98.50
12:09:32 PM     all      0.88      0.00      0.38      0.13      0.00     98.62
12:09:34 PM     all      1.12      0.00      1.75      0.25      0.00     96.88
12:09:36 PM     all      2.38      0.00      1.38      0.12      0.00     96.12
12:09:38 PM     all     14.79      0.00      7.39      0.50      0.00     77.32
------- continuous reports being generated, output clipped -----

We will see useful monitoring example of this command in next post.

How to scan new lun / disk in Linux & HPUX

Howto guide to scan new disk or LUNs on Linux or HPUX machines. This guide explains steps to scan and then identify new disk device names.

When you add a new disk to the system, you need to scan it so that kernel will be able to identify new hardware and assign a disk name to it. Adding a new disk to the system can be local or from storage. If it’s a local then its an addition of disk in free disk slots attached to server. If its a storage LUN then it’s masking and zoning at storage level to WWN of the server.

Once the disk / LUN is made available/visible to the server, the next step is to scan it. The kernel has a know hardware tree with it. This tree needs to be updated with new disk information. To let the kernel know that a new disk is made available to server disk scanning is required. If the disk is from storage array then there are chances you have storage vendor utilities/scripts available to scan storage on server example: evainfo (for EVA storage), xpinfo (for XP12K storage), powermt (for EMC storage).    If these utilities are not available, you still be able to scan them from OS.

HPUX disk scan :

In HPUX, we have dedicated ioscan command to scan new hardware. You can ask command to scan on hard disks with -C option i.e. class. Before executing this command, keep the output of previous disks (ioscan -funC disk) handy. This output can be compared to new output (command below) to identify new disk.

# ioscan -fnC disk
Class     I  H/W Path        Driver  S/W State   H/W Type     Description
==========================================================================
disk      4  0/0/1/1.0.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c1t0d0   /dev/rdsk/c1t0d0
disk      0  0/0/1/1.2.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c1t2d0   /dev/rdsk/c1t2d0
disk      1  0/0/2/0.2.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c2t2d0   /dev/rdsk/c2t2d0
disk      2  0/0/2/1.2.0     sdisk   CLAIMED     DEVICE       HP      DVD-ROM 305
                            /dev/dsk/c3t2d0   /dev/rdsk/c3t2d0
disk      3  0/10/0/1.0.0.0  sdisk   CLAIMED     DEVICE       I2O     RAID5
                            /dev/dsk/c4t0d0   /dev/rdsk/c4t0d0

Scan output shows you all detected disks on the system and their assigned disk names in CTD format. Sometimes, ioscan unable to install special device files for newly detected disks, in such a situation you can run insf (install special files) command to ensure all detected hardware has device files in place.

# insf -e
insf: Installing special files for btlan instance 0 address 0/0/0/0
insf: Installing special files for stape instance 1 address 0/0/1/0.1.0
insf: Installing special files for sctl instance 0 address 0/0/1/0.7.0
insf: Installing special files for sdisk instance 4 address 0/0/1/1.0.0
insf: Installing special files for sdisk instance 0 address 0/0/1/1.2.0
insf: Installing special files for sctl instance 1 address 0/0/1/1.7.0
----- output clipped ----

New disk even can be identified by comparing directory structure of /dev/disk or /dev/dsk/ before and after the scan. Any new addition during the scan to these directories is your new disk.

Once you identify this new disk, you can use it on the system via volume managers like LVM.

Linux Disk scan:

In Linux, it’s a bit tricky since there is no direct ioscan available. First, you need to get currently available disk details using fdisk command as below :

# fdisk -l |egrep '^Disk' |egrep -v 'dm-'|grep -v identifier
Disk /dev/sda: 74.1 GB, 74088185856 bytes
Disk /dev/sdb: 107.4 GB, 107374182400 bytes
Disk /dev/sdd: 2147 MB, 2147483648 bytes
Disk /dev/sde: 2147 MB, 2147483648 bytes
Disk /dev/sdc: 2147 MB, 2147483648 bytes

Keep this list handy to compare with the list after scan.

Scan SCSI disks

Now, if you have connected disks via SCSI then you need to scan SCSI hosts on the server. Check the current list of hosts on the server as below :

# ls /sys/class/scsi_host/
host0  host1  host2  host3

Now, you have 4 hosts on this server (in the example above). You need to scan all these 4 hosts in order to scan new disks attached to them. This can be done by writing - - - in their respective scan files. See below commands:

echo "- - -" > /sys/class/scsi_host/host0/scan
echo "- - -" > /sys/class/scsi_host/host1/scan
echo "- - -" > /sys/class/scsi_host/host2/scan
echo "- - -" > /sys/class/scsi_host/host3/scan

This completes your scan on SCSI hosts on the server. Now you can again run fdisk command we saw previously and compare the new output with the old one. You will see a new disk being added to the system and its respective device name too.

Scan FC LUNs:

If you have connected disks via FC then you need to scan FC hosts on the server. Check the current list of hosts on the server as below :

# ls /sys/class/fc_host
host0  host1

Now there are 2 FC hosts on the server. Again we need to scan them by writing 1 to their respective issue_lip file along with scan steps from above.

# echo "1" > /sys/class/fc_host/host0/issue_lip
# echo "- - -" > /sys/class/scsi_host/host0/scan
# echo "1" > /sys/class/fc_host/host1/issue_lip
# echo "- - -" > /sys/class/scsi_host/host1/scan

This will scan your FC HBA for new visible disks. Once the command completes (check syslog for completion event), you can use fdisk command to list disks. Compare the output with ‘before scan’ output and get new disk names!

Move disks/LUN from one server to another without losing data

Howto guide to moving disks or LUN from one server to another without losing any data. This guide is related to disks or LUN which are configured under LVM.

In Unix or Linux infra, it’s pretty common scenarios when you have to move disks or storage LUNs from one server to another server with data on them intact. This is something that is happening in clusters automatically i.e. handled by cluster services. When the primary node goes down, cluster services move disk or luns from primary to the secondary node and make secondary node available for use.

We are going to see how to do this manually using commands. This howto guide gives you an insight into what cluster services do in the background to move data across nodes in case of failures. We will be using LVM (Logical Volume Manager) as our disk manager in this guide since its most widely used volume manager next to VxVM (Veritas Volume Manager).

Steps flow goes like this :

  1. Stop disk access on server1
  2. Remove disk / LUN from server1
  3. Present disk / LUN to server2
  4. Identify new disk / LUN on server2
  5. Import it into LVM
  6. Make it available to use on server2

Let’s see these steps one by one in detail with commands and their outputs. We will be moving mount point /data from server1 to server2. /data is mounted on /dev/vg01/lvol1.

1. Stop disk access on server1

Firstly, you have stopped all user/app access to the related mount points. In our case its /data. You can check if anyone accessing mount point and can kill it using fuser command.

# fuser -cu /data         #TO VIEW USERS
/data:   223412c(user1)
# fuser -cku /data        #TO KILL USERS
# fuser -cu /data

Once you are sure no one is accessing mount point, go ahead and unmount it.

# umount /data

2. Remove disk / LUN from server1

Now, we need to remove disk or LUN from LVM of server1 so that it can be detached from the server gracefully. For this, we will be using vgexport command so that configuration backup can be imported on the destination server.

# vgchange -a n /dev/vg01
Volume group "/dev/vg01" has been successfully changed.
# vgexport -v -m /tmp/vg01.map vg01
Beginning the export process on Volume Group "/dev/vg01". 
vgexport:Volume Group “/dev/vg01” has been successfully removed.

To export VG, you need to de-activate the volume group first and then export VG with a map file. Transfer this map file to server2 with FTP or sftp so that it can be used while importing VG there.

Now, your VG vanishes from server1 i.e. related disk / LUN is no more associated with LVM of server1. Since VG ix exported only, data is intact on disk / LUN physically.

3. Present disk / LUN to server2

Now, you need to physically remove the disk from server1 and physically attach it to server2. If it’s a LUN then remove the mapping of LUN with server1 and map it to server2. You will require to do zoning at the storage level and WWN of both servers.

At this stage, your disk / LUN is removed from server1 and now available/visible to server2. But, it’s not yet known to LVM of server2.

4. Identify new disk / LUN on server2

To identify this newly presented/mapped disk/LUN on server2, you need to scan hardware or FC. Once you get disk number for it (identified in the kernel) proceed with the next steps of LVM.

Read here : Howto scan new lun / disk in Linux & HPUX

5. Import it in LVM

Now, we have disk / LUN identified on server2 along with the VG map file from server1. Using this file and disk name, proceed with importing VG in server2.

# vgimport -v -m /tmp/vg01.map /dev/vg01 list_of_disk
vgimport: Volume group “/dev/vg01” has been successfully created.
Warning: A backup of this volume group may not exist on this machine.
Please remember to take a backup using the vgcfgbackup command after activating the volume group.
# vgchange -a y vg01
Volume group “/dev/vg01” has been successfully changed.
# vgcfgbackup /dev/vg01
Volume Group configuration for /dev/vg01 has been saved in /etc/lvmconf/vg01.conf

First, import VG with vgimport command. In place of the list_of_disk argument in the above example, you have to give your disk name. You can use any VG name here. It’s not mandatory that you have to use VG name same as the first server. After successful import, activate that VG with vgchange.

6. Make it available to use on server2

At this stage, you disk / LUN is available in LVM of server2 with all data on them intact. To make it available for use we need to mount it on the directory. Use mount command:

# mount /dev/vg01/lvol1 /data2

Add entry in /etc/fstab as well to make sure mount point gets mounted at boot too.

cut command and its examples

Learn how to use the cut command in various scenarios and in scripting with lots of examples. Understand how to extract selective data using cut command.

Everybody in a Linux or Unix environment is familiar with grep command which is used for pattern finding/filtering. This is a kind of selective data extraction from source data. cut is another command which is useful for selective data extraction.

cut command basically extracts a portion of data from each line of a given file or supplied data. Syntax to be used for cut command is –

# cut [options] [file]

Below different options can be used with cut command :

c To select only this number of character.

Roughly you can say its a column number to be extracted. See below example:

# cat test
sd
teh
dxff
cq2w31q5
# cut -b 4 test


f
w

Here, cut is instructed to select only 4th character. If you look closely, in the output it shows only 4th column letters. Lines which are having less than 4 characters are shown blank in cut output!

This option is useful in scripting when there is a need for single-column extraction from supplied data.

-b To select only this number of bytes.

This option tells the command to select only a given number of a byte from each line of data and show in the output. In most cases, its output is the same as -c option since each character is treated as one byte in a human-readable text file.

-d Use specified delimiter instead of TAB

Normally cut uses tab as a delimiter while filtering data. If your file has different delimiter like a comma , in CSV file or colon : in /etc/passwd file then you can specify this different delimiter to cut command. Command will process data accordingly. This option should always be used with b, c or f options. Only using -d option will result in below error :

# cut -d a test
cut: you must specify a list of bytes, characters, or fields
Try `cut --help' for more information.

-f To select only specified number of fields.

This option is to specify which all fields to be filtered from data. This option if not supplied with delimiter character data (-d option/ no TAB in data) then it will print all lines. For example, if you say -f 2 then it searches for TAB or supplied delimiter values if any in data. If found then it will print 2nd field from delimiter for that line. For lines where it won’t find delimiter, it will print the whole line.

# cat test
this is test file
raw data
asdgfghtr
data
#cut -f1 test
this is test file
raw data
asdgfghtr
# cut -d" " -f2 test
is
data
asdgfghtr
data

Field numbering is as below:

Field one is the left side field from the first occurrence of the delimiter
Field second is the right-side field from the first occurrence of the delimiter
Fields third is the right-side second field from the first occurrence of the delimiter

Observe the above example, where it prints all lines as it is when the delimiter is not specified. When the delimiter is defined as single space ” ” then it prints field two according to numbering explained above.

-s To print only lines with delimiters

Missing delimiter causes cut to print the whole line as it is which may mislead the desired output. This is dangerous when the command is being used in scripts. Hence -s option is available which restricts cut to display the whole line in case of missing delimiter.

# cut -f1 -s test

Using the same test file from the above example when -s is specified then there is no output. This is because the default delimiter TAB does not exist in the file.

Number to be used with -b, -c and -f options:

In all the above examples, we have declared single integers like 2,4 for -b, -c and -f options. But we can also use the below system :

  • x: Single number like we used in all examples
  • x-: From xth bye, character or field till end of line
  • x-y: From xth byte, character or field till yth
  • -y: From first byte, character or field till yth
# cut -d" " -f2- test
is test file
data
asdgfghtr
data
# cut -d" " -f2-3 test
is test
data
asdgfghtr
data
# cut -d" " -f-1 test
this
raw
asdgfghtr
data

Few examples of cut command :

To get a list of users from /etc/passwd file, we will use delimiter as : and cut out the first field.

# cut -d: -f1 /etc/passwd
root
bin
daemon
adm
lp
------ output clipped -----

cut command can be feed with data from the pipe as well. In that case last [file] parameter shouldn’t be defined. Command will read input from pipe and process data accordingly. For example, we are grep ing user with uid 0 and then getting their usernames using cut like below:

# cat /etc/passwd |grep :0: | cut -d: -f1
root
sync
shutdown
halt
operator

Getting userid and group id from /etc/passwd file along with their usernames.

# cat /etc/passwd |cut -d: -f1-4
root:x:0:0
bin:x:1:1
daemon:x:2:2
adm:x:3:4
lp:x:4:7
sync:x:5:0

File encryption / password protect file in HPUX

Learn how to password protect files in HPUX. This is helpful to encrypt some public readable files using a password and decrypt them whenever needed.

It’s pretty obvious that you can control file access using permissions but sometimes you may want to protect file lying in a public directory like with password of your choice. Or sometimes you may want even root shouldn’t read your files 😉

In that case, you can use crypt command to encrypt your file with a password of your choice. This command is available in HPUX, Solaris. I couldn’t found it in Linux though. crypt command is basically used to encrypt or decrypt files. So basically you will be encrypting your file with the key of your choice and whenever you want to read it back, you need to decrypt it by supplying password/key you chose at the time of encryption.

Locking / encrypting file with key

Let’s take a myfile.txt sample file for encryption. You need to supply this file as an input to crypt command and define the output file (see example below).

# cat myfile.txt
This is test file for crypt.

# crypt < myfile.txt > myfile.crypt
Enter key:

# cat myfile.crypt
3▒▒▒x▒▒X▒n▒d▒6▒▒=▒▒q▒j

Now, crypt command will ask you for a key. Its a password you can set of your choice. Note that, it won’t ask you to retype key. Once executed you can new output file created (with the name given in command). This file is encrypted and can’t be read using cat, more etc commands!

That’s it! your file in encrypted. Now you can safely delete your original file myfile.txt and keep an encrypted copy on the machine.

Unlocking / decrypting file with key

Now, to retrieve file content back i.e. decryption of file you can run the same command. Only input and output file names will be exchanging their positions. Now, the encrypted filename will be your input file and the original filename will be the output file name.

# rm myfile.txt
# crypt < myfile.crypt > myfile.txt
Enter key:
# ll myfile.txt
-rw-------   1 root       users           29 Dec 12 11:51 myfile.txt
# cat myfile.txt
This is test file for crypt.

crypt command checks input file and get to know its encrypted one. So it uses key supplied by user to decrypt it into output file specified in command. You get your file back as it was!