Yearly Archives: 2016

sar command (Part I): All you need to know with examples

Learn System Activity Report sar command with real-world scenario examples. Understand the command’s log files, execution, and different usage.

SAR ! System Activity Report! sar command is the second-best command used to check system performance or utilization after top command. From the man page, ‘The sar command writes to standard output the contents of selected cumulative activity counters in the operating system. The accounting system, based on
the values in the count and interval parameters, writes information the specified number of times spaced at the specified intervals in seconds.’ No doubt this is the best performance monitoring tool to be used for any sysadmin.

Read next part of sar tutorial :

Command log file management:

sar keep collecting system resource utilization and store it in binary files. These files are called datafiles and those are located in /var/log/sa/saXX the path where XX is data in dd format. So this could be one of the locations to check when you are troubleshooting file system utilization.

# ll /var/log/sa
total 29024
-rw-r--r-- 1 root root 494100 Dec  1 23:50 sa01
-rw-r--r-- 1 root root 494100 Dec  2 23:50 sa02
-rw-r--r-- 1 root root 494100 Dec  3 23:50 sa03
-rw-r--r-- 1 root root 494100 Dec  4 23:50 sa04
-rw-r--r-- 1 root root 494100 Dec  5 23:50 sa05
-rw-r--r-- 1 root root 494100 Dec  6 23:50 sa06
-rw-r--r-- 1 root root 494100 Dec  7 23:50 sa07

----- output clipped -----

Log files are binary hence can be read only with sar using -f option. Normal sar command shows your data in real-time when executed. If you need to check historic data you need to use -f option and provide a path of the particular data file.

# sar -u 2 3
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/19/2016      _x86_64_        (4 CPU)

11:44:29 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
11:44:31 AM     all     25.37      0.00     10.12      0.00      0.00     64.50
11:44:33 AM     all     25.41      0.00     10.39      0.13      0.00     64.08
11:44:35 AM     all     27.84      0.00     11.36      0.12      0.00     60.67
Average:        all     26.21      0.00     10.62      0.08      0.00     63.08

In the above example, when executed it will run for 23 iterations (we will see what it is, in later part of this post) for 2 seconds each and show you an output which is in real-time. Let’s see -f option :

# sar -u 2 3 -f /var/log/sa/sa15
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/15/2016      _x86_64_        (4 CPU)

12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:10:01 AM     all     10.24      0.00      5.18      0.17      0.00     84.41
12:20:01 AM     all     11.55      0.00      5.02      0.19      0.00     83.24
12:30:01 AM     all     10.79      0.00      4.79      0.17      0.00     84.25
Average:        all     10.86      0.00      5.00      0.17      0.00     83.97

In above example, we ran sar command but on a datafile /var/log/sa/sa15. Hence data is being read from older/historic data files which is not real-time. File’s first entry is always treated as the first iteration and further on data is displayed according to command arguments. Hence you can see the first entry is being of 12AM.

Another beauty of this command for log management is you can save real-time command output in a log file of your choice. Let’s say you need to share the output of a specific time of monitoring then you can save the output in the log file and can share. In this way, you don’t have to share complete day datafile. You have to use -o option along with file path of your choice.

# sar -u 2 3 -o /tmp/logfile
Linux 2.6.39-200.24.1.el6uek.x86_64 (testsvr1)         12/19/2016      _x86_64_        (4 CPU)

11:51:42 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
11:51:44 AM     all     27.75      0.00      9.88      0.12      0.00     62.25
11:51:46 AM     all     26.00      0.00      9.88      0.12      0.00     64.00
11:51:48 AM     all     25.53      0.00     10.26      0.00      0.00     64.21
Average:        all     26.43      0.00     10.00      0.08      0.00     63.48
# ls -lrt /tmp/logfile
-rw-r--r-- 1 root root 63672 Dec 19 11:51 /tmp/logfile

In the above example, you can see the output is being displayed on the terminal as well as in a file provided in command options. Note that this file is also a binary file only.

Command Intervals and Iterations :

This command takes these two arguments which will define the time factors of output.

Interval is the time in seconds between two iterations of output samples. Normally selected as 2,5,10 seconds. Iteration or count is the number of samples to be taken after an interval of defined seconds. So for a command which says sar 2 5 means 2 interval and 5 iterations i.e. take 5 samples separated by 2 seconds each. i.e. if the command is fired at 12:00:00 then the output will include samples for times 12:00:02, 12:00:04 till 12:00:10. Check any above example and you will figure out how it works.

If the interval parameter is set to zero, the sar command displays the average statistics for the time since the system was started. If the iterations parameter is specified without the count parameter, then reports are generated continuously as shown below.

# sar -u 2
Linux 2.6.39-200.24.1.el6uek.x86_64 (oratest02)         12/19/2016      _x86_64_        (4 CPU)

12:09:28 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
12:09:30 PM     all      0.75      0.00      0.50      0.25      0.00     98.50
12:09:32 PM     all      0.88      0.00      0.38      0.13      0.00     98.62
12:09:34 PM     all      1.12      0.00      1.75      0.25      0.00     96.88
12:09:36 PM     all      2.38      0.00      1.38      0.12      0.00     96.12
12:09:38 PM     all     14.79      0.00      7.39      0.50      0.00     77.32
------- continuous reports being generated, output clipped -----

We will see useful monitoring example of this command in next post.

How to scan new lun / disk in Linux & HPUX

Howto guide to scan new disk or LUNs on Linux or HPUX machines. This guide explains steps to scan and then identify new disk device names.

When you add a new disk to the system, you need to scan it so that kernel will be able to identify new hardware and assign a disk name to it. Adding a new disk to the system can be local or from storage. If it’s a local then its an addition of disk in free disk slots attached to server. If its a storage LUN then it’s masking and zoning at storage level to WWN of the server.

Once the disk / LUN is made available/visible to the server, the next step is to scan it. The kernel has a know hardware tree with it. This tree needs to be updated with new disk information. To let the kernel know that a new disk is made available to server disk scanning is required. If the disk is from storage array then there are chances you have storage vendor utilities/scripts available to scan storage on server example: evainfo (for EVA storage), xpinfo (for XP12K storage), powermt (for EMC storage).    If these utilities are not available, you still be able to scan them from OS.

HPUX disk scan :

In HPUX, we have dedicated ioscan command to scan new hardware. You can ask command to scan on hard disks with -C option i.e. class. Before executing this command, keep the output of previous disks (ioscan -funC disk) handy. This output can be compared to new output (command below) to identify new disk.

# ioscan -fnC disk
Class     I  H/W Path        Driver  S/W State   H/W Type     Description
==========================================================================
disk      4  0/0/1/1.0.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c1t0d0   /dev/rdsk/c1t0d0
disk      0  0/0/1/1.2.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c1t2d0   /dev/rdsk/c1t2d0
disk      1  0/0/2/0.2.0     sdisk   CLAIMED     DEVICE       HP 36.4GST373455LC#36
                            /dev/dsk/c2t2d0   /dev/rdsk/c2t2d0
disk      2  0/0/2/1.2.0     sdisk   CLAIMED     DEVICE       HP      DVD-ROM 305
                            /dev/dsk/c3t2d0   /dev/rdsk/c3t2d0
disk      3  0/10/0/1.0.0.0  sdisk   CLAIMED     DEVICE       I2O     RAID5
                            /dev/dsk/c4t0d0   /dev/rdsk/c4t0d0

Scan output shows you all detected disks on the system and their assigned disk names in CTD format. Sometimes, ioscan unable to install special device files for newly detected disks, in such a situation you can run insf (install special files) command to ensure all detected hardware has device files in place.

# insf -e
insf: Installing special files for btlan instance 0 address 0/0/0/0
insf: Installing special files for stape instance 1 address 0/0/1/0.1.0
insf: Installing special files for sctl instance 0 address 0/0/1/0.7.0
insf: Installing special files for sdisk instance 4 address 0/0/1/1.0.0
insf: Installing special files for sdisk instance 0 address 0/0/1/1.2.0
insf: Installing special files for sctl instance 1 address 0/0/1/1.7.0
----- output clipped ----

New disk even can be identified by comparing directory structure of /dev/disk or /dev/dsk/ before and after the scan. Any new addition during the scan to these directories is your new disk.

Once you identify this new disk, you can use it on the system via volume managers like LVM.

Linux Disk scan:

In Linux, it’s a bit tricky since there is no direct ioscan available. First, you need to get currently available disk details using fdisk command as below :

# fdisk -l |egrep '^Disk' |egrep -v 'dm-'|grep -v identifier
Disk /dev/sda: 74.1 GB, 74088185856 bytes
Disk /dev/sdb: 107.4 GB, 107374182400 bytes
Disk /dev/sdd: 2147 MB, 2147483648 bytes
Disk /dev/sde: 2147 MB, 2147483648 bytes
Disk /dev/sdc: 2147 MB, 2147483648 bytes

Keep this list handy to compare with the list after scan.

Scan SCSI disks

Now, if you have connected disks via SCSI then you need to scan SCSI hosts on the server. Check the current list of hosts on the server as below :

# ls /sys/class/scsi_host/
host0  host1  host2  host3

Now, you have 4 hosts on this server (in the example above). You need to scan all these 4 hosts in order to scan new disks attached to them. This can be done by writing - - - in their respective scan files. See below commands:

echo "- - -" > /sys/class/scsi_host/host0/scan
echo "- - -" > /sys/class/scsi_host/host1/scan
echo "- - -" > /sys/class/scsi_host/host2/scan
echo "- - -" > /sys/class/scsi_host/host3/scan

This completes your scan on SCSI hosts on the server. Now you can again run fdisk command we saw previously and compare the new output with the old one. You will see a new disk being added to the system and its respective device name too.

Scan FC LUNs:

If you have connected disks via FC then you need to scan FC hosts on the server. Check the current list of hosts on the server as below :

# ls /sys/class/fc_host
host0  host1

Now there are 2 FC hosts on the server. Again we need to scan them by writing 1 to their respective issue_lip file along with scan steps from above.

# echo "1" > /sys/class/fc_host/host0/issue_lip
# echo "- - -" > /sys/class/scsi_host/host0/scan
# echo "1" > /sys/class/fc_host/host1/issue_lip
# echo "- - -" > /sys/class/scsi_host/host1/scan

This will scan your FC HBA for new visible disks. Once the command completes (check syslog for completion event), you can use fdisk command to list disks. Compare the output with ‘before scan’ output and get new disk names!

Move disks/LUN from one server to another without losing data

Howto guide to moving disks or LUN from one server to another without losing any data. This guide is related to disks or LUN which are configured under LVM.

In Unix or Linux infra, it’s pretty common scenarios when you have to move disks or storage LUNs from one server to another server with data on them intact. This is something that is happening in clusters automatically i.e. handled by cluster services. When the primary node goes down, cluster services move disk or luns from primary to the secondary node and make secondary node available for use.

We are going to see how to do this manually using commands. This howto guide gives you an insight into what cluster services do in the background to move data across nodes in case of failures. We will be using LVM (Logical Volume Manager) as our disk manager in this guide since its most widely used volume manager next to VxVM (Veritas Volume Manager).

Steps flow goes like this :

  1. Stop disk access on server1
  2. Remove disk / LUN from server1
  3. Present disk / LUN to server2
  4. Identify new disk / LUN on server2
  5. Import it into LVM
  6. Make it available to use on server2

Let’s see these steps one by one in detail with commands and their outputs. We will be moving mount point /data from server1 to server2. /data is mounted on /dev/vg01/lvol1.

1. Stop disk access on server1

Firstly, you have stopped all user/app access to the related mount points. In our case its /data. You can check if anyone accessing mount point and can kill it using fuser command.

# fuser -cu /data         #TO VIEW USERS
/data:   223412c(user1)
# fuser -cku /data        #TO KILL USERS
# fuser -cu /data

Once you are sure no one is accessing mount point, go ahead and unmount it.

# umount /data

2. Remove disk / LUN from server1

Now, we need to remove disk or LUN from LVM of server1 so that it can be detached from the server gracefully. For this, we will be using vgexport command so that configuration backup can be imported on the destination server.

# vgchange -a n /dev/vg01
Volume group "/dev/vg01" has been successfully changed.
# vgexport -v -m /tmp/vg01.map vg01
Beginning the export process on Volume Group "/dev/vg01". 
vgexport:Volume Group “/dev/vg01” has been successfully removed.

To export VG, you need to de-activate the volume group first and then export VG with a map file. Transfer this map file to server2 with FTP or sftp so that it can be used while importing VG there.

Now, your VG vanishes from server1 i.e. related disk / LUN is no more associated with LVM of server1. Since VG ix exported only, data is intact on disk / LUN physically.

3. Present disk / LUN to server2

Now, you need to physically remove the disk from server1 and physically attach it to server2. If it’s a LUN then remove the mapping of LUN with server1 and map it to server2. You will require to do zoning at the storage level and WWN of both servers.

At this stage, your disk / LUN is removed from server1 and now available/visible to server2. But, it’s not yet known to LVM of server2.

4. Identify new disk / LUN on server2

To identify this newly presented/mapped disk/LUN on server2, you need to scan hardware or FC. Once you get disk number for it (identified in the kernel) proceed with the next steps of LVM.

Read here : Howto scan new lun / disk in Linux & HPUX

5. Import it in LVM

Now, we have disk / LUN identified on server2 along with the VG map file from server1. Using this file and disk name, proceed with importing VG in server2.

# vgimport -v -m /tmp/vg01.map /dev/vg01 list_of_disk
vgimport: Volume group “/dev/vg01” has been successfully created.
Warning: A backup of this volume group may not exist on this machine.
Please remember to take a backup using the vgcfgbackup command after activating the volume group.
# vgchange -a y vg01
Volume group “/dev/vg01” has been successfully changed.
# vgcfgbackup /dev/vg01
Volume Group configuration for /dev/vg01 has been saved in /etc/lvmconf/vg01.conf

First, import VG with vgimport command. In place of the list_of_disk argument in the above example, you have to give your disk name. You can use any VG name here. It’s not mandatory that you have to use VG name same as the first server. After successful import, activate that VG with vgchange.

6. Make it available to use on server2

At this stage, you disk / LUN is available in LVM of server2 with all data on them intact. To make it available for use we need to mount it on the directory. Use mount command:

# mount /dev/vg01/lvol1 /data2

Add entry in /etc/fstab as well to make sure mount point gets mounted at boot too.

cut command and its examples

Learn how to use the cut command in various scenarios and in scripting with lots of examples. Understand how to extract selective data using cut command.

Everybody in a Linux or Unix environment is familiar with grep command which is used for pattern finding/filtering. This is a kind of selective data extraction from source data. cut is another command which is useful for selective data extraction.

cut command basically extracts a portion of data from each line of a given file or supplied data. Syntax to be used for cut command is –

# cut [options] [file]

Below different options can be used with cut command :

c To select only this number of character.

Roughly you can say its a column number to be extracted. See below example:

# cat test
sd
teh
dxff
cq2w31q5
# cut -b 4 test


f
w

Here, cut is instructed to select only 4th character. If you look closely, in the output it shows only 4th column letters. Lines which are having less than 4 characters are shown blank in cut output!

This option is useful in scripting when there is a need for single-column extraction from supplied data.

-b To select only this number of bytes.

This option tells the command to select only a given number of a byte from each line of data and show in the output. In most cases, its output is the same as -c option since each character is treated as one byte in a human-readable text file.

-d Use specified delimiter instead of TAB

Normally cut uses tab as a delimiter while filtering data. If your file has different delimiter like a comma , in CSV file or colon : in /etc/passwd file then you can specify this different delimiter to cut command. Command will process data accordingly. This option should always be used with b, c or f options. Only using -d option will result in below error :

# cut -d a test
cut: you must specify a list of bytes, characters, or fields
Try `cut --help' for more information.

-f To select only specified number of fields.

This option is to specify which all fields to be filtered from data. This option if not supplied with delimiter character data (-d option/ no TAB in data) then it will print all lines. For example, if you say -f 2 then it searches for TAB or supplied delimiter values if any in data. If found then it will print 2nd field from delimiter for that line. For lines where it won’t find delimiter, it will print the whole line.

# cat test
this is test file
raw data
asdgfghtr
data
#cut -f1 test
this is test file
raw data
asdgfghtr
# cut -d" " -f2 test
is
data
asdgfghtr
data

Field numbering is as below:

Field one is the left side field from the first occurrence of the delimiter
Field second is the right-side field from the first occurrence of the delimiter
Fields third is the right-side second field from the first occurrence of the delimiter

Observe the above example, where it prints all lines as it is when the delimiter is not specified. When the delimiter is defined as single space ” ” then it prints field two according to numbering explained above.

-s To print only lines with delimiters

Missing delimiter causes cut to print the whole line as it is which may mislead the desired output. This is dangerous when the command is being used in scripts. Hence -s option is available which restricts cut to display the whole line in case of missing delimiter.

# cut -f1 -s test

Using the same test file from the above example when -s is specified then there is no output. This is because the default delimiter TAB does not exist in the file.

Number to be used with -b, -c and -f options:

In all the above examples, we have declared single integers like 2,4 for -b, -c and -f options. But we can also use the below system :

  • x: Single number like we used in all examples
  • x-: From xth bye, character or field till end of line
  • x-y: From xth byte, character or field till yth
  • -y: From first byte, character or field till yth
# cut -d" " -f2- test
is test file
data
asdgfghtr
data
# cut -d" " -f2-3 test
is test
data
asdgfghtr
data
# cut -d" " -f-1 test
this
raw
asdgfghtr
data

Few examples of cut command :

To get a list of users from /etc/passwd file, we will use delimiter as : and cut out the first field.

# cut -d: -f1 /etc/passwd
root
bin
daemon
adm
lp
------ output clipped -----

cut command can be feed with data from the pipe as well. In that case last [file] parameter shouldn’t be defined. Command will read input from pipe and process data accordingly. For example, we are grep ing user with uid 0 and then getting their usernames using cut like below:

# cat /etc/passwd |grep :0: | cut -d: -f1
root
sync
shutdown
halt
operator

Getting userid and group id from /etc/passwd file along with their usernames.

# cat /etc/passwd |cut -d: -f1-4
root:x:0:0
bin:x:1:1
daemon:x:2:2
adm:x:3:4
lp:x:4:7
sync:x:5:0

How to get directory size in Linux

Learn how to get directory total size in Linux without calculating each and every file’s size within. A very handy command to troubleshoot mount point utilization.

Many times we need to check specific directory size to hunt down the culprit of mount point utilization. There are scenarios where mount points keep getting full and we need to investigate which file or directory is hogging most of the space.

Collectively to check the highest file or directories, I already briefed on post Highest size files in the mount point. Let’s see how we can get directory’s collective size in one go.

We will be using disk usage command du and below of its options :

  • -s: Display the only summary of each element
  • -h: Human readable format for size i.e. KB, MB, GB
  • -c: Produce grand total i.e.  display the total size
# du -sch /dump1/test/mydir
13G     /dump1/test/mydir
13G     total

Here, the specified directory is 13GB. It’s a size of /dump1/test/mydir directory not of /dmp1.

If you want to check the size of every object beneath the specified directory then you can skip -s i.e. summery option from the command.

# du -ch /tmp
4.0K    /tmp/hsperfdata_oracle11
4.0K    /tmp/orbit-root

----- output clipped -----

4.0K    /tmp/keyring-VPiB3D
16K     /tmp/lost+found
652K    /tmp
652K    total

In the above output, you can see each and every object’s size is given and finally at the end total is given which is the final size of the specified directory!

Let me know if you have any queries/questions/feedback in the comments. Also, drop us a message on our contact form.

File encryption / password protect file in HPUX

Learn how to password protect files in HPUX. This is helpful to encrypt some public readable files using a password and decrypt them whenever needed.

It’s pretty obvious that you can control file access using permissions but sometimes you may want to protect file lying in a public directory like with password of your choice. Or sometimes you may want even root shouldn’t read your files 😉

In that case, you can use crypt command to encrypt your file with a password of your choice. This command is available in HPUX, Solaris. I couldn’t found it in Linux though. crypt command is basically used to encrypt or decrypt files. So basically you will be encrypting your file with the key of your choice and whenever you want to read it back, you need to decrypt it by supplying password/key you chose at the time of encryption.

Locking / encrypting file with key

Let’s take a myfile.txt sample file for encryption. You need to supply this file as an input to crypt command and define the output file (see example below).

# cat myfile.txt
This is test file for crypt.

# crypt < myfile.txt > myfile.crypt
Enter key:

# cat myfile.crypt
3▒▒▒x▒▒X▒n▒d▒6▒▒=▒▒q▒j

Now, crypt command will ask you for a key. Its a password you can set of your choice. Note that, it won’t ask you to retype key. Once executed you can new output file created (with the name given in command). This file is encrypted and can’t be read using cat, more etc commands!

That’s it! your file in encrypted. Now you can safely delete your original file myfile.txt and keep an encrypted copy on the machine.

Unlocking / decrypting file with key

Now, to retrieve file content back i.e. decryption of file you can run the same command. Only input and output file names will be exchanging their positions. Now, the encrypted filename will be your input file and the original filename will be the output file name.

# rm myfile.txt
# crypt < myfile.crypt > myfile.txt
Enter key:
# ll myfile.txt
-rw-------   1 root       users           29 Dec 12 11:51 myfile.txt
# cat myfile.txt
This is test file for crypt.

crypt command checks input file and get to know its encrypted one. So it uses key supplied by user to decrypt it into output file specified in command. You get your file back as it was!

How to configure NTP client in Linux

Learn how to configure NTP (Network Time Protocol) on Linux machines to sync time with the NTP server over the network. Also, learn how to manually sync time on NTP.

Nowadays NTP (Network Time Protocol) is one of the essential things in any IT infrastructure. Apart from production even development or test environments also backed with NTP to ensure smooth operations.  Let’s see what is NTP and how to configure it in the Linux machine.

What is NTP

NTP is the protocol used to sync the time of machines with the NTP server (can be an appliance or another Linux machine) over the network. It aims at keeping all the machines clock in sync so that there will be no delays between any two machines in a network. This is very crucial in production environments running finance data. Network time protocol runs on UDP port 123. This should be open between the time server and client in both directions.

What is NTP server

NTP server can be another machine with NTP server-side configuration running or it can be a dedicated NTP appliance. NTP appliance is a small rackmount server looking device that has an antenna attached to it. An antenna can be extended to building rooftops to ensure better signal receiving. These appliances receive signals and hence synchronize their own time with satellites in space. Now their own system time will be set a benchmark to sync other machines with it over the network. The appliance comes with its own configuration which can be done on front display buttons or by connecting to its console. Each vendor has a different set of configs and different methods to set them.

Configure NTP client 

Let’s assume we already have NTP appliance name ntpappliance1.xyz.com with IP 10.10.1.2 in our infra. Now we will see step by step configuring Linux server to sync time with this appliance over the network.

1. Make sure you have ntp package installed. If not install it using steps defined here.

# rpm -q ntp
ntp-4.2.6p5-1.el6.x86_64

2. Edit /etc/ntp.conf file to add appliance or NTP server name into it. Add IP/hostname in the end of the file. If you are supplying hostname here then make a sure relevant entry is present in /etc/hosts file. I have shown both entries in the below example IP and hostname.

# cat /etc/ntp.conf

------- output clipped ------
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
#server 2.sg.pool.ntp.org

server ntpappliance1.xyz.com prefer
server 10.10.1.3

In the above output, 3-time servers are defined and the one with prefer option will always have a preference while syncing time. If it’s not reachable then any one of the remaining servers is chosen by the daemon to sync time with.

3. That’s it! Start your ntp daemon and make sure it’s running.

# /etc/init.d/ntpd start
OR
# service ntpd start
# service ntpd status
ntpd (pid  2261) is running...

4. Check time sync status using command :

# ntpq -p
     remote              refid       st t when poll reach   delay   offset  jitter
==================================================================================
+ntpappliance2.xyz.com 10.10.1.3      3 u   40   64  377  180.764    0.719   0.458
*ntpappliance1.xyz.com 10.10.1.2      3 u   50   64  377  180.851   -0.272   0.149

Here different fields are :

  • remote : Remote time server hostname/IP
  • refid : Association ID
  • st : stratum
  • t : u: unicast, b: broadcast, l: local
  • when : sec/min/hr since last received packet
  • poll : poll interval (log2 s)
  • reach : reach shift register (octal)
  • delay : roundtrip delay
  • offset : offset from server time
  • jitter : Jitter (noise)

Also, the very first value displayed is state i.e. + and * sign. These values can be :

  • + mean Good connectivity and preferred remote server
  • * means currently selected time server for sync
  • - means do not use to sync due to out of tolerance (cluster algorithm)
  • x means do not use to sync due to out of tolerance (intersection algorithm)
  • # means good connectivity but not used to sync yet.

Manually sync time with the server

NTP daemon runs in background and sync time according to polling configuration. But if you want to manually sync time right away with time server then you can do it with below command :

# ntpdate -u ntpappliance2.xyz.com
10 Dec 13:20:05 ntpdate[30337]: adjust time server 10.10.1.3 offset -0.000437 sec

It will update time with a given time server (as an argument in command) right away.

If you are having issues with timeserver connectivity then first troubleshoot at OS and firewall level. You can also view your syslog and grep for the NTP keyword and you will see all NTP related messages logged in Syslog which may help you in troubleshooting.

Amazing “who” command

Learn ‘who’ command in Linux and related files. Understand its usage, different options, and scenarios in which command can be useful. 

If you are in Linux/Unix administration, you probably used who command many times. And majorly for checking the list of users currently logged in to the system. But there is much other stuff this command can do which we overlook. Let see the power of who command.

‘who’ command is capable of doing the following :

  1. Provide information about logged-in users
  2. List dead processes
  3. Shows last system boot time
  4. List system login processes
  5. List processes spawned by init
  6. Shows runlevel info
  7. Track last system clock change
  8. User’s message status

Let’s see all the above doings one by one and understand from where command fetch this information for us.

1. Provide information about logged-in users

This is a pretty famous use of who command. Without any argument when run, it shows output similar to one below :

# who
root     pts/1        2016-12-09 10:48 (10.10.42.56)
user4    pts/2        2016-12-09 10:53 (10.10.42.22)

The output consists of 5 columns where,

  • The first field is the username of the user who is logged in currently
  • The second field is the terminal from where the user is logged in
  • Third and fourth fields are date and time of login
  • The fifth field is the IP address/hostname from where the user logged in.

who reads information from /var/run/utmp file and provide in a formatted manner in this output.

Read also: How to check bad logins in HPUX

2. List dead processes

This is helpful during troubleshooting of performance issues or system cleanup. There are some processes that went dead in the system i.e. not properly terminated or closed after their execution completes. These processes can be seen using who -d.

# who -d
         pts/1        2016-12-09 11:46             27696 id=ts/1  term=0 exit=0
         pts/2        2016-12-03 00:34             23816 id=ts/2  term=0 exit=0
         pts/3        2016-12-03 00:34             23856 id=ts/3  term=0 exit=0

The output shows terminal from which process was fired in the first field, followed by the date, time, process id and other details.

3. Shows last system boot time

If you want to check when the system was booted quickly then who -b is your way out. No need to search through log files or no need to back-calculate the date from system uptime. Just 4 letters command and you will be presented with last boot time.

# who -b
         system boot  2016-03-17 19:39

4. List system login processes

These are the currently active login processes i.e. Getty on the system. Using who -l you will get details of login processes. This information can also be traced/verified in ps -ef output.

# who -l
LOGIN    tty3         2016-03-17 19:39              2502 id=3
LOGIN    tty6         2016-03-17 19:39              2522 id=6
LOGIN    tty5         2016-03-17 19:39              2515 id=5
LOGIN    tty2         2016-03-17 19:39              2497 id=2
LOGIN    tty4         2016-03-17 19:39              2510 id=4

Here,

  • The first field is the process type
  • The second field denotes terminal used
  • Third and fourth are the date and time of spawn
  • The fifth field is process id i.e. PID
  • Sixth is identification/sequence number

The above processes can be seen in ps -ef filtered output as well.

# ps -ef |grep -i getty
root      2497     1  0 Mar17 tty2     00:00:00 /sbin/mingetty /dev/tty2
root      2502     1  0 Mar17 tty3     00:00:00 /sbin/mingetty /dev/tty3
root      2510     1  0 Mar17 tty4     00:00:00 /sbin/mingetty /dev/tty4
root      2515     1  0 Mar17 tty5     00:00:00 /sbin/mingetty /dev/tty5
root      2522     1  0 Mar17 tty6     00:00:00 /sbin/mingetty /dev/tty6

5. List processes spawned by init

Whenever the system finishes boot, the kernel will load services/processes with the help of the init program. If there are any active processes running on the server which were spawned by init program then those can be viewed using who -p. Currently, my server does not have such processes exist so no output can be shared here.

6. Shows runlevel info

Another famous use of who with -r argument. To check the current run level of the system we use who -r.

# who -r
         run-level 5  2016-03-17 19:39

In the output, it shows the current run level number and date, a time when the system entered this run level. In HPUX this command output also shows the previous run level system was in before entering the current run level.

Read also: Run levels in HPUX

7. Track last system clock change

Normally this option i.e. who -t is not useful nowadays. Since all system now runs with NTP (network time protocol), time syncs with NTP server and there are no manual interventions for the system time change. who -t a command normally aims at showing the last system clock change details if manually done.

8. User’s message status

Linux uses a native messaging system using which one can send and messages to logged in the user’s terminal. who -T gives you the user’s message status i.e. if messaging service is enabled or disabled for the user. The user may opt-out of this service from his terminal using mesg n command. This will prevent any message to be displayed on his terminal.

By observing the user’s messages status, one can determine which user won’t be receiving a broadcast message is sent. This will help to analyze which user may not be aware of happenings that are notified through broadcast messages. Mainly system reboot, shutdown sent outs broadcast messages to all logged-in users. If some users don’t see it, he may lose his work during the system down event.

# who -T
user3      + pts/0        2016-12-09 11:42 (10.10.49.12)
testuser2  - pts/1        2016-12-09 12:38 (10.10.49.12)

In the above output, I logged in with testuser2 and opt-out of messaging service with mesg n. You can see - sign against testuser2 while user3 is with + i.e. he has messaging-enabled and will receive messages on the terminal.

Understanding /etc/hosts file

/etc/hosts is a key file for name resolution in any Linux Unix system. Learn fields, formats within /etc/hosts file. Understand the meaning of each field and how it can be set.

This is also one of the important files in the Linux-Unix system like /etc/passwd or /etc/fstab. The name resolution in the Lx-Ux system is being handled by this file. Whenever kernel needs to resolve some hostname to IP, it will search for it in /etc/hosts file. If DNS is configured on the system then it will go for it and then this file doesn’t play much of role in name resolution. Basically this file is a static IP lookup table on the server.

It’s a text file that can be viewed using cat, more, less, etc commands. One can edit this file using text editors like vi. Sample /etc/hosts file is shown below :

# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 
::1         localhost localhost.localdomain localhost6
10.10.1.64  server34 #This server

# Test servers 
10.10.1.12 test01
10.10.1.121 test02

# NTP server
10.10.1.85 ntpsvr1.kerneltalks.com  ntpsrv1
10.10.1.86 ntpsvr2.kerneltalks.com     #standby server

The format being followed is <IP> <FQDN> <alias> where both fields are separated by space or tab and one IP per line. Comments can be added with lines starting with the # symbol. Comments can be added on the same line of IP entry too. Any text following the # symbol will be ignored until the end of the line. These lines will be ignored by the kernel/shell/program when it reads this file. Those are just comments added for the understanding of the user (human). Like in the above example NTP server and standby server both are comments.

The hostname can contain only alphanumeric characters, minus sign - and period . It should always start with the alphabet and ends with an alphanumeric character.

There is also the 3rd field in each row which is optional. This field is for aliases. These are short names, alternate names, etc for the same IP. In the above example, ntpsrv1 is an alias to IP 10.10.1.85

You will see a couple of entries in all /etc/hosts file on your environment. Most of them are loopback address i.e. 127.0.0.1 which will be pointing to localhost and another is that server’s own IP & hostname entry.

Build Syslog server in Linux for centralized log management

Step by step guide to configure Syslog Server in a Linux environment. Learn how to enable remote Syslog logging in Linux for centralized log management.

In many It infrastructure environments, clients choose to have one centralized Syslog server in which all logs from remote systems can be collected. It then easier to filter, monitor, verify a report in a single location rather than querying all systems in infra. In this post, we will be seeing how to configure Linux machine to act as a Syslog server.

In the configuration, there are two parts. First server-side configuration to be done on the Linux machine which will act as Syslog server. Secondly, client-side configuration to be done on a remote system that will be sending logs to the Syslog server.

Server side configurations:

A machine which will be acting as Syslog server should have below pre-requisites done :

  1. syslog daemon i.e. syslogd should be up and running
  2. portmap and xinetd services should be running
  3. Targeted client machine’s IP range should be able to reach the Syslog server over network.
# service syslog status
syslogd (pid  3774) is running...
klogd (pid  3777) is running...
# service portmap  status
portmap (pid 3891) is running...
# service xinetd  status
xinetd (pid  4410) is running...

Once you make sure all related services are running, proceed to edit syslogd configuration file i.e. /etc/syslog.conf. You need to add -r option in the configuration file which will enable daemon to receive logs from remote machines.

# cat /etc/sysconfig/syslog

# Options to syslogd
# -m 0 disables 'MARK' messages.
# -r enables logging from remote machines
# -x disables DNS lookups on messages recieved with -r
# See syslogd(8) for more details
SYSLOGD_OPTIONS="-m 0"
# Options to klogd
# -2 prints all kernel oops messages twice; once for klogd to decode, and
#    once for processing with 'ksymoops'
# -x disables all klogd processing of oops messages entirely
# See klogd(8) for more details
KLOGD_OPTIONS="-x"
#
SYSLOG_UMASK=077
# set this to a umask value to use for all log files as in umask(1).
# By default, all permissions are removed for "group" and "other".

Here you can see a row with parameter SYSLOGD_OPTIONS="-m 0". This needs to be added with -r option like  SYSLOGD_OPTIONS="-r -m 0"

Edit the conf file with a text editor like vi and add -r option as stated above. To take up these new changes restart Syslog service.

# service syslog restart
Shutting down kernel logger:          [  OK  ]
Shutting down system logger:          [  OK  ]
Starting system logger:               [  OK  ]
Starting kernel logger:               [  OK  ]

Now your server Syslog daemon is ready to accept logs from remote machines. All messages from remote machines and Syslog server’s own Syslog will be logged in /var/log/messages on Syslog server. Its own messages will be having “localhost” in 2nd field after the date and remote machine logs will be having IP/hostname instead of localhost in the 2nd field.

It should look like below once it starts populating remote machine’s logs too. First entry beings its own and second one being remote server’s log.

Nov 10 12:34:44 localhost syslogd 1.4.1: restart (remote reception).
Nov 10 12:34:44 server3  snmpd[4380]: Connection from UDP: [10.100.49.125]:55234

Client side configurations:

In client machine, you need to edit Syslog configuration file /etc/syslog.conf. Here you need to instruct Syslog daemon to send logs to remote Syslog server.

Open /etc/syslog.conf configuration file and append user.* @[ server IP] to end of it. In which server IP is your Syslog server IP. If you have mentioned Syslog server IP in /etc/hosts of client machine then you can give hostname in above entry instead of IP.

user.* defines the type of log messages to be sent to the Syslog server. If you want to log all messages to the Syslog server you can use *.* or you can choose the type of logs defined in this config file itself. Read the below file and you will get to know different types. Defining *.* is not advisable since it will be flooding logs on the Syslog server and its storage might get full if you have many machines sending logs to the server at a time.

This should look like below. Check last line of file :

# cat /etc/syslog.conf

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog


# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 *

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log

user.*          @10.12.2.5

After editing conf file, restart syslog daemon to get this new config in action.

You can send test log to check if your setup is working using below command :

# logger -p user.info “Test log message”

This will send a user.info type messages to Syslog locally. It will be logged to local /var/log/messages and also gets forwarded to the Syslog server on the mentioned IP. You should see below entries :

On local i.e. client 
# tail -1 /var/log/messages
Dec  7 01:27:09 localhost root: “Test log message”

On syslog server 
# tail -1 /var/log/messages
Dec  7 01:27:09 server3 root: “Test log message”

This will confirm your Syslog server is accepting remote logs perfectly and the machine you configured as the client is sending logs to the server too!