Beginner’s guide: 4 Linux group management commands

Learn to manage groups in Linux with these group management commands. The article includes how to create, modify, delete, and administer groups.

Group management in Linux

Groups on the Linux system are a bunch of users created for easy access/permission management. One user can be a member of one or many groups. Users will have only one primary and one/many secondary groups. In our other article we have seen user management commands in Linux/Unix. In this article we will discuss group management. There are mainly 4 commands used to manage user groups on Linux systems :

  1. groupadd
  2. groupmod
  3. groupdel
  4. gpasswd

Let’s check all these commands and fields they are responsible in /etc/group file.

groupadd command

As the name suggests, it is used to create new groups on the Linux system. groupadd command needs a group name as an argument.

# groupadd sysadmins

# cat /etc/group
sysadmins:x:502:

This command creates a group named sysadmins. A newly created group can be verified in /etc/group file. Study fields in /etc/group file here.

Several common switches which works with groupadd are :

  • -g : Specify GID of your choice
  • -o : Create a group with non-unique GID
  • -r : Create a system group. (GID will be taken from system group GID range)

groupmod command

If you want to edit parameters like name, GID, uniqueness of group which already exist in the system then you can modify group using groupmod. Below the list of the switch with their desired values should feed to this command –

  • -g : new GID
  • -o : Make it non-unique
  • -n : New name
# groupmod -n newsysadmins sysadmins
# cat /etc/group |grep sys
newsysadmins:x:502:

# groupmod -g 9999 sysadmins
# cat /etc/group 
sysadmins:x:9999:

# groupmod -o -g 3 sysadmins
# cat /etc/group |grep sys
sys:x:3:bin,adm
sysadmins:x:3:


Observe above outputs where we changed the name, gid of the group and lastly we assigned the same GID 3 (non-unique) to our group which was already existing.

groupdel command

That’s the command where group ends their life! Yes, group deletion is performed using this command. This command is pretty simple. Just supply your group name and it will be deleted from the system.

# groupdel sysadmins

gpasswd command

This command is used to administer group. Administering groups includes :

  1. Adding/removing users to/from group
  2. Setting and removing group password
  3. Making a user administrator/member of a group

Adding and removing user in the group is done with switch -a and -d followed by user name and lastly group name. Check below examples :

# gpasswd -a shri sysadmins
Adding user shri to group sysadmins

# cat /etc/group | grep sysadmin
sysadmins:x:3:shri

# gpasswd -d shri sysadmins
Removing user shri from group sysadmins

# cat /etc/group | grep sysadmin
sysadmins:x:3:


Password set is done without any switch while password removal is with -r switch as below :

# gpasswd sysadmins
Changing the password for group sysadmins
New Password:
Re-enter new password:

What is the use of group password in Linux? 

This question comes to many of us. Hardly rather no one uses this feature at all. The idea must be to secure a group from non-member users. But since a group password should be known to all group members, it actually doesn’t make any sense to use it. Then you might ask then why group passwords exist in the first place? It may be just following the user (password security) model to groups as well to maintain symmetry in design. I mean it’s just my thought. Let me know if you have any other reason which suits group password existence!

Making any user administrator of the group grants him the privilege to administer the group. Member, the user is just a member of the group and can not administer it. You can make user administrator of the group with -A switch and member with -M. By default, the user is added to the group as a member

# gpasswd -A shri sysadmins
# gpasswd -M shri sysadmins

Those are all group management commands in Linux with their most used switches. Let us know any addition/correction/feedback in the comments!

Finger command in Linux

Learn how to get user details by using finger command in Linux. List of switches for finger command and list of parameter information in this article.

Finger command howto!

The finger is a user information lookup program in Linux. It is used to get system user details like the user, home directory, last login, user shell, etc. This command is useful to see these parameters which otherwise you have to look under /etc/passwd and last login records.

Sometimes you will not find finger command in your out of box distribution. You can install a finger package and proceed to use this command. Sample output of installation on Redhat for your reference.

# yum install finger
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos, security
Setting up Install Process
epel/metalink                                                                                                                         |  15 kB     00:00
epel                                                                                                                                  | 4.3 kB     00:00
epel/primary_db                                                                                                                       | 5.9 MB     00:04
rhui-REGION-client-config-server-6                                                                                                    | 2.9 kB     00:00
rhui-REGION-rhel-server-releases                                                                                                      | 3.5 kB     00:00
rhui-REGION-rhel-server-releases/primary_db                                                                                           |  56 MB     00:00
rhui-REGION-rhel-server-releases-optional                                                                                             | 3.5 kB     00:00
rhui-REGION-rhel-server-releases-optional/primary_db                                                                                  | 5.4 MB     00:00
rhui-REGION-rhel-server-rh-common                                                                                                     | 3.8 kB     00:00
Resolving Dependencies
--> Running transaction check
---> Package finger.x86_64 0:0.17-40.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================
 Package                      Arch                         Version                              Repository                                              Size
=============================================================================================================================================================
Installing:
 finger                       x86_64                       0.17-40.el6                          rhui-REGION-rhel-server-releases                        22 k

Transaction Summary
=============================================================================================================================================================
Install       1 Package(s)

Total download size: 22 k
Installed size: 27 k
Is this ok [y/N]: y
Downloading Packages:
finger-0.17-40.el6.x86_64.rpm                                                                                                         |  22 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : finger-0.17-40.el6.x86_64                                                                                                                 1/1
  Verifying  : finger-0.17-40.el6.x86_64                                                                                                                 1/1

Installed:
  finger.x86_64 0:0.17-40.el6

Complete!

Finger command requires the username as an argument. Without any switch finger shows below details.

# finger shrikant
Login: shrikant                         Name: Shrikant Lavhate
Directory: /home/ec2-user               Shell: /bin/bash
On since Wed Jul  5 00:31 (EDT) on pts/0 from 59.184.183.234
No mail.
No Plan.

It displays –

  1. Login: Login id
  2. Name: Comment in /etc/passwd against that user
  3. Directory: Home directory of the user
  4. Shell: The user login shell
  5. Last login time and IP from where he/she was logged in
  6. Email status
  7. Plan : (the content of .plan file in user’s home directory)

Email status can be one of the below –

  • No Mail. :  if there is no mail at all
  • Mail last read DDD MMM ## HH:MM YYYY (TZ):  if the person has
    looked at their mailbox since new mail arriving
  • New mail received … : Same as above
  • Unread since … :  if the user has new mail

Finger command switches

Finger command supports a few switches. The above output without any switch is the same output for the switch -l (multi-line listing). It also displays the content of the files .plan, .project, .pgpkey, and .forward from the user’s a home directory if they exist.

Another switch is -s which can be used for more information like terminal name, write status, idle time, login time and contact details, etc.

# finger -s  shri
Login     Name              Tty        Idle  Login Time   Office     Office Phone
shri      Shrikant Lavhate   *     *    No     logins

In this output you can see * for :

  • terminal: When the unknown device
  • Write status: If write permission is denied
  • login and idle time: If nonexistent

The last switch is -m which prevents user matching. Finger command matches the supplied user name in userid and user comment details. To avoid matching it in comment details and only check-in user ids this switch can be used.

Finger can even be used to lookup remote user information by using user@host format.

How-to guide: LVM snapshot

Understand what is LVM snapshot and why to use it. Learn how to create, mount, unmount, and delete the LVM snapshot.

LVM snapshot How to and when to!

LVM (Logical Volume Manager) is one of the widely used volume managers on Unix and Linux. There were series of LVM articles we published at kerneltalks in the past. You can refer them in below list :

In this article we will be discussing how to take snapshots of the logical volume. Since LVM is widely used, you must know how to take a backup of logical volume on the LVM level. Continue reading this article to know more about LVM snapshots.

What is LVM snapshot?

LVM snapshot is a frozen image of the logical volume. It is an exact copy of LVM volume which has all the data of volume at the time of its creation. LVM copies blocks or chinks from volume to volume so it’s comparatively fast than file-based backup. Backup volume is read-only in LVM and read-write in LVM2 by default.

Why to use LVM snapshot?

LVM snapshots are a quick and fast way to create a data copy. If your volume size is big and you can not tolerate performance issues while doing online data backup or downtime while doing offline data backup then LVM snapshot is the way to go. You can create snapshots and then use your system normally. While the system is serving production, you can take a backup of your backup volume at your leisure without hampering system operations.

Another benefit of using snapshot is your data remain un-changed while the backup is going on. If you take a backup of production/live volume then data on volume may change resulting inconsistency in the backup. But in case of snapshot since no app/user using it or writing data on it, taking backup from a snapshot is consistent and smooth.

How to take LVM snapshot?

LVM snapshots are easy to take. You need to use the same lvcreate command with -s switch (denotes snapshot) and supply the name of the volume to back up in the end. Important things here are to decide the size of the backup volume. If backup volume size is unable to hold data volume then your snapshot becomes unusable and it will be dropped. So make sure you have little extra space in backup volume than your original data volume (per man page 15-20% might be enough).

Lets create LVM snapshot.

# df -h /mydata
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/vg01-lvol0   93M  1.6M   87M   2% /mydata

# lvcreate -L 150 -s -n backup /dev/vg01/lvol0
  Rounding up size to full physical extent 152.00 MiB
  Reducing COW size 152.00 MiB down to maximum usable size 104.00 MiB.
  Logical volume "backup" created.

In the above example, we created a snapshot of 100MB logical volume. For snapshot we defined 150MB space to be on the safer side. But command itself reduced it to 104MB optimum value to save on space. We also named this volume as a backup with -n switch.

Once the snapshot is created it can be mounted as any other logical volume. Once mounted it can be used as a normal mount point. Remember, LVM snapshots are read-only and LVM2 snapshots are read-write by default.

# mount /dev/vg01/backup /mybackup

# df -h
Filesystem              Size  Used Avail Use% Mounted on
/dev/mapper/vg01-lvol0  93M  1.6M   87M   2% /mydata
/dev/mapper/vg01-backup 93M  1.6M   87M   2% /mybackup

You can see after mounting backup volume its the exact same copy of the original volume. Original volume mounted on /mydata and backup volume mounted on /mybackup.

Now, you can take a backup of the mounted backup volume leaving original volume un-touched and without impacting its service to the environment.

How to delete LVM snapshot?

LVM snapshot deletion can be done as you delete the logical volume.  After backup you may want to delete snapshot then you just need to unmount it and use lvremove command.

# umount /mybackup
# lvremove /dev/vg01/backup
Do you really want to remove active logical volume backup? [y/n]: y
  Logical volume "backup" successfully removed

Conclusion

LVM snapshots are special volumes that hold an exact copy of your data volume. It helps in reducing the overhead of read-write on data volume when taking backup. It also helps for consistent backup since no app/user altering data on snapshot volume.

Process states in Linux

Learn different process states in Linux. Guide explaining what are they, how to identify them, and what do they do.

Different process states

In this article we will walk you through different process states in Linux. This will be helpful in analyzing processes during troubleshooting. Process states define what process is doing and what it is expected to do in the near time. The performance of the system depends on a major number of process states.

From birth (spawn) till death (kill/terminate or exit), the process has a life cycle going through several states. Some processes exist in process table even after they are killed/died, those processes are called zombie processes. We have seen much about the zombie process in this article. Let’s check different process states now. Broadly process states are :

  1. Running or Runnable
  2. Sleeping or waiting
  3. Stopped
  4. Zombie

How to check process state

top command lists total count of all these states in its output header.

top - 00:24:10 up 4 min,  1 user,  load average: 0.00, 0.01, 0.00
Tasks:  88 total,   1 running,  87 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1018308k total,   187992k used,   830316k free,    15064k buffers
Swap:        0k total,        0k used,        0k free,    67116k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      20   0 19352 1552 1240 S  0.0  0.2   0:01.80 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kthreadd
----- output clipped -----

See highlighted row named Tasks above. It shows the total number of processes and their state-wise split up.

Later in above top output observe column with heading S. This column shows process states. In the output we can see 2 processes in the sleeping state.

You can even use ps command to check process state. Use below syntax :

# ps -o pid,state,command
  PID S COMMAND
 1661 S sudo su -
 1662 S su -
 1663 S -bash
 1713 R ps -o pid,state,command

In the above output you can see column titled S shows state of the process. We have here 3 sleeping and one running process. Let’s dive into each state.

Process state: Running

The most healthy state of all. It indicates the process is active and serving its requests. The process is properly getting system resources (especially CPU) to perform its operations. Running process is a process which is being served by CPU currently. It can be identified by state flag R in ps or top output.

The runnable state is when the process has got all the system resources to perform its operation except CPU. This means the process is ready to go once the CPU is free. Runnable processes are also flagged with state flag R

Process state: Sleeping

The sleeping process is the one who waits for resources to run. Since its on the waiting stand, it gives up CPU and goes to sleep mode. Once its required resource is free, it gets placed in the scheduler queue for CPU to execute. There are two types of sleep modes: Interruptible and Uninterruptible

Interruptible sleep mode

This mode process waits for a particular time slot or a specific event to occur. If those conditions occur, the process will come out of sleep mode. These processes are shown with state S in ps or top output.

Uninterruptible sleep mode

The process in this sleep mode gets its timeout value before going to sleep. Once the timeout sets off, it awakes. Or it awakes when waited-upon resources become available for it. It can be identified by the state D in outputs.

Process state : Stopped

The process ends or terminates when they receive the kill signal or they enter exit status. At this moment, the process gives up all the occupied resources but does not release entry in the process table. Instead it sends signals about termination to its parent process. This helps the parent process to decide if a child is exited successfully or not. Once SIGCHLD received by the parent process, it takes action and releases child process entry in the process table.

Process state: Zombie

As explained above, while the exiting process sends SIGCHLD to parents. During the time between sending a signal to parent and then parent clearing out process slot in the process table, the process enters zombie mode. The process can stay in zombie mode if its parent died before it releases the child process’s slot in the process table. It can be identified with Z in outputs.

So complete life cycle of process can be circle as –

  • Spawn
  • Waiting
  • Runnable
  • Running
  • Stopped
  • Zombie
  • Removed from process table

Bash fork bomb: How does it work

Insights of Bash fork bomb. Understand how the fork bomb works, what it could to your system, and how to prevent it.

Insight of Bash fork bomb

We will be discussing Bash fork bomb in this article. This article will walk you through what is fork bomb, how it works, and how to prevent it from your systems. Before we proceed kindly read the notice carefully.

Caution: Fork bomb may crash your system if not configured properly. And also can bring your system performance down hence do not run it in production/live systems.

Fork bombs are normally used to test systems before sending them to production/live setup. Fork bombs once exploded can not be stopped. The only way to stop it, to kill all instances of it in one go or reboot your system. Hence you should be very careful when dropping it on the system, since you won’t be able to use that system until you reboot it. Let’s start with insights into the fork bomb.

What is fork bomb?

Its a type of DoS attack (Denial of Service). Little bash function magic! Fork bomb as the name suggests has the capability to fork its own child processes in system indefinably. Means once you start fork bomb it keeps on spawning new processes on the system. These new processes will stay alive in the background and keeps eating system resources until the system hangs. Mostly these new processes don’t do anything and keep idle in the background. Also, by spawning new processes it fills up the kernel process limit, and then the user won’t be able to start any new process i.e. won’t be able to use the system at all.

How fork bomb works?

Now, technically speaking: fork bomb is a function. It calls himself recursively in an indefinite loop. Check the below small example :

bombfn ()
{
bombfn | bombfn &
}
bombfn

Above is the smallest variant of the fork bomb function. Going step by step to read this code :

  1. bombfn () : Line one is to define new function named bombfn.
  2. { and } : Contains the function code
  3. bombfn : This last line calls the function (bombfn) to execute
  4. Now the function code within { } – Here the first occurrence of bombfn means function calls himself within him i.e. recursion. Then its output is piped to another call of the same function. Lastly & puts it in background.

Overall, the fork bomb is a function which calls itself recursively, piped output to another call to himself, and then put it in background.  This recursive call makes this whole process repeat itself for indefinite time till system crashes due to resource utilization full.

This function can be shortened as :(){ :|: & };: Observe carefully, bombfn is above example is replaced by : and the last call of function is inlined with ;

 :(){ :|: & };:

Since this seems just a handful of symbols, the victim easily fell for it thinking that even running it on terminal won’t cause any harm as anyway, it will return an error. But it’s not!

How to prevent bash fork bomb?

Fork bomb can be prevented by limiting user processes. If you don’t allow the user to fork many processes then fork bomb won’t be allowed to spawn many processes that could bring down the system. But yes it may slow down system and user running fork bomb won’t be able to run any commands from his session.

Caution : This solution works for non-root accounts only.

Process limit can be set under /etc/security/limits.conf or PAM configuration.

To define process limit add below code in /etc/security/limits.conf file :

user1            soft    nproc          XXX
user1            hard    nproc          YYYYY

Where :

  • user1 is username for which limits are being defined
  • soft and hard are the type of limits
  • nproc is variable for process limit
  • last numbers (XXX, YYYYY) are a total number of processes a user can fork. Set them according to your system capacity.

You can verify the process limit value of user by running ulimit -u from the user’s login.

Once you limit processes and run fork bomb, it will show you below warning until you kill it. But you will be able to use the system properly provided you set limits well within range.

-bash: fork: retry: Resource temporarily unavailable

You can kill it by pressing cntl+c on the next iteration of warning. Or you can kill using root account from another session.

Cowsay: Fun in Linux terminal

Cowsay command to have some ASCII graphics in your Linux terminal! This command displays strings of your choice as a cow is saying/thinking in graphical format.

Cowsay command!

Another article to have some fun in the Linux terminal. Previously we have seen how to create fancy ASCII banners and matrix falling code in Linux terminal. In this article we will see another small utility called cowsay which prints thinking cow ASCII picture on the terminal with a message of your choice.  Cowsay is useful to write eye-catchy messages to users in motd (message of the day)!

From the man page “Cowsay generates an ASCII picture of a cow saying something provided by the user. If run with no arguments, it accepts standard input, word-wraps the message given at about 40 columns, and prints the cow saying the given message on standard output.” That explains the functionality of cowsay. Let’s see it in action!

# cowsay I love kerneltalks.com
 ________________________
< I love kerneltalks.com >
 ------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

😀 this is how it looks in terminal! Awesome eh?

Installation is pretty simple. Install the cowsay package in your Linux and that’s it. For reference below are installation logs on my AWS EC2 Linux server.

# yum install cowsay
Loaded plugins: amazon-id, rhui-lb, search-disabled-repos, security
Setting up Install Process
epel/metalink                                                                                                                         |  12 kB     00:00
epel                                                                                                                                  | 4.2 kB     00:00
http://mirror.math.princeton.edu/pub/epel/6/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel
Trying other mirror.
epel                                                                                                                                  | 4.3 kB     00:00
epel/primary_db                                                                                                                       | 5.9 MB     00:09
rhui-REGION-client-config-server-6                                                                                                    | 2.9 kB     00:00
rhui-REGION-rhel-server-releases                                                                                                      | 3.5 kB     00:00
rhui-REGION-rhel-server-releases-optional                                                                                             | 3.5 kB     00:00
rhui-REGION-rhel-server-rh-common                                                                                                     | 3.8 kB     00:00
Resolving Dependencies
--> Running transaction check
---> Package cowsay.noarch 0:3.03-8.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=============================================================================================================================================================
 Package                              Arch                                 Version                                  Repository                          Size
=============================================================================================================================================================
Installing:
 cowsay                               noarch                               3.03-8.el6                               epel                                25 k

Transaction Summary
=============================================================================================================================================================
Install       1 Package(s)

Total download size: 25 k
Installed size: 31 k
Is this ok [y/N]: y
Downloading Packages:
cowsay-3.03-8.el6.noarch.rpm                                                                                                          |  25 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : cowsay-3.03-8.el6.noarch                                                                                                                  1/1
  Verifying  : cowsay-3.03-8.el6.noarch                                                                                                                  1/1

Installed:
  cowsay.noarch 0:3.03-8.el6

Complete!

Once successfully installed you can run cowsay command followed by the text you want the cow to say! There are different cow modes which you can use to change the appearance of cow 😀 (outputs later in this post)

  1. -b: borg mode
  2. -d: Cow appears dead
  3. -g: greedy mode
  4. -s: stoned cow
  5. -t: tired cow
  6. -y: Young cow 😛

Different Cowsay command examples

Normally cowsay word wraps. If you want fancy banners in cowsay you should use -n switch so that cowsay won’t word wrap and you get nice formatted output.

# figlet kerneltalks | cowsay -n
 __________________________________________________
/  _                        _ _        _ _         \
| | | _____ _ __ _ __   ___| | |_ __ _| | | _____  |
| | |/ / _ \ '__| '_ \ / _ \ | __/ _` | | |/ / __| |
| |   <  __/ |  | | | |  __/ | || (_| | |   <\__ \ |
| |_|\_\___|_|  |_| |_|\___|_|\__\__,_|_|_|\_\___/ |
\                                                  /
 --------------------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


Check out below cow appearences as listed above with different switches.

# cowsay -b kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (==)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
# cowsay -d kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (xx)\_______
            (__)\       )\/\
             U  ||----w |
                ||     ||
# cowsay -g kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  ($)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
# cowsay -s kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (**)\_______
            (__)\       )\/\
             U  ||----w |
                ||     ||
# cowsay -t kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (--)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
# cowsay -y kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (..)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

If you observe in all different modes eyes and tongues are the only entities that change. So, you can define and change them manually too! You can define eyes with -e switch and tongue with -T switch.

# cowsay -e 88  kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (88)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
# cowsay -T X kerneltalks
 _____________
< kerneltalks >
 -------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
             X ||----w |
                ||     ||

In above example I defined 88 as eyes and X as tongue!

It’s cool that developers coded so much versatility for such funny command too! Too much of switch support, man pages, and all!

Enable debugging to log NFS logs in Linux

Learn how to set up debugging to generate NFS logs. By default NFS daemon does not provide a dedicated log file. So you have to setup debugging.

Capturing NFS logs

One of our readers asked me to query that where are NFS logs are located? I decided to write this post then to answer his query since it’s not easy to just name a log file. There is a process you need to execute well in advance to capture your NFS logs. This article will help you to find answers for where are my NFS logs? Find the NFS log file location or where NFS daemon logs events?

There is NFS logging utility in Solaris called nfslogd (NFS transfer logs). It has a configuration file /etc/nfs/nfslog.conf and stores logs in a file /var/nfs/nfslog. But, I am yet to see/use it in Linux (if it does exist for Lx). If you any insight about it, let me know in the comments.

By default, NFS daemon does not have a dedicated log file whose configuration can be done while you setup the NFS server. You need to enable debugging for NFS daemon so that its events can be logged in /var/log/messages  syslog logfile. Sometimes, even without this debugging enabled few events may be logged to Syslog. These logs are not enough when you try to troubleshoot NFS related errors. So we need to enable debugging for NFS daemon and plenty handful of information will be available for you to analyze when to start NFS troubleshooting.

Below are NFS service start-stop logs in Syslog when debugging is not enabled

Jun 24 00:31:24 kerneltalks.com rpc.mountd[3310]: Version 1.2.3 starting
Jun 24 00:31:24 kerneltalks.com kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Jun 24 00:31:24 kerneltalks.com kernel: NFSD: starting 90-second grace period
Jun 24 00:31:46 kerneltalks.com kernel: nfsd: last server has exited, flushing export cache
Jun 24 00:31:46 kerneltalks.com rpc.mountd[3310]: Caught signal 15, un-registering and exiting.

rpcdebug is the command used to set NFS & RPC debug flags? This command supports below switch :

  • -m: specify modules to set or clear
  • -s: set given debug flags
  • -c: Clear flags

Pretty simple! If you want to enable debugging use -s, if you want to turn off/disable debugging use -c! Below is a list of important debug flags you can set or clear.

  • nfs: NFS client
  • nfsd: NFS server
  • NLM : Network lock manager of client or server
  • RPC : Remote procedure call module of client or server

Enable debugging for NFS logs :

Use the below command to enable NFS logs. Here are enabling all modules. You can instead use the module of your requirement from the above list instead of all.

# rpcdebug -m nfsd all
nfsd       sock fh export svc proc fileop auth repcache xdr lockd

In the above output you can see its enabled list of modules for debugging (on right) for daemon nfsd (on left). Once this is done you need to restart your NFS daemon. After restarting you can check Syslog and voila! There are your NFS logs!

Jun 24 00:31:46 kerneltalks.com kernel: nfsd: last server has exited, flushing export cache
Jun 24 00:31:46 kerneltalks.com rpc.mountd[3310]: Caught signal 15, un-registering and exiting.
Jun 24 00:32:03 kerneltalks.com kernel: svc_export_parse: '-test-client- /shareit  3 8192 65534 65534 0'
Jun 24 00:32:03 kerneltalks.com rpc.mountd[3496]: Version 1.2.3 starting
Jun 24 00:32:03 kerneltalks.com kernel: set_max_drc nfsd_drc_max_mem 962560
Jun 24 00:32:03 kerneltalks.com kernel: nfsd: creating service
Jun 24 00:32:03 kerneltalks.com kernel: nfsd: allocating 32 readahead buffers.
Jun 24 00:32:03 kerneltalks.com kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
Jun 24 00:32:03 kerneltalks.com kernel: NFSD: starting 90-second grace period

Logs follow standard Syslog format. Date, time, hostname, service, and finally message. You can compare these logs with logs given at the start of this post when debugging was not enabled. You can see there are pretty extra logs are being generated by debugging.

Disable debugging for NFS logs

Disabling debugging will stop logging NFS daemon logs. It can be done with -c switch.

# rpcdebug -m  nfsd -c  all
nfsd      <no flags set>

You can see, command clears all set flags and shows no flags set for nfsd to log any debug messages.

How to rescan disk in Linux after extending VMware disk

Learn to rescan disk in Linux VM when its backed vdisk in VMware is extended. This method does not require downtime and no data loss.

Re-scan vdisk in Linux

Sometimes we get a disk utilization situations and needs to increase disk space. In the VMware environment, this can be done on the fly at VMware level. VM assigned disk can be increased in size without any downtime. But, you need to take care of increasing space at OS level within VM. In such a scenario we often think, how to increase disk size in Linux when VMware disk size is increased? or how to increase mount point size when vdisk size is increased? or steps for expanding LVM partitions in VMware Linux guest? or how to rescan disk when vdisk expanded? We are going to see steps to achieve this without any downtime.

In our example here, we have one disk /dev/sdd assigned to VM of 1GB. It is part of volume group vg01 and mount point /mydrive is carved out of it. Now, we will increase the size of the disk to 2GB at VMware level and then will add up this space in the mount point /mydrive.

If you re using the whole disk in LVM without any fdisk partitioning, then skip step 1 and step 3.

Step 1:

See below fdisk -l output snippet showing disk /dev/sdd of 1GB size. We have created a single primary partition on it /dev/sdd1 which in turn forms vg01 as stated earlier. Always make sure you have data backup in place of the disk you are working on.

Disk /dev/sdd: 1073 MB, 1073741824 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x8bd61ee2

    Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         130     1044193+  83  Linux LVM

# ll /mydrive
total 24
drwx------. 2 root root 16384 Jun 23 11:00 lost+found
-rw-r--r--. 1 root root 0 Jun 23 11:01 shri
drwxr-xr-x. 3 root root 4096 Jun 23 11:01 .
dr-xr-xr-x. 28 root root 4096 Jun 23 11:04 ..

Step 2:

Now, change disk size at VMware level. We are increasing it by 1 more GB so the final size is 2GB now. At this stage disk need to be re-scanned in Linux so that kernel identifies this size change. Re-scan disk using below command :

echo 1>/sys/class/block/sdd/device/rescan
OR
echo 1>/sys/class/scsi_device/X:X:X:X/device/block/device/rescan

Make sure you use the correct disk name in command (before rescan). You can match your SCSI number (X:X:X:X) with VMware disk using this method.

Note : Sending “– – -” to /sys/class/scsi_host/hostX/scan is scanning SCSI host adapters for new disks on every channel (first -), every target (second -), and every device i.e. disk/lun (third -) i.e. CTD format. This will only help to scan when new devices are attached to the system. It will not help us to re-scan already identified devices.

That’s why we have to send “1” to /sys/class/block/XYZ/device/rescan to respective SCSI block device to refresh device information like the size. So this will be helpful here since our device is already identified by the kernel but we want the kernel to re-read its new size and update itself accordingly.

Now, kernel re-scan disk and fetch its new size. You can see new size is being shown in your fdisk -l output.

Disk /dev/sdd: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x8bd61ee2

    Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         130     1044193+  83  Linux LVM

Step 3:

At this stage, our kernel know the new size of the disk but our partition (/dev/sdd1) is still of old 1GB size. This left us no choice but to delete this partition and re-create it again with full size. Make a note here your data is safe and make sure your (old & new) partition is marked as Linux LVM using hex code  8e or else your will mess up the whole configuration.

Delete and re-create partition using fdisk console as below:

# fdisk /dev/sdd

Command (m for help): d
Selected partition 1
Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-261, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-261, default 261):
Using default value 261
Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 8e

Command (m for help): p

Disk /dev/xvdf: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x8bd61ee2

    Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1         261     2095458+  83  Linux LVM

All fdisk prompt commands are highlighted in the above output. Now you can see the new partition /dev/sdd1 is of 2GB size. But this partition table is not yet written to disk. Use w command at fdisk prompt to write table.

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

You may see the warning and error like above. If yes, you can use partprobe -s and you should be good. If you still below error with partprobe then you need to reboot your system (which is sad ).

Warning: WARNING: the kernel failed to re-read the partition table on /dev/sdd (Device or resource busy).  As a result, it may not reflect all of your changes until after reboot.

Step 4:

Now rest of the part should be tackled by LVM. You need to resize PV so that LVM identifies this new space. This can be done with pvresize command.

# pvresize /dev/sdd1
  Physical volume "/dev/sdd1" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized

As new PV size is learned by LVM you should see free/extra space available in VG.

# vgdisplay vg01
  --- Volume group ---
  VG Name               vg01
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               2.00 GiB
  PE Size               4.00 MiB
  Total PE              511
  Alloc PE / Size       250 / 1000.00 MiB
  Free  PE / Size       261 / 1.02 GiB
  VG UUID               0F8C4o-Jvd4-g2p9-E515-NSps-XsWQ-K2ehoq

You can see our VG now have 2GB space! Now you can use this space to create new lvol in this VG or extend existing lvol using LVM commands. Further you can extend filesystem online which is sitting on logical volumes.

You can observe all lvol in this VG will be unaffected by this activity and data is still there as it was previously.

# ll /mydrive
total 24
drwx------.  2 root root 16384 Jun 23 11:00 lost+found
-rw-r--r--.  1 root root     0 Jun 23 11:01 shri
drwxr-xr-x.  3 root root  4096 Jun 23 11:01 .
dr-xr-xr-x. 28 root root  4096 Jun 23 11:04 ..

11 log files you should see on your Linux system

Listing of important Linux log files and their formats. These logs play a vital role in troubleshooting and every sysadmin should be aware of them.

Important log files in Linux

Troubleshooting any issues with your system needs proper knowledge of log file structure and their locations. As a sysadmin you should know for which log file should be checked for a particular service issue. In this article we will walk through several hand-picked log files which are pretty much helpful for conducting preliminary analysis during troubleshooting. 10 log files we selected to discuss here are :

  1. /var/log/messages : General message and system related stuff
  2. /var/log/boot.log : Services success/failures at boot
  3. /var/log/secure or /var/log/auth.log : Authentication log
  4. /var/log/utmp or /var/log/wtmp : Login records
  5. /var/log/btmp  : Failed login records
  6. /var/log/cron : Scheduler log file
  7. /var/log/maillog : Mail logs
  8. /var/log/xferlog : File transfer logs
  9. /var/log/lastlog : Last login details
  10. dmesg : Device driver messages
  11. /var/crash logs : System crash dump

Lets see each log file one by one.

System consolidated log file : /var/log/messages

All system services which do not have their own special log file, normally write to this log file. Most of the system activity can be seen here hence its also called Syslog (system’s log). Every sysadmin first opens up this log when he starts troubleshooting! Sample log looks like below :

May 22 02:00:29 server1 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="999" x-info="http://www.rsyslog.com"] exiting on signal 15.
May 22 02:00:29 server1 kernel: imklog 5.8.10, log source = /proc/kmsg started.
May 22 02:00:29 server1 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1698" x-info="http://www.rsyslog.com"] start
May 22 02:17:43 server1 dhclient[916]: DHCPREQUEST on eth0 to 172.31.0.1 port 67 (xid=0x445faedb)

File can be read from left as:

  1. Date
  2. Time
  3. System hostname
  4. Service name (and sometimes PID as well)
  5. Message text

System hostname stands value when you have a centralized Syslog server. Message text and service name help you narrow your search. You can directly grep on this file to find specific service-related logs omitting other clutter. All file operations like cat, more, less, etc can be done on this file since its plain text file.

Since this log fills very fast on a busy system (as it logs almost everything), you should consider configuring logrotate for it so that it won’t full your mount point.

Service success/failures at boot : /var/log/boot.log

At the time of booting Linux server, you can see services being started and their success or failure status is displayed on local console. The same logs can be obtained from the boot log post-boot. This file lists all service’s success/failure status at boot time so that it can be referred later to troubleshoot any service-related issues.

$ cat /var/log/boot.log
growroot: FAILED: GPT partition found but no sgdisk
                Welcome to Red Hat Enterprise Linux Server
Starting udev: udevd[404]: can not read 'https://z5.kerneltalks.com/etc/udev/rules.d/75-persistent-net-generator.rules'
udevd[404]: can not read 'https://z5.kerneltalks.com/etc/udev/rules.d/75-persistent-net-generator.rules'

                                                           [  OK  ]
Setting hostname ip-172-31-1-120.ap-south-1.compute.interna[  OK  ]
Setting up Logical Volume Management:                      [  OK  ]
Checking filesystems
/dev/xvda1: clean, 77980/393216 files, 811583/1572864 blocks
                                                           [  OK  ]
Remounting root filesystem in read-write mode:             [  OK  ]
Mounting local filesystems:                                [  OK  ]
Enabling local filesystem quotas:                          [  OK  ]
Enabling /etc/fstab swaps:                                 [  OK  ]
Entering non-interactive startup

The above sample, file shows services, their status (on right), and any error/warning messages written to console (by service daemons).

Authentication logs : /var/log/secure, /var/log/auth.log

This log file is crucial to check user access logs. Log files stores information on user logins along with authentication used. It also stores sudo logs.

Jun 14 23:41:00 server1 sshd[1586]: Accepted publickey for ec2-user from 59.184.130.135 port 51265 ssh2
Jun 14 23:41:01 server1 sshd[1586]: pam_unix(sshd:session): session opened for user ec2-user by (uid=0)
Jun 14 23:41:04 server1 sudo: ec2-user : TTY=pts/0 ; PWD=/home/ec2-user ; USER=root ; COMMAND=/bin/su -
Jun 14 23:41:04 server1 su: pam_unix(su-l:session): session opened for user root by ec2-user(uid=0)
Jun 14 23:43:45 server1 su: pam_unix(su-l:session): session closed for user root

Log file can be read from left to right as :

  1. Date
  2. Time
  3. Server hostname
  4. Authentication service or daemon (sometimes along with PID)
  5. Message

Login records : /var/log/utmp, /var/log/wtmp

These wtmp or utmp file stores user login details. utmp stores login details like id, time, duration, the terminal used, system reboot details, etc. wtmp provides historical utmp data.  These files are not plain text files and hence need to be parsed to last command to read data within. You can use last -f <path> to read file.

ec2-user pts/1        59.184.170.243   Mon Jun 19 07:24   still logged in
ec2-user pts/0        59.184.170.243   Mon Jun 19 07:21 - 07:24  (00:02)
reboot   system boot  2.6.32-696.el6.x Mon Jun 19 07:20 - 07:39  (00:18)
ec2-user pts/0        59.184.130.135   Wed Jun 14 23:41 - 00:09  (00:28)
reboot   system boot  2.6.32-696.el6.x Wed Jun 14 23:40 - 00:09  (00:29)

In above output of last -f /var/log/wtmp you can see from left to right :

  1. User or event (reboot)
  2. Terminal used
  3. IP from which user was connected
  4. Date/time of login
  5. Date /time of log out
  6. Duration of session

Failed login records : /var/log/btmp

Its file dedicated to logging only failed login attempts. This too can be read using last command as mentioned above.

user1    ssh:notty    31.207.47.50     Tue Jun 13 01:18    gone - no logout
user     ssh:notty    31.207.47.50     Tue Jun 13 01:18 - 01:18  (00:00)
ubnt     ssh:notty    31.207.47.50     Tue Jun 13 01:18 - 01:18  (00:00)

It also shows user id, terminal, source IP, and time who tried to log into the system but failed to do so.

Scheduler logs : /var/log/cron

Cron (Linux scheduler) logs are saved under /var/log/cron. Job run status, daemon logs are saved here. It helps in understanding job execution and troubleshooting of scheduled jobs. Its plain text file and supports all normal file operations.

Jun 15 00:09:51 server1 crond[1473]: (CRON) INFO (Shutting down)
Jun 19 07:21:18 server1 crond[1474]: (CRON) STARTUP (1.4.4)
Jun 19 07:21:18 server1 crond[1474]: (CRON) INFO (RANDOM_DELAY will be scaled with factor 26% if used.)
Jun 19 07:21:20 server1 crond[1474]: (CRON) INFO (running with inotify support)
Jun 19 07:30:01 server1 CROND[1676]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Jun 19 07:40:01 server1 CROND[1702]: (root) CMD (/usr/lib64/sa/sa1 1 1)

Since the server was shut on 15 Jun and started on 19 Jun you can see cron daemon wrote down and up logs in this log file. For job executed, log code CMD is used. Daemon messages are tagged with INFO.

Mail logs : /var/log/maillog

Mail related logs saved here. It logs date-time, hostname, mail protocol (with PID sometimes), and messages.

Jun 15 00:09:51 server1 postfix/postfix-script[1939]: stopping the Postfix mail system
Jun 15 00:09:51 server1 postfix/master[1432]: terminating on signal 15
Jun 19 07:21:17 server1 postfix/postfix-script[1432]: starting the Postfix mail system
Jun 19 07:21:17 server1 postfix/master[1433]: daemon started -- version 2.6.6, configuration /etc/postfix

Since our server doesn’t have an SMTP configuration, there is nothing much in our file shown above.

File transfer logs : /var/log/xferlog

This log contains information from file transfer program/service daemons.

Mon Jun 19 04:44:43 2017 1 10.4.1.22 20 /CSV/test1.csv a _ o r testuser ftp 0 * c
Mon Jun 19 04:51:33 2017 1 10.4.1.22 432 /CSV/test4.csv a _ o r testuser ftp 0 * c
Mon Jun 19 04:57:15 2017 1 10.4.1.22 110 /CSV/test14.csv a _ o r testuser ftp 0 * c
Mon Jun 19 04:57:19 2017 1 10.4.1.22 2505 /CSV/master.csv a _ o r testuser ftp 0 * c

Sample log files hows date, time, source IP, file path copied, options used, the user who copied files, the protocol used details.

Last login details : /var/log/lastlog

System’s all user’s recent login details are saved to this log file. Its not plain text file and hence you need to use lastlog command which reads data from this log file and print on the terminal in a human-readable format. It sorts users with their order in /etc/passwd file.

# lastlog
Username         Port     From             Latest
root                                       **Never logged in**
bin                                        **Never logged in**
oprofile                                   **Never logged in**
tcpdump                                    **Never logged in**
ec2-user         pts/1    59.184.170.243   Mon Jun 19 07:24:14 -0400 2017
apache                                     **Never logged in**

Above output is pretty self explanatory.

Device driver messages : dmesg

Boot time device/driver writes messages on the console. Those messages can be viewed as post boot as well using dmesg command. These messages help in troubleshooting devices or driver initialization issues.

# dmesg
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-696.el6.x86_64 (mockbuild@x86-027.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Tue Feb 21 00:53:17 EST 2017
Command line: console=ttyS0 ro root=UUID=1b7ea291-67b4-48da-802a-be4b2bcb64d5 rd_NO_LUKS  KEYBOARDTYPE=pc KEYTABLE=us LANG=en_US.UTF-8 xen_blkfront.sda_is_xvda=1 console=ttyS0,115200n8 console=tty0 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_NO_LVM rd_NO_DM
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009e000 (usable)
 BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved)

----- output trimmed -----

Those are all boot time messages just before /var/log/boot.log

System crash dump : /var/crash

This is the directory under which the core-dump is saved in case of a system crash. Obviously you need to configure dump configs on the system for it. This information is very important to get the root cause of the system crash. This is almost the image of the current state of the system at the time of the crash. Most of the vendor asks this dump files to analyze when we log a case with them for a system crash.