HPUX: Add persistent device names in VG

Know command to add persistent device names in the existing volume group in HPUX. Also, learn how to match legacy devices with persistent device names.

Newer HPUX like v3 introduced persistent device names (DSF disk special files). These are more convenient and user-friendly names than ctd format (Controller, target, device) names. CTD format are called legacy DSF and newer are persistent.

Persistent DSF takes care of native multipathing. In the case of multipathing, for the same disk, there are different ctd exists and hence different legacy names. But, there will be only one persistent device name per disk no matter how many ctd paths exist for that same disk. You can see this in the device file names mapping below.

For example, the legacy disk file name is /dev/dsk/c0t1d0 whereas persistent disk file name will be /dev/disk/disk1. A system with persistent device names does have legacy names in kernel too. Persistent to legacy names can be mapped using ioscan command like below :

# ioscan -m dsf
Persistent DSF             Legacy DSF(s) 
======================================== 
/dev/rdisk/disk1           /dev/rdsk/c1t0d0 
/dev/rdisk/disk2           /dev/rdsk/c4t0d0 
/dev/rdisk/disk3           /dev/rdsk/c2t0d1 
                           /dev/rdsk/c2t0d2 
/dev/rdisk/disk4           /dev/rdsk/c3t0d1 
                           /dev/rdsk/c3t0d2

In the above output, you can see the persistent device name on left and its related legacy name is on right. You can see multipathing is being taken care of by persistent device files. Since there is only one persistent name but many legacy device names for the same disk.

Adding persistent devices in VG :

For example, if you have created a volume group on a system with legacy device files. Now, you want to add persistent device files to VG. Then you can do it by using vgextend command with all persistent disk paths. But this method will be hectic since you need to search persistent names for all existing legacy devices in VG and accordingly add them in VG.

HPUX v3 has a special command vgdsf which will do this task for you. You just need to provide VG name and this command will add all persistent devices in VG and removes legacy devices out.

# vgdsf -c /dev/vg01

Converting legacy DSFs to persistent DSFs in VG /dev/vg01
Persistent DSF /dev/disk/disk3 added to VG /dev/vg01
Persistent DSF /dev/disk/disk4 added to VG /dev/vg01
Legacy DSF /dev/dsk/c2t0d1 removed from VG /dev/vg01
Legacy DSF /dev/dsk/c2t0d2 removed from VG /dev/vg01
Legacy DSF /dev/dsk/c3t0d1 removed from VG /dev/vg01
Legacy DSF /dev/dsk/c3t0d2 removed from VG /dev/vg01

In the above output you can see first it adds persistent dsf to VG and then removes legacy dsf from VG. You can verify if VG contains only persistent devices using vgdisplay command or examining /etc/lvmtab file.

How to check and test APA in HPUX

A how-to guide for checking and testing APA configurations in HPUX. Auto Port Aggregation is used for NIC redundancy which is similar to NIC teaming in Linux.

APA stands for Auto Port aggregation. It is software i.e. operating system level configuration which offers NIC (Network Interface Card also referred to as LAN card) redundancy. Under APA in HPUX, two NICs are configured together as a single virtual card at OS level. For OS, it’s a single NIC it’s talking to. But physically there are 2 NIC handling requests on this virtual card. On the occasion of hardware failure of anyone physical card, another physical card service OS (through virtual card) without hampering operations.

Complete guide : How to configure APA in HPUX

Normally, physical NICs are numbered as lan0, lan1, lan2, and so on. APA in HPUX terms new virtual cards as lan900, lan901, and so on. The current list of lan cards on the system can be obtained using the below command :

# ioscan -fnClan
Class     I  H/W Path    Driver S/W State   H/W Type     Description
=====================================================================
lan       0  2/0/0/1/0   igelan   CLAIMED     INTERFACE    HP PCI-X 1000Base-T Built-in
lan       1  2/0/4/1/0   iether   CLAIMED     INTERFACE    HP A7012-60601 PCI/PCI-X 1000Base-T Dual-port Adapter
lan       2  2/0/4/1/1   iether   CLAIMED     INTERFACE    HP A7012-60601 PCI/PCI-X 1000Base-T Dual-port Adapter
lan       3  2/0/6/1/0   iether   CLAIMED     INTERFACE    HP A7012-60601 PCI/PCI-X 1000Base-T Dual-port Adapter

In the above output, you can see, the second column which shows lan number. So we have 4 lan cards numbering lan0 to lan3 here. But in this output, you won’t be able to see APA interfaces i.e. virtual NIC.

For checking APA interfaces you need to use lanscan command.

# lanscan -q
2
3
900
901   0  1
902
903
904

Here you can see lan0 and lan 1 combined together forms lan901 interface which is APA card. Since those are used in APA, you don’t see them as separate entries like lan2 and lan3. This way you can trace physical NIC and their respective virtual or APA interfaces.

Testing APA:

To test APA means to check if your network connectivity via APA interface is uninterrupted in case of one of the physical NIC failure.

You can test this by removing one of the NIC physically from the system board. But this is not recommended since abruptly removing cards from the board also invites un-foreseen hardware issues. So, to test APA we have to emulate NIC failure or shuts NIC down without touching hardware.

This can be achieved by resetting NIC using lanadmin command. Resetting NIC makes card unavailable/un-operational for few seconds. This time is enough for us to test APA in HPUX.

Complete test can be carried out in below order:

  1. Identify IP defined on lan901 (our APA interface)
  2. Keep continuous ping on for this IP
  3. Reset lan0
  4. Observe ping
  5. Once lan0 comes back up reset lan1
  6. Observe ping
  7. Make sure both lan0 and lan1 are back online.

To reset lan you can use below command :

# lanadmin -r 0

To check if lan being reset is online or offline in APA

# lanscan -q
2
3
900
901   1   <<<< missing 0 means lan0 is offline

Repeat above command till lan901 shows 0 and 1 both interfaces.

During this test, you may observe one or two ping loss. This is due to APA shifting loads to the only available interface. This ping loss won’t hamper the operating environment because its far less than timeout values defined in software/tools used on OS. Hence redundancy is maintained in case of NIC failure.

Above test will fail i.e. you will completely lose ping to IP in the below scenarios :

  1. Your APA configuration is erroneous
  2. One or both lan interfaces are not configured properly at the network level (VLAN configurations)

How to rename logical volume in Linux and HPUX

Learn how to rename logical volume in Linux or Unix. Understand what happens in the background when you change the logical volume name of existing LVOL.

LVM i.e. logical volume manager is one of the widely used volume managers in Linux and Unix. A logical volume is a portion of the volume group which can be mounted on a mount point. Once mounted, space belonging to that logical volume is available for use to end-user.

In this post, we are going to see step by step how to rename logical volume. In Linux, lvrename is a direct command which does this stuff for you. But first, we will see how it works in the background so that you know the flow and you can rename LV even without lvrename command.

LV renaming procedure follows below flow :

  1. Stop all user/app access to related mount point (on which lvol is mounted) using fuser
  2. Un-mount LV using umount
  3. Rename device names of lvol using mv
  4. Mount LV using mount
  5. Edit /etc/fstab entry related to this lvol using vi

Let’s see an example where we are renaming /dev/vg01/lvol1 which is mounted on /data to /dev/vg01/lvol_one. See the below output for the above-mentioned steps (HPUX console).

# bdf /data
/dev/vg01/lvol1 524288 49360 471256 9% /data
# fuser -cku /data
/data:   223412c(user1)
# umount /data
# mv /dev/vg01/lvol1 /dev/vg01/lvol_one
# mv /dev/vg01/rlvol1 /dev/vg01/rlvol_one
# mount /data
# bdf /data
/dev/vg01/lvol_one    524288    49360   471256  9%   /data

In the above output, you can see how we renamed logical volume just by renaming its device files.

In Linux, we have a single command lvrename which do all the above steps in the background for you. You just need to provide it with old and new lvol names along with the volume group where this lvol belongs. So, the above scenario will have below command –

# lvrename vg01 lvol1 lvol_one
  Renamed "lvol1" to "lvol_one" in volume group "vg01"

You can see in the output that single command renamed lvol1 to lvol_one! This command also supports below option :

  • -t For test
  • -v Verbose mode
  • -f Forceful operation
  • -d debug

How to map Linux disk to vmware disk

Learn how to map Linux disk to VMware disk for virtual machines hosted on VMware. This ensures you are treating the correct disk on the virtualization layer.

How to map linux vm disk to vmware disk

It’s always a challenge to identify the correct disk physically when it’s being used in the virtualization layer. Since disk or hardware is attached to the Host physically and made visible to the guest server. Any activity related to physical attribute which is to be done on guest machine seeks perfect mapping of hardware from guest to host. In other posts, we already explained mapping iVM disks to host disks in HPUX (HPUX virtualization). In this post, we will be seeing how to map Linux disk to VMware disk (VMware virtualization).

Like HPUX, we do not have a direct command to see the mapping of disks. In HP, both hardware (server), OS software (HPUX), and virtualization technology (iVM) all three products are owned/developed by HP. This makes it possible to integrate tasks into a single command. Since VMware, Linux is not a single vendor configuration, I think it’s not yet possible to get things done with single-line command.

To map VMware disks to Linux VM, we need to check and relate the SCSI id of disks.

In vmware :

Check VM settings and identify disk SCSI id.

As highlighted  in above screenshot, identify your SCSI id. Here its 0:0

In Linux VM:

Now login to Linux VM and execute below command :

# dmesg | grep -i 'Attached SCSI disk'
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 1, lun 0

We are filtering disk messages from syslog at the boot, to get disk SCSI id identified by the kernel. This will show us disk names along with 4 numbers.  Every disk has 4 numbers displayed in the output. SCSI, channel, id, and lun. We are interested here in channel and id numbers.

For example, disk sda in the above output has numbers 0:0:0:0 whereas disk sdb has 0:0:1:0. Look at the second and third number i.e. 0:0:0:0 or 0:0:1:0. Sometimes you have to check first and third fields to match numbers.

Now match this SCSI id with the id you got from VMware console (VM settings panel).  0:0 is matching that means disk sda in Linux is what we are looking at in VMware Hard Disk 1.

Sometimes RDM disks are assigned to guests from VMware in that case above method is not sufficient to identify disks. You have another approach.

Get sg number from Linux VM :

# dmesg|grep sg |grep Attached
sr 0:0:0:0: Attached scsi generic sg0 type 5
sd 2:0:0:0: Attached scsi generic sg1 type 0
sd 2:0:1:0: Attached scsi generic sg2 type 0

This sgX number is always less than one from VMware “Hard Disk X” number.  So sgX+1=Hard Disk X. This means in Linux disk sg starts with sg0 but in VMware it starts with Hard Disk 1. In the above example, sg1 will be Hard Disk 2 in VMware.

Match numbers accordingly and you will be able to map guest disk with vmware disk!

If you have any other method to get this task done, please drop us in comments!

Ulimit value : All you need to know

Understand what is ulimit? How to set it? Which all system resources can be limited using ulimit control? and how to view current ulimit settings.

In this article, we are going to see everything about bash built-in ulimit value. This is your key to keep the system safe from fork bombs or malicious codes aimed at hung systems by crunching resources.

What is ulimit?

It can roughly be called a user limit! Using this value you are limiting shell and its forked processes to use certain defined system resources. This helps in managing system resources and in turn processes efficiently. Using you can make sure that all-important processes on the server always get resources while the least important once cant hog more than what they should get. There are different parameters can be defined under ulimit umbrella which we will see ahead.

To view your current ulimit setting run below command :

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 95174
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

As you can see in the above output, the left column denotes parameters that can be limited using ulimit, along with their measuring unit and option to be used in braces and the last column shows current set value.

ulimit controlled parameters :

See below list of parameters that can be limit using ulimit and their details. The list is from man page and parameters are self-explanatory. Since this is bash built-in; when you check the manpage, you will see all bash commands. You have to scroll that man page all the way to bottom (since its alphabetically sorted) to get to ulimit section. There you will find these parameters.

Different ulimit parameters

OptionParameter
-a
All current limits are reported
-b
The maximum socket buffer size
-c
The maximum size of core files created
-d
The maximum size of a process’s data segment
-e
The maximum scheduling priority (“nice”)
-f
The maximum size of files written by the shell and its children
-i
The maximum number of pending signals
-l
The maximum size that may be locked into memory
-m
The maximum resident set size (many systems do not honor this limit)
-n
The maximum number of open file descriptors (most systems do not allow this value to be set)
-p
The pipe size in 512-byte blocks (this may not be set)
-q
The maximum number of bytes in POSIX message queues
-r
The maximum real-time scheduling priority
-s
The maximum stack size
-t
The maximum amount of cpu time in seconds
-u
The maximum number of processes available to a single user
-v
The maximum amount of virtual memory available to the shell
-x
The maximum number of file locks
-T
The maximum number of threads

To set specific parameter limit values, you can issue the command :

# ulimit -option <value>

Once done, it will limit this parameter for the current shell (shell from where the command was run) and it’s all forked processes. A more efficient way to implement limits is through profiles that are discussed next.

How to setup ulimit :

The most common use in corporate Infra is for database servers. Since we all know that DB is resource-hungry application. So many times ulimits specified to it in terms of  -p or -n etc. This setting is done in DB owner user id like Oracle’s (user id with which DB application runs on the server) .bash_profile in its home or /etc/profile or through custom scripts which loads when DB starts. Find below code snippet which can be used in /etc/profile:

if [ $USER = "oracle" ] || [ $USER = "oradb" ]; then
if [ $SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 10000 -n 35000
fi
fi

Whenever users logged in, /etc/profile gets executed. It checks if the user is oracle or oradb and if yes with Ksh shell, it sets respective ulimits values for it!

This ensures parameters are set for that user’s shell when DB application starts in the user’s shell. Since,ulimit limits resources for shell and its forked processes, these values get imposed for DB apps running under that user’s shell!

How to run your script with system boot in HPUX

A how-to guide for a running script or starting your custom coded daemon/service at system boot. Use the RC script directory to load custom daemons at boot.

There are many daemons and system services which starts with system boot. You might be wondering how to add your own script or customized daemon or service in boot sequence so that when the system is booted its there already running.

In this article, we will be seeing how to run your script with system boot in HPUX. First, we will create a small script that will be taking the start, stop arguments to call your original script/daemon/service. Then we will keep this script inside an appropriate run level RC directory so that it will be executed when the system enters that run level.

Let’s assume you have /usr/sbin/my_agentd to start at boot. You can create an additional script /sbin/init.d/my_agent which can take the start, stop options like below :

# cat /sbin/init.d/my_agent

choice=$1
case $choice in
"start")
        cd /usr/sbin
        ./my_agentd <other option if any>
        ;;
"stop")
        ps -ef|grep -i my_agentd|grep -v grep|awk '{print $2}'|xargs kill -9
        ;;
esac

The above script will take the start and stop as arguments. It will execute your agent binary when supplied with start argument and kills your running agent if supplied with a stop option. It is advisable to keep this script in /sbin/init.d directory since there lives all start, stop scripts of daemons or services.

Make sure you give proper executable permissions to this newly crafted script file.

# chmod 555 /sbin/init.d/my_agent

Now the last step is to have this script executed with run-level 3 (multi-user mode). To accomplish this, create a link for this file in /sbin/rc3.d directory.

/sbin/rc3.d directory contains all run level 3 related startup scripts. Keeping yours in it makes sure that it will start with run level 3.

# cd /sbin/rc3.d
# ln -s /sbin/init.d/my_agent S99my_agent

You are all set!

Now, whenever your system enters run level 3. It will try to execute S99my_agent file with the start argument. Which in turn calls /sbin/init.d/my_agent since its a link. When /sbin/init.d/my_agent (our coded script) gets start an argument, it calls /usr/sbin/my_agentd which is your customized daemon/service/script.

How to restart Apache server in Linux

Learn how to restart Apache webserver in Linux from the command line. Know log file locations to look for troubleshooting during the restart process.

Apache webserver is one of the most widely used web servers for the Linux environment. Its easy webserver configuration, quick SSL configuration, separated log files make it easy to manage for sysadmin.

In this post, we will be seeing how to restart apache instances in Linux from the command line. We will also see its log files which can help us while troubleshooting the restart process.

Apache instance normally resides in /etc/httpd directory. If you have multiple instances running on the same machine then they might be in different directories under /etc/httpd/httpd-Name1, /etc/httpd/httpd-Name2 etc. It is recommended to have different user ids to run different instances so that they can be managed separately well.

Check running Apache:

Currently running Apache instance can be identified by using any of below commands :

root@kerneltalks # ps -ef |grep -i httpd
apache   15785 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start
apache   15786 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start
apache   15787 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start
apache   15788 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start
apache   15789 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start
apache   15790 20667  0 07:50 ?        00:00:02 /usr/sbin/httpd -f /etc/httpd/conf/httpd.conf -k start

Since my machine has multiple Apache instances running, you can see the same PPID for all processes. This will be PPID of main httpd service.

root@kerneltalks # service httpd status httpd (pid  20667) is running... 

To stop Apache :

To stop Apache we need to invoke Apache control script and provide it with the configuration file name (so that it knows which instance to stop in case of multiple instances).

# /usr/sbin/apachectl -f /etc/httpd/conf/httpd.conf -k stop

In the above example apachectl is supplied with conf file location (-f) and option (-k) to stop. Once above command is executed, you will notice httpd processes are no more visible in ps -ef output.

In case of single instance only you can directly stop service to stop Apache.

# service httpd stop

Once apache is stopped you will also notice that /var/httpd/logs/access.log also stops populating.

To start Apache:

To start it again you need to invoke apachectl with start argument.

# /usr/sbin/apachectl -f /etc/httpd/conf/httpd.conf -k start

In case of single instance you can directly start service.

# service httpd start

Once Apache is started backup you will see httpd processes in ps -ef output. Also access.log will start populating again.

If Apache is not starting up (after you made changes in configurations) then you need to check error.log file for reasons. It resides under /etc/httpd/logs directory.

To restart Apache :

Above both operations can be combined with restart options. When invoked, it will stop the instance first and then start it again.

# /usr/sbin/apachectl -f /etc/httpd/conf/httpd.conf -k restart

# service httpd restart

Above both commands restart Apache instances.

6 ways to manage service startups using chkconfig in Linux

Learn chkconfig command to list, turn on, or turn off system services during different run levels. Also, see how to control xinetd managed system service.

The chkconfig utility in Linux is a command-line tool to manage system services startups in run levels starting from 0 to 6. Run levels are different system stages in which the system can run with different numbers and combinations of services available for its users. There are mainly 7 run levels defined in the kernel world whereas 0,1 and 6 are not relative in this post. 0,6 run levels are boot/shut related whereas 1 is the single-user mode and we are seeing service management here which is normally comes in a picture for multi-user mode. So only 2,3,4,5 are the run levels that are covered by chkconfig utility when you are altering any services.

chkconfig controls which system service starts in which run level. It can even completely shuts off service so that it won’t run at any run level too. There are xinetd controlled system services that can be turned on or off irrespective of run level. These can be managed through chkconfig too.

Chkconfig is capable of below tasks:

  1. Overview of list of services and their run-level wise availability
  2. Details of individual system service
  3. Enable service at certain/all run levels
  4. Disable service
  5. Enable xinetd managed service
  6. Disable xinetd managed service

If you are on the Ubuntu server then probably chkconfig won’t work on Ubuntu. You can alternatively use update-rc.d command there.

We will see each of the above options in detail with related options and examples.

1. Overview of services and their run-level availability :

For this task, chkconfig displays a list of system services. Against every service, it displays all run levels 0 to 6 and on/off parameter. On means, service is enabled at that run level, and off means service is disabled at that run level. To view this you can run chkconfig command without any argument or with --list option.

# chkconfig --list
NetworkManager  0:off   1:off   2:off   3:off   4:off   5:off   6:off
abrt-ccpp       0:off   1:off   2:off   3:on    4:off   5:on    6:off
abrt-oops       0:off   1:off   2:off   3:on    4:off   5:on    6:off
abrtd           0:off   1:off   2:off   3:on    4:off   5:on    6:off
acpid           0:off   1:off   2:on    3:on    4:on    5:on    6:off
atd             0:off   1:off   2:off   3:on    4:on    5:on    6:off
auditd          0:off   1:off   2:on    3:on    4:on    5:off   6:off
autofs          0:off   1:off   2:off   3:on    4:on    5:on    6:off

-----output clipped-----
xinetd based services:
        chargen-dgram:  off
        chargen-stream: off
        time-stream:    off
----- output clipped -----

In the above output for example system service autofs is enabled for run level 3,4,5 and disabled for 0,1,2,6. It also shows xinetd based service status at the bottom. These services only have on or off status (no run level based status)

2. Details of individual service

In this task the same output as above can be obtained only for a particular service. It is as good as grepping out that service line from the above output. This task can be accomplished by using the service name as an argument to --list option.

# chkconfig --list nfs
nfs             0:off   1:off   2:off   3:off   4:off   5:off   6:off

# chkconfig --list rsync
rsync           off

3. Enable service at certain/all run levels :

Let’s see how to enable service for a particular/all run level. For this task, you need to specify levels (--level) on which services need to enable, followed by service name and lastly on a state.

# chkconfig --level 34 autofs on

In above example we are enabling autofs service in run level 3 and 4.

To enable it in all run levels i.e. 2,3,4,5 you can skip --level option.

# chkconfig autofs on

The above example enables autofs system service at all run levels. You can verify if the service state is set properly as specified in the command, by viewing chkconfig --list output.

4. Disable service

To disable service at certain/all run levels, sane above command can be used only ‘off’ state needs to be specified at the end instead of ‘on’ state.

# chkconfig --level 34 autofs off <<turns off at run level 3 & 4

# chkconfig autofs off <<turns off on all run levels

Make a note that this change is not immediate. Enabling or disabling services will change settings but takes effect only when the system will enter a specific run level after the execution of the command.

5. Enable xinetd managed service:

Since xinetd managed services cant be tagged to run-levels, you can not specify --level option with commands associated with them. To enable xinetd managed service, you need to mention the service name followed by ‘on’ state.

# chkconfig rsync on

Make a note that, change in xinetd managed services will be instant after the execution of the command.

6. Disable xinetd managed service:

To disable xinetd managed services opt ‘off’ state in above command.

# chkconfig rsync off




Observe state change in chkconfig --list output.

How to create tar file in Linux or Unix

Learn how to create tar file in Linux or Unix operating system. Understand how to create, list, extract tar files with preserving file permissions.

TAR is a short form of tape archive. They are used to sequentially read or write data. Compiling a huge file list or deep directory structure into a single file (tar file) is what we are seeing in this post. This resulting single file sometimes termed as tarball as well.

Creating tape archives of files or directories is extremely important when you are planning to FTP a huge pile of them to another server. Since each file opens a new FTP session and closes it after transfer finishes, if the number of files is huge then FTP takes forever to complete data transfer (even if size of each file is small). Even if you have an ample amount of bandwidth, due to session open and close operations FTP slows down to knees. In such cases, archiving all files in single tarball makes sense. This single archive can transfer at maximum speed and making transfer quickly. After transfer, you can again expand the archive and get all files in place the same as the source.

Normally its good practice to first zip files or directories and then create a tape archive of it. Zipping reduces their size and tar packs these reduced size files in single tarball! In Linux or Unix, command tar is used to create a tar file. Let’s see different operations that can be handled by tar command.

How to create tar file :

To create tar file -c option is used with tar command. It should be followed with the filename of the archive, normally it should have *.tar extension. It for users to identify its a tar archive, because Linux/Unix really don’t care about an extension.

# tar -cvf file.tar file2 file3
file2
file3
# ll
total 32
-rw-r--r-- 1 root users    40 Jan  3 00:46 file2
-rw-r--r-- 1 root users   114 Jan  3 00:46 file3
-rw-r--r-- 1 root users 10240 Jan  9 10:22 file.tar

In the above output, we have used -v (verbose mode) for -c option. file.tar is archive name specified and the end of the command is stuffed with a list of files to be added in the archive. The output shows filename on the terminal which is currently being added to the archive. Once all files are processed, you can see the new archive in the specified path.

To add directories in the tar file, you need to add directory path at end of the same command above.

# tar -cvf dir.tar dir4/
dir4/
dir4/file.tar
dir4/file1
dir4/file3
dir4/file2

If you use a full absolute path in command then tar will remove leading / so making members path relative. When archive will be deflated then all members will have a path starting with the current working directory/specified directory rather than leading /

# tar -cvf dir2.tar /tmp/dir4
tar: Removing leading `/' from member names
/tmp/dir4/
/tmp/dir4/file.tar
/tmp/dir4/file1
/tmp/dir4/file3
/tmp/dir4/file2

You can see / being removed warning in output above.

How to view content of tar file :

Tar file content can be viewed without deflating it. -t i.e. table option does this task. It shows file permissions, ownerships, size, date timestamp, and relative file path.

# tar -tvf dir4.tar
drwxr-xr-x root/users      0 2017-01-09 10:28 tmp/dir4/
-rw-r--r-- root/users  10240 2017-01-09 10:22 tmp/dir4/file.tar
-rw-r--r-- root/users  10240 2017-01-09 10:21 tmp/dir4/file1
-rw-r--r-- root/users    114 2017-01-03 00:46 tmp/dir4/file3
-rw-r--r-- root/users     40 2017-01-03 00:46 tmp/dir4/file2

In the above output, you can see the mentioned parameters in the same sequence displayed. Relative file path means when this tar file extracts it will create files of directories in PWD or path specified in the command line. For example in the above output, if you extract this file in /home/user4/data directory then it will create tmp directory under /home/user4/data and extracts all files in it i.e. under /home/user4/data/tmp.

How to extract tar file :

-x option extracts files from the tar file. As explained above, it will extract files and directories within tar into a relative path.

# tar -xvf dir4.tar
tmp/dir4/
tmp/dir4/file.tar
tmp/dir4/file1
tmp/dir4/file3
tmp/dir4/file2
# ll
total 36
-rw-r--r-- 1 root users 30720 Jan  9 10:28 dir4.tar
drwxr-xr-x 3 root users  4096 Jan  9 10:46 tmp
# ll tmp
total 4
drwxr-xr-x 2 root users 4096 Jan  9 10:28 dir4
# ll tmp/dir4
total 32
-rw-r--r-- 1 root users 10240 Jan  9 10:21 file1
-rw-r--r-- 1 root users    40 Jan  3 00:46 file2
-rw-r--r-- 1 root users   114 Jan  3 00:46 file3
-rw-r--r-- 1 root users 10240 Jan  9 10:22 file.tar

In the above outputs, step by step you can see tmp and dir4 structure created by tar command and extracted all files within it.

There are also few more options supported by tar-like appending files in the existing archives, zipping, updating new files in the existing archives, etc. We will see them in another post later. Leave us comments if you have any queries or suggestions.