Linux infrastructure handover checklist

Checklist which will help you in taking Linux infrastructure handover or transition from other support parties

Pointers for Linux infra handover

Handover or transition is an unavoidable step of the project that comes in every sysadmin’s life. Its a process of taking over roles and responsibilities from one operation party to another due to change in support contracts/business etc.

The obvious thing here is to understand the current setup and working procedures so that you can continue it once the previous support party leaves the authority. So we will walk you through the list of points or questions that will help you in Linux infrastructure handover or transition. You can treat this as a questionnaire or checklist for Linux handover.

If you are going to handle servers hosted in public cloud like AWS, Azure then the majority of below pointers are just don’t stand any value 🙂

Hardware

We are considering here remote support so managing hardware is not really in the scope of handover. So only generic knowledge about hardware is enough and no detailed analysis required. If your transition/handover includes taking over hardware management as well then you might need more detailed information than listed below.

  1. Hardware details of proprietary systems like HPUX, AIX, Blade, Rackmount servers for inventory purposes.
  2. Datacenter logical diagram with location of all CI. This will be helpful for locating CI quickly for hardware maintenance.
  3. Vendor details along with SLA, datacenter contacts and OEM contacts for hardware support at datacenter, escalation matrix.
  4. Vendor coordination process for HW support at the datacenter
  5. DR site details and connectivity details between primary and DR site
Server Connectivity

One of the prime requirements whenever to take over any Linux Infra. First thing is to know how you can reach remote Linux servers or even local servers along with their console accesses.

  1. How servers are being accessed from remote locations? Jump server details if any.
  2. VPN access details if any. The process to get new VPN access, etc.
  3. Accounts on Linux servers for logins (LDAP, root, etc if implemented)
  4. How console access is provided for physical servers?
Licensing & contracts

When it comes to supporting Infrastructure, you should be well aware of contracts you have with hardware and software vendors so that you can escalate the things when they require expert’s eyes.

  1. Vendor contract information for OS being used (Redhat, Suse, OEL, etc.) includes start/end date, SLA details, level of support included, products included, escalation matrix, etc.
  2. Software licenses for all tools along with middleware software being used in infrastructure.
  3. Procedure or contacts of the team to renew the above said contracts or licenses.
Risk mitigation plans for old technology

Every company runs a few CI with old technology for sure. So one should take into consideration the up-gradation of these CI while taking handover. Old technology dies over a period of time and becomes difficult day by day to support. Hence its always advisable to identify them as a risk before taking handover and have clarity of its mitigation from ower.

  1. Linux infrastructure future roadmap for servers running old OS (i.e. end of life or end of support)
  2. Discuss migration plans for servers running AIX, HPUX Unix flavours to Linux if they are running out of contracts and support by the vendor in near future.
  3. Ask for a migration plan of servers running non-enterprise Linux flavours like CentOS, Fedora, Oracle Linux, etc.
  4. Same investigation for tools or solutions in Linux infra being used for monitoring, patching, automation, etc.
Linux patching

Quarterly planned activity! Patching is an inseparable part of the Linux lifecycle. Obviously we made a separate section for it. Get whatever details you can gather around this topic from the owner or previous support party.

  1. What are the patching solutions are being used like spacewalk server, SUSE manager server, etc?
  2. If not what is the procedure to obtain new patches? If its from Vendor then check related licenses, portal logins, etc.
  3. What are patching cycles being followed? i.e, Frequency of patching, patching calendar if any, 
  4. Patching procedure, downtime approval process, ITSM tool’s role in patching activities, co-ordination process, etc.
  5. Check if any patching automation implemented.
Monitoring

Infrastructure monitoring is a vast topic. Some organizations have dedicated teams for it. So if that’s the case you will require very little to gather regarding this topic.

  1. Details of monitoring tools implemented e.g. tool’s servers, portal logins, licensing and infra management details of that tool, etc.
  2. SOP to configure monitoring for new CI, Alert suppression, etc.
  3. Alert policy, threshold, etc. definition process on that tool
  4. Monitoring tool’s integration with other software like ticketing tool/ITSM
  5. If the tool is being managed by the separate team then contact details, escalation matrix, etc for the same.
Backup solutions

Backup is another major topic for organizations and its mostly handled by the dedicated team considering its importance. Still, it’s better to have ground knowledge about backup solutions implemented in infrastructure.

  1. Details of backup solutions
  2. SOP for backup related activities like adding, updating, deleting new/old CI, policy definitions, etc.
  3. List of activities under the backup stream
  4. Backup recovery testing schedules, DR replication details if applicable
  5. Major backup recurring schedules like weekends so that you can plan your activities accordingly
Security compliance

The audit requirement is to keep your Linux infra security complaint. All Linux servers should be complaint to security policies defined by organization, they should be free from any vulnerabilities and always running on the latest software. Below are a few pointers to consider here –

  1. Solution or tool for running security scans on Linux servers
  2. SOP for the same, along with operating details.
  3. Password policies to be defined on Linux servers.
  4. Hardening checklist for newly built servers
Network infra details 

The network is the backbone of any IT infrastructure. Its always run by a dedicated team and hence you are not required to have in-depth knowledge of it. It’s not the scope of your transition. But you should know a few basics to get your day to day sysadmin life going smooth.

  1. SOP for proxy details, how to get ports opened, IP requirements, etc. 
  2. Network team contact details, process to get network services, escalation matrix, etc.
  3. How internet connectivity implemented for servers
  4. Understanding network perimeter and zones like DMZ, Public, Private in context to DC.
Documentation repository

When you kick off your support to new infrastructure, document repository is the gold mine for you. So make sure you populate it with all kind of related documents and make it worth.

  1. Location & access details of documentation. It could be a shared drive, file server, on the portal like SharePoint etc.
  2. Includes inventories, SOP documents, Process documents, Audit documents etc.
  3. Versioning and approval process for new/existing documents if any
Reporting

This area is in sysadmin’s bin. Gather all the details regarding this area.

  1. List of all reports currently existed for Linux Infrastructure
  2. What is the report frequency (daily, weekly, monthly)? 
  3. Check if reports are automated. If not ask for SOP to generate/pull reports. And then it’s an improvement area for you to automate them.
  4. How and why report analysis is done? This will help you to get expectations from report outputs.
  5. Any further procedure for reports like forwarding to management, signoff from any authority etc.
  6. Report repository if any. This is covered in the documentation repository section as well.
Applications

This area is not actually in scope for Sysadmin but it helps them to work in a process-oriented environment. Also helps to trace down criticality and impact on applications running on servers when underlying CI runs into trouble.

  1. ITSM tool (IT Service Management tool) used for ticketing & asset management & all details related to ITSM tool like access, authorization etc.
  2. Also, ask for a small training session to get familiar with ITSM tools as it’s customized accordingly to organizations operating structure.
  3. Architectural overview of applications running on Linux servers.
  4. Critical applications along with their CI mapping to track down application impact in case of issues with server
  5. Communication and escalation matrices for applications.
  6. Software repository being used. Like software setups, installable, OS ISO images, VM templates etc
Operations

In all the above points, we gathered data which can be used in this phase i.e. actual supporting Linux infrastructure.

  1. List of day to day activities and expected support model
  2. Logistics for operations like phone lines, ODC requirement, IT hardware needed for support etc.
  3. Process for decommissioning old server and commissioning new server process
  4. New CI onboarding process
  5. DR drill activities details
  6. Escalation/Management matrices on owner organization side for all above tech solutions

That’s all I could think of right now. If you have any more pointers let me know in comments, I am happy to add them here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.