Monthly Archives: August 2025

Migrating from Amazon Linux 2 to Bottlerocket AMI in EKS Nodes

In this post, I share my journey transitioning from Amazon Linux 2 to Bottlerocket as the EKS node OS, aiming for enhanced security with a hardened OS image.

In my envionment, running Kubernetes workloads on Amazon EKS with Amazon Linux 2 (AL2) worker nodes is a tried-and-tested approach. It’s stable, compatible with most tooling, and offers the flexibility of a general-purpose Linux OS.

AL2 is a full Linux distribution meaning it ships with many binaries, libraries, and utilities that aren’t strictly needed for running containers. This fully blown OS increases the attack surface if not hardened properly. If an attacker compromises a node (through a container escape, misconfiguration, or another vector), these extra tools and privileges can be leveraged for deeper intrusion, persistence, and lateral movement.

Hence, it was a wise choice to explore Bottlerocket which is CIS hardened out of the box as a EKS node OS.

Bottlerocket: Minimal, Secure, Container-First OS

Bottlerocket is an open-source Linux-based OS purpose-built by AWS to run containers securely and with minimal overhead. It’s now officially published and supported for EKS and ECS, making it a good alternative to AL2 for containerization platforms. As Bottlerocket is CIS hardened out of the box, it saves so much of manual/automation work of hardening OS image.

Key Security Advantages of Bottlerocket over AL2

  1. Immutable Root File System
    • The root filesystem is read-only and protected with dm-verity (integrity verification).
  2. No Direct Package Installation
    • There’s no package managers (yum/apt).
    • All additional functionality runs in special-purpose containers (control or admin container), isolating changes from the host OS.
  3. No Default SSH Access
    • Bottlerocket blocks SSH by default.
    • Administrative access is through AWS Systems Manager (SSM) Session Manager, meaning you are covered with IAM and CloudTrail.
  4. Locked-Down System & Kernel
    • No direct systemd or kernel-level access from workloads.
    • The OS is configured and updated via a local API (protected by SELinux policies), avoiding risky manual edits.
  5. Atomic, Signed OS Updates with Rollback
    • Updates are applied as a full image to an inactive partition, verified with cryptographic signatures, and made active only after reboot.

Why BottleRocket could be a good choice?

Moving from AL2 to Bottlerocket removes unnecessary OS-level tools and privileges from your nodes, reducing the blast radius. Instead of manually hardening AL2 with CIS benchmarks, SELinux policies, and SSH lockdowns, Bottlerocket bakes these controls in by default.

This means:

  • Lower operational risk.
  • Less maintenance effort to stay compliant.
  • Better alignment with Kubernetes’ container-first security model.

Official Bottlerocket Documentation → https://bottlerocket.dev/en/os/1.42.x/

Our Migration Journey with Karpenter

In the earlier section, we discussed why we focused on Bottlerocket. Now, let’s talk about the how the actual activities we performed during our migration.

Our EKS cluster uses Karpenter for node provisioning instead of EKS-managed node groups. Hence this post focuses on Karpenter-specific configurations for using Bottlerocket AMIs.

Let’s get into the stepwise procedure –

1. Updating the EC2NodeClass Manifest

To provision Bottlerocket nodes with Karpenter, we updated our EC2NodeClass manifest as follows:

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: bottlerocket-nodes
spec:
  amiFamily: Bottlerocket
  amiSelectorTerms:
    - alias: bottlerocket@v1.42.0
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        deleteOnTermination: true
        encrypted: true
        kmsKeyID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
        volumeSize: 10Gi
        volumeType: gp3
    - deviceName: /dev/xvdb
      ebs:
        deleteOnTermination: true
        encrypted: true
        kmsKeyID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx
        volumeSize: 30Gi
        volumeType: gp3

Key Points

  • amiFamily must be set to Bottlerocket.
  • amiSelectorTerms.alias specifies the desired Bottlerocket version.
  • Avoid using bottlerocket@latest in production environments. As it points to an AWS-managed SSM parameter, if AWS updates the parameter value, Karpenter detect it as an AMI drift and recycle nodes (default check interval is 1 hour). It shuffles all the PODs in the cluster.

Pro tip: Instead of relying on @latest, you can implement your own AMI automation pipeline. Tag new AMIs with predefined keys (e.g., owner=myteam, version=1.42.0) and reference those tags in amiSelectorTerms.

2. Understanding Block Device Mappings

We explicitly defined two block devices for Bottlerocket nodes:

  • /dev/xvda — OS Volume
    • For Bottlerocket OS!
    • Contains active/passive partitions, bootloader, dm-verity metadata, and the Bottlerocket API datastore.
  • /dev/xvdb — Data Volume
    • Used for everything running on top of Bottlerocket i.e. container images, runtime storage, and Kubernetes persistent volumes.

If you don’t define /dev/xvdb in your manifest:

  • Karpenter defaults it to 20 GB of type GP2. I prefer gp3 for best price-performance ratio.
  • You may end up in insufficient disk space incidents.
  • /dev/xvda may end up larger than necessary, wasting EBS storage for the OS.

By making these changes, we were able to seamlessly migrate from AL2 to Bottlerocket in our Karpenter-managed EKS environment, gaining all the hardened security benefits without disrupting workloads.

3. User Data Script: Custom GuardDuty DNS Mapping on Bottlerocket

Background

In our setup, EKS cluster uses a self-hosted DNS instead of AWS’s default DNS. We’ve also enabled AWS GuardDuty threat detection for the cluster.

When GuardDuty protection is enabled, it creates a PrivateLink VPC endpoint whose DNS name is resolved inside the respective PODs. This PrivateLink is available in the subnets/AZs where it’s created (in our case: 3 subnets, AZs a, b, and c).

For GuardDuty’s DaemonSet to function correctly, all EKS nodes must be able to resolve its PrivateLink endpoint from within the same subnet they launched in.

How We Did It on AL2

On Amazon Linux 2, this was simple:

  • Add a shell script to EC2 user data.
  • Script fetchs the subnet-specific PrivateLink IP.
  • Appends the mapping to /etc/hosts.

The Bottlerocket Challenge

Bottlerocket can not execute raw shell scripts directly via EC2 user data.
Instead:

  • It uses TOML-formatted user data.
  • OS changes are made through the Bottlerocket API (apiclient).

Also, /etc/hosts exists on the read-only root filesystem, so direct edits are not possible.

Our Solution

After researching the Bottlerocket design, we found three possible approaches:

  • Host containers (admin, control): Could run the script but admin requires enabling an SSH keypair, which we wanted to avoid.
  • Bootstrap containers: Run a container at instance boot before the kubelet starts.
  • apiclient API calls: The correct way to update /etc/hosts on Bottlerocket.

We opted to go ahead with bootstrap containers + apiclient.

Final User Data Configuration

Here’s the relevant part of our EC2NodeClass manifest for Karpenter:

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: bottlerocket-guardduty
spec:
  amiFamily: Bottlerocket
  amiSelectorTerms:
    - alias: bottlerocket@v1.42.0
  blockDeviceMappings:
    ...
  userData: |
    [settings.bootstrap-containers]
    [settings.bootstrap-containers.guardduty]
    source = "public.ecr.aws/bottlerocket/bottlerocket-bootstrap:v0.2.4"
    mode = "once"
    user-data = "xxxvYmlxxxxxxxxxxxxxxxxxxxxxxxyBXT1JMXX=="

Note: The base64-encoded string shown in the bootstrap-container user-data is only a placeholder. Below are the detailed steps to generate the actual base64-encoded string for your script.

Implementation Steps

a) Create the shell script (myscript.sh)
This script uses apiclient to inject the GuardDuty PrivateLink mapping into /etc/hosts via the Bottlerocket API.

#!/bin/bash
set -euo pipefail

echo "[BOOTSTRAP] Starting GuardDuty host entry setup..."

# Get IMDSv2 token
TOKEN=$(curl -sX PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 60")

# Get metadata
MAC1=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/network/interfaces/macs/ | head -1 | tr -d '/')

VPCID=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/network/interfaces/macs/$MAC1/vpc-id)

AZ=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/dynamic/<AWS-ACCOUNT-ALIAS>/document | jq -r .availabilityZone)

REGION=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/dynamic/<AWS-ACCOUNT-ALIAS>/document | jq -r .region)

# Get GuardDuty ENI IP
ENIIP=$(aws ec2 describe-network-interfaces \
  --filters Name=vpc-id,Values=$VPCID \
            Name=availability-zone,Values=$AZ \
            Name=group-name,Values="GuardDutyManagedSecurityGroup-vpc-*" \
  --query 'NetworkInterfaces[0].PrivateIpAddress' \
  --region "$REGION" --output text)

if [[ -z "$ENIIP" || "$ENIIP" == "None" ]]; then
    echo "[BOOTSTRAP] No GuardDuty ENI IP found"
    exit 0
fi

cat > hosts.json <<EOF
{
  "settings": {
    "network": {
      "hosts": [
        ["$ENIIP", ["guardduty-data.$REGION.amazonaws.com"]]
      ]
    }
  }
}
EOF

apiclient apply < hosts.json

b) Encode the script in Base64
Bootstrap container user data must be Base64-encoded for TOML.

base64 -w 0 myscript.sh

c) Embed the Base64 string in user data
Paste the encoded string into:

[settings.bootstrap-containers.guardduty]
user-data = "&lt;base64-encoded-script>"

d) Validate execution
You can verify your bootstrap container ran successfully using:

aws ec2 instance --> Actions --> Monitor and Troubleshoot --> Get system log

Key Takeaways:

  • Use the network.hosts API setting for modifying contents of /etc/hosts
  • Bootstrap containers are the best way to run initialization scripts at boot.
  • Avoid enabling the admin host container with SSH just for automation, it defeats the purpose of Bottlerocket’s out of the box hardening.

Final Thoughts

In this blog, we’ve shared the insights and hands-on learnings we’ve gathered while working with Bottlerocket. Since there’s limited practical guidance available online, we thought to share our experience. In a summary: migrating from Amazon Linux 2 to Bottlerocket for EKS node hardening not only strengthens security but also changes how we interact with the underlying OS. While certain tasks like running userdata scripts require a different approach, Bottlerocket’s design ensures a minimal attack surface, immutable infrastructure, and tighter control over system access. With the right methods, such as leveraging bootstrap containers and the Bottlerocket API, you can still meet your operational requirements without compromising on security.

KubeCon + CloudNativeCon India 2025: My experience

From scaling Kubernetes, observability, powering huge AI projects to real stories from PepsiCo, Flipkart, and Intuit, KubeCon + CloudNativeCon India 2025 in Hyderabad was a deep dive into where cloud-native ecosystem is headed. Here’s my personal experience about the event.

I had the chance to attend KubeCon + CloudNativeCon India this year. Its a two-day technical conference packed with thoughtfully curated keynotes, tech talks, and lightning sessions from the people shaping the Kubernetes ecosystem and the cloud-native world. The event also featured ~30 product/solutions booths showcasing solutions to help organizations with Kubernetes workloads, security, scaling, observability, platform engineering and their AI journey. Before we get into more details, take a look at these key numbers and details:

  • Held at Hyderabad International Convention Centre (HICC).
  • Happened on 6-7 August 2025.
  • Sessions
    • 12 Keynote sessions
    • 10 lightning talks
    • 56 Tech sessions on K8s, Observability, Scaling, Security, AI + ML, Platform engineering, etc.
  • Approx ~4000 registrations
  • ~30 product/solutions booth by various companies.
  • The 2026 KubeCon India is announced to be held in Mumbai!
  • Watch this event’s session recordings, available on the CNCF YouTube Channel.
  • Review session slides from speakers who provided them via the event schedule.

With so much on the agenda, it’s very difficult for one person to attend everything. That’s why pre-planning is essential for events of this scale. It was worth creating a schedule focused on the sessions and booths that match your interests. The Sched app made this much easier, helping me build my personal agenda so I wouldn’t miss anything important.

There were these ~30 product/solutions booth at the event (Sorted alphabetically)-

  • Akamai – Akamai Cloud. Simple, Scalable and out of this world.
  • Atlassian – Delivering greatness is impossible alone. Jira. Confluence. Loom. Rovo.
  • AWS – Scalable, Reliable and Secure Kubernetes
  • CAST AI – Kubernetes Automation and Performance
  • Clickhouse + HyperDX – The world’s fastest analytical database
  • Cloudflare
  • Coralogix – Complete Observability, Zero compromises.
  • D E Shaw & Co
  • Digital Ocean
  • Dragonfly – In memory data store. Cut Redis costs by 50%
  • EDB Postgres II by CloudNativePG – Setting the Standard for PostGres in Kubernetes
  • Expedia group – Careers
  • F5 – Optimize, Scale, and Secure Apps in Kubernetes
  • GitHub
  • Gitlab – Develop, secure and deploy software faster
  • Google Cloud
  • Grafana Labs – Open, composable and cost-effective observability.
  • Intel – AI inside for a new Era
  • Kloudfuse – Unified Observability. Your Cloud. Your Control.
  • Kodecloud – Kubernetes learning platform
  • Kong – Kube-native. Platform-forward.
  • Last9 – Ship fast. triage faster.
  • Learning Lounge CNCF – The Linux Foundation Education
  • Microsoft Azure – Achieve more with Azure hosted Kubernetes
  • Nudgebee – AI-Agentic Assistants. Troubleshooting. FinOps. CloudOps.
  • Nutanix – Kubernetes. Virtualization. Data. Anywhere.
  • PerfectScale by doit – Autonomous Kubernetes. Optimization and Governance.
  • RedHat OpenShift
  • ScyllaDB – Predictable Performance at Scale – Anywhere
  • Sonatype – Scalable Artifact Management
  • vmware – Kubernetes Made Simpler, More Flexible and Secure.
  • VULTR – Global cloud infrastructureand manage Kubernetes for AI-native applications

My experience

I traveled with my team to attend the event, and ignoring the travel fatigue, we made it to HICC by 8 AM for the check-in formalities. We were ready after security checks done and we received our event badges. The venue welcomed us with tea, coffee, and cookies – a much-needed warm-up before a day packed with technical deep dives.

By 9:30, as per the agenda, the keynote sessions started. The crowd was so massive that the main hall quickly filled up, but crew arranged adjacent halls with big screens for live streaming. We settled into one of those adjacent halls, but due to some technical glitches with the screen and audio, we decided to head back to the main hall and stand along the sides just to catch the keynotes live. After an hour of inspiring talks (and sore legs :D), we stepped out for a break and then dove into exploring the product and solution booths.

The booths were sparkling with energy! Attendees deep in conversations with booth teams, exploring new features and solutions – it was a energetic sight! It was great to see fresh graduates in the mix, already well-versed in cloud technologies. The enthusiasm was contagious, and of course, no developer conference is complete without a shower of freebies and laptop stickers. We do collected a healthy stash!

The iconic “Kubestronauts” blue jackets were turning heads throughout the venue. It’s a title one can earn by completing five Kubernetes and cloud-native certifications, a badge of honor that carries serious weight in the Kubernetes world today. Seeing a few Kubestronauts proudly sporting those jackets was truly inspiring!

We split up to cover more ground, each visiting different booths. My own highlights are the insightful discussions I had at the AWS and Nudgebee booths. Both offered interesting perspectives and solutions worth exploring further.

The organizers smartly arranged lunch across multiple dining designated areas, which helped spread out the crowd and made it possible for attendees to enjoy a peaceful, crowd-free meal.

We contiunued the second day with similar agenda of attending sessions and booths gathering every possible bit of the information that could help us keeping tab of the Cloud Native and Kubernetes world and industry progress in those domains. ArgoCD scaling, Platform Engineering, EKS Auto mode, EKS dashboards, Observability, AI integration for troubelshooting and Ops to name a few.

I also ran into a former colleague and reconnected with some contacts from AWS I had worked with in the past during the event. Also, got to see some open source community leaders, some well known personalities from Kubernetes/Cloud world in person. These technical conferences are they best way to network and socialize!

A Peek into the Event

1 / 12