Akash Kaveti — Platform & Security Engineer

Currently

01What I'm building right now

Live work

Deep in self-hosted LLM infrastructure — standing up production-grade inference with vLLM on GPU clusters, serving open-weight models with the throughput and latency of a hosted API, but without a single token ever leaving the perimeter.

This is where it gets exciting: sovereign, private model serving on Kubernetes — continuous batching, tensor/GPU scheduling, hardened multi-tenant isolation, and signed, reproducible deploys. The kind of AI infrastructure where trust isn't a marketing word — it's enforced all the way down to the commit.

vLLMSelf-hosted LLMsGPU inference Open-weight modelsKubernetesPrivate & sovereign

What I build

02Security-critical infrastructure

Secrets & Identity

Secret management, workload identity, and short-lived scoped credentials — removing standing access without slowing engineers down.

Supply-Chain Integrity

Signed GitOps deploys, hardened base images, reproducible releases, and dependency hygiene enforced in CI.

Platform Hardening

Kubernetes RBAC, network policy, service mesh, and pod-level isolation as a default, not an afterthought.

Observability for Security

Metrics, logs, and audit-trail routing that tell infrastructure failure apart from anomalous workload behaviour.

Policy-as-Code

Terraform-enforced guardrails and GitOps as the only write path — every change reviewed, signed, and auditable to a commit.

Reproducible Delivery

Build and deploy systems that produce the same artifact every time, with a clear chain from source to running workload.

Track record

03Selected experience

2024 — Present

Lead DevSecOps / Platform Engineer

Sovereign AI research institute

Own the secure platform researchers train and serve foundation models on — GPU-enabled OpenShift on-prem and AWS, with hardened images, signed deployments, and per-tenant isolation.
Building self-hosted LLM inference with vLLM on GPU clusters — production-grade serving of open-weight models that stays entirely inside the perimeter.
Designed the cluster-wide observability and alerting model that separates infrastructure failure from anomalous workload behaviour.
Built one-click developer environments with workload identity and ephemeral credentials — removing standing access without slowing research.
Drove the migration to GitOps as the only path to production: every change reviewed, signed, and auditable to a commit.

2022 — 2024

Senior Platform / Security Engineer

Global media-intelligence SaaS

Led the org-wide rollout of secret management on Kubernetes — eliminating long-lived secrets checked into config repos across dozens of services.
Architected the multi-cluster GitOps platform with environment promotion, drift detection, and a single signed source of truth for all production manifests.
Ran the migration from self-managed clusters to managed Kubernetes, shrinking the unpatched control-plane surface.
Established a FinOps practice that cut cloud spend ~30% while tightening tagging and ownership.

2021 — 2022

Cloud Platform Reliability Engineer

Global automotive enterprise

Operated the centralised Kubernetes platform that hundreds of internal applications ran on, with SLOs and on-call rotation.
Drove the move to GitOps — pull-based deploys, signed commits, and no direct production access from laptops.
Supported teams adopting workload identity, secret injection, and hardened base images.

2016 — 2021

SRE / Technical Lead & earlier platform roles

Consultancy & early-stage startups

Technical lead for DevOps/SRE across engagements — set platform standards, reviewed designs, and coached engineers on secure-by-default practice.
First infrastructure hire at an early-stage startup: designed, deployed, and ran the entire cloud and on-prem footprint, including CI/CD and base-image standards.
Stood up monitoring and CI/CD pipelines, and owned infrastructure-as-code across multiple product teams.

Toolbox

04Core stack

cloud & orchestration

AWSAzure KubernetesOpenShiftRancher DockerPodman

iac & packaging

TerraformHelmKustomize HelmfileAnsiblePuppet

ci/cd & gitops

ArgoCDFluxCDGitLab CI GoCDJenkinsTeamCity

security & identity

Secrets managementWorkload identity Signed artifactsRBAC Network policy (Cilium)Service mesh (Istio)

observability & languages

PrometheusGrafanaLoki DatadogCloudWatchElastic / Kibana PythonBashLinux

Background

05Education

M.Sc., Telecommunication Systems

2014 — 2015 · Sweden

B.Tech., Electronics & Communications

2010 — 2013 · India

01What I'm building right now

02Security-critical infrastructure

▣Secrets & Identity

⬡Supply-Chain Integrity

◈Platform Hardening

◎Observability for Security

⌘Policy-as-Code

⟁Reproducible Delivery