I'm Nelson. I build and operate production AWS infrastructure, from self-managed Kubernetes to AI-powered self-healing.

My path to DevOps started at AWS. I joined as a Technical Customer Service Associate after graduating with a Computer Science degree in Dublin. That role put me on the front lines of customer infrastructure problems: digging through CloudTrail logs, debugging IAM permission boundaries, and helping people figure out why their ECS tasks weren't starting or why their costs suddenly doubled. You learn fast when every ticket is someone else's production issue.

But there's a gap between understanding AWS services and building with them. So I built my own portfolio as real production infrastructure. What started as a single deployment grew into a self-managed Kubernetes cluster on AWS, provisioned with kubeadm, defined entirely in CDK (TypeScript), and spanning multiple operational domains. Workloads deploy through ArgoCD GitOps with ApplicationSet, images build and push via reusable GitHub Actions workflows with OIDC auth, and Day-1 orchestration runs through Step Functions and SSM Automation.

The full observability stack runs on that same cluster: Prometheus Operator for metrics, Grafana for dashboards, Loki for log aggregation, and Tempo for distributed tracing, all deployed via Helm charts through ArgoCD. Traffic flows from CloudFront through an NLB to Traefik IngressRoutes, with Calico CNI and NetworkPolicy enforcing pod-level segmentation. Every architecture choice is documented in ADRs: why self-managed K8s over EKS, why Traefik over ALB, why ArgoCD over Flux.

More recently, I've built AI-powered infrastructure tooling. A self-healing agent uses Bedrock AgentCore with MCP to automatically diagnose and remediate CloudWatch alarms, complete with Cognito M2M authentication, FinOps token-budget guardrails, and a dead-letter queue safety net. An AI content pipeline uses Bedrock with RAG (Pinecone) to generate articles from briefs, with adaptive thinking budgets that scale with content complexity. The entire platform runs on a minimal monthly spend by design.

The articles I write here come from problems I've actually hit. Not theory, not course material. If I write about CDK stack separation or Kubernetes networking or ArgoCD sync strategies, it's because I dealt with the broken version first and had to fix it. I hope they save someone else a few hours of debugging.

AWS DevOps Engineer — Professional  |  BSc Computer Science, Dublin

Beyond Code

When I'm not building infrastructure, I'm making music. It's a creative outlet that keeps me sane — and a reminder that not every problem requires a YAML file. Have a listen →