About the Role
We are hiring a DevOps Engineer to design, build, and maintain the infrastructure that powers our products. You will partner closely with engineering teams to deliver reliable, secure, and scalable systems. The role covers cloud infrastructure, automation, observability, and security across the full delivery pipeline.
Responsibilities
Design and manage cloud infrastructure on AWS across multiple environments.
Deploy, scale, and operate containerized workloads using Kubernetes and Helm.
Provision and maintain infrastructure as code with Terraform.
Build and improve CI/CD pipelines using GitHub Actions for fast, safe releases.
Containerize applications with Docker and maintain image build standards.
Set up monitoring, alerting, and dashboards using Grafana and Prometheus.
Implement error tracking and incident response workflows to reduce downtime.
Apply SecOps practices across infrastructure, pipelines, and runtime environments.
Troubleshoot production issues and lead root cause analysis.
Document architecture, runbooks, and operational procedures.
Required Skills
Strong hands on experience with Kubernetes and Helm in production.
Proven expertise in Terraform for infrastructure as code.
Solid working knowledge of AWS core services (EC2, VPC, IAM, S3, RDS, EKS).
Experience designing CI/CD pipelines with GitHub Actions.
Working knowledge of SecOps principles, including secrets management, IAM hardening, and vulnerability scanning.
Experience with Grafana and Prometheus for metrics, alerting, and dashboards.
Strong command of Docker and container best practices.
Familiarity with error tracking tools and modern observability practices.
Nice to Have
Experience with GitOps tools such as ArgoCD or Flux.
Background in service mesh, ingress controllers, or API gateways.
Exposure to cost optimization and FinOps on AWS.
Scripting skills in Bash, Python, or Go.
What We Look For
Strong ownership and a bias for automation.
Clear communication and solid documentation habits.
Calm, structured approach to production incidents.
Curiosity to learn new tools and improve existing systems.
Team first mindset with willingness to mentor and share knowledge.