About the job
Role Introduction:
We seek an experienced and highly skilled DevOps Team Lead who combines deep technical knowledge and leadership skills. This is an on-site role where you will lead a team responsible for building, securing, scaling, and maintaining critical cloud and on-premise infrastructure.
You must have strong hands-on experience in GCP and on-prem solutions, cybersecurity, CI/CD pipelines, high availability architectures, and networking and security.
This high-impact position offers growth, cutting-edge work, and excellent perks.
Key Responsibilities:
Leadership and Team Management:
Lead, mentor, and grow a team of DevOps and SRE engineers.
Set clear goals, establish best practices, and drive technical excellence.
Conduct regular code and architecture reviews with the team.
Act as an escalation point for critical issues and incidents.
Foster a culture of ownership, reliability, continuous improvement, and security-first thinking.
Infrastructures and Systems:
Architect, build, and manage highly available, scalable, and fault-tolerant systems on GCP and on-premise environments.
Design and implement active-active architectures and Disaster Recovery (DR) sites with clear RTO and RPO objectives.
Implement infrastructure as code (IaC) using Terraform, Helm, and similar technologies.
Manage and optimize databases (e.g., PostgreSQL, MySQL) and cache stores (e.g., Redis, Memcached).
Implement and manage monitoring, logging, alerting, and observability tools (e.g., Prometheus, Grafana, ELK, Stackdriver).
CI/CD Automation:
Build and maintain robust CI/CD pipelines for microservices and monolithic architectures.
Automate repetitive operational tasks through scripts and infrastructure tooling (Python, Bash, Go, etc.).
Integrate DevSecOps practices into all stages of the SDLC to ensure secure deployments.
Manage release processes and coordinate deployments with development teams.
Networking, Security and Compliance:
Manage and secure network configurations: VPCs, VPNs, load balancers, API Gateways, service meshes, proxies, SSL/TLS.
Drive authentication and authorization designs across systems (OAuth2, SAML, OpenID Connect, IAM policies, RBAC).
Implement and monitor vulnerability management, patching processes, and security incident response.
Conduct regular risk assessments, penetration tests, and audit readiness exercises.
Maintain compliance with relevant security frameworks (ISO 27001, SOC2, HIPAA, etc.).
Champion zero-trust security models and adopt best practices for cloud security posture management (CSPM).
Incident Management and Reliability:
Design and operate systems with strong resiliency and self-healing capabilities.
Lead incident response, postmortems, and implement corrective action plans (RCAs).
Continuously improve system SLA/SLO/SLI metrics and operational runbooks.
Ensure 24x7 uptime for critical services, participating in on-call rotations as necessary.
Collaboration and Strategy:
Work closely with development, security, product, and QA teams to align infrastructure and deployment strategies.
Translate business needs into technical requirements and infrastructure solutions.
Evaluate new tools, technologies, and methodologies that can enhance productivity, security, and system reliability.
Mandatory Qualifications:
Google Cloud Professional Certification (Architect, DevOps Engineer, or equivalent).
Certified Kubernetes Administrator (CKA).
Security Certification (e.g., CISSP, CISM, CEH, or equivalent).
8+ years of experience in DevOps, SRE, or Cloud Infrastructure roles.
Deep hands-on expertise in GCP and on-prem deployments.
Proven experience managing CI/CD pipelines, production-grade Kubernetes clusters, database, and cache management.
Strong background in network engineering and cloud-native security.
Solid scripting and automation skills (Python, Bash, Go).
Hands-on experience with vulnerability management, intrusion detection, and compliance.
Preferred Skills:
Familiarity with multi-cloud strategies (AWS, Azure).
Experience building zero-trust architectures and service mesh security (Istio, Linkerd).
Hands-on knowledge of SIEM tools, security event management, and threat hunting.
Strong understanding of distributed systems design and scaling patterns.
Contributions to open-source DevOps/security tools (a big plus!).
Perks:
Certification reimbursement for continuous learning.
Access to state-of-the-art infrastructure and tools.
Work with industry-leading experts in security, cloud, and operations.
Opportunities to lead high-visibility, business-critical projects.
A culture that values ownership, technical excellence, and career growth.
Our Commitment:
We are committed to fostering an environment of diversity, inclusion, continuous learning, and career development. Join us and make an impact where it matters.
Note:
This is an on-site role in Karachi.
Monthly based
Karachi Division,Pakistan,Pakistan
Karachi Division,Pakistan,Pakistan