The Senior Director of Cloud Operations is responsible for the operational integrity, performance, and reliability of enterprise cloud environments. This role leads a global, data-driven operations team with a strong emphasis on incident management, service continuity, and continuous improvement. This role reports directly to the Vice President of Cloud.

This position will be responsible for leading a global team of cloud engineers, SRE practice, service management tools and operations using a metrics-first approach.

Cloud Infrastructure Operations
Oversee the daily operations of cloud platforms (AWS, Azure, GCP), ensuring high availability and performance across global regions.
Lead the development and execution of operational runbooks, SOPs, and escalation paths.
Incident Management & Response
Own the end-to-end incident management lifecycle: detection, triage, escalation, resolution, and post-incident review.
Lead a global incident response team with 24/7 coverage, ensuring seamless handoffs across time zones.
Implement real-time monitoring, alerting, and automated remediation to reduce MTTD and MTTR.
Use data analytics to identify incident trends, recurring issues, and systemic risks.
Conduct blameless postmortems and ensure corrective actions are prioritized and tracked to closure.
Data-Driven Operational Leadership
Build and lead a global team of cloud engineers, SREs, and operations analysts using a metrics-first approach.
Define and track operational KPIs (e.g., uptime, incident frequency, resolution time, change success rate) to drive accountability and performance.
Leverage dashboards and analytics platforms (e.g., Datadog, Grafana, Splunk, ServiceNow) to provide real-time visibility into system health and team performance.
Use data to inform staffing models, on-call rotations, and workload balancing across regions.
Foster a culture of continuous improvement through data-backed retrospectives and operational reviews.
AI enabled Focus
Drive AI and ML adoption in operational workflows (e.g., predictive monitoring, incident pattern analysis etc.,) to improve uptime and automate repetitive tasks.
Define and execute AI-driven observability strategy using tools like AIOps platforms for intelligent alerting and root cause analysis.
Collaborate with Engineering, Security, and Product teams to embed AI-enabled automation in deployment pipelines, change management etc.,.
Establish and maintain SLOs/SLAs leveraging AI-generated insights to prioritize engineering work that improves reliability and customer experience.
Oversee incident management, post-mortems, and continuous improvement, incorporating AI tools for impact analysis and knowledge retention.
Operational Governance
Define and enforce SLAs, SLOs, and operational KPIs.
Ensure compliance with security, regulatory, and audit requirements.
Manage change control, configuration management, and release processes to minimize operational risk.
Cost & Vendor Management
Monitor and optimize cloud spend through cost governance and usage analysis.
Manage vendor relationships, contracts, and service-level agreements.
Collaboration & Communication
Partner with engineering, security, and business teams to align operations with product and service goals.
Provide regular reporting and updates to executive leadership on operational health, risks, and incident trends.

Education
Bachelor’s or master’s degree in computer science, Information Systems, or related field.
Experience
14+ years in IT operations, with 7+ years in cloud infrastructure and operations leadership.
Proven experience leading global teams and managing high-severity incidents in large-scale environments.
Skills
Deep expertise in cloud operations, incident response, and service reliability.
Strong knowledge of ITIL, SRE, and DevOps practices.
Proficiency in operational analytics and observability tools.
Excellent leadership, communication, and cross-functional collaboration skills.
Strong presentation skills, including experience presenting to large global audiences.
Certifications (Preferred)
AWS Certified DevOps Engineer – Professional
Azure Administrator Associate
ITIL Foundation or Practitioner

At Granicus, we offer a comprehensive and flexible benefits package designed to support your well-being, growth, and work-life balance.

Here’s what you can expect as a India-based team member:

Flexibility & Balance

Paid Time Off– Take the time you need to rest, recharge, and live your life.
Company-Wide Wellbeing Days – Paid days off to unplug and focus on your mental health.
Work From Home Reimbursement – Support a productive home office environment.

Health & Wellness

Private healthcare benefits - Comprehensive coverage for you and your family.
On-Demand Mental Health Support – Access to Headspace and other wellness tools.
Fitness Reimbursement & Cycle Program – Stay active, your way.
Critical Illness and Life Insurance Benefits

Family & Future

Paid Parental Leave - For both birthing and non-birthing parents.
Pension plan with employer contributions

Growth & Recognition

Online Learning Platforms – Fuel your professional development.
Competitive Salary & Bonuses – Your contributions are valued and rewarded.

Senior Director, Cloud Operations (SRE, SDM)

Job Summary

What Your Impact Will Look Like

You Will Love This Job If You Have

The Benefits

Options