Opening from Default - All locations
The Company
Serving the People Who Serve the People
Granicus is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and their constituents together. We are on a mission to support our customers by meeting the needs of their communities and implementing our technology in ways that are equitable and inclusive. Granicus has consistently appeared on the GovTech 100 list over the past 5 years and has been recognized as the best companies to work on BuiltIn.
Over the last 25 years, we have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers powering an unmatched Subscriber Network that uses our digital solutions to make the world a better place. With comprehensive cloud-based solutions for communications, government website design, meeting and agenda management software, records management, and digital services, Granicus empowers stronger relationships between government and residents across the U.S., U.K., Australia, New Zealand, and Canada. By simplifying interactions with residents, while disseminating critical information, Granicus brings governments closer to the people they serve—driving meaningful change for communities around the globe.
Want to know more? See more of what we do here.

Granicus is seeking an experienced and highly skilled Senior Site Reliability Engineer (SRE) to join our SRE team. As a Senior SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will lead efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability.
Essential Function
On-call Production Support: Provide production support on a shift according to the team on-call roster.
Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface.
Work on SREs backlog items.
Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability.
Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention.
Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence.
System Improvements: Participate in designing and implementing system improvements to enhance reliability, scalability, and performance.
Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes.
Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team.
Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth.
Security: Implement and adhere to security best practices to protect our systems and data.

Technical Skills: Good understanding of Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud). Experience with scripting languages such as Python, Bash, or Ruby.
Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field, or equivalent practical experience.
Experience: 5+ years of experience in site reliability engineering, system administration, or a similar role, with a proven track record of managing large-scale, high-availability systems
Technical Skills: Expertise in Linux/Unix systems, networking, and cloud services (AWS, Azure, or Google Cloud). Proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++).
Tools and Technologies: Advanced knowledge of monitoring and logging tools (Prometheus, Grafana, Splunk), configuration management (Ansible, Chef, Puppet), and CI/CD pipelines.
Problem-Solving: Strong analytical and problem-solving skills with the ability to diagnose and resolve complex issues efficiently.
Communication: Excellent verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders.
Leadership: Demonstrated ability to lead and mentor a team, drive projects to completion, and manage cross-functional initiatives.

At Granicus, we offer a comprehensive and flexible benefits package designed to support your well-being, growth, and work-life balance.

Here’s what you can expect as a India-based team member:

Flexibility & Balance

Paid Time Off– Take the time you need to rest, recharge, and live your life.
Company-Wide Wellbeing Days – Paid days off to unplug and focus on your mental health.
Work From Home Reimbursement – Support a productive home office environment.

Health & Wellness

Private healthcare benefits - Comprehensive coverage for you and your family.
On-Demand Mental Health Support – Access to Headspace and other wellness tools.
Fitness Reimbursement & Cycle Program – Stay active, your way.
Critical Illness and Life Insurance Benefits

Family & Future

Paid Parental Leave - For both birthing and non-birthing parents.
Pension plan with employer contributions

Growth & Recognition

Online Learning Platforms – Fuel your professional development.
Competitive Salary & Bonuses – Your contributions are valued and rewarded.

Site Reliability Engineer 3 - GCP

Job Summary

What Your Impact Will Look Like

You Will Love This Job If You Have

The Benefits

Options