Site Reliability Engineer, Manager

Job Locations US
Requisition ID
2026-166095
Position Category
Information Technology
Clearance
Public Trust

Responsibilities

Peraton is seeking a Site Reliability Engineer (SRE), Manager- a highly experienced professional responsible for ensuring the availability, reliability, and performance of complex systems in a multi-vendor environment. This role combines deep technical expertise in infrastructure, automation, and system architecture with leadership and collaboration skills to drive reliability frameworks, proactive monitoring, and incident response across diverse platforms and teams.

 

The Site Reliability Engineer, Manager operates with significant autonomy, architecting solutions that enhance system observability, scalability and fault tolerance. They lead reliability initiatives, mentor engineering teams, and collaborate with multiple vendors and internal stakeholders to align reliability strategies with business objectives and customer needs. This role is ideal for a highly skilled engineer who excels in technical leadership, complex system architecture, and multi-stakeholder environments. Principal Site Reliability Engineers are key to building resilient systems that scale efficiently while minimizing downtime and risk.

 

This opportunity will support the modernization of a large-scale multi-tenant cloud ecosystem, providing critical enterprise-wide support for more than 40 million users in a complex stakeholder environment. This position requires senior level leadership skills combined with modern cloud and industry leading technical capabilities including product development, strict security compliance, latest technology cloud solutions, reliable application delivery with SaaS and Artificial Intelligence integrations and rapid continuous delivery.   

 

Core Responsibilities

  • Reliability Architecture and Automation: Design, implement, and oversee reliability frameworks, including SLOs, error budgets, and automated incident response systems. Develop and maintain CI/CD pipelines to ensure seamless deployment and procedural efficiency.
  • Observability and Monitoring: Lead the creation and enhancement of observability platforms using metrics, logging, and tracing tools. Utilize modern technologies like OpenTelemetry, AI/ML for anomaly detection, and streaming data platforms to proactively detect and resolve issues
  • Multi-Vendor Collaboration: Coordinate with external vendors and internal teams to integrate and manage diverse systems and tools. Ensure consistent reliability standards and practices are maintained across different technology stacks and service providers.
  • Incident Management and Risk Mitigation: Drive incident response strategy by leading root cause analysis, post-mortem reviews, and continuous improvement efforts. Identify potential risks and implement mitigation strategies to prevent service disruptions. 

Leadership and Collaboration

  • Technical Leadership: Mentor site reliability and engineering teams, fostering a culture of reliability, automation, and continuous learning. Advocate for best practices in system design and reliability engineering.
  • Cross-Functional Partnership: Work closely with product development, DevOps, and security teams to integrate reliability into the software development lifecycle. Influence platform strategy and roadmap based on reliability insights.
  • Strategic Influence: Collaborate with senior stakeholders and vendors on long-term reliability goals. Prepare executive-level presentations that translate technical challenges into business impact.
  • Agile and DevOps Practices: Lead and refine agile workflows to enhance team productivity and reliability outcomes. Champion DevOps methodologies to align development and cloud services efforts.

**Position could support /work across multiple enterprise- wide efforts within Peraton.**

 

Qualifications

Key Skills and Qualifications:

 

  • Extensive experience (10+ years) in site reliability engineering or related roles, preferably in multi-vendor and complex environments. 
  • Deep knowledge of cloud-native infrastructure, container orchestration (e.g., Kubernetes), and automation tools such as Terraform, Ansible, or Chef.
  • Proficiency in observability technologies, such as Prometheus, Grafana, OpenTelemetry, log aggregation systems, etc.
  • Strong programming and scripting skills for automation and tooling (Python, Go, or similar).
  • Expertise in defining and implementing SLIs, SLOs, and error budgets.
  • Excellent communication skills for collaboration with diverse teams and external vendors.
  • Proven ability to lead large-scale reliability initiatives and mentor engineering teams.
  • Strategic thinker with a focus on aligning reliability engineering with business priorities and customer experience. 

Clearance Requirements:

  • U.S. Citizenship required
  • Ability to obtain agency clearance (public trust)

Preferred Qualifications:

  • Top Secret clearance preferred

 

Peraton Overview

Peraton is a next-generation national security company that drives missions of consequence spanning the globe and extending to the farthest reaches of the galaxy. As the world’s leading mission capability integrator and transformative enterprise IT provider, we deliver trusted, highly differentiated solutions and technologies to protect our nation and allies. Peraton operates at the critical nexus between traditional and nontraditional threats across all domains: land, sea, space, air, and cyberspace. The company serves as a valued partner to essential government agencies and supports every branch of the U.S. armed forces. Each day, our employees do the can’t be done by solving the most daunting challenges facing our customers. Visit peraton.com to learn how we’re keeping people around the world safe and secure.

Target Salary Range

$135,000 - $216,000. This represents the typical salary range for this position. Salary is determined by various factors, including but not limited to, the scope and responsibilities of the position, the individual’s experience, education, knowledge, skills, and competencies, as well as geographic location and business and contract considerations. Depending on the position, employees may be eligible for overtime, shift differential, and a discretionary bonus in addition to base pay.

EEO

EEO: Equal opportunity employer, including disability and protected veterans, or other characteristics protected by law.

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
Share on your newsfeed