Get the job you really want.
Top Senior Site Reliability Engineer Jobs in Boston, MA
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
As a Lead Site Reliability Engineer, you will enhance infrastructure reliability, scalability, and efficiency by leading automation projects, mentoring engineers, and developing software-driven infrastructure solutions. You will shape deployment strategies and monitor performance to support the organization's rapid growth.
Top Skills:
.NetAnsibleAWSC#ChefContainerdDockerGCPGoJavaKubernetesLinuxNutanixPythonTerraformVsphere
Artificial Intelligence • Big Data • Information Technology • Software
The Lead Site Reliability Engineer will oversee cloud infrastructure management, develop SRE processes, ensure FedRAMP compliance, and lead a team of engineers.
Top Skills:
AnsibleAWSAzureBashCloudFormationCrossplaneDockerGCPGitGitlabGoIds/IpsJenkinsKubernetesPythonSIEMTerraform
Fintech • Information Technology • Payments • Financial Services • Cryptocurrency
The Senior Site Reliability Engineer will manage the production environment for the FedNow Service, implementing monitoring tools and CI/CD automation, supporting technical operations, interfacing with internal stakeholders, and driving continuous improvement initiatives while ensuring system reliability and scalability.
Artificial Intelligence • Enterprise Web • Information Technology • Machine Learning • Mobile • Software • Analytics
As a Site Reliability Engineer, you'll enhance system stability and performance, manage alert quality, and ensure operational security while collaborating on engineering initiatives.
Top Skills:
Cloud TechnologiesGkeKubernetesNginx
Aerospace • Artificial Intelligence • Logistics • Machine Learning • Software • Transportation • Defense
Lead efforts to deliver the Flyways AI Platform through coding, deploying, and maintaining services in a secure cloud infrastructure, while managing complex systems and collaborating with teams.
Top Skills:
AWSCircleCIDockerGrafanaHelmJenkinsK8SPostgresPythonTerraform
Big Data • Cloud • Software • Database
Lead the Fabric team as a Site Reliability Engineer, focusing on building resilient infrastructure for secure service communication, while overseeing team direction and addressing technical issues.
Top Skills:
AWSAzureBgpDnsGCPKubernetesTcp/IpTls/MtlsVpcs
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software
The SRE Cloud Architect will design and optimize AWS cloud infrastructure focusing on scalability, reliability, and cost efficiency, while mentoring teams and ensuring best practices in security and operational excellence.
Top Skills:
AnsibleApi GatewayAWSAws CdkAws CloudwatchAws GuarddutyBashCloudFormationCloudfrontCloudtrailDocumentdbEc2EksGitlabGrafanaLambdaLokiMimirPrometheusPythonRdsS3Secrets ManagerSecurity HubSsmTempoTerraform
Big Data • Cloud • Software • Database
The Lead Site Reliability Engineer will manage the Fabric team, ensuring secure communication infrastructure, guiding engineering practices, and participating in on-call support.
Top Skills:
AWSAzureBgpDnsGCPKubernetesSdnTcp/IpTls/Mtls
Featured Jobs
Artificial Intelligence • Fintech • Information Technology • Software • Data Privacy
The Principal Site Reliability Engineer ensures SaaS products are fast and stable, focuses on automation, system monitoring, and collaborates with teams to improve product performance.
Top Skills:
C#,.Net,Java,Harness,Azure Devops,Ansible,Jenkins,New Relic,Dynatrace,Datadog,Appdynamics,Powershell,Python,Bash,Terrraform,Sql,Cosmos,Solarwinds Database Performance Analyzer,Idera Sql Diagnostic Manager,Redgate Sql Monitor,Kubernetes,Aks,Eks
6 Days Ago
Easy Apply
Easy Apply
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
The Lead Site Reliability Engineer will design, develop, and operate observability systems, ensuring service reliability in large distributed environments. Responsibilities include scaling observability systems, writing monitoring libraries, and collaborating with engineering teams.
Top Skills:
AnsibleBashElasticsearchGoKafkaPrometheusPythonRubyScalaTerraform
Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
As a Senior Software Engineer II, you'll design scalable infrastructure to support Dandy's products, ensuring quality and performance in a collaborative environment.
Top Skills:
ChronosphereGCPGraphQLKubernetesNestjsNode.jsPostgresPulumiReactReduxTemporalTypescript
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will design scalable systems, automate processes, and ensure high availability of the Atlas platform, collaborating with multiple teams.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Fintech • Information Technology • Payments • Financial Services • Cryptocurrency
As a Principal Engineer in the SRE/Production Operations team for FedNow, you will oversee production environments, implement monitoring and tooling, and ensure reliable, scalable systems. Responsibilities include CI/CD automation design, capacity planning, and collaborating with internal teams to manage technical operations and continuous improvement initiatives.
Artificial Intelligence • Big Data • Information Technology • Software
The Senior Site Reliability Engineer will build and manage cloud infrastructure, ensure security compliance, and lead incident management efforts while collaborating with various teams on performance optimization.
Top Skills:
AnsibleAWSAzureBashCloudFormationCrossplaneDockerFirewallsGCPGitGitlabGoIds/IpsJenkinsKubernetesPythonSIEMTerraform
Cloud • Greentech • Other • Energy
As a Site Reliability Engineer II on the Observability team, you'll manage and improve observability stacks, support engineering teams with monitoring, develop new tools, and analyze system performance for enhanced reliability.
Top Skills:
AnsibleCircleCICloud FormationDockerGithub ActionsGitlab Ci/CdGoKubernetesPythonTerraform
Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation
As a Site Reliability Engineer, you'll enhance infrastructure security, automate deployments, optimize CI/CD processes, and drive engineering best practices while ensuring compliance and observability.
Top Skills:
Aws CloudElasticsearchGoJavaScriptMongoDBNode.jsReactRedisTerraform
22 Days Ago
Easy Apply
Easy Apply
Big Data • Fintech • Mobile • Payments • Financial Services
As a Senior Software Engineer in SRE, you will lead teams in building reliable backend systems, driving incident management, and fostering a culture of quality, while supporting product development and handling operational metrics.
Top Skills:
AWSKotlinKubernetesMySQLPython
22 Days Ago
Easy Apply
Easy Apply
Big Data • Fintech • Mobile • Payments • Financial Services
As a Staff Software Engineer in SRE, you will design and enhance backend systems, ensuring reliability and operational excellence while developing a culture of quality and mentorship within the team.
Top Skills:
AWSKotlinKubernetesMySQLPythonSpark
Information Technology • Software
The Site Reliability Engineer will design and maintain resilient infrastructure for a SaaS platform, ensuring security and performance through AWS services and effective monitoring.
Top Skills:
Api GatewayAurora ServerlessAWSCloudwatchFusionauthGrafanaGuarddutyLambdaOpensearch ServerlessPrometheusSecrets ManagerShieldTerraformWaf
Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
As a Lead Site Reliability Engineer, you will enhance cloud infrastructure, automate operations, and troubleshoot complex production issues in a secure environment.
Top Skills:
AnsibleAWSBashChefDirect ConnectDockerGoKubernetesPuppetPythonRestRubyScalaSoapTlsTransit GatewayUnix/LinuxVpc
Consumer Web • Digital Media • Information Technology • News + Entertainment • Social Media
The Senior Site Reliability Engineer will enhance infrastructure resilience, optimize system performance, and improve both physical and cloud systems while collaborating with engineering teams.
Top Skills:
AnsibleCC++DockerGoJavaKubernetesPythonTerraformUnix/Linux
Security • Software
Design and implement AWS infrastructure, manage automation with CloudFormation and Terraform, and ensure availability and reliability of cloud architectures. Support teams and advocate for improvements in architecture.
Top Skills:
AnsibleAWSC#C++CloudFormationDatadogDockerElasticsearchGrafanaHelmInfluxdbJavaKubernetesLogstashPythonSaltTerraform
Travel
Seeking a Lead Site Reliability Engineer with over 7 years of experience in Ops or DevOps. Responsibilities include architecting reliable systems, collaborating with teams, and ensuring system security and uptime.
Top Skills:
AWSBackboneChefDatadogGitJavaJavaScriptJqueryMongoDBNoSQLPrometheusReactRequirejsTerraform
Software
The Principal Site Reliability Engineer will architect and maintain fault-tolerant systems in the Jama Cloud, focusing on automation and reliability practices while guiding teams in engineering processes.
Top Skills:
ArangodbAWSBashDatadogDockerJ2EeMs SqlMySQLNeo4JPostgresPythonTerraform
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will design, implement, and enhance systems for infrastructure development, focusing on automation, reliability, and developer experience.
Top Skills:
AWSAzureBazelCrossplaneGCPGithub ActionsKubernetesTerraform
Top Boston Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results