- Shell Scripting
- Distributed Systems
We are looking for a Site Reliability Engineer with experience in distributed systems to join our team working on a new transaction monitoring application used across multiple businesses.
As a Site Reliability Engineer, you will lead the team’s thinking when it comes to hosting applications in production. You will understand the functional and non-functional requirements and design an approach that will allow us to productionise the new system as quickly as possible, and then iterate to a robust approach. You will integrate with other systems for provisioning, monitoring, and alerting. Working closely with our support teams and development teams, you will help do all this with a high level of automation and reliability.
We want to minimize manual processes and strong software engineering proficiency in this role is key.
This is an exciting opportunity to work on an important project, which will have huge impact on the business and our future architecture.
Working closely with a data-centric application, hosting algorithms to detect possible market abuse.
Designing a deployment and hosting strategy that will scale to support multiple algorithms and many end users in a performant way in a way that will deliver value early and often. Implementing against this strategy.
Build out our CI/CD pipeline to deploy all layers of the architecture to production in a robust, easy-to-use and automated way.
Providing alerting, technical documentation and operational support to allow us to identify and resolve incidents with the health of the application as quickly as possible. Handing this approach over the support teams in production environment, possibly through the writing of Runbooks.
Building a close relationship with clients and stakeholders to understand the use case for the platform, and prioritise work accordingly
Working well in a multidisciplinary DevOps-focused team, building a close relationship with other developers, Quants/Data Scientists and production support teams
Skills & Qualifications:
You have experience taking applications to production, ideally written in R or Python. You can design a fault-tolerant, scalable system from scratch based on a set of requirements. You will have worked with a legacy application, and understand how to iterate that to a more desirable approach.
You have experience building CI/CD pipelines from scratch for microservice architectures, ideally with experience in TeamCity, IBM UrbanCode Deploy and/or Jenkins.
You will have developing or supporting applications with a significant ETL component. You can deploy, scale and monitor these applications.
You have experience creating, deploying and supporting Docker images and are familiar with parts of the ecosystem such as OpenShift or Kubernetes
You are fluent in Bash/Shell and at least one other programming/scripting language, preferably Python#
An ideal candidate would have experience working with R code in a production system.
You have high development standards, especially for code quality, code reviews, unit testing, continuous integration and deployment
You have proven capability to interact with clients and deliver results, taking ideas to production
You have experience working in fast paced development environments
You agree that verbal and written communication skills are vital
Grade :All Job Level - All Job FunctionsAll Job Level - All Job Functions - GB
Time Type :
Citi is an equal opportunity and affirmative action employer.
Minority/Female/Veteran/Individuals with Disabilities/Sexual Orientation/Gender Identity.
Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity .
To view the "EEO is the Law" poster . To view the EEO is the Law Supplement .
To view the EEO Policy Statement .
To view the Pay Transparency Posting .