Site Reliability Engineer
Salary: £70-90k
About Quix
Our mission is to help developers use live data in their applications, faster.
The use of event streaming is booming. Organisations are in a race to become data-driven, but working with live data-in-flight remains difficult, time-consuming and costly.
Our innovative developer-first platform enables developers and enterprise to build, test and deploy live models and services directly on Kafka, fast.
With built-in scale, efficiency and resilience, our unique Stream SDK supports time-series data, events, metadata and binary blobs, integrating seamlessly into the application stack to deliver an end-to-end platform that developers love, in minutes.
Quix is adopted by developers across the video game, racing, manufacturing, automotive, health and telco industries.
Our team rapidly developed into a remote-first organisation during 2020 with people now living and working across the world.
We are building a category defining platform which will launch a new data-driven epoch.
Join us and bring your passion!
Role
As a Site Reliability Engineer you will help deliver and scale a platform that developers love to use. You will:
Maintain existing services to guarantee uptime.
Build and implement disaster recovery when it is not and ensuring it is mostly the former via improvements.
Keep services running or getting them back up and running quickly when a failure occurs.
Ensure that we ship software that meets security requirements.
Automate work including infrastructure needs, failover solutions, failure mitigation.
Improve monitoring and alerting solutions.
Maintain documentation for recurring issues, prepare incident reports for production issues.
Migrate the platform to AWS, GCP and additional cloud platforms as required.
Design and implement on-prem and hybrid-cloud solutions.
Required skills and knowledge
- Professional communication skills, both verbal and written
- Experience operating large-scale production systems, with keen understanding of design principles and best practices of implementation
- Knowledge of:
- Networking (DNS, load balancer, etc)
- Unix / Linux shell
- Encryption for data-in-flight & rest
- Experience in the following technologies:
- Kafka
- Kubernetes
- Docker
- Azure Cloud Service
- Ansible, Terraform or alternatives
- Source control (git)
- Helm
Nice to have
- Experience in the following technologies:
- AWS
- Google Cloud
- Chaos Engineering
Benefits
- Work from home anywhere in the UK and EU (may be required to travel occasionally)
- Team meet-ups in EU destinations
- 37 days holiday (including all public holidays in your region)
- 2 additional paid days off a year for volunteering work
- Generous salary according to the experience and skills of the candidate
- Options to earn stock in the company
- Budget to choose own hardware and office set-up
- Training and personal development budget
- Regular socials with food/drink allowance
If you are interested in joining Quix, please send your CV to agnese@quix.ai