About the Company
IBM is a global technology and consulting company headquartered in Armonk, New York, with operations in over 170 countries. We are a leader in hybrid cloud, AI, and enterprise services, constantly pushing the boundaries of innovation to solve the world’s most complex problems. Our mission is to be essential to our clients, to the world, and to each other.
Job Description
As a Site Reliability Engineer (SRE) at IBM in Portsmouth, you will play a crucial role in ensuring the reliability, performance, and scalability of our critical systems and applications. You will be responsible for applying software engineering principles to operations, automating tasks, troubleshooting complex issues, and implementing robust solutions to prevent future problems. This hybrid role involves both remote work and collaborative time at our Portsmouth office, fostering a balance between independent focus and team synergy.
Key Responsibilities
- Design, implement, and maintain scalable, reliable, and efficient infrastructure and applications.
- Develop and improve monitoring, alerting, and logging systems to proactively identify and resolve issues.
- Automate operational tasks, including deployment, scaling, and system maintenance.
- Participate in on-call rotations to provide 24/7 support for critical systems.
- Perform root cause analysis for production incidents and implement preventative measures.
- Collaborate with development teams to ensure new features are designed for reliability and operability.
- Manage and optimize cloud resources (e.g., IBM Cloud, AWS, Azure, GCP).
- Contribute to the continuous improvement of SRE practices and tools.
Required Skills
- Strong proficiency in at least one programming language (e.g., Python, Go, Java, Ruby).
- Experience with cloud platforms (e.g., IBM Cloud, AWS, Azure, GCP).
- Familiarity with containerization technologies (e.g., Docker, Kubernetes).
- Solid understanding of Linux/Unix operating systems and networking.
- Experience with CI/CD pipelines and automation tools (e.g., Jenkins, GitLab CI, Ansible, Terraform).
- Knowledge of monitoring and alerting tools (e.g., Prometheus, Grafana, ELK Stack, Splunk).
- Excellent problem-solving and communication skills.
Preferred Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Experience with large-scale distributed systems.
- Knowledge of database systems (SQL and NoSQL).
- Understanding of ITIL or similar service management frameworks.
- Certifications in cloud platforms (e.g., AWS Certified SRE, Google Cloud Professional SRE).
Perks & Benefits
- Competitive salary and performance-based bonuses.
- Comprehensive health, dental, and vision insurance.
- Generous paid time off and flexible work arrangements.
- Employee stock purchase plan.
- Pension scheme with company contributions.
- Continuous learning and development opportunities, including access to IBM's extensive training platforms.
- On-site gym and wellness programs (at applicable locations).
- Employee assistance program.
- Childcare vouchers.
How to Apply
If you are interested in this position, please click the "Apply Now" button below. To ensure your application is properly considered, please prepare the following:
- An up-to-date Resume or CV
- A brief cover letter summarizing your experience and motivation
Applications are reviewed on a rolling basis. Only shortlisted candidates will be contacted for an interview.
⚠️ Important Disclaimer
Welcome to Westford Trust. We publish job opportunities aggregated from public sources, employers, and job portals. We never charge any fees to access or use our website; all information is provided entirely for free.
Westford Trust does not directly offer or manage these positions, nor are we directly involved in the hiring process for the vacancies published on https://jobs.westfordtrust.com.
If you suspect a fraudulent listing or have any questions, please contact us at techturna@gmail.com.