1 June 2024

An exciting opportunity for a Site Reliability Engineer (SRE) who will monitor critical applications for leading banking institution.

Mandatory Skill(s)

  • Degree in Computer Science, Mathematics, Engineering, Econometrics or similar;
  • Knowledge and experience in Windows systems administration including events/services and asp.net/.net core applications;
  • Understanding of core SRE concepts, SLI / SLA / Error budgets and experience in how to implement various SRE models within the organization;
  • Proficient in Unix / Linux administration, tomcat application administration, Powershell scripting and automation;
  • Experience in monitoring and observability implementations especially on Windows stack with knowledge on networking concepts and Windows networking;
  • Knowledge of RDBMS concepts and performance management (Oracle / MS SQL), Azure pipelines;
  • Self-driven, committed, and reliable team player;
  • Passionate to learn new technologies;
  • SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency;
  • Ability to stay calm under pressure and very well organized in advancing tasks assigned.

Desirable Skill(s)

  • Experience in applying SRE concepts to Front office banking processes (Credit Approval, Monitoring);
  • Experience in Python Scripting and Ansible.

Responsibilities

  • Monitor the performance of our production systems using a host of monitoring tools;
  • Identify and troubleshoot issues such as software bugs, misconfigurations, performance bottlenecks and coordinating the fix of those issues;
  • Responsible for constantly running technical state health assessments on production infrastructure and systems;
  • Involved in actively monitoring SLAs and ensuring that services perform within promised SLAs;
  • Architecting, creating and automatically managing many ‘runners or bots’ that fully automate tasks across infrastructure and applications;
  • Coordinate incident management and service restoration;
  • Responsible for Disaster Recovery (DR) and Business Continuity Planning (BCP);
  • Gather relevant data and provide accurate production reporting for availability, reliability, performance and capacity.

If you are interested in this role, click on the “Apply to this job” button below or you could also write in with your CV to Meenakshi Saklani at meenakshi.saklani@sciente.com quoting the job title.

Meenakshi Saklani
EA Reg No.: R1876920
Lead Consultant
Let’s connect via
Apply to this Job