SRE Lead

navneetkaur | Updated: January 23, 2025

About Trantor

Trantor is a technology services company focused on outsourced product development and digital re-engineering. Leveraging our CaptiveCoE™ engagement model, we operate as a seamless extension of our clients’ teams to provide rapid scalability with predictable budgets. Founded in 2012, Trantor has worked with customers across Tech, FinTech, Media & Cyber Security industries. We have centers in the US, India, Canada, and Costa Rica. We are consistently rated as the #1 employer in the region with the ability to attract and retain technical talent. Our commitment to excellence and impactful results has translated to long-term relationships and value for our clients and solution partners Please visit us at: https://trantorinc.com

Role and Responsibilities

About Trantor

Role and Responsibilities

Reliability Engineering:

Design and implement strategies to ensure system reliability and scalability, meeting high traffic demands during peak events like FIFA 2026.Define and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for all critical systems.

Monitoring and Observability:

Set up and maintain monitoring and alerting systems using tools like CloudWatch, Datadog, or Prometheus.Create dashboards and alerts for key performance metrics, including latency, error rates, and throughput.

Incident Management:

Develop and maintain incident response plans, ensuring quick resolution and root cause analysis.
Lead on-call rotations and manage post-mortem processes for critical incidents.

Automation and Scalability:

Automate routine operational tasks, including scaling, failover, and disaster recovery, using tools like AWS Lambda and scripting languages.
Collaborate with DevOps and Platform Engineering teams to improve CI/CD pipelines and deployment reliability.

Security and Compliance:

Work with the security team to implement best practices, including IAM role enforcement, encryption, and secure logging.
Monitor compliance with regulatory requirements, including GDPR and AWS-specific standards.

Knowledge Sharing and Leadership:

Provide mentorship to junior team members on SRE practices.
Collaborate with the AWS architect to ensure operational alignment with infrastructure designs.

Skills and Qualifications

7+ years of experience in SRE or a related role with a focus on system reliability and performance.
Strong proficiency in monitoring and observability tools (e.g., Datadog, Prometheus, Grafana).
Expertise in scripting (Python, Go, or Bash) for automation and tooling.
Hands-on experience with AWS services, including CloudWatch, Lambda, DynamoDB, and S3.
Proven track record of managing high-traffic, consumer-facing applications with strict uptime requirements.
Experience implementing and managing SLAs, SLOs, and SLIs.
Knowledge of security best practices, including vulnerability assessments and IAM management.
Familiarity with chaos engineering practices to test system resilience.
AWS Certified DevOps Engineer – Professional certification.

Job Category: SRE

Job Type: Full Time

Job Location: Chandigarh/Gurgaon/Noida/Remote

Shift Timing: General Shift

SRE Lead

Apply for this position

Trantor will be a part of your mission!

Services

Our Company

Let’s Connect

Download the Collateral

Take a quick assessment(1/4)

(Customer Centricity, Teams working across Boundaries)

Take a quick assessment(2/4)

(Design Thinking)

Take a quick assessment(3/4)

(Fail/Learn Fast)

Take a quick assessment(4/4)

(Developed Management)

and we will get back to you soon. Thanks!