Logo of Huzzle

Principal Engineer - Availability Engineering

image

Salesforce

1mo ago

  • Job
    Full-time
    Expert / Leadership (9+ years)
  • Dublin
  • Quick Apply

AI generated summary

  • You should have extensive experience in engineering, robust problem-solving skills, and a deep understanding of system availability and reliability best practices.
  • You will ensure system reliability, improve availability processes, troubleshoot issues, and collaborate with teams to enhance engineering practices and service performance.

Requirements

  • 15+ years of hands on software development experience
  • 5+ year in a Tech Lead, Principal or Architect capacity
  • Ability to reverse engineer solutions via independent code and architecture review, envision, define and then supply to delivery of availability improvement refactoring projects
  • Mastery of one or more object oriented delivery with languages such as Java, Golang, APEX, Python
  • Deep experience working with core web technologies: HTTP, JSON, REST, XML
  • Proficiency with databases including Oracle or other relational and/or NoSQL solutions
  • Experience owning and operating multiple instances of a critical service
  • Running critical infrastructure services; monitoring, alerting, logging, tracing and reporting
  • Domain expertise on Service ownership standard processes, SLO/I/A definition, driving proactive operational awareness and experience with Incident / Problem management
  • Thorough knowledge of Agile development methodology with experience in both Test / Behavioral Driven Development practices

Responsibilities

  • As part of a specialist unit focused on availability and resilience, you will embed with delivery teams, acting in a Lead capacity, creating bandwidth and prioritizing a focus on corrective and proactive availability measures.
  • You will be contributing to designing, developing, debugging, and operating resilient applications and platforms deployed across distributed systems that run across thousands of compute nodes in multiple data centers.
  • You will champion resiliency standard processes; Observability tool integration, horizontal/vertical sizing & auto-scaling, release rollback & recovery workflows, integration tests and validation procedures for applications running on self host infra as well as public cloud platforms such as AWS, GCP, Azure & Alibaba.
  • Using and contributing to open source technology (Spinnaker, Zookeeper, etc.)
  • Developing / demonstrating Infrastructure-as-Code using Terraform.
  • Building / integrating with API’s and microservices deployed on containerization frameworks such as Kubernetes, Docker, Mesos etc.
  • Resolving complex technical issues and driving innovations that improve system availability, resilience, and performance.
  • You have experience balancing live runtime management, feature delivery, and retirement of technical debt.
  • Participate in the team’s on-call rotation to address complex problems in real-time and keep services operational and highly available.

FAQs

What is the job title for this position?**

The job title is Principal Engineer - Availability Engineering. **Question: What company is hiring for this position?** **Answer:** Salesforce is hiring for this position. **Question: What are the primary responsibilities of the Principal Engineer in Availability Engineering?** **Answer:** The primary responsibilities include embedding with delivery teams to focus on availability measures, designing and operating resilient applications across distributed systems, championing resiliency standard processes, and resolving complex technical issues to improve system availability and performance. **Question: What years of experience are required for this position?** **Answer:** The position requires 15+ years of hands-on software development experience and 5+ years in a Tech Lead, Principal, or Architect capacity. **Question: What programming languages should candidates be proficient in?** **Answer:** Candidates should have mastery of one or more object-oriented delivery languages such as Java, Golang, APEX, or Python. **Question: What kind of systems and technologies will this role work with?** **Answer:** This role will work with multi substrate engineering platforms, distributed systems, API's, microservices deployed on containerization frameworks such as Kubernetes and Docker, as well as infrastructure-as-code using Terraform. **Question: Does the position involve on-call duties?** **Answer:** Yes, the position involves participating in the team’s on-call rotation to address complex problems in real-time and maintain high availability of services. **Question: What are the key skills required for candidates applying for this position?** **Answer:** Key skills required include experience in availability improvement, proficiency with core web technologies (HTTP, JSON, REST, XML), database expertise (including Oracle or NoSQL solutions), and knowledge of incident/problem management processes. **Question: How does Salesforce approach equality and diversity in hiring?** **Answer:** Salesforce is committed to creating a workforce that reflects society through inclusive programs and initiatives, including equal pay, employee resource groups, and inclusive benefits. **Question: Where can applicants find more information about Salesforce's benefits and equality initiatives?** **Answer:** Applicants can learn more about Equality at www.equality.com and explore company benefits at www.salesforcebenefits.com.

👋 We’re Salesforce, the Customer Company. AI + Data + CRM = Customer Magic. ✨

Technology
Industry
10,001+
Employees

Mission & Purpose

Salesforce is a leading cloud-based software company that provides customer relationship management (CRM) solutions and a wide range of enterprise applications. Their platform enables businesses to manage customer interactions, sales processes, marketing campaigns, and service operations in a centralised and efficient manner. Salesforce's ultimate mission is to empower companies to connect with their customers, partners, and employees in meaningful ways, fostering stronger relationships and driving business growth. Their purpose is to revolutionise the way businesses operate by offering a comprehensive suite of cloud-based tools and applications that streamline processes, enhance collaboration, and enable organisations to make data-driven decisions. With a strong focus on innovation, customer success,