Logo of Huzzle

Cloud Operations Engineer- US Remote

image

Actian

2mo ago

Applications are closed

  • Job
    Full-time
    Mid Level
  • Software Engineering

Requirements

  • Bachelor’s degree in computer science or equivalent experience related to Information Technology
  • 3+ years’ experience as a Cloud Operations Engineer or Site Reliability Engineer managing a SaaS / PaaS / IaaS environment
  • Experience managing Linux and Windows Server
  • Experience with the configuration and automation toolsets such as Terraform, Puppet, Chef and Ansible
  • Experience in monitoring a global Cloud footprint. Hands-on with modern monitoring platforms and time-series databases, such as Grafana, Prometheus, DataDog, or SumoLogic, Nagios, Zenoss
  • Experience in the design and/or deployment of Public Cloud technologies (AWS, Azure, GCP)
  • Experience in Network Services such as DNS, DHCP, WAN Routing, TCP/IP networking and DNS, LDAP, NFS and SMTP.
  • Knowledge of RDBMS systems such as MySQL and SQL Server.
  • Experience with containerization and container orchestration especially with Docker, Kubernetes
  • Experience in the deployment and management of microservices
  • Experience maintaining and managing Spark, Kafka, Tomcat, Cassandra, and MySQL based systems
  • Proficient with Python, Bash, SQL or Java
  • Requires the ability to write and present effective materials, including presentations, status reports, technical diagrams, and flowcharts
  • Requires the ability to use problem-solving techniques, such as root cause analysis, to resolve issues.
  • Solid understanding of incident management, change management, and problem management
  • Nice to Haves:
  • Experience working with a globally distributed team
  • Understanding of software development lifecycle and CI/CD pipelines
  • Experience architecting and optimizing cloud platforms
  • Certifications in either of the following AWS, Azure and GCP

Responsibilities

  • Monitor and debug issues across the platforms (applications, networks, databases)
  • Administer, maintain, automate systems to ensure reliability, resiliency, scalability, and security
  • Deploy, maintain, and enhance monitoring solutions and provide technical resolutions and root cause analysis for high severity incidents
  • Work closely with Engineering and Software Development teams to design, deploy, and operate components/services that are automated, resilient, and scalable
  • Ensures that documented SSAE Policies and Procedures are followed and enforced
  • Create, update, and maintain documentation for all configurations for the production environment
  • Maintains and ensures the readiness and availability of disaster recovery environments
  • Develop and deliver timely reports on service metrics including but not limited to availability, capacity, performance, and latency across all production systems
  • Manage a 24x7x365 regional operational team

FAQs

What are the main responsibilities of a Cloud Operations Engineer?

Cloud Operations Engineers are responsible for keeping all production systems at Actian running smoothly, developing and delivering automation, creating monitoring and telemetry, enhancing the CD pipeline, driving operational excellence through automation and monitoring, and working closely with development teams to build reliability into products and architecture.

What skills are important for a Cloud Operations Engineer to have?

A successful Cloud Operations Engineer should be customer focused, a self-starter, and able and willing to work with geo-dispersed teams. Additionally, strong engineering principles, operational discipline, automation skills, monitoring and telemetry knowledge, CD pipeline development experience, and the ability to drive improvements and change across the organization are important skills for this role.

What is the primary goal of a Cloud Operations Engineer at Actian?

The primary goal of a Cloud Operations Engineer at Actian is to ensure the smooth operation of all production systems, drive operational excellence through automation and monitoring, and accelerate the adoption of containers and Kubernetes within the organization.

Powering the data-driven enterprise

Technology
Industry
201-500
Employees
2005
Founded Year

Mission & Purpose

Actian makes data easy. We deliver cloud and on-premises data solutions that simplify how people connect, manage, and analyze data. We transform business by enabling customers to make confident, data-driven decisions that accelerate their organization’s growth. Our data platform integrates seamlessly, performs reliably, and delivers at industry-leading speeds.