Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

Honeywell Site Reliability Engineer 
United States 
627770977

03.07.2024
JOB DESCRIPTION

Key Responsibilities

  • Hands-on design, analysis, development and troubleshooting of highly distributed large-scale production systems and event-driven, cloud-based services
  • Primarily Linux Administration, managing a fleet of Linux and Windows VMs as part of the application solution
  • Infra as a code development – Terraform, shell and python
  • Ensuring the repeatability, traceability, and transparency of our infrastructure automation
  • Support on-call rotations for operational duties that have not been addressed with automation
  • Support healthy software development practices, including complying with the chosen software development methodology (Agile, or alternatives), building standards for code reviews, work packaging, etc.
  • Create and maintain monitoring technologies and processes that improve the visibility to our applications' performance and business metrics and keep operational workload in-check.
  • Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities.
  • Develop, collaborate, and monitor standard processes to promote the long-term health and sustainability of operational development tasks.
  • Participate in technical training events, game day scenarios, and professional conferences

YOU MUST HAVE

  • 2+ Years of experience in system administration, application development, infrastructure development or related areas
  • 2+ years of experience in Azure cloud administration and solution design.
  • 2+ years of experience with programming in languages like Javascript, Python, PHP, Go, Java or Ruby
  • 2+ years of in reading, understanding and writing code in the same
  • 3+ years Mastery of infrastructure automation technologies (like Terraform, CodeDeploy, Puppet, Ansible, Chef)
  • 2+ years expertise in container/container-fleet-orchestration technologies (like Kubernetes, AKS, EKS, Docker, Vagrant, etcd, zookeeper)
  • 2+ years Cloud and container native Linux administration/build/management skills

WE VALUE

  • Versatility with troubleshooting diverse sets of hosting technologies strongly desired. These include web server platforms, application platforms, operating systems, network components, virtualization technologies, storage, and database platforms.
  • Expertise with cloud- continuous-deployment- based software development lifecycles (e.g. CI/CD)
  • Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), Caching operations & deployment experience (memcache, Redis)
  • Expertise with Lean/Agile deployment processes (Blue/Green, ZDT, Canary, load balancers/DNS strategies A/B test, feature flagging methodologies)
  • Familiarity with site and infrastructure monitoring systems (like ELK, Datadog, AppDynamics, New Relic, Splunk, Sumologic, Grafana)
  • Strong problem solving, root cause analysis and systems engineering skills
  • Excellent presentation and communication skills
  • Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways.
  • Demonstrated expertise building and managing highly scaled production infrastructure in the cloud (Azure required; GCP, AWS, OpenStack a plus)
  • Expertise with SDLC branching, SCM, and code deployment systems (Bitbucket, git/gitflow, Jenkins, CircleCI, TravisCI, etc.)
Additional Information
  • JOB ID: HRD236198
  • Category: Engineering
  • Location: 715 Peachtree Street, N.E.,Atlanta,Georgia,30308,United States
  • Exempt
  • Must be a US Person or able to obtain export Authorization.