Expoint - all jobs in one place

המקום בו המומחים והחברות הטובות ביותר נפגשים

Limitless High-tech career opportunities - Expoint

Wix Senior Site Reliability Engineer - Deviant Art 
Israel, Tel Aviv District, Tel Aviv-Yafo 
536052289

02.04.2024

As a Site Reliability Engineer at DeviantArt you will be responsible for ensuring the robustness, scalability, and security of the platform infrastructure that supports over 1.5 billion monthly page views. This involves balancing daily operations of troubleshooting, server maintenance, and small tasks, alongside architecting, developing, and completing larger infrastructure projects often in conjunction with other teams and stakeholders.

Infrastructure Scalability and High Load Management:

  • Maintain and architect a scalable, highly available infrastructure on AWS through load balancing and auto-scaling, capable of handling over 1.5 billion page views monthly with optimal performance
  • Ensure high availability of site and critical infrastructure, addressing downtime and degradation issues quickly to restore critical systems and services
  • Maintain a developer environment in parity with production systems, to ensure changes can be appropriately tested before release
  • Develop and maintain CI/CD pipelines using Terraform and Kubernetes, enhancing deployment strategies for high efficiency and zero downtime
  • Utilize configuration management tools to automate and streamline infrastructure provisioning and management, including writing tests and documentation Database Performance and Scalability:
  • Optimize, maintain, and scale sharded MySQL databases to ensure fast, efficient, and reliable data access and storage amidst increasing data ingest
  • Troubleshoot slow queries and bottlenecks on MySQL servers to quickly mitigate production issues Security and DDOS Mitigation:
  • Develop and enforce stringent security protocols to protect infrastructure from threats, with a particular focus on DDOS attack mitigation
  • Upgrade AWS components, servers, containers, and packages regularly to proactively and retroactively address any security issues
  • Continuously monitor, analyze, and improve internal security measures to ensure the highest level of protection against evolving threats Cost Optimization and Resource Management:
  • Monitor and optimize cloud resource utilization to ensure cost-effective operation and efficient use of computing, aiming for low infrastructure costs without compromising performance