Analyze and refine Spark workloads to enhance performance and resource efficienc.. This includes tuning parameters, re-engineering data processing pipelines, and ensuring optimal execution strategies.
Implement monitoring solutions to detect and diagnose performance bottlenecks or failures in Spark applications. Solve complex runtime issues related to resource allocation, execution errors, or data inconsistencies.
Collaborate with various parts of the organization to design and build a framework that allows Business Data Cloud workloads to scale effortlessly and elastically across distributed data environments.
Produce documentation on best practices for Spark optimization and conduct training sessions or workshops to enhance the broader team’s expertise in performance tuning and support strategies.
Lead AI-automation initiatives to improve workload scalability and efficiency.
Mentor engineering teams and lead cross-functional global collaborations
Qualifications:
Deep familiarity with principles of distributed computing, including concurrency, fault tolerance, and network latency, which are essential for optimizing distributed data processing.
Comprehensive knowledge of Apache Spark architecture, including its core components like the driver, executor, and cluster manager, as well as Spark's execution model.
Expertise in performance profiling and tuning Spark applications, including optimizing resource allocation, parallelism, and shuffling processes to reduce execution time and improve efficiency.
Skilled in writing efficient SQL queries and transformations using SparkSQL and DataFrames, optimizing operations to reduce computation overhead.
Strong debugging skills to identify and resolve runtime issues, optimize code paths, and rectify configuration or environment-related problems.
Experience with tools such as Spark's web UI, Ganglia, Grafana, or Prometheus for monitoring application status and diagnosing performance bottlenecks.
Comprehensive understanding of AI, ML trends and their applications in the distributed Data query execution space.
Excellent leadership, mentorship, and communication skills.
Strategic vision and deep analysis skills to foresee industry advancements
We win with inclusion
Successful candidates might be required to undergo a background verification with an external vendor.