Job responsibilities
- Executes creative software solutions, design, development, and technical troubleshooting with ability to think beyond routine or conventional approaches to build solutions or break down technical problems
- Implement real-time data processing solutions to handle large volumes of data efficiently.
- Ensure data processing solutions adhere to security and compliance standards.
- Document data processing workflows, architecture, and best practices.
- Optimize data processing pipelines for performance and scalability.
- Monitor and troubleshoot performance issues in Kafka and Spark applications.
- Leads communities of practice across Software Engineering to drive awareness and use of new and leading-edge technologies
- Adds to team culture of diversity, opportunity, inclusion, and respect
Required qualifications, capabilities, and skills
- Formal training or certification on software engineering concepts and 5+ years applied experience
- Proven Senior Java developer with over 10 + year expertise in Java, Kafka, Spark, Structured Streaming, and Spark SQL.
- Design and implement scalable data processing pipelines using Apache Kafka, Apache Spark, and Structured Streaming.
- Develop and maintain Java applications for data ingestion, transformation, and storage.
- Integrate data processing solutions with AWS services such as Apache Kafka/Amazon MSK, Amazon S3, AWS Lambda, and Amazon EMR.
- Strong experience with AWS services and cloud-based architectures.
- Advance experience with exposure to Spark Structured Streaming and Spark SQL.
- Experience with data enrichment, transformation, and optimization techniques.
- Experience in developing, debugging, and maintaining code in a large corporate environment with one or more modern programming languages and database querying languages.
- Proficiency in designing and implementing real-time data processing solutions.
Preferred qualifications, capabilities, and skills
- Experience with Python/shell scripting and working in a Linux environment.
- Experience building distributed systems at Internet scale.