Finding the best job has never been easier
Share
Responsibilities:
Develop and deliver data solutions to accomplish technology and business goals.
Code design and delivery tasks associated with the integration, cleaning, transformation and control of data in operational and analytics data systems.
Work with stakeholders, Product Owners, and Software Engineers to aid in the implementation of data requirements, performance analysis, research and troubleshooting.
Work with data engineering practices and contribute to story refinement/defining requirements.
Participate in estimating work necessary to realize a story/requirement through the delivery lifecycle.
Code solutions to integrate, clean, transform and control data in operational and/or analytics data systems per the defined acceptance criteria.
Use Java, Scala, Python, Apache Kafka architecture, Cloudera architecture, components, and ecosystem to maintain system operations, enhance existing data processing routines and innovate new methods using the latest offerings.
Create advanced data pipelines by using Kafka APIs, including producers, consumers, and Kafka Streams.
Utilize Kafka Connect, Schema Registry, and KSQL to perform inline data enrichments, calculations and build near real-time data products.
Develop objects and metadata for integration of CDC data using fully transactional Hive managed tables.
Perform troubleshooting and performance tuning on Cloudera Data platform.
Perform Python and Spark development with an emphasis on Spark performance tuning and advance Spark multi-parallel processing applications.
Remote work may be permitted within a commutable distance from the worksite.
Required Skills & Experience:
Master's degree or equivalent in Computer and Information Science, Management Information Systems, Engineering (any), or related; and
2 years of experience in the job offered or a related IT occupation.
Must include 2 years of experience in each of the following:
Using Java, Scala, Python, Apache Kafka architecture, Cloudera architecture, components, and ecosystem to maintain system operations, enhance existing data processing routines and innovate new methods using the latest offerings;
Creating advanced data pipelines by using Kafka APIs, including producers, consumers, and Kafka Streams;
Utilizing Kafka Connect, Schema Registry, and KSQL to perform inline data enrichments, calculations and build near real-time data products;
Developing objects and metadata for integration of CDC data using fully transactional Hive managed tables;
Performing troubleshooting and performance tuning on Cloudera Data platform; and,
Performing Python and Spark development with an emphasis on Spark performance tuning and advance Spark multi-parallel processing applications.
If interested apply online ator email your resume toand reference the job title of the role and requisition number.
:
1st shift (United States of America)These jobs might be a good fit