As an Unstructured Data Engineering Lead, you will have the opportunity to tap into your curiosity and collaborate with some of the most innovative and diverse people around the world. Here, you will make an impact by:
Leading the development of pipelines for preprocessing unstructured data, eliminating duplicate data and text noise, chunking data, and generating vector embeddings.
Implementing efficient and scalable solutions using Python programming skills and cloud engineering expertise to handle unstructured data effectively.
Determining the best approaches and techniques for data preprocessing tasks, driving innovation and efficiency in handling unstructured data.
Supporting the team by providing guidance, mentorship, and technical expertise in data engineering, particularly in the context of unstructured data.
By taking on this role, you will play a crucial part in driving the success of our organization's unstructured data initiatives and contribute to the advancement of data engineering practices.
ey Responsibilities:
- Lead the design and development of data pipelines for processing and analyzing unstructured data.
- Preprocess unstructured data to eliminate duplicate data, text noise, and other irrelevant information.
- Chunk large volumes of unstructured data into manageable segments for efficient processing.
- Develop and implement algorithms and techniques for generating vector embeddings from unstructured data.
- Collaborate with data scientists, machine learning engineers, and domain experts to define data requirements and objectives.
- Optimize data preprocessing and embedding generation pipelines for scalability and performance.
- Leverage strong Python programming skills to develop efficient and reliable data engineering solutions.
- Utilize cloud engineering expertise to design and implement scalable and cost-effective data processing architectures.
- Explore and leverage open source software and tools to drive innovation and efficiency in handling unstructured data.
- Stay up-to-date with the latest advancements in data engineering and unstructured data processing techniques.
- Mentor and guide junior engineers, fostering a collaborative and innovative team environment.
To set you up for success in this role from day one, 3M requires (at a minimum) the following qualifications:
- Bachelor's degree or higher (completed and verified prior to start) in Computer Science or Engineering
- Three (3) years of experience in unstructured data engineering at a large manufacturing company in a private, public, government or military environment
- Three (3) years of experience as a data engineer, with expertise in handling unstructured data.
Additional qualifications that could help you succeed even further in this role include:
- Master’s degree in Computer Science, Engineering, or related field from an accredited institution
- Strong understanding of data engineering concepts and best practices.
- Proficiency in Python programming, with the ability to develop efficient and reliable data engineering solutions.
- Expertise in cloud engineering, with experience in designing and implementing scalable and cost-effective data processing architectures.
- Familiarity with open source software and tools for data engineering and unstructured data processing.
- Experience with data preprocessing techniques, including duplicate elimination, noise removal, and chunking.
- Knowledge of algorithms and methods for generating vector embeddings from unstructured data.
- Knowledge of distributed computing frameworks, such as Apache Spark or Hadoop.
- Strong analytical and problem-solving skills, with the ability to optimize data processing pipelines.
- Excellent communication and collaboration abilities, with the capacity to work effectively in cross-functional teams.
- Ability to adapt to a fast-paced and dynamic environment
Work location:
- Hybrid Eligible (Job Duties allow for some remote work but require travel to Maplewood, MN at least 2 days per week)
Please note: your application may not be considered if you do not provide your education and work history, either by: 1) uploading a resume, or 2) entering the information into the application fields directly.
Please access the linked document by clicking select the country where you are applying for employment, and review. Before submitting your application you will be asked to confirm your agreement with the terms.