Job Description*
An expert who envisions the emerging technology trends, researches, keeps track of new software & tools that can help build the Data Science platform.
Responsibilities*
- Lead the architecture and design for building scalable, resilient, and secure distributed applications ensuring compliance with organizational technology guidelines, security standards, and industry best practices like 12-factor principles and well-architected framework guidelines.
- Actively contribute to hands-on coding, building core components, APIs and microservices while ensuring high code quality, maintainability, and performance.
- Ensure adherence to engineering excellence standards and compliance with key organizational metrics such as code quality, test coverage and defect rates.
- Integrate secure development practices, including data encryption, secure authentication, and vulnerability management into the application lifecycle.
- Work on adopting and aligning development practices with CI/CD best practices to enable efficient build and deployment of the application on the target platforms like VMs and/or Container orchestration platforms like Kubernetes, OpenShift etc.
- Collaborate with stakeholders to align technical solutions business requirements, driving informed decision-making and effective communication across teams.
- Mentor team members, advocate best practices, and promote a culture if continuous improvement and innovation in engineering processes.
- Develop efficient utilities, automation frameworks, data science platforms that can be utilized across multiple Data Science teams.
- Propose/Build variety of efficient Data pipelines to support the ML Model building & deployment.
- Propose/Build automated deployment pipelines to enable self-help continuous deployment process for the Data Science teams.
- Analyze, understand, execute and resolve the issues in user scripts / model / code.
- Perform release and upgrade activities as required.
- Well versed in the open-source technology and aware of emerging 3rd party technology & tools in AI-ML space.
- Ability to fire fight, propose fix, guide the team towards day-to-day issues in production.
- Ability to train partner Data Science teams on frameworks and platform.
- Flexible with time and shift to support the project requirements. It doesn’t include any night shift.
- This position doesn’t include any L1 or L2 (first line of support) responsibility.
Education*
- Graduation / Post Graduation: BE/B.Tech/MCA/MTech
- Certifications If Any:Generative AI, Data Science & NLP
Experience Range*
Foundational Skills*
- Experience with Data Science, artificial Intelligence and Machine Learning tools and technologies (Python, R, H2O, Spark, SparkML)
- Strong knowledge in cloud platform technologies and good to have experience in at least one major cloud platform like AWS, Azure or GCP.
- Extensive hands on experience in designing, developing, and maintaining software frameworks using Python, Spark, shell scripts
- Strong knowledge in DevOps practices, CI/CD technologies, container technologies and platforms like Docker, Padman,Kubernetes/OpenShift.
Desired Skills*
- Experience building and supporting E2E Data Science (using AI and ML) and Advanced Analytics platform for Model development, Model building & deployment.
- Exposure to Driverless AI, Virtual Environments/Kernel Management will be a sharp added advantage.
- Extensive hands on supporting platforms to allow modelling and analysts go through the complete model lifecycle management (data munging, model develop/train, governance, deployment)
- Experience with model deployment, scoring and monitoring for batch and real-time on various different technologies and platforms.
- Experience in Hadoop cluster and integration includes ETL, streaming and API styles of integration.
- Experience in automation for deployment using Ansible Playbooks, scripting.
- Design and build and deploy streaming and batch data pipelines capable of processing and storing large datasets quickly and reliably using Kafka, Spark and YARN for large volumes of data (TBs)
- Experience designing and building full stack solutions utilizing distributed computing or multi-node architecture for large datasets (terabytes to petabyte scale)
- Experience with processing and deployment technologies such YARN, Kubernetes /Containers and Serverless Compute for model development and training
- Hands on experience working in a Cloud Platform (AWS/Azure/GCP) to support the Data Science
- Effective communication, Strong stakeholder engagement skills, Proven ability in leading and mentoring a team of software engineers in a dynamic environment.
Work Timings*