Job Description
The Opportunity
- Based in Hyderabad, join a global healthcare biopharma company and be part of a 130-year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare.
- Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products.
- Drive innovation and execution excellence. Join a team that is passionate about using data, analytics, and insights to drive decision-making and create custom software, allowing us to tackle some of the world's greatest health threats.
Role Overview
- Design, develop, and maintain data pipelines to extract data from various sources and populate a data lake and data warehouse.
- Work closely with data scientists, analysts, and business teams to understand data requirements and deliver solutions aligned with business goals.
- Build and maintain platforms that support data ingestion, transformation, and orchestration across various data sources, both internal and external.
- Use data orchestration, logging, and monitoring tools to build resilient pipelines.
- Automate data flows and pipeline monitoring to ensure scalability, performance, and resilience of the platform.
- Monitor, troubleshoot, and resolve issues related to the data integration platform, ensuring uptime and reliability.
- Maintain thorough documentation for integration processes, configurations, and code to ensure easy onboarding for new team members and future scalability.
- Develop pipelines to ingest data into cloud data warehouses.
- Establish, modify and maintain data structures and associated components.
- Create and deliver standard reports in accordance with stakeholder needs and conforming to agreed standards.
- Work within a matrix organizational structure, reporting to both the functional manager and the project manager.
- Participate in project planning, execution, and delivery, ensuring alignment with both functional and project goals.
What should you have
Foundational Data Concepts
SQL (Intermediate / Advanced)
Python (Intermediate)
Cloud Fundamentals (AWS Focus)
AWS Console, IAM roles, regions, concept of cloud computing
AWS S3
Data Processing & Transformation
Apache Spark (Concepts & Usage)
Databricks (Platform Usage), Unity Catalog, Delta Lake
ETL & Orchestration
AWS Glue (ETL, Catalog), Lambda
Apache Airflow (DAGs and Orchestration)
or other orchestration tool
dbt (Data Build Tool)
Matillion (or similar ETL tool)
Data Storage & Querying
Amazon Redshift / Azure Synapse
Trino / Equivalent
AWS Athena / Query Federation
Data Quality & Governance
Data Quality Concepts / Implementation
Data Observability Concepts
Collibra / equivalent tool
Real-time / Streaming
Apache Kafka (Concepts & Usage)
DevOps & Automation
CI / CD concepts, Pipelines
(GitHub Actions / Jenkins / Azure DevOps)
What we look for:
Current Contingent Workers apply
*A job posting is effective until 11:59:59PM on the dayBEFOREthe listed job posting end date. Please ensure you apply to a job posting no later than the dayBEFOREthe job posting end date.