Design and develop overall data architecture to ensure the effective storage, retrieval, and analysis of large-scale data sets
Build and maintain scalable data pipelines for ingesting, processing, and transforming data from various sources into our data warehouse or data lake. Ensure data quality and data consistency throughout the pipeline
Optimize data storage, retrieval, and processing performance through indexing, partitioning, and caching techniques
Evaluate and select appropriate technologies, tools, and platforms for data processing
Improve availability and reliability of data streaming pipeline
Evaluate the scalability of data architecture to accommodate future growth and changing business needs
Develop build and deployment automation for microservices using CI/CD
Essential Requirements:
Bachelor's degree in Computer Science, Engineering or related field
8+ years developing backend services with Java/ Scala/Python
Strong modern data store knowledge including traditional data warehouse and data lake
Experience on solution design with microservices dealing large scale datasets
Experience in automating operational tasks through development and coding.
Hands-on experience using Maven, Jenkins, Git, JUnit
5+ years working experience using cloud services in AWS or other cloud provider
Excellent communication skill
Preferred Requirements:
Knowledge of Spark (java, Scala)
Knowledge of Databrick
Hands-on experience with Docker and Kubernetes
Experience working in cloud environments: AWS and/or Azure
Familiarity with performance monitoring using Datadog