Keep our customers’ data separated and secure to meet compliance and regulations requirements.
Design, Build and Operate the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and cloud (mainly AWS) migration and ‘big data’ technologies.
Optimize various RDBMS engines in the cloud and solve customers' security, performance, and operation problems.
Design, Build and Operate large, complex data lakes that meet functional / non-functional business requirements.
Optimize various data types ingestion, storage, processing, and retrieval from near real-time events and IoT to unstructured data as images, audio, video and documents, and in between.
Use Jupyter Notebooks to build and deploy ML models.
Leverage AWS AI/ML pre built solutions to accelerate work for customers
Work with customers and internal stakeholders, including the Executive, Product, Data, Software Development, and Design teams, to assist with data-related technical issues and support their data infrastructure and business needs.
Requirements
Summary of Key Requirements
We seek a candidate with 3+ years of experience in a Data Scientist/Machine Learning Engineer role who has attained a Bachelor's (Graduate preferred) degree in Computer Science, Mathematics, Informatics, Information Systems, or another quantitative field. They should also have experience using the following software/tools:
Experience with big data tools: Spark, ElasticSearch, Hadoop, Kafka, Kinesis etc.
Experience with relational SQL and NoSQL databases, such as MySQL or Postgres and DynamoDB or Cassandra.
Experience with AWS cloud services: EC2, RDS, EMR, Redshift etc.
Experience with functional and scripting languages: Python, Java, Scala, etc.
Experience with various ML models for classification, scoring and more.
Experience with Deep Learning Neural Networks (Convolution, NLP etc.)
Experience with AWS AI/ML Services
Experience with Python coding
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Experience supporting and working with external customers in a dynamic environment.