As a Technology Support Lead in Corporate Technology Infrastructure Management, you will play a leadership role in ensuring the operational stability, availability, and performance of our production services. Critical thinking while overseeing day-to-day maintenance of the firm’s systems will be key and set you up for success as you navigate tasks related to identifying, troubleshooting, and resolving issues to ensure a seamless user experience.
Job responsibilities
- Provide cutting edge AI/ML Platform solutions using LLMs, public cloud and modern standards and patterns.
- Provide solutions using AWS Cloud Services for compute, storage, databases, and security and Azure Services
- Develop solutions using Generative AI Models
- Work closely with the Product team to design, build and deliver capabilities in agile sprints.
- Collaborate with cross-functional teams, including data scientists, software engineers, and designers, to integrate generative AI into various applications and products.
- Develop and implement state-of-the-art generative AI services leveraging Azure Open AI models and AWS Bedrock service.
- Lead teams of technologists that provide end-to-end application or infrastructure service delivery for the successful business operations of the firm
- Execute policies and procedures that ensure operational stability and availability
- Monitor production environments for anomalies, address issues, and drive evolution of utilization of standard observability tools
- Escalate and communicate issues and solutions to the business and technology stakeholders, actively participating from incident resolution to service restoration
- Lead incident, problem, and change management in support of full stack technology systems, applications, or infrastructure
Required qualifications, capabilities, and skills
- 5+ years of experience or equivalent expertise troubleshooting, resolving, and maintaining information technology services
- Hands-on experience working on AWS Cloud Based applications development using EC2, EKS, Lambda, SQS, SNS, RDS Aurora MySQL & Postgres, DynamoDB, and Kinesis.
- Strong understanding of AWS networking, security, IAM roles, monitoring and application debugging
- Deep expertise across application, data, security, and infrastructure disciplines
- Experience in setting up public cloud infrastructure using TerraForm.
- Experience working containerized services on Kubernetes or ECS
- Experience with Python, Java, and REST APIs.
- Solid understanding of improving and debugging backend performance bottlenecks.
- Experience with application production readiness, production monitoring, and production issue triaging- Knowledge of industry standard software best practices, development lifecycle processes, Agile tools, methodologies, and security best practices.
- Strong engineering background in deploying and scaling ML and AI Models on Public cloud services.
- Proficiency in programming languages such as Python, Java
Preferred qualifications, capabilities, and skills
- Knowledge of AWS Sagemaker and data analytics tools
- Ability to adapt to new technologies and learn quickly in a fast-paced environment.
- Experience with Single Sign On/OIDC integration and deep understanding of OAUTH, and JWT/JWE/JWS. IDAnywhere
- Knowledge or experience with working on Azure will be a plus