Elevate your career as a Technology Support III team member in Commercial & Investment Bank, where you will ensure seamless operations, drive innovation and modernize cloud infrastructure for a future-ready financial ecosystem. You will play a vital role in ensuring the operational stability, availability, and performance of our production application flows. Your expertise will be crucial in troubleshooting, maintaining and resolving production service interruptions across both internally and externally developed systems, ensuring a seamless user experience and fostering a culture of continuous improvement. In this dynamic role, you will have the opportunity to delve into analyzing, optimizing, automating and modernizing our cloud infrastructure, where our applications are hosted, driving innovation and efficiency in our technological landscape.
Job Responsibilities:
- Treat operations as software problems, solve system failures, minimize client impacts and ensure all systems function seamlessly to meet business needs.
- Analyze and troubleshoot production application flows to ensure end-to-end application or infrastructure service delivery supporting the business operations of the firm.
- Monitor production environments for anomalies and address issues utilizing standard observability tools.
- Create proactive event management through observability and developing automation to minimize toil.
- Work closely with the Software Engineering and Product teams/other stakeholders to understand issues, drive resolutions, and stability enhancements of the applications, leading to improvement in operational stability, performance, and availability.
- Create, continuously improve, and maintain real-time monitoring dashboards to ensure system operability and performance.
- Identify trends and assist in the management of incidents, problems, and changes in support of full-stack technology systems, applications, or infrastructure.
- Participate in public cloud infrastructure management and cost optimization.
- Ensure resiliency of the platform through managed Disaster Recovery (DR), Site Reliability (SR), and High Availability (HA) events.
Required Qualifications, Capabilities, and Skills:
- 6+ years of experience or equivalent expertise troubleshooting, resolving, and maintaining information technology applications, infrastructure, and services.
- Experience supporting public/private cloud-based applications.
- Knowledge in automating daily support operations using scripting and tooling.
- Experience in one or more observability and monitoring tools and techniques like Datadog, Dynatrace, Splunk, ITRS Geneos.
- Knowledge of one or more general-purpose programming languages or automation scripting (Python, UNIX shell scripting, etc.).
- Experience with one or more major cloud services (AWS, Azure or GCP) and Infrastructure as Code tooling.
- Understanding of networking concepts and troubleshooting.
- Knowledge of source code management tools like Git, Bitbucket, and CI/CD tools like Jenkins.
- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals.
- Possesses a continuous improvement mindset.
Preferred Qualifications, Capabilities, and Skills:
- Advanced certification in Cloud technologies and Infrastructure as Code tooling.
- Familiarity with ITIL support methodologies and concepts.