Own; Provide L1/L2 production management for mission-critical & GSP applications. This includes providing quick resolutions to app issues, driving stability, efficiency and effectiveness improvements to help us and the business succeed.
Maintains production application systems that have completed the development stage and are running in the daily operations of the firm.
Work closely with the development, Infra teams, QA and business support teams, and the GSP business to determine strategy & priorities and to ensure that the team is meeting the business’ requirements. Ensure team delivers on these priorities and communicated progress effectively o all stakeholders.
Work on technical solutions to be able to streamline, automate existing processes and provide smart monitoring solutions.
Perform appropriate problem-solving tasks before passing to another team with previously agreed upon format, logs, etc.
Own, coordinate, communicate and execute disaster recovering testing, application production releases for your applications
Analyze applications to identify risks, vulnerabilities, performance, and stability issues prior to them occurring, coordinate and work with other teams to ensure they are addressed.
Create and maintain a knowledge base to ensure that knowledge transfer takes place within the team.
Develop a comprehensive understanding of how applications collectively integrate to contribute to achieving business goals. Liaise with business support teams and application development groups.
Provide technical support coverage to all the GSP application, actively monitoring the applications and technical platform SLA’s and KPI’s.
This individual should have detailed knowledge of the applications and the upstream and downstream dependencies. They should be able to understand the application processes, database schema as an expert.
When dealing with major issues, the group would be expected to make key technical recommendations based on their knowledge of the systems and the process flows involved.
Acts as mentor or coach to newer or more junior analysts.
The candidate will spend a portion of their time on development tasks such as new features, scaling, recovery, and automation of manual tasks, continuous integration, and continuous delivery.
Champion stability initiatives to enable application high availability for Business-As-Usual which includes better monitoring, failover and resiliency
The candidate will partner closely with each area to perform diagnostic and forensic investigation for outages caused by scale issues in applications
Performs controlled resolution of incidents and problems including prioritization and escalation to relevant groups when appropriate, root cause analysis of all problems with follow-through to resolution.
The successful candidate will have the technical skills and aptitude that can bring the latest technology ideas to fruition and support an excellent client experience.
Engage Product/Technology Owners to establish business service level objectives, Service level Indicators to measure performance and then identify hotspots and architectural refinements required.
The candidate will also be responsible to provide Best in Class Production Availability, Resiliency and Predictability to the GSP business and trading functions by improving application logging for better proactive monitoring, designing solutions that are fault tolerant, and building always-on systems for high availability
Deep-dive into current production incidents, understand current design and architectural issues, develop innovative and technical tooling to improving production stability, enabling faster recovery and reducing toil.
Ability to handle incidents, problems and change at a global enterprise level. Calm and analytical when faced with major incidents on critical systems.
Exhibits sound and comprehensive communication and diplomacy skills to exchange complex information with inherent confidence with operations and technology partners on a regional or global basis.
Qualifications:
Demonstrated experience in an Application Support, Production Management, or related role would be preferred
Experience installing, configuring, or supporting business applications.
Experience tracking issues through tools such as JIRA and ServiceNow
Good all-around technical skills/background
Experience with some of the below technologies is required:
UNIX (AIX/ Linux) environment
Databases - Oracle, Microsoft SQL Server
Scripting languages - Python, Shell, Java/javascript, Perl
Demonstrated analytical skills
Advanced execution capabilities and ability to adjust quickly to changes and re-prioritization
Ability to plan, organize and prioritize workload.
Consistently demonstrates clear and concise written and verbal communication skills
Ability to communicate appropriately to relevant stakeholder
Effectively share information with other support team members and with other technology teams, the ability to work with your colleagues in an all around team environment
Willingness to learn, self-motivated
Experience in Fixed income businessknowledge/understandingis preferred
Experience in supporting Sales and Trading, Risk and Trade Management Platforms or similar systems.
Exposure or high level understanding of cloud computing platforms (AWS , Azure)
Experience with ITRS Geneos, Appdynamics or alternate APM/Monitoring tools
Knowledge/ experience of incident, Change and problem Management procedures
Knowledge of Kubernetes, OpenShift, container management
Configuration automation using Ansible
Ability to engage a large audience and lead the discussion with clear, articulate, and highly assertive communication
Experience with log aggregation tools such as ELK/Splunk
Experience with Dashboard/reporting tools such as Grafana, Prometheus, Tableau are nice to have
Previous experience of Site Reliability Engineering concepts
Applications SupportFull timeNew York New York United States$142,320.00 - $213,480.00