Location
Dubai (Onsite), United Arab Emirates
Ready to apply for roles like this?
Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.
Unlock employer & apply directly
About
We are seeking a highly experienced Senior Production Lead with over 15 years of hands-on experience in managing production support for high-volume, customer-facing applications within the banking and financial services domain. This role is accountable for the end-to-end management of production incidents, service requests, and escalations, ensuring timely resolution, effective communication, and minimal business impact. The Senior Production Lead will oversee a team of production support engineers and analysts, driving operational excellence through incident management, root cause analysis, automation, and continuous improvement.
Job Description
Must have Skills: Incident management, Java (Expert), Cloud, Production Support, Stakeholder & Vendor Management
Knowledge & Experience:
- Own and manage end-to-end production support for digital applications, ensuring timely resolution of incidents and service requests.
- Lead the production support team, assigning tickets, overseeing issue analysis, and coordinating fixes and deployments.
- Ensure production go-live handoffs are seamless, with proper documentation, readiness checks, and support coverage.
Incident & Problem Management
- Act as the primary escalation point for major incidents, SLA breaches, and recurring issues.
- Lead incident triage, root cause analysis (RCA), and post-incident reviews to drive long-term resolution.
- Maintain clear and timely communication with IT management, business users, and stakeholders during incidents.
Monitoring & Health Checks
- Proactively monitor applications to pre-empt issues, minimize downtime, and ensure performance against SLAs.
- Implement and maintain health check routines, dashboards, and alerting mechanisms for critical systems.
Automation & SOP Development
- Identify and automate manual, repetitive tasks to improve operational efficiency.
- Develop and maintain Standard Operating Procedures (SOPs) for production support activities and incident handling.
Technical Troubleshooting & Consultation
- Provide hands-on troubleshooting for Java backend systems, web and mobile technologies, and various frameworks.
- Understand and resolve issues related to certificate exchanges, API integrations, cloud infrastructure, and system dependencies.
- Offer technical consultation to support teams and contribute to solution design reviews when needed.
Cloud & Infrastructure Coordination
- Leverage hands-on experience with Azure (primary) and AWS (secondary) cloud platforms for troubleshooting and deployment support.
- Coordinate with network, firewall, SecOps, and infrastructure teams to resolve cross-domain issues.
Stakeholder & Vendor Management
- Maintain strong relationships with business users, IT teams, and external vendors, ensuring alignment and timely resolution.
- Review vendor fixes and designs, and coordinate production deployments with third-party teams.
Communication & Reporting
- Own communication for major incidents, including status updates, impact assessments, and resolution timelines.
- Prepare and deliver incident reports, RCA documentation, and performance dashboards to senior stakeholders.
Team Leadership & Mentorship
- Mentor and coach production leads and support engineers, fostering a culture of ownership and continuous improvement.
- Ensure team members are equipped with the right tools, knowledge, and support to handle production issues effectively.
Cross-Functional Collaboration
- Work closely with application development teams, infrastructure teams, and business units to ensure smooth operations and issue resolution.
- Participate in project planning and go-live readiness to ensure production support requirements are met.
Compliance & Policy Adherence
- Ensure all production support activities conform to IT and bank policies, procedures, and security standards.
- Support Business Resumption Plan (BRP) testing and other compliance-related activities as required.
Operational Continuity & Coverage
- Provide coverage during team absences, and be prepared for temporary reassignment or promotion in case of business necessity.
- Strive to understand peer and superior roles to ensure continuity of operations during emergencies or transitions.
Skills
- Strong troubleshooting skills in Java backend systems
- Understanding of web and mobile technologies and various frameworks
- Hands-on experience with Azure Cloud (primary) and AWS Cloud (secondary)
- Knowledge of certificate management, including exchange and renewal processes
- Familiarity with API integrations, system interfaces, and backend services
- Ability to analyze and resolve complex production issues across distributed systems
Qualifications
The ideal candidate will be a problem solver and strategic thinker, capable of challenging build teams and architects, coordinating across multiple technical domains (network, security, firewall, application teams), and ensuring seamless production handoffs during go-lives. A deep understanding of certificates, integrations, system dependencies, and cross-functional collaboration is essential.