To provide 24/7 Level 1 (L1) support for autonomous event management and alert monitoring across multiple network domains—including Fixed Transport, Mobile (CS/PS) Core, Fixed Core, Radio Access Network—as well as the company Enterprise IT Infrastructure and critical business IT applications. This role is responsible for continuously observing AI-generated alerts, performing initial triage, escalating incidents as per defined procedures, and ensuring timely response to operational events. The engineer works within a structured SOC environment to maintain service continuity and support incident resolution through AI-driven tools and dashboards.
Ready to apply for roles like this?
Unlock the company name and direct application link. Subscribers get instant access to fresh jobs across Dubai, Abu Dhabi and Riyadh, many with visa support.
Unlock employer & apply directly
Key Accountabilities
- Perform first-level analysis of alerts, validate severity, and escalate incidents to appropriate teams following predefined SOPs and SLAs.
- Use AI-powered dashboards (e.g., Grafana, Kibana) to track system health, visualize alerts, and ensure real-time visibility of operational events.
- Operate and interact with AI agents and automation platforms to support event classification, incident creation, and information sharing.
- Monitor and escalate all actionable fault alarms using AI tools to improve Trouble Ticket SLAs.
- Execute basic diagnostic scripts (Python/ML-based) to support fault validation and reduce false positives during or after incident handling.
- Log all monitored alerts, actions taken, and escalations in Trouble Ticketing systems with clear and accurate details.
- Assist in identifying repetitive alert patterns and contribute to automation initiatives aimed at reducing manual intervention.
- Participate in disaster recovery and business continuity activities by ensuring timely alert detection and escalation during critical events.
- Support SOC operations by working in shifts as part of the 24/7 duty roster, ensuring uninterrupted monitoring and response coverage.
Requirements
- Bachelor’s degree in computer networks, data science, artificial intelligence or relevant field.
- Basic understanding of mobile network architecture (e.g., LTE, 5G, GSM, VoLTE, etc.).
- Familiarity with IT infrastructure components: servers, storage, networking, and cloud platforms.
- Certifications in Python for networks, AI, machine learning, and Data Science is a plus.
- Basic Knowledge of AI/ML algorithms for anomaly detection and predictive alerting.
- Experience or knowledge in training machine learning models in AI/ML/DS projects.
- Familiarity with Event Handling (alert monitoring workflows) and escalation protocols.
- Exposure to application monitoring tools.
- Basic knowledge of alarms classification and unsupervised machine learning algorithms.
- Willingness to work in shifts or on-call rotations.
- Skills in complex problem solving, inter/intra team collaboration, excellent communication, initiative, and commitment to achieve.
Competencies
- Think strategically (Level 2 of 5)
- Achieve tangible results (Level 2 of 5)
- Lead breakthrough change (Level 2 of 5)
- Exceed customer expectations (Level 2 of 5)
- Nurture, Inspire and Motivate (Level 2 of 5)
- Target win-win outcomes (Level 2 of 5)
Location
United Arab Emirates