- Expérience
- 2+ yrs
- Salaire
- —
- Ouvertures
- 1
- Publié
- il y a 5 heures
- Work mode
- Au bureau
- Éducation
- Diploma or Degree in Electrical Engineering, Mechanical Engineering, Facilities Engineering, or related discipline
- Eligibility
- Candidates with at least 2 years of experience in 24x7 facilities, network operations, command center, critical environment, or data center operations are suitable. Applicants must be ready for rotating shifts and on-site work at client data center facilities.
- Resume
- Required to apply
Where you'll work
Description de l'emploi
Role overview
The Incident Response Analyst II is responsible for managing the full lifecycle of facilities-related incidents in a 24x7 data center operations setting. The role covers detection, triage, escalation, coordination, documentation, and follow-up actions to help maintain uptime, stability, and safe operations across critical infrastructure.
Incident and event management
- Review alarms and unusual operating conditions, then acknowledge and respond promptly.
- Act as the primary point of response for facility events by using monitoring and automation tools.
- Judge incident severity and business impact, then choose the correct escalation route.
- Run or support incident bridges and coordinate communication during serious events.
- Take ownership of major facility incidents as the incident coordinator.
- Keep incident logs current, including timelines, actions taken, and ticket updates.
- Work closely with site operations, vendors, engineering teams, and management stakeholders.
- Perform early-stage root cause analysis and flag repeating problems.
- Contribute to operational improvements and lessons learned activities.
- Work in line with SOPs, MOPs, EOPs, runbooks, and playbooks.
- Follow a rotating shift schedule in a 24x7 operation.
Facilities monitoring and alarm operations
- Track alerts from BMS, DCIM, and EPMS platforms.
- Monitor alarms related to utility power, UPS units, battery systems, generators, ATS equipment, PDUs, HVAC and cooling assets, CRAH and CRAC units, chilled water systems, environmental sensors, leak detection, and fire protection systems.
- Classify and acknowledge alarms, then evaluate their urgency and impact.
- Escalate issues to technicians, facilities engineers, or management based on procedure.
- Follow incidents through to closure while keeping stakeholders informed.
- Record all alarm activity accurately in ticketing tools.
- Use SOPs, MOPs, EOPs, runbooks, and playbooks while performing daily duties.
- Monitor CCTV systems and review footage when needed to verify events or assist investigations.
- Document incident reports and event logs carefully.
- Use physical security tools such as Lenel, Genetec, and Avigilon when applicable.
Critical event and emergency response
- Help coordinate response efforts during utility interruptions, equipment failures, environmental alarms, and other emergency situations.
- Maintain communication bridges and share status updates with relevant stakeholders.
- Support emergency response for fire alarms, generator activity, cooling issues, and evacuation events.
- Coordinate with vendors, facilities engineers, and local site teams to speed up recovery.
- Capture event timelines, response actions, and lessons learned.
- Take part in emergency drills and business continuity exercises.
- Adhere to emergency operating procedures at all times.
Reporting and continuous improvement
- Prepare incident summaries and shift handover reports.
- Maintain accurate records for alarms, escalations, and corrective actions.
- Assist with identifying trends and recurring incidents.
- Participate in root cause analysis and post-incident reviews.
- Suggest updates to procedures, runbooks, and escalation workflows.
- Support KPI and SLA reporting.
- Contribute to continuous improvement and operational excellence programs.
Qualifications
Experience: At least 2 years in a Facilities Operations Center, Network Operations Center, Command Center, Critical Environment, or another 24x7 operations environment. Experience supporting mission-critical facilities or data center operations is expected.
Technical knowledge: Working familiarity with BMS, DCIM, EPMS, critical power and cooling infrastructure, fire detection and suppression systems, environmental monitoring, CCTV, access control systems, and incident/ticketing platforms.
Soft skills: Strong analytical and troubleshooting ability; the capacity to manage several incidents at once; clear written and verbal communication; composure in high-severity situations; strong teamwork and stakeholder coordination; ability to work independently and collaboratively; and flexibility to work rotating shifts, including nights, weekends, and public holidays.
Work location: This is an on-site position based at client data center facilities.
Preferred profile
A diploma or degree in Electrical Engineering, Mechanical Engineering, Facilities Engineering, or a related field is preferred. Additional exposure to Schneider Electric EcoStruxure, Vertiv, Siemens, Johnson Controls, or similar BMS/DCIM solutions is an advantage. Familiarity with EPMS platforms, physical security systems such as Lenel, Genetec, and Avigilon, incident management frameworks, and operational best practices will be helpful. Certifications like Schneider Electric Data Center Certified Associate (DCCA), Uptime Institute ATD or ATS, CDCP, BICSI DDCI, or ITIL Foundation are also valued. Experience in hyperscale or colocation data center environments is preferred.
Working conditions
The role operates in a rotating shift model with 24x7 coverage. Candidates should be prepared for night shifts, weekends, and public holidays.