Global Operations Centre Support Specialist
Apply on
Sustainable Metal Cloud was founded with a vision to move cloud computing towards net zero, with solutions forged through advanced technology.
An NVIDIA CSP partner, we operate a zero-compromise large-scale GPU cloud service that furthers the adoption of artificial intelligence, responsibly, and sustainably. Part of the ST Telemedia Global Data Centres group, we're an expert in large-scale, mission-critical infrastructure, with operational excellence in our DNA.
Our technology stack is founded on revolutionary, high-efficiency energy saving technology that consumes materially less power when hosting and running GPU-based workloads. Delivering this to the customer as a managed service, we are at the forefront of sustainability in the cloud – enabling large-scale AI to be built with a carbon footprint of up to half of competing solutions.
- A fast-paced and dynamic environment working with next-gen technology. You’ll be operating at the intersection of sustainability and artificial intelligence – helping to transform an industry.
- Working with and access to colleagues who are true innovators and leaders in their field.
- Real growth and career progression opportunities. We’re still very young in with plenty of room to grow.
- As an emerging company, we work as a close-knit team. Work with the founders, grow a strong network, and witness the impact you make first-hand as we democratise AI tools for everyone – more sustainably, and more affordably.
ROLES AND RESPONSIBILITIES
As a Global Operations Centre (GOC) Support Specialist, your primary responsibility is to support network operations, server hardware and alerts from the data centre facilities. You’ll work in a 24x7 shift-based environment, ensuring timely responses, incident resolution, and adherence to service level agreements (SLAs). Here are the key tasks:
Network Monitoring and Operations:
- Manage tier-1 network monitoring round the clock.
- Monitor the performance and health of data centre, networks and hardware systems.
- Identify potential issues and troubleshoot technical problems promptly to minimise downtime.
- Log, update, and progress tickets to resolution.
- Collaborate with cross-functional teams to deploy, repair, and maintain data centre infrastructure.
Incident Management and Escalation:
- Follow escalation procedures during critical incidents.
- Act as the point of contact for significant events or technical escalations.
- Coordinate and supervise data centre maintenance efforts, collaborating with internal and external parties.
Documentation and Communication:
- Compile daily monitoring and operational tasks into documents.
- Ensure compliance with physical security procedures, including visitor management and equipment movement.
- Maintain access logs and service reports.
- Track equipment movement in and out of data centres and warehouse facilities.
Quality Assurance and Assessment:
- Ensure subcontractors deliver quality services and meet contractual obligations.
- Conduct routine assessments of data centre infrastructure systems and critical facilities.
- Perform other duties as assigned by line managers.
SKILLS AND EXPERIENCE
- Holds a Diploma or Professional certification in Electrical, Mechanical, Computer Science/ Information Technology or possesses equivalent experience.
- Demonstrates a career path marked by continuous personal development with at least two years of experience in Data Centres or 24/7 critical environment management.
- Able to work shifts, including weekends and public holidays.
- Has a basic understanding and experience with electrical and mechanical systems commonly used in a Data Centre.
- Proficient in network and server hardware troubleshooting.
- Possesses a working knowledge of physical IT infrastructure components.
- Comfortable working on rotating shifts.
- Having working knowledge and experience with monitoring & ticketing tools.
- Demonstrates good problem-solving and communication skills.
- Proven ability to work effectively under direction.