Stanford Research Computing is looking for an experienced facility engineer to join our team. Our staff work directly with some of the world's top researchers in a broad range of disciplines, across Stanford’s seven schools — while also supporting and learning from each other in cross-project endeavors. We maintain and manage several advanced research computing facilities, and we support a variety of environments for Stanford research. In Stanford Research Computing, you’ll have a rare opportunity to contribute to discoveries and inventions that have global reach and positive impact, and to share in the curiosity and commitment of the scholars and scientists who lead these projects.
This position, reporting to the Research Computing Data Center Manager, will provide facility engineering support for data centers under the management of Stanford Research Computing. Research Computing offers High Performance Computing (HPC) hosting services, computational and data systems, services, and support for researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facilities (SRCF1 and SRCF2), both located on the SLAC campus. The SRCF2 is a state of the art expansion of the original facility (SRCF1), designed to accommodate high density computational infrastructure. The Facilities Engineer will assist in the operation of both sites, monitoring temperatures, facilitating deliveries, racking servers, and storage devices, troubleshooting and repairing servers, installing network wiring, and installing, and securing racks. This position will also provide backup to an existing Facilities Engineer position as needed. It may occasionally be responsible for interfacing with vendors, for installations, equipment repairs, and maintenance.
Core Duties
- Ensure ongoing production facility operation to maximize the reliability, dependability, and availability of it, and the systems hosted therein. Verify that all systems are operating efficiently.
- Monitor and observe the operation of mechanical, and electrical systems on a daily basis
- Perform regular physical inspections of all systems, looking for abnormal noises, fluid leaks, air leakage, etc.
- Inspect PDUs and electrical meters, for alarms, abnormal current, and voltage levels.
- Initiate corrective action following standard operating procedures as established by manager
- Serve as an expert, providing design and technical recommendations in areas of specialization. This includes interfacing and coordination with and across specialties and disciplines.
- Work with groups or individual researchers from Stanford, Stanford Medicine, and SLAC who have equipment hosted at SRCF2 to ensure safe facility operation and standards are met.
- Respond to and address facility problems as directed by the data center manager. This will require occasional in-person responses outside normal working hours.
- Identify and resolve complex technical systems issues that may have high risk/consequences of failure.
- Review and analyze projects’ technical plans and specifications for conformance with design guidelines and other related criteria, and pertinent codes and standards
- Serve as a technical lead on assigned projects in areas of expertise across multiple contractor activities, through all project phases.
- Conduct risk assessments and develop contingency plans
Minimum Requirements
Education and Experience
- Bachelor’s degree in relevant discipline. Ten years of relevant work experience or a combination of those attributes.
Knowledge, Skills, and Abilities
The following are required for this role:
- Strong interpersonal skills
- Demonstrated breadth of understanding of electrical and mechanical systems found in a data center environment such as: Power distribution systems, UPS, Variable Frequency Drives, Generators, Air Handlers, Chillers, PLC’s, etc.
- Knowledge and proficiency with engineering software related to assignment, specifically Building Control and Monitoring Systems
- Deep technical knowledge in area of expertise
- Strong customer service focus
- Excellent project management skills
- Excellent written and oral communication skills
- General use of hand tools (ie, electrical meters)
- Demonstrated proficiency using office productivity tools including Google docs and sheets, Zoom and Slack
- Working knowledge of computer systems
- Ability to identify and mitigate potential project risk components
Highly desirable knowledge, skills and abilities include:
- Data Center certifications such as: CDCP - Certified Data Center Professional, Certified Data Center Management Professional (CDCM), Certified Data Center Technician Professional (CDCTP), or Data Center Energy Practitioner (DECP).
- Understanding of the basics of high performance computing and associated networking features and functions.
Certifications and Licenses:
- Data center operation certification is highly desired; see above.
Physical Requirements
- Occasionally sitting, performing desk-based computer tasks, lift/carry/push/pull objects that weigh up to 10 pounds.
- Routinely stand/walk, twist/bend/stoop/squat, grasp lightly/fine manipulation, use a telephone, lift/carry/push/pull objects that weigh from 11 to 60 pounds.
- Frequently kneel/crawl, climb (ladders, scaffolds, or other), reach/work above shoulders, grasp forcefully, writing by hand, sort/file paperwork or parts, lift/carry/push/pull objects that weigh >40 pounds.
Working Conditions
- Requires 24-hour response availability seven days per week for emergency situations. You must be able to travel to the data center facilities from your place of residence within 45 minutes.
- Expected to work an 8-5 or 8:30 – 5:30 schedule
- Requires after hours and weekend work on occasion for emergency situations, planned maintenance and/or project-related activities.
- May be exposed to noise > 80 decibels.
- May work at heights 4 - 10 ft.
Work Standards
- Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues, clients, and with external organizations.
- Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for safety; communicates safety concerns; uses and promotes safe behaviors based on training and lessons learned.
- Subject to and expected to comply with all applicable University and unit policies and procedures, including but not limited to the personnel policies and other policies found in the University’s Administrative Guide.
The job duties listed are typical examples of work performed by positions in this job classification and are not designed to contain or be interpreted as a comprehensive inventory of all duties, tasks, and responsibilities. Specific duties and responsibilities may vary depending on department or program needs without changing the general nature and scope of the job or level of responsibility. Employees may also perform other duties as assigned.