Data Engineer

Stanford University • FULL_TIME • Stanford, California • 1m ago

Stanford University is launching an interdisciplinary Neuro-AI project dedicated to building a foundation model of the brain. This endeavor will involve multiple labs and faculty across the Stanford campus, including the Wu Tsai Neurosciences Institute, Stanford Bio-X, and the Human-Centered Artificial Intelligence Institute. Leveraging cutting-edge advances in electrophysiology and machine learning, this project aims to create a functional "digital twin" — a model that captures both the activity dynamics of the brain at cellular resolution and the intelligent behavior it generates, including perception, motor planning, learning, reasoning, and problem-solving.

This ambitious initiative promises to offer unprecedented insights into the brain's algorithms of perception and cognition while serving as a key resource for aligning artificial intelligence models with human-like neural representations. As part of this project, we are seeking talented data engineers with extensive experience in data infrastructure engineering. The team will be responsible for designing, building, and operating the data pipeline infrastructure, which includes the entire flow of data from neurophysiological data acquisition to storage, processing, and preparation for large-scale training of foundation models. Ideal candidates will have practical experience in designing and scaling big data pipelines with proficiency in big-data storage architectures (data lakehouse) and relevant software tools and frameworks including but not limited to Delta Lake, Apache Spark, Apache Parquet, and Apache AirFlow.

This position promises a vibrant and cooperative atmosphere within the laboratories of Andreas Tolias (https://toliaslab.org), Tirin Moore (https://www.moorelabstanford.com) and other labs at Stanford University renowned for their expertise in perception, cognition, pioneering neural recording techniques, computational neuroscience, machine learning, and Neuro-AI research.

Role & Responsibilities:

•Work in a team of engineers and scientists to design, build, and maintain large scale data pipelines.

•Set up and maintain the hardware and software infrastructure to support distributed computation, data orchestration, and fast-throughput distributed storage capable of supporting PB of data.

•Coordinate with experimentalists, research scientists, and machine learning engineers to accelerate and facilitate the workflows for large-scale neuroscientific data analyses and foundation model training.

Key qualifications:

•PhD or Master’s/Bachelor’s degree in Computer Science or related fields.

•2-3 years of experience in designing and managing big data pipelines with a particular focus on data infrastructure engineering.

•Experience in working with real-time/high-throughput data transfers to cloud-based data storage and compute nodes (i.e. AWS).

•Expertise in modern big data tools and frameworks (e.g. Apache Spark, Airflow, and Delta Lake).

•Experience in setting up and managing large-scale data storage and compute infrastructure to support high-throughput data processing workflow

•Strong communication skills to work effectively within an interdisciplinary team constituting of varying degrees of technical skills

Preferred qualifications:

•Experience with machine learning techniques and their associated challenges for data pipeline engineering.

•A strong software engineering background for ensuring high-quality code and continuous development of data analysis pipelines in coordination with other teams

•Experience in working in a large interdisciplinary team, managing software and/or hardware infrastructure for data storage and analyses.

What we offer:

•Work on a collaborative and uniquely positioned project spanning several disciplines, from neuroscience to artificial intelligence and engineering.

•Work jointly with a vibrant team of researchers and scientists on a cutting-edge project dedicated to one mission, rooted in academia but inspired by science in industry.

•Competitive salary and benefits.

•Strong mentoring in career development.

Please complete the basic application on the Stanford Careers site, and we also ask that you send your CV and one page interest statement to: recruiting@enigmaproject.ai

The expected pay range for this position is $102,000 to $122,000 per annum.

Stanford University has provided a pay range representing its good faith estimate of what the university reasonably expects to pay for the position. The pay offered to the selected candidate will be determined based on factors including (but not limited to) the experience and qualifications of the selected candidate including equivalent years since their applicable education, field or discipline; departmental budget availability; internal equity; among other factors.

Apply