CV

Basics

Name J. Jake Nichol
Label Computer Scientist
Email [my_github_username][at]gmail[dot]com
Summary I'm a Postdoctoral Appointee in the Scientific Machine Learning department at Sandia National Laboratories, where I conduct research on causal discovery methods and trustworthy machine learning. I recently completed my Ph.D. in Computer Science at the University of New Mexico, with a dissertation focused on recovering spatiotemporal causal structures, particularly in climate and other Earth science data.

Education

  • 2018 - 2025

    Albuquerque, NM, USA

    Ph.D.
    University of New Mexico, School of Engineering
    Computer Science
  • 2015 - 2017

    Albuquerque, NM, USA

    M.B.A.
    University of New Mexico, Anderson School of Business
    Business Management
  • 2011 - 2016

    Albuquerque, NM, USA

    B.S.
    University of New Mexico, School of Engineering
    Computer Science

Work

  • 2025 - ...
    Postdoctoral Appointee
    Sandia National Laboratories
    Postdoc in the Scientific Machine Learning department.
  • 2019 - 2025
    R&D Graduate Intern
    Sandia National Laboratories
    Year-round intern in the Scientific Machine Learning department.
    • Participated in the Lab Driven Research & Development project CLDERA, which seeks to develop tools to identify source to impact pathways in climate systems.
    • Researched the application of causal discovery, specifically the PC and PCMCI algorithms, for identifying causal pathways from the 1991 Mt. Pinatubo eruption.
    • Developed the CaStLe algorithm for constructing spatiotemporal causal graphs of local dependencies in climate data.
    • Researched random forest machine learning techniques for learning about emergent dynamics in climate models.
    • Used feature importance analysis to make comparisons between observed and simulated data to look for the simulations' faults
  • 2018 - 2019
    Graduate Research Assistant
    University of New Mexico
    A research assistant under Dr. Lydia Tapia and Dr. Marina Kogan.
    • Robotics under Dr. Tapia
    • Computational sociology under Dr. Kogan
  • 2017 - 2019
    Owner
    Swarming Technologies LLC
    Ran a business full-time manufacturing, repairing, and developing autonomous ground-based robots.
    • Robots, known as 'Swarmies' were designed for swarm robotics and autonomous robotics research and education.
    • Swarmies were featured in the NASA Swarmathon, NASA Minds competition, and CS4All NM.
  • 2015 - 2017
    Robotics Engineer and Designer
    NASA Swarmathon & Moses Biological Computation Lab
    Designed and developed autonomous, swarming robots called Swarmies. The robots were a part of a nation-wide, college-level robotics competition called the NASA Swarmathon. Development included designing with Autodesk Inventor and manufacturing parts with SLS, SLA, and FDM 3D printing, as well as electronics development.
    • Analyzed data to track progress and success determinants using MySQL and Python’s numpy, scipy, and pandas.
    • Developed automated deployment for code on robots using Ansible and Docker.
    • Conducted swarm robotics research using Arduino/iPod Touch-controlled robots, iAnts, using a genetic algorithm to tune behavior.
  • 2014 - 2014
    Software Engineering Intern
    Intel Corporation
    Product surveying, QA, troubleshooting, and testing.
    • Arduino/Galileo programming and debugging.
    • Design and construction of a robotic car, controlled by the Intel Galileo Gen 2.
    • Developed a Arduino/Intel Galileo controlled robot that included: input given via Bluetooth from Android phone and multiple sensors for line following and obstacle avoidance.

Awards

Volunteer

  • 2018 - ...

    Santa Fe, NM, USA

    Adaptive Ski Instructor
    Adaptive Sports Program New Mexico Inc
    Teach skiing to people with various disabilities.
  • 2017 - 2020

    Albuquerque, NM, USA

    Troop 9 Board of Review Member
    Boy Scouts of America
    Attend board of review meetings to assist in scout advancement.

Publications

  • 2025
    Space-Time Causal Discovery in Earth System Science: A Local Stencil Learning Approach
    Journal of Geophysical Research: Machine Learning and Computation
    Causal discovery tools enable scientists to infer meaningful relationships from observational data, spurring advances in fields as diverse as biology, economics, and climate science. Despite these successes, the application of causal discovery to space-time systems remains immensely challenging due to the high-dimensional nature of the data. For example, in climate sciences, modern observational temperature records over the past few decades regularly measure thousands of locations around the globe. To address these challenges, we introduce Causal Space-Time Stencil Learning (CaStLe), a novel meta-algorithm for discovering causal structures in complex space-time systems. CaStLe leverages regularities in local space-time dependencies to learn governing global dynamics. This local perspective eliminates spurious confounding and drastically reduces sample complexity, making space-time causal discovery practical and effective.
  • 2023
    Benchmarking the PCMCI Causal Discovery Algorithm for Spatiotemporal Systems
    OSTI
    Causal discovery algorithms construct hypothesized causal graphs that depict causal dependencies among variables in observational data. While powerful, the accuracy of these algorithms is highly sensitive to the underlying dynamics of the system in ways that have not been fully characterized in the literature. In this report, we benchmark the PCMCI causal discovery algorithm in its application to gridded spatiotemporal systems. Effectively computing grid-level causal graphs on large grids will enable analysis of the causal impacts of transient and mobile spatial phenomena in large systems, such as the Earth's climate.
  • 2021
    Learning Why: Data-Driven Causal Evaluations of Climate Models.
    ICML 2021 Workshop Tackling Climate Change with Machine Learning
    We plan to use nascent data-driven causal discovery methods to find and compare causal relationships in observed data and climate model output. We will look at ten different features in the Arctic climate collected from public databases and from the Energy Exascale Earth System Model (E3SM). In identifying and comparing the resulting causal networks, we hope to find important differences between observed causal relationships and those in climate models. With these, climate modeling experts will be able to improve the coupling and parameterization of E3SM and other climate models.
  • 2021
    Machine learning feature analysis illuminates disparity between E3SM climate models and observed climate change
    Journal of Computational and Applied Mathematics
    In September of 2020, Arctic sea ice extent was the second-lowest on record. State of the art climate prediction uses Earth system models (ESMs), driven by systems of differential equations representing the laws of physics. Previously, these models have tended to underestimate Arctic sea ice loss. The issue is grave because accurate modeling is critical for economic, ecological, and geopolitical planning. We use machine learning techniques, including random forest regression and Gini importance, to show that the Energy Exascale Earth System Model (E3SM) relies too heavily on just one of the ten chosen climatological quantities to predict September sea ice averages. Furthermore, E3SM gives too much importance to six of those quantities when compared to observed data. Identifying the features that climate models incorrectly rely on should allow climatologists to improve prediction accuracy.
  • 2018
    The Swarmathon: An Autonomous Swarm Robotics Competition
    arXiv
    The Swarmathon is a swarm robotics programming challenge that engages college students from minority-serving institutions in NASA's Journey to Mars. Teams compete by programming a group of robots to search for, pick up, and drop off resources in a collection zone. The Swarmathon produces prototypes for robot swarms that would collect resources on the surface of Mars. Robots operate completely autonomously with no global map, and each team's algorithm must be sufficiently flexible to effectively find resources from a variety of unknown distributions.

Interests

Causal Inference
Causal Discovery
Causal Structure Learning
Causal Machine Learning
Machine Learning
Scientific machine learning
Domain/physics-informed machine learning
Climatological machine learning
ML feature importance, such as random forests Gini importance, permutation importance, drop-column importance, SHAP
Explainable & trustworthy machine learning
Artificial Intelligence
AI for Earth systems science
AI for science
Trusted AI
Explainable AI
Fairness and ethics in AI

Skills

Programming
Python libraries like Pandas, Xarray, DASK, and Tigramite
LaTeX
HPC frameworks Slurm and PBS
GNU Parallel
MATLAB
Minor experience with Docker and Anisble.