Trustworthiness of Foundation Models and What They Generate

A mini workshop as a part of SMC-IT/SCC 2024, July 15-19, 2024
Computer History Museum, Mountain View, CA, USA

Organizers: Daniel Crichton and Richard Doyle (JPL/IEEE) and Ashish Mahabal (Caltech)

Image credit: Robert Hurt (Caltech/IPAC)

Motivation for the workshop:

Interest in Foundation Models (FMs) has burgeoned, with much attention focused on Large Language Models (LLMs): their notable performance, along with open issues such as hallucinations. LLM success appears to rest on the sheer power of learning statistical patterns from massive public databases, and on inherent ordering constraints within a domain, exploited by completion operations. Yet there is growing interest in other applications where such constraints may be less prevalent, or different in form. Within NASA, there is interest in exploring the potential of FMs for analyzing and understanding image and time-series data.

Trustworthiness is a key challenge, which must entail some form of validation of the FMs themselves. Explainability is an established research challenge relevant to uses of Artificial Intelligence (AI), particularly Machine Learning (ML); closely related is human-machine interaction. In the context of FMs, there is an important responsibility on users—not sufficiently emphasized—to contribute to the validation of Generative AI outputs—if not the FMs themselves—as one antidote to hallucinations. Our workshop will convene SMEs in these research areas, to discuss extant challenges and possible paths to solutions, noting—as is not unusual with AI—where existing best practices (e.g., in V&V) may be adaptable / extensible, and where new methodologies may be required.

One of the claims, and attractions, of FMs is their alleged zero-shot applicability to diverse corpora of text, images, and other data types. Validation of such claims lies at the intersection of developing FMs for fundamental sciences and their trustworthiness. For instance, can the Segment Anything Model (SAM) work for fuzzy images (say a cancer starting to attack an adjacent organ and thus blurring boundaries), or an LLM trying to address questions from a field that has sparse corpora (even when supported with mechanisms like retrieval-augmented generation). With time-series the zero-shot applicability will be even more under scrutiny (for instance for gappy data as in much of terrestrial astronomy).

Workshop Structure:

The workshop will have two 90-minute tracks, each featuring three 20-minute invited talks, followed by a a panel of speakers, moderated by the workshop co-chairs. The SMEs will be drawn from NASA, DoD, Academia, and Industry, including Commercial Space. The workshop hosts will seek opportunities to cross-fertilize discussion across both near-term and far-term considerations—with panel members, and with workshop attendees.

Speakers and Abstracts

15 July 2024
Agenda (slight changes possible):




1:15 - 1:20

Crichton, Doyle, & Mahabal

Welcome and introductory comments

1:20 - 1:40

Geoffrey Fox (UVA)

Foundation Models and Patterns for Science Time Series

1:40 - 2:00

Manil Maskey (NASA)

AI Foundation Models for NASA Science: a Culture of Openness

2:00 - 2:20

Jack Lightholder (JPL)

Intelligent Parsing of Academic Literature Using Large Language Models

2:20 - 2:55

Fox, Lightholder, Maskey

Panel Discussion

2:55 - 3:15


3:15- 3:45


3:45 - 4:05

Brian Patrick Green (SCU)

Ethics and Trustworthy Foundation Models

4:05 - 4:25

Bjorn Andersson (CMU)

Leveraging AI for Assurance of Critical Software Systems

4:25 - 5:00

Andersson, Green, Erik Linstead

Panel Discussion

5:00 - 5:15

Crichton, Doyle & Mahabal

Concluding remarks