We employ anomaly and novelty detection methods to support science investigations, enabling subsequent review to separate anomalies of potential scientific interest (and follow-up) from those that indicate artifacts or noise in the data.
We are developing and evaluating onboard data analysis methods for use by the Europa Clipper spacecraft, a future mission that will make 40+ flybys of Europa. These methods include onboard detection of thermal anomalies (hot spots) on the surface, areas of unusual mineralogical composition, active plumes erupting from the surface, and more. These methods must be highly efficient to operate within the limited computational and memory resources available onboard.
We employ image classification to data collected by spacecraft (e.g., Mars orbiters and rovers) to enable content-based search (e.g., "all images containing craters").
We are creating an automated system that scans through scientific publications about Mars to extract information about named surface targets (rocks, soils, etc.) and their composition. The extracted information is compiled into a searchable encyclopedia, which is publicly available through the MSL Analyst's Notebook (click "Search", then "Targets").
This has been a focus of my research since I joined JPL, and (not coincidentally) it is a major emphasis of my group's work in general. I have looked at methods for on-board analysis for Earth orbiters, Mars orbiters (to detect thermal anomalies, track the CO2 polar caps, and estimate the dust and water ice content of the atmosphere), Mars landers/rovers (to detect dust devils), and a hypothetical mission to Enceladus (to detect plumes).
Most of this work is cutting-edge in terms of the problems we tackle and the restrictions of the computing environment, rather than in terms of novel machine learning methods. Thus, most of the publications we've had are in science venues, in which we seek to publicize the benefits of machine learning technology for spacecraft.
We are developing new algorithms to quickly detect transient (very short-duration) events in radio astronomy data, to enable archiving the raw data for only events of interest. We have an operational system (V-FASTR) at the 10-station Very Long Baseline Array (VLBA) and ultimately aim to support the ambitious plans of the 3,000-antenna Square Kilometer Array (SKA). You can see the latest V-FASTR activity and detections here.
Ensuring that an implementation exactly matches the specification is challenging, but particularly critical in flight software systems. We advocate the use of state-based software system modeling for use in flight software development, in which the system behavior is encoded in a UML state diagram, then converted into C or C++ code using the JPL Autocoder.
We demonstrated this approach by encoding part of the Proximity-1 communications protocol, used by the Mars Reconnaissance Orbiter (MRO) and the Mars Exploration Rovers, into a state chart that we then converted into C code. We linked this code in with the MRO flight software for its Electra radio and successfully demonstrated that the auto-generated code passed all relevant flight software tests. In addition, we demonstrated the benefits of this development process, which include reduced development time, earlier detection of errors, and ease of communicating system design. As a result, a simplified version of the Autocoder was adopted by the Mars Science Laboratory software team, and we received follow-on funding to produce a full reference implementation of Proximity-1, using the same process, for use testing future radios in the JPL Protocol TestLab.
I completed a Master's degree in Geological Sciences from the University of Southern California in December, 2008. This work has focused on what methods from information theory can tell us about the potential biogenicity (created by life) of a sample may be.
We seek to develop methods by which the individual nodes in a sensor network can collaboratively learn new concepts, by querying each other and sharing data and learned information.
Funded by a four-year grant from the NSF's Robust Intelligence Program (NSF award number IIS-0705681), this project is a collaboration with Terran Lane of UNM.
Constrained clustering was an idea I invented for my dissertation at Cornell University (Ph.D. 2002). Unsatisfied with the peformance of regular clustering on the challenging task of noun phrase clustering, I suggested that we might be able to build in our existing domain knowledge as constraints on the clustering process. Others have since taken up this idea and extended it in many directions.
We developed and demonstrated an interactive machine learning analysis toolkit for use with remote sensing data. In particular, the PixelLearn system provides SVM classification, SVM regression, and clustering methods and supports a range of different remote sensing data types (for imagers and spectrometers).
In this project, we sought to identify connections between weather and agriculture (e.g., crop yield). We integrated data from orbiting satellites, weather stations on the ground, and historical crop yield archives to generate crop yield predictions in California and Kansas, for wheat and corn.
Funded by a two-year grant from NASA's Earth Science Technology Office (grant number AIST-QRS-04-3004), then extended an additional 9 months by an ATI infusion grant, this project was a collaboration with Dominic Mazzoni of JPL and Stephan Sain of NCAR.
We seek to develop classification, clustering, and preference modeling methods that can take advantage of existing domain knowledge.
Funded by a five-year grant from the NSF's Information Technology Research Program (NSF award number ITR-0325329), this project is a collaboration with Marie desJardins of UMBC.
Computing in a high-radiation environment, such as in orbit, can be risky; data can be corrupted by radiation during the course of a long computation. As we seek to have more data analysis done onboard spacecraft, this issue becomes more critical. Radiation-hardened hardware lags behind the "soft" state of the art by years, and is orders of magnitude slower and more expensive. In this project, we investigated alternative ways to compute safely in a high-radiation environment, using software solutions rather than specialized hardware. We also designed and built a software simulator that can subject a program to a simulated high-radiation environment by injecting "bit flips" into RAM while the program is running.
Currently, most analysis of images collected by orbiters around other planets proceeds manually. Interesting features, such as craters, fissures, gullies, and volcanoes are annotated in a painstaking process. We seek to provide automated means to annotate these features of interest, and particularly to detect any surface changes. These changes include the appearance or disappearance of the mysterious dark slope streaks on Mars as well as dust devil tracks and any other new discoveries we may uncover. Our approach focuses on the automated identification of "landmarks", which are statistically salient features within an image, and the subsequent detection of any changes.