Fifteen years ago, from a Lockheed L-1011 Tristar jet flying somewhere 40,000 feet
above the Pacific, NASA’s IBEX satellite, the Interstellar Boundary Explorer, shot skyward. So began a many-year mission to map the interstellar boundary, a thin band surrounding the sun that helps shield Earth and our solar system from dangerous levels of interstellar radiation. Onboard IBEX was a Los Alamos– built instrument, IBEX-Hi, that detects particles that originate some 11 billion miles from the sun in the heliosheath, the portion of the heliosphere—the entire region shielded by the interstellar boundary—that bumps up against the interstellar medium. IBEX-Hi detects a particle every ten seconds or so, but only half of the detections are the particles of interest. The rest are data noise.
“There just aren't that many detections for us to work with,” says Dan Reisenfeld, a space scientist in charge of the Lab’s work with IBEX. Two years ago, Reisenfeld started trying to squeeze more information out of the data they’d captured. The goal was to build better maps of the interstellar boundary. The particles IBEX-Hi measures are called ENAs, or energetic neutral atoms. ENAs begin their lives in the Sun’s corona as protons that sweep outward until, somewhere far past the most distant planets, they meet hydrogen atoms flowing inward from the interstellar medium. Here, an electron exchange occurs and the ENAs are born. Only a tiny percentage of these ENAs is directed earthward and an even smaller percentage is detected by the sensors on the earth-orbiting satellite. Reisenfeld and collaborators use these ENA observations to build the maps needed to understand the physics of the heliosheath. How does the environment where ENAs are born shield life on Earth from the most dangerous cosmic rays? Do other planets have similar shields? But because IBEX-Hi’s data are noisy, so are the interstellar boundary maps. To get better answers, Reisenfeld needed better maps. To get better maps, he needed better data.
Improving the data through statistics was the only way. Since IBEX launched, the standard way of analyzing its data has been, as Reisenfeld characterized it, “rudimentary stats techniques and error propagation methods taught in undergraduate lab classes.” So he asked Los Alamos statistician Dave Osthus to help. Osthus, who had spent the past three years working on epidemiology models of COVID-19, jumped at the challenge.
Osthus started by building an automated statistical learning technique with the goal of quieting the noisy data. He dubbed the code Plinko in reference to a The Price is Right game, where contestants win money based on which bin a bouncing token lands in. The game is structured randomness, and so was the way the ENAs' origination points were determined. Osthus’s code used the facts that dictated the quality of each IBEX-Hi observation to more accurately guess at the particle’s origination point—like using the Plinko token’s drop position to predict which bin it will land in. The code first squeezed more from IBEX-Hi’s observations by comparing them to each other. It then weighted the ENA observations by their quality and probabilistically placed each within the range of locations where it had most likely been born.
Out of this approach grew 1000 new interstellar boundary maps. Each looked roughly like a handful of Skittles strewn across a flattened globe, the colors indicating different rates of ENA observations. Each map was just one possible rendering of the interstellar boundary. Not one was accurate, but that was never the point. “Their value lies in the average of them all,” Osthus says.
Osthus flattened the 1000 maps into one averaged snapshot of the interstellar boundary. With the data noise hushed, the enriched map had, as Reisenfeld put it, cleared the fog. But it still wasn’t right. “Too blurry—too inaccurate—we couldn’t discern any localized features in it, ” Osthus says.
So he began anew, starting again by trying to focus the 1000 maps he’d already made less noisy. His approach mirrored the way software uses calculus and optimization to focus digital pictures, but it didn’t work reliably. Rather than focusing the maps, the code highlighted the randomized method he’d used to build them: some looked crisp; others like he’d spray-painted his laptop screen. “It was actually a relief,” Osthus says. “It showed me how I was doing it wrong, and hinted at how to fix it.”
To do it right, he wrote a new code that tethered each ENA to its calculated origination point while also bringing the interstellar boundary’s features into sharper focus. After one more round of code development, Osthus handed Reisenfeld maps with features sharper than had ever been seen, like the still mysterious ribbon of ENAs—a subject of research in its own right—that wraps around the interstellar boundary. Osthus had teased out new clarity with statistics alone.
“Dave developed a whole new tool for us to understand the heliosphere,” says Reisenfeld. “This is the first time ever that these sorts of statistical methods have been applied in the field of space science on such a large scale.”