For around 20 years, astronomers have struggled to find an ancient group of stars mixed in with the gas, dust and newer stars of our galaxy’s bulge. These “fossil” stars preceded the Milky Way and should have been discernible by their distinctive chemistry and orbits. Yet until recently, only a small number of them had ever been found.
Now, a determined effort using data-intensive machine learning has unearthed a trove of them, bringing into focus their features and fates. The methods used in their discovery have enabled scientists to update their understanding of the Milky Way’s formation and of disk galaxies in general.
Astronomers believe that the Milky Way was preceded by something called a proto-galaxy — a violent, chaotic place containing young stars with wild orbits. Its origin story starts out credibly enough. After the Big Bang, dark matter coalesced in our region of space. The dark matter attracted ordinary matter. The first waves of stars then arose, but how these stars got there was anyone’s guess.
“People didn’t have a really good idea of what the proto-galaxy looked like,” said Vedant Chandra, an astrophysicist at Harvard University and one of the lead authors on a recent paper detailing the ancient star discoveries.
By the 2000s, scientists had settled on two formation theories. Either the proto-galaxy gave birth to the Milky Way’s first stars internally, as gas coalesced into stars, or it cannibalized other galaxies, ripping out stars and siphoning off dark matter. To settle the question, astronomers would need to isolate the Milky Way’s earliest star population. Studies identified candidate stars, but if the internal-nursery theory was correct, a much larger fossil population lay undiscovered.
The opportunity to find them arrived in 2022 when the European Space Agency’s Gaia space telescope released its third full set of data, called DR3. Gaia was launched 10 years ago to survey the Milky Way, and each successive data release has included more accurate position measurements than prior releases.
Importantly, DR3 also included stellar spectra — measurements of how bright a star is at different wavelengths of light. These spectrometry measurements are commonly used to examine the chemical elements inside a star.
To determine star birth dates, the team relied on a standard spectroscopic technique that looks for the signatures of heavy elements. (In astronomy, “heavy” means anything more massive than hydrogen or helium.) As the universe ages, hydrogen-rich stars detonate into supernovas and die, spewing out elements such as carbon and oxygen. This material then coalesces into new, heavier-element stars, also known as metal-rich stars. So more recent stars are metal-rich, and metal-poor stars must have originated in the proto-galaxy.
When the team saw the Gaia DR3 data, however, they were disappointed to discover that the spectrometer readings were too broad to reveal individual chemical peaks. “The spectral information for about 200 million stars was released, but these are very low-resolution spectra. If you look at the spectrum it’s just a bunch of wiggles,” Chandra said.
So the team turned to machine learning to extract the signals of heavier elements from the noisy, low-resolution spectra. They used an off-the-shelf algorithm called XGBoost, and trained it using high-quality spectral data from other surveys. With this training, the algorithm was able to reveal the stars’ metallicity based solely on the low-quality Gaia wiggles. When the team double-checked their predictions against data collected by three other independent high-quality sky surveys in three unique sections of the Milky Way, they found tight agreement.
Looking into the inner secrets of the algorithm, Chandra found that it decided a star’s heavy-element abundance based almost exclusively on the star’s calcium and magnesium absorption lines. It also corrected for potential sources of error, such as the dense tangles of cosmic dust and gas that lie between Earth and the center of the Milky Way. “The shape of those wiggles will change if there’s a lot of dust in the line of sight to the star,” he said. “And that’s important because we’re studying the center of the galaxy, which is filled with dust.”
The team whittled down a population of 1.5 million stars to about 18,000 early stars with low metallicity located in the Milky Way’s bulge. “A decade ago, I was thrilled to have a sample of almost 1,000 low-metallicity bulge stars,” said Melissa Ness, an astronomer at Columbia University. “We are now in a regime of having many thousands of these metal-poor stars. That’s an incredible data set to work with.”
The researchers still needed to answer at least one more question: Where were the proto-galaxy’s stars headed? The answer came from another type of measurement newly available in the Gaia DR3 release — the speed at which the stars are moving along our line of sight. Knowing this velocity made it possible to uncover each star’s orbit.
What emerged was a portrait of a halo-shaped proto-galaxy, as anticipated by some theorists. The population of elderly, metal-poor stars orbited in a small, tight sphere with a radius of 9,000 light-years, which the team dubbed the “poor old heart” of the Milky Way.
Overall, the findings suggest that the proto-galaxy didn’t steal stars from other galaxies. If it had, their stellar orbits would be headed toward regions beyond the Milky Way.
With the velocity and spectrometry measurements already in hand for 1.5 million Milky Way stars, Chandra cast his gaze to related theories that could be checked. One recent one stood out.
In 2022, two papers hinted at a timeline for the Milky Way’s disk formation. The theory goes that after the proto-galaxy arose, the region “simmered,” collecting gas and creating metal-poor stars. After a billion years, the emergent galaxy “boiled,” frantically giving birth to metal-rich stars for 2 billion to 3 billion years. These newer stars were different. They followed flatter orbits. As the galaxy cooled down, a razor-thin disk formed, filled with the newly minted stars (including our sun) moving in tidy circular orbits around the galactic center.
The 1.5 million stars in Chandra’s data set confirmed this timeline. “What we’re looking at is the Milky Way spinning up for the first time,” he explained. “You’re essentially seeing the birth of the disk of the galaxy.” He and his colleagues are now using the full 30-million-star data set to provide an even more comprehensive look. “The bulge has been officially confusing for decades,” Will Clarkson, an astronomer at the University of Michigan, Dearborn. “This has been a good opening of a new window into this fossil population.”