The Internet Couldn’t Handle All the Data From the First Black Hole Photo
On Wednesday, astronomers announced that they’d captured the world’s first image of a black hole — and the internet couldn’t handle it. No, we’re not talking about black hole Shrek memes or snarky opinion pieces about how this image of an object 55 million light-years away was “so blurry.” We’re talking about how the internet literally couldn’t handle the quantity of data collected by the eight telescopes across five continents that make up the Event Horizon Telescope experiment that captured this image of the black hole at the center of the galaxy Messier 87.
Instead, the massive quantity of data collected by the radio antennae had to be flown on airplanes to central data centers where it could be cleaned and analyzed. So in addition to being a massive achievement of human ingenuity and understanding, one that confirmed several theories about black holes, the M87 black hole image was also a Herculean feat of data storage and management.
Over seven days in April 2017, the EHT experiment turned all eight telescopes toward M87. Synchronized by custom-made atomic clocks, they all started collecting the incoming radio signals from the distant black hole and logging the data on super-fast data recorders that had been built for this very task.
"There’s no internet that can compete with 5 petabytes of data on a plane."
“We had 5 petabytes of data recorded,” Dan Marrone, Ph.D., an associate professor of astronomy at the University of Arizona who specialized in data storage for the EHT experiment, told reporters on Wednesday.
“It amounts to more than half a ton of hard drives. Five petabytes is a lot of data: It’s equivalent to 5,000 years of MP3 files.”
Here’s why and how this one picture required the data equivalent of 1.39 billion copies of “Old Town Road” by Lil Nas X.
Eight Synchronized Telescopes
The EHT experiment employed a technique called very long baseline interferometry, which used the eight simultaneously recording telescopes to essentially turn the Earth into a single, rotating telescope dish. Each of these telescopes recorded raw incoming radio signals as tons of data.
In other words, it’s like if eight people took videos of the same far-away phenomenon from different angles, then put all of their videos together to make one really clear video. In this scenario, though, the object was really far away, and the telescopes were really far apart.
The advantage of this long baseline between telescopes is that the rotation of the Earth gave scientists shots of the black hole from eight simultaneous angles.
Cleaning the Data
Once all 1,000 pounds of hard drives were filled with these 5 petabytes of raw data, they were loaded onto airplanes and flown to two centralized “correlators,” located in Massachusetts and Germany.
“The fastest way to do that is not over the internet, it’s actually to put them on planes,” said Marrone. “There’s no internet that can compete with 5 petabytes of data on a plane.”
Adding to this challenge, the scientists had to wait until summer to send the hard drives from the South Pole Telescope, as the images were captured during Antarctica’s winter.
The correlators then began the job of syncing up all the data from the telescopes with each other. This means that supercomputers took all the raw observational data collected by the telescopes and used the atomic clock information to line them all up with one another, creating a seamless record of the wavefront of light from the black hole as it reached Earth.
Exchanging Tools With Silicon Valley
Chi-Kwan Chan, Ph.D., a computational astrophysicist at the University of Arizona who dealt with computation for the M87 imaging project, tells Inverse that once the correlators had cleaned the data, the task then got a lot more granular.
“After that step, people usually just use a work station and do the computations on them,” he says. “But the contribution I made was I brought cloud computing technology into the collaboration so we could launch many powerful virtual machines in the cloud to do data analysis and accelerate it.”
Chan and his team developed this software, which helped the EHT team clean the data even further to create the final, composite image, which is just a few hundred kilobytes. He’s hopeful that the tech industry will be able to use this software for network architecture in the future.
“In that sense we’re also giving back to society,” he says.
Notably, the University of Arizona’s computers that Chan and his team have used to run black hole simulations are based on graphics processing units, which are computationally very powerful. These are the same graphics cards that have come under extremely high demand as they’ve become popular among cryptocurrency miners. So just as the black hole project created software that will persist for others to use, it also took cues from a vastly different field of computer science, all in the name of discovery.
Ironically, Chan’s team used these powerful GPUs to simulate so many black holes in advance of the M87 observation, they already knew what to expect from the real black hole.
“We created a huge library of black hole images,” he says. “Because we saw so many of them and saw so many possibilities, we were not surprised when we saw the real one.”