Rolling Zettabytes: Quantifying the Data Impact of Connected Cars

Jan. 21, 2020
Autonomous cars could generate up to 5 TB of data an hour. Edge Computing World provided a closer look at how self-driving cars and the edge network will work together, and the scenarios that will will require data and storage.

MOUNTAIN VIEW, Calif. – Self-driving cars won’t be rolling down your street tomorrow. But when they do, some of them will be generating huge amounts of data. How can we prepare for the network and storage requirements to support this autonomous future?

The first step is understanding the requirements. The Automotive Edge Computing Consortium (AECC) is working to help stakeholders understand the infrastructure requirements for connected cars. At Edge Computing World, AECC board member Vish Nandlall outlined the group’s findings on the volume of data created by autonomous cars, and the challenges they will create.

“Autonomous cars are one of the poster children for edge computing,” said Nandlall in his talk, titled “Driving the Zettabyte Edge.”

“Automotive data volume will drive the edge, and we’re going to hit zettascale volumes,” he said. “We’re really starting to challenge the limits of the cloud technologies we’ve been using. It’s a challenge to the infrastructure and cloud communities, and a challenge to the automotive community.”

Nandlall believes that the IT requirements of self-driving cars has the potential to transform automakers into hyperscale computing companies.

“By 2028, a single vendor will operate at zettascale,” said Nandlall, who is Vice President of Emerging Technology and Ecosystems at Dell Technologies. “These automotive OEMs will become the landlords of major amounts of data real estate.”

How Much Data?

There’s been a lot of hype and big numbers around the data needs of autonomous cars. Nandlall helped quantify the data generation, which will vary widely based upon the car’s activity. The 5 TB and hour rate is for a “vehicle under task,” and actively fetching and sending data. To understand this, let’s step back and review the data aspects of autonomous cars.

Autonomous vehicles are the equivalent of supercomputers rolling down the highway, generating and transmitting a mind-boggling amount of data. Most analysts expect the data requirements of self-driving vehicles will be split, with essential functions managed by powerful on-board computers, and data offloaded to external facilities for additional data-crunching and storage.

Self-driving cars rely upon computer vision using a series of video cameras, radar and LiDAR laser light detection sensors, which allow the car to perceive the world around it. Neural networks (AI) allow powerful on-board computers to recognize objects – including other vehicles, pedestrians, stop signs and traffic lights – and understand the distance between objects.

These technologies create an enormous volume of data. At 30 frames per second, video cameras create data at a rate that ranges from about 300 GB an hour for 720p video all the way up to 5.4 TB an hour for 4K video.

Not all of that data is useful, and not all of it will need to be transmitted off the vehicle to local network or the cloud. The AECC estimates that about 30 percent of video will be uploaded to help refine models and train algorithms, and about 2 percent of video will be uploaded to retain for audit trails for accidents or traffic incidents.

The bottom line: Autonomous cars could require data transfer offload ranging from 383 GB an hour to 5.17 TB an hour. “We’re going to have to do edge data offload to manage this,” said Nandlall.

A diagram of autonomous vehicle data generation, from the Automotive Edge Computing Consortium.

On-Board Capabilities vs. Augmented Safety

There’s been a lot of confusion about the need for data transfer off the car, and the role of AI models and data analysts in the operation of self-driving cars. Nandlall says these vehicles will encounter a range of scenarios, with only a subset requiring the fetching or sending of data.

Nandlall says there’s a misconception that fast network connections (like 5G wireless) are essential to help self-driving cars operate safely. That’s not the case, he said.

“The car will have sufficient safety,” said Nandlall. “All of the on-board telemetry should be able to make decisions without the surrounding network. Inferencing (AI-guided decision) for all central functions will occur on the car.”

But there are efforts to extend the capabilities of on-board sensors and cameras using external data. “We want to increase the safety threshold, and that’s where augmented safety comes into play,” he added. “The surrounding network will provide augmented safety information.”

One example is high-definition mapping tools, which extend a self-driving vehicle’s awareness of its surroundings and the road ahead. HD maps are being developed by location specialists like TomTom and HERE Technologies, which collect data from car cameras to assemble maps of major roadways, identifying lane boundaries, curbs, guardrails and medians with centimeter-level accuracy.

This allows autonomous vehicles to extend their horizon and “see around corners” to understand the road ahead. This has immediate applications in snow and fog, when cameras may experience difficulty visualizing road conditions.

Over time, the industry’s vision is for these HD maps to collect data in real time, creating “self-healing” HD maps that are constantly updated to reflect current conditions. Nandlall says HD maps have the long-term potential to warn drivers of potholes and icy conditions. To be useful, these live HD maps must be updated over the air.

Waymo’s fully self-driving Chrysler Pacifica Hybrid minivan on public roads in Chandler, Arizona. (Photo: Waymo)

“It’s not going to be just batch work,” Nandlall said. “This is going to be streaming. How do we manage that data?”

The long-term vision is to add more intelligence by creating networks of connected vehicles that “talk” to one another using vehicle-to-vehicle (V2V) communications over low-latency wireless connections, increasing safety and reliability by sharing autonomous pathfinding data. The autonomous vehicle roadmap also includes vehicle-to-infrastructure (V2I) communications that enable robot cars to connect with traffic lights and parking meters.

V2V and V2I technologies are crucial to hopes that autonomous vehicles can dramatically reduce traffic congestion in major cities, integrating location-aware cars, traffic lights and toll gates to enable seamless merging at speed, easing the backups seen at intersections and toll plazas.

Infrastructure Requirements and Funding

To create this AI-enabled future, driverless vehicles will require low-latency wireless connections to fiber networks and data centers. This connectivity, storage and data-crunching infrastructure may need to extend to almost everywhere cars can drive.

Given the data generation rates the AECC is seeing, self-driving cars will create demand for additional infrastructure, as well as new ways of sorting out data based on its value.

“There is a ton of data coming into our data centers,” said Nandlall. “We need more sophisticated methods of managing all this data. We can’t afford to consolidate everything in a data lake. We need more dynamic ways of promoting and demoting data. We have to shift the way we view the world to a data-centric view.”

The AECC was formed in 2018 to develop strategies for moving big data between vehicles, edge data centers and the cloud. AECC founding members include AT&T, Intel, Ericsson, NTT, KDDI and Sumitomo Electric. The AECC estimates that data traffic from autonomous vehicles could surpass 10 exabytes per month by 2025 – about 1,000 times the present volume.

Those are eye-popping numbers One of the audience questions during Nandlall’s session was about storage, and whether current hardware would prove adequate. The largest current commercial hard disk drive (HDD) holds 16 terabytes of data.

Nandlall agreed that storage management will become an issue. He said the workable strategies would feature tiers grouped by storage performance and storage media, with data mobility and management policies, and perhaps new thinking about physical storage media.

In this data-driven future, the business models also lack clarity, particularly when it comes to financing this additional infrastructure.

“Who will build it?” said Nandlall. “I believe many telco players are interested in building this out, but there’s a lot of capital expense involved, and the return on that investment is not well understood.”

About the Author

Rich Miller

I write about the places where the Internet lives, telling the story of data centers and the people who build them. I founded Data Center Knowledge, the data center industry's leading news site. Now I'm exploring the future of cloud computing at Data Center Frontier.

Sponsored Recommendations

Get Utility Project Solutions

Lightweight, durable fiberglass conduit provides engineering benefits, performance and drives savings for successful utility project outcomes.

Guide to Environmental Sustainability Metrics for Data Centers

Unlock the power of Environmental, Social, and Governance (ESG) reporting in the data center industry with our comprehensive guide, proposing 28 key metrics across five categories...

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...


Coolant Distribution Units: The Heart of a Liquid Cooling System

nVent's Abhishek Gupta explains why CDUs are at the core of driving the efficiencies that liquid cooling can bring to data centers, so choosing the right one is critical.

White Papers

Get the full report

Ethernet in Data Center Networks

Aug. 1, 2022
This white paper from Anritsu discusses Ethernet usage trends in data center networks, as well as the technologies helping operators meet growing bandwidth demands and verify ...