Data Sources: Do You Know Where Your Information Is Coming From?

Oct. 1, 2018
Data is becoming critical to our lives and economy. There are many different data points creation mechanisms, and it’s important to understand why they matter. Bill Kleyman is here to help.

People are becoming walking, talking, data generating engines. As we continue to immerse into a data-driven world, all this data helps us in our everyday lives and makes our businesses more competitive. Once siloed, remote, inaccessible, and mostly underutilized, data has become essential to our society and our individual lives. Data is becoming life-critical.

That’s why it’s really important to understand where this data is coming from, and what we in the data center world are doing with it.

All Data is Not Created Equal

There are many different data points and creation mechanisms. Let’s start with embedded data. IDC reports that by 2025, embedded data will constitute nearly 20 percent of all data created. Embedded data could come from a number of origins, including wearable devices, IoT, cars, building automation, machine tools, RFID readers, chip cards, and so much more.

According to the report, the embedding of computing power in a large number of endpoint devices has become a key contributor to data growth in our present era. Today, the number of embedded system devices feeding into data centers is less than one per person globally, and over the next 10 years, that number will increase to more than four per person. While data from embedded systems tends to be very efficient compared with data from entertainment and other consumer usage, the number of files generated will be very large, measuring in the quintillions per year. All these embedded devices creating data fuel the growth and value of Big Data applications and metadata.

For example, let’s say you’re a content provider. You’d want to analyze the meta data of this source to better understand usage, latency, satisfaction, and even where to introduce new services.

This also means, as a data center operator, you have to be very careful about storage and data processing. If this is your line of business, be sure to plan accordingly.

Productivity Data

Next, we have productivity data. Productivity data comes from a set of traditional computing platforms such as PCs, servers, phones, and tablets. This information will also continue to rise as we become much more mobile. By 2025, connected users will number 75% of the world’s population, including previously unconnected groups like young children, the elderly, and people in emerging markets.

The growth of the real-time data being generated from all of these data points will cause a shift in the type of digital storage needed in the future. It will also heighten the focus on low-latency responsiveness from enterprise edge computing storage solutions and offerings.

IDC estimates that the percentage of data in the datasphere that is processed, stored, or delivered by public cloud data centers will nearly double to 26% from 2016 to 2025. Such clouds will process, store, or deliver not just IT services but also entertainment, grid telemetry, and telecommunications.

Understanding the Impact on the Data Center

Data ingestion has been a really hot topic for a lot of data center providers. The reality is that you can create some really powerful services around data ingestion and even processing. That said, if you’re looking to create some fun data analytics services focused on data ingestion, consider the following:

  • Consider the sources and the location. You’re not only concerned about the source of the data, although that’s certainly important. Understanding where your users are will be critical in how you ingest and process information. If you’re looking into investing into a data processing platform, know the source and the location of the incoming data.
  • Latency can mean everything. Data usage is being analyzed by its level of criticality as indicated by factors such as the need for real-time processing and low latency, the ad hoc nature of usage, and the severity of consequences should the data become unavailable (e.g., a medical application is considered to be more consequential than a streaming TV program). IDC estimates that by 2025, nearly 20 percent of the data in the datasphere will be critical to our lives and 10 percent of that will be hypercritical. What is the latency of your use case? Is it a streaming service or a critical healthcare application? Bridging from the previous point, it’s also important to plan your network architecture around data ingestion points which revolve around location and the source. Are you ingesting small bits of data or large files? Similarly, how far does that data need to travel? Smart network systems will allow you to prioritize data and ingest it appropriately.
  • Edge systems are designed for this. Edge architectures are really powerful solutions built for processing large amounts of data as close to the source as possible. Software-defined technologies can allow for the rapid creation and migration of edge storage environments wherein the intersection of live data and Big Data analytics can actually occur. This helps you meet the needs of local and mobile analytic workloads. Edge will help you create smaller data center footprints and remove major challenges around latency. Working with edge for data ingestion as well as processing can make a lot of sense.
  • Leverage cloud for even more ingesting and processing solutions. If you want to create a truly powerful data ingestion and processing platform, working with the cloud could make a lot of sense. AWS Direct Connect gives you the ability to integrate with a variety of APN Technology and Consulting Partners. There are also lots of great providers to choose from, based on your requirements and your region. Examples include CoreSite, Equinix, Lightower, CenturyLink, CyrusOne, Datapipe, XO, Level 3, and many others. This is similar to Azure ExpressRoute where you can work with a variety of partners including Aryaka, CenturyLink, CoreSite, Equinix, Level 3, NTT Communications, Telefonica, Telus, Verizon, and several others. So, you could have your ingestion points at one of these data center sites while the processing occurs in the cloud. Don’t be afraid to design around hybrid to give yourself as much flexibility as possible.

Remember, cloud partners aren’t the only ones who will benefit from the influx of data in our world. Data center vendors and providers are actively going after a growing market that’s trying to better leverage data to make better business decisions. Investment in edge continues to increase as we try to reduce latency and support new customer initiatives.

As you design your own data ingestion initiatives, be sure to take into consideration all of these points we’ve discussed. Data is the lifeblood of business, and you’re here to help bring value to this  data.

Explore the evolving world of edge computing further through Data Center Frontier’s special report series and ongoing coverage.

About the Author

Bill Kleyman

Bill Kleyman is a veteran, enthusiastic technologist with experience in data center design, management and deployment. Bill is currently a freelance analyst, speaker, and author for some of our industry's leading publications.

Courtesy of Stream Data Centers

Ready for AI? Not Until You’ve Overcome These 3 Challenges

Supporting AI deployments is easy to say and hard to do. Stuart Lawrence, VP of Product Innovation & Sustainability at Stream Data Centers and Mike Licitra, VP of Solutions Architecture...

White Papers


The Data Center Edge of Tomorrow: Five Critical Ways Digital Twins and AI Will Impact the Network

Sept. 21, 2023
In our online world, we persistently rely on our ability to connect. In fact, our ability to connect is critical to our work life balance. To meet our evolving and growing connectivity...