Data Sources: Do You Know Where Your Information Is Coming From?

Oct. 1, 2018
Data is becoming critical to our lives and economy. There are many different data points creation mechanisms, and it’s important to understand why they matter. Bill Kleyman is here to help.

People are becoming walking, talking, data generating engines. As we continue to immerse into a data-driven world, all this data helps us in our everyday lives and makes our businesses more competitive. Once siloed, remote, inaccessible, and mostly underutilized, data has become essential to our society and our individual lives. Data is becoming life-critical.

That’s why it’s really important to understand where this data is coming from, and what we in the data center world are doing with it.

All Data is Not Created Equal

There are many different data points and creation mechanisms. Let’s start with embedded data. IDC reports that by 2025, embedded data will constitute nearly 20 percent of all data created. Embedded data could come from a number of origins, including wearable devices, IoT, cars, building automation, machine tools, RFID readers, chip cards, and so much more.

According to the report, the embedding of computing power in a large number of endpoint devices has become a key contributor to data growth in our present era. Today, the number of embedded system devices feeding into data centers is less than one per person globally, and over the next 10 years, that number will increase to more than four per person. While data from embedded systems tends to be very efficient compared with data from entertainment and other consumer usage, the number of files generated will be very large, measuring in the quintillions per year. All these embedded devices creating data fuel the growth and value of Big Data applications and metadata.

For example, let’s say you’re a content provider. You’d want to analyze the meta data of this source to better understand usage, latency, satisfaction, and even where to introduce new services.

This also means, as a data center operator, you have to be very careful about storage and data processing. If this is your line of business, be sure to plan accordingly.

Productivity Data

Next, we have productivity data. Productivity data comes from a set of traditional computing platforms such as PCs, servers, phones, and tablets. This information will also continue to rise as we become much more mobile. By 2025, connected users will number 75% of the world’s population, including previously unconnected groups like young children, the elderly, and people in emerging markets.

The growth of the real-time data being generated from all of these data points will cause a shift in the type of digital storage needed in the future. It will also heighten the focus on low-latency responsiveness from enterprise edge computing storage solutions and offerings.

IDC estimates that the percentage of data in the datasphere that is processed, stored, or delivered by public cloud data centers will nearly double to 26% from 2016 to 2025. Such clouds will process, store, or deliver not just IT services but also entertainment, grid telemetry, and telecommunications.

Understanding the Impact on the Data Center

Data ingestion has been a really hot topic for a lot of data center providers. The reality is that you can create some really powerful services around data ingestion and even processing. That said, if you’re looking to create some fun data analytics services focused on data ingestion, consider the following:

  • Consider the sources and the location. You’re not only concerned about the source of the data, although that’s certainly important. Understanding where your users are will be critical in how you ingest and process information. If you’re looking into investing into a data processing platform, know the source and the location of the incoming data.
  • Latency can mean everything. Data usage is being analyzed by its level of criticality as indicated by factors such as the need for real-time processing and low latency, the ad hoc nature of usage, and the severity of consequences should the data become unavailable (e.g., a medical application is considered to be more consequential than a streaming TV program). IDC estimates that by 2025, nearly 20 percent of the data in the datasphere will be critical to our lives and 10 percent of that will be hypercritical. What is the latency of your use case? Is it a streaming service or a critical healthcare application? Bridging from the previous point, it’s also important to plan your network architecture around data ingestion points which revolve around location and the source. Are you ingesting small bits of data or large files? Similarly, how far does that data need to travel? Smart network systems will allow you to prioritize data and ingest it appropriately.
  • Edge systems are designed for this. Edge architectures are really powerful solutions built for processing large amounts of data as close to the source as possible. Software-defined technologies can allow for the rapid creation and migration of edge storage environments wherein the intersection of live data and Big Data analytics can actually occur. This helps you meet the needs of local and mobile analytic workloads. Edge will help you create smaller data center footprints and remove major challenges around latency. Working with edge for data ingestion as well as processing can make a lot of sense.
  • Leverage cloud for even more ingesting and processing solutions. If you want to create a truly powerful data ingestion and processing platform, working with the cloud could make a lot of sense. AWS Direct Connect gives you the ability to integrate with a variety of APN Technology and Consulting Partners. There are also lots of great providers to choose from, based on your requirements and your region. Examples include CoreSite, Equinix, Lightower, CenturyLink, CyrusOne, Datapipe, XO, Level 3, and many others. This is similar to Azure ExpressRoute where you can work with a variety of partners including Aryaka, CenturyLink, CoreSite, Equinix, Level 3, NTT Communications, Telefonica, Telus, Verizon, and several others. So, you could have your ingestion points at one of these data center sites while the processing occurs in the cloud. Don’t be afraid to design around hybrid to give yourself as much flexibility as possible.

Remember, cloud partners aren’t the only ones who will benefit from the influx of data in our world. Data center vendors and providers are actively going after a growing market that’s trying to better leverage data to make better business decisions. Investment in edge continues to increase as we try to reduce latency and support new customer initiatives.

As you design your own data ingestion initiatives, be sure to take into consideration all of these points we’ve discussed. Data is the lifeblood of business, and you’re here to help bring value to this  data.

Explore the evolving world of edge computing further through Data Center Frontier’s special report series and ongoing coverage.

About the Author

Bill Kleyman

Bill Kleyman is a veteran, enthusiastic technologist with experience in data center design, management and deployment. Bill is currently a freelance analyst, speaker, and author for some of our industry's leading publications.

Sponsored Recommendations

Get Utility Project Solutions

Lightweight, durable fiberglass conduit provides engineering benefits, performance and drives savings for successful utility project outcomes.

Guide to Environmental Sustainability Metrics for Data Centers

Unlock the power of Environmental, Social, and Governance (ESG) reporting in the data center industry with our comprehensive guide, proposing 28 key metrics across five categories...

The AI Disruption: Challenges and Guidance for Data Center Design

From large training clusters to small edge inference servers, AI is becoming a larger percentage of data center workloads. Learn more.

A better approach to boost data center capacity – Supply capacity agreements

Explore a transformative approach to data center capacity planning with insights on supply capacity agreements, addressing the impact of COVID-19, the AI race, and the evolving...

Courtesy of CoolIT Systems

Considerations for Selecting a Row-Based CDU for your Data Center

Ian Reynolds, Senior Project Engineer with CoolIT Systems, outlines the factors you need to consider when selecting the best CDU for your data center's needs.

White Papers

Dcf Venyu Wp Cover 2021 07 12 7 15 51 233x300

The Business Case for Data Center Geo Diversity

July 13, 2022
Geo diversity, or shortening the distance that your data travels, will allow you to reaching your user bases more effectively, and create better customer experiences. This white...