07, Dec, 2024

Why You Need a Plan for Ongoing Unstructured Data Mobility

In the hybrid cloud, AI-enhanced enterprise, unstructured data is growing exponentially everywhere. Within this article, I discuss why unstructured data mobility is not a one-time event but an opportunity to continually right place data to meet organizational needs.

Many enterprise IT leaders store petabytes of data spread across silos in their data centres, at edge locations and in the cloud. Most of this data is unstructured and stored as files of many types and sizes, such as documents, images, video, genomics, IoT and research data. 

Unstructured data is expensive to store, protect and manage due to its sheer volume and pace of growth.  IT organizations are realizing that since 80% of unstructured data typically gets cold within months of creation, by treating cold data differently, they can cut significant costs without compromising user access.  New threats such as ransomware are adding to the urgency of addressing unstructured data management efficiently.  

The result is that unstructured data is increasingly in motion through its lifecycle to less expensive storage and backup options and data lakes and analytics applications. You need a strategy to manage that ongoing mobility.

Why Is Data Mobility So Critical?

First, let’s look at what makes data mobility essential for business success.

  1. Data growth: Unstructured data can be extensive and have many small files, growing exponentially yearly. The days of yesteryear when you could buy one or two storage appliances and set them in the data centre without worry are over.  Enterprises regularly need to add capacity to their NAS, SAN or other storage devices—and supply chain disruptions since the pandemic has slowed this process. Therefore, it’s imperative to have a nuanced approach to data and not treat it all the same.  It’s not sustainable, it’s too expensive, and it’s wasteful.
  2. Cutting overall costs: Most enterprises are spending at least 30% of their IT budget on data storage, according to the 2022 State of Unstructured Data Management Report. Storing all your data on Tier 1 storage doesn’t just drive up the primary storage bill and the cost of backups and disaster recovery. Backups are the more significant part of your bill since operational data typically has three copies. Therefore, data mobility can significantly reduce operational data to lower overall storage costs dramatically.
  3. Data lifecycles: Most organizations keep all or most of their data indefinitely, but as data ages, its value changes. Some data becomes “cold” or infrequently accessed or not needed after 30 days yet must be retained for some time for regulatory or compliance reasons; some data should be deleted; and some data may be required for research or analytics purposes later. A presumably easy answer is to move that data to secure storage in the cloud, but choosing the wrong cloud storage class is risky: cloud file storage is often anywhere from 10x-50X more expensive than cheaper cloud tiers. Ensuring easy mobility for the data as it ages and understanding the best options for different data segments is paramount.
  4. Technology refresh:  Storage architectures typically become obsolete every three to five years, and new options are on the horizon. Cloud vendors usually offer new price-performance options every year. Taking advantage of the latest options can significantly improve price performance, availability and data usability. Yet it requires data migrations and data lifecycle management across vendors and storage architectures. 
  5. Data reuse: Another reason why unstructured data mobility is imperative is due to growing AI and machine learning adoption.  Once data is no longer in active use, it has the potential for a second or third life in extensive data analytics programs.  You might migrate some data to a low-cost cloud tier for archival purposes, but IT or other departments with the correct permissions should be able to quickly & easily discover it later and move it to a cloud data lake or AI tool when needed for many different use cases. 
  6. New business strategies: When an organization is undergoing a merger, acquisition, or divestiture, it must meet new governance and compliance requirements for data. Similarly, the enterprise may embark on a new cloud strategy or adopt a new data architecture. In all these examples, data mobility needs will change. You need a flexible unstructured data management architecture to meet new requirements as they come up so you can find, segment and move data to new locations without undue hassle or cost.

What New Requirements Does Ongoing Data Mobility Bring? 

Ad hoc strategies to address data mobility no longer work in this complex data environment when requirements and needs are in constant flux. IT leaders need a systematic way to manage data movement, meet new requirements, cut costs, be sustainable, and support new projects for unstructured data analytics. Here’s what’s involved:

  1. Visibility of data:  The ability to look at data across storage silos to see trends, patterns, and anomalies and do cost modelling is critical to make intelligent decisions.  Similarly, having a unified way to search for data across silos is essential to find specific data sets and move them to new locations as needed.
  2. Analysis of data:  IT organizations need to understand data across various characteristics to make the right decisions for ITS management. Age of data and time of last access, file size and type, top data owners, costs, the volume of data and data growth rates are some of the top metrics to track
  3. Cold data tiering:  Segment and tier inactive or cold data before you migrate.  Too often, organizations will send large data sets to the cloud to save money but will miss out on significant savings because they are lifting and shifting data from one expensive storage location to another. Move the rarely-accessed data to low-cost object storage such as AWS Glacier or Azure Blob. Migrate the hot or warm data to a high-performing tier until it has aged out according to your policies.
  4. Understand cloud storage classes: Cloud storage options are always changing and maturing for customers, and choice can be overwhelming.  Partner with a cloud data storage expert to help guide these decisions so that you can efficiently map the right data sets to the right cloud storage service and create a plan for cloud data lifecycle management.
  5. Departmental collaboration: Today, IT organisations focus on managing data, not storage.  To that end, working directly with data owners on strategies is essential to avoid conflicts and ensure that data mobility and management decisions are sound.
  6. Policy automation: In large-scale data environments, especially a large enterprise with many different stakeholders, shares and directories, you can’t support data lifecycle management manually. Use an unstructured data management solution to easily create and automate policies to copy, tier, migrate and confine/delete distinct data sets. Ultimately, policy automation will result in more savings, better compliance and the assurance that data is always living in the right place at the right time.
  7. Native access to data:  Data is your corporate asset. Regardless of where you want to migrate or tier data, you must ensure that it’s easily accessible and usable in its target destination. The notion of native access to data means that if you move data to a new storage location, such as object storage in the cloud, you can access it there and move it somewhere else without needing to go through your file storage layer, which incurs licensing fees and requires adequate capacity. Cloud-native access is required for using cloud-based AI and ML services.  Otherwise, your data is locked and not available for additional value-added activities.

Unstructured data is both a liability and an asset. Managing it properly with a plan for long-term data mobility should be one of the top initiatives for enterprise IT today. By doing so, you can get more value from massive unstructured data volumes, be as cost-effective as possible and enable new ways of finding and using data to serve the broader organization better.