Imagine if you could remember everything you’ve ever read, seen, said or heard, at a moment’s notice. How much would that change your day-to-day life? Anyone could ask you any question about anything you’ve experienced, and in a flash, you’d have the answer.
Countless hours could be saved on studying for exams, or preparing for big meetings. Having total recall of whatever you’ve learned at-the-ready would turn you into a real-time supercomputing being. You’d win “Jeopardy” every time, and never lose Trivial Pursuit.
If you had that kind of capacity in your mind, you’d have a corporeal version of what Hammerspace calls a global namespace. You’d be able to do in the real world what the do in the realm of computing. The significance of this innovation cannot be overstated.
Since the earliest days of DM Radio, way back in 2008, yours truly would ask experts from leading data vendors: “Do you have a strategic view of which data sets are moving where and when?” Reason being, such a comprehensive view would be incredibly valuable.
Firstly, you’d know where all the redundancies are: which fields and whole data sets are you moving multiple times each day, when only one move would suffice. You’d thus be able to optimize license use, compute costs, and most importantly: the valuable time of personnel.
You would also save incalculable amounts of effort and energy by your operations teams. Data engineers often pull their hair out because the demands of business users exceed what the average human can do in a day. Sure, it’s great to be a hero, but not every day!
You’d also be able to figure out which data sets provide real value to business processes. This is of paramount importance when trying to figure out your ideal information architecture, knowing where to store which data, at what cost and latency.
You’d also be able to finally dissolve the age-old problem of data copies! Yes, you need a backup or two. Even Hadoop defaulted to just three copies of your data. Do you really need seven? Or nine? Research suggests the actual number is 11 copies of enterprise data files!
To quote Nigel Tufnel of the cult movie classic Spinal Tap: “Ours go to eleven!”
How It Works:
Hammerspace’s Global Data Platform software manages data across different storage systems from any vendor, including on-prem and any cloud, providing a range of benefits, including:
• Metadata-driven: Hammerspace assimilates file system metadata from existing storage without the need to move your data. This metadata is information about the data, and is the actual file/folder structure that users see on their desktop. This includes information like file name, location, access permissions, and more.
• Unified view: The metadata layer creates a unified global view via standard SMB/NFS/S3 protocols to all your data, regardless of where it’s physically stored (on-prem servers, cloud storage). Users don’t need client software, or any alterations to their workflows. They simply see all their files across all storage globally via a standard mount point.
• Abstraction: This global namespace abstracts away the underlying storage infrastructure. You interact with your files as if they’re all in one place, even if they’re scattered across different locations. This also means that policy-based data orchestration for tiering, compute workloads, etc. is completely transparent to users, even on live files that are in use.
Imagine a library with books stored in multiple rooms and buildings. Hammerspace’s Global Data Platform is like a master catalog that lists every book and its location. You can access a book via the global catalog and find it easily, no matter where it’s physically stored.
Key benefits of the Hammerspace Global Data Platform:
• Simplified data management: Access and manage all your data wherever it is via standard protocols from a single pane of glass.
• Data mobility: Automate data orchestration, to move data between different storage systems without disrupting user access, even on live data.
• Improved collaboration: Data across multiple storage types and locations is globally accessible to users from anywhere.
• Enhanced data governance: Apply consistent policies and controls globally across your entire data estate.
In essence, Hammerspace provides a powerful way to manage and access data in a hybrid, multi-cloud environment. If deployed properly, it can eliminate the need to ever worry about an Information Architecture ever again. That’s just crazy!
The Power of Indirection
Hammerspace uses a technique called “metadata indirection” to optimize data paths and improve performance. Here’s how it works:
Traditional File Systems: In a typical file system, the directory structure and file location information (metadata) is embedded within each proprietary storage platform, at the infrastructure layer. When you access a file, you have to traverse this directory structure to find where the file is actually stored. When your data set is siloed across multiple storage systems, the result is that your access to those files is also fragmented across multiple file systems.
Hammerspace’s Approach: Hammerspace elevates the file system above the infrastructure layer, decoupling it from the physical data location. This Parallel Global File System creates a global namespace, which is the unified view of all data across any on-prem or cloud storage from any vendor. Users access their files globally via standard SMB, NFS, and/or S3 protocols, regardless of which storage the data is on today, or may move to in the future. No changes to user systems and no client software would be required.
Benefits of Indirection:
• Optimized Data Paths: By decoupling metadata from physical location, Hammerspace can optimize data paths based on factors like network proximity, storage performance, and user access patterns. This ensures that data is accessed from the most efficient location.
• Flexibility and Scalability: Hammerspace ensures that changes in the storage environment, such as adding new storage or moving data, are transparent to the user’s view of the data. All infrastructure changes happen in the background, without interrupting users who see the consistent global view via the metadata layer.
• Improved Performance: By intelligently routing data requests, Hammerspace can reduce latency and improve overall performance. This is especially beneficial for GPU-based workloads used for AI and deep learning, which need direct access to high-performance compute resources.
• Simplified Management: Hammerspace provides automated global control for all data, regardless of where it is physically stored. This simplifies critical data services that need to span all silos and locations, such as data protection, access control, disaster recovery and much more.
In essence, Hammerspace’s indirection layer acts as a smart traffic director for data, ensuring that requests are routed efficiently and that users always have the best possible access to their data.
This approach is similar to how DNS works for website addresses. You type in a human-readable domain name (like dmradio.biz), and DNS resolves it to the actual IP address of the server. Hammerspace does something similar for file paths, making the data access more efficient and adaptable.
“This is the missing level of indirection we’ve never had,” noted Hammerspace CEO David Flynn n on a recent episode of DM Radio. You can check out that full episode right here: https://www.youtube.com/watch?v=bNp9Nbf8d-M
As I pointed out at the beginning of that episode, there is a powerful analogy in the long-and-winding data management story, with a profound line by the character, Jerry, in Edward Albee’s Zoo Story, who says:
“Sometimes it is necessary to go a long distance out of the way, in order to come back a short distance correctly.”
That’s exactly what we’re now seeing in the world of data access. It’s time to come back a short distance correctly. I haven’t seen any other company, or open-source project for that matter, focused on this absolutely game-changing functionality. Kudos to Hammerspace!
They quite literally pull data rabbits out of hats.