Seagate is going to change the way you look at enterprise storage. It just took the wraps off it first direct-access-over-Ethernet Kinetic hard drive (Fig. 1). The drive is essentially a modified 3.5-in, 5900 rpm, 4 Tbyte Seagate Terascale device. It uses the same SAS connector but it uses the pins for a pair of 1 Gbit/s Ethernet ports instead.
The drive is part of Seagate’s Kinetic Open Storage platform. It not only changes the interface ports but the protocol as well. The usual, low level block IO is replaced with a Seagate key/value API that is more suitable for applications like Hadoop (see Essentials Of The Hadoop Open Source Project) or distributed cluster file systems like OpenStack’s Swift. The drive handles free storage and low level mapping of objects to sectors.
- Seagate Launches World’s Fastest Enterprise Hard Drive
- What’s The Difference Between Hardware And Software Hybrid Disk Drives?
- The Fundamentals Of Flash Memory Storage
- Essentials Of The Hadoop Open Source Project
Of course, the drive's bandwidth needs to be compared to 6 Gbit/s SATA and 12 Gbit/s SAS. It may seem like a limitation but large arrays of SAS and SATA drives often sit behind a controller that may have other limitations. On the other hand, Ethernet switches can provide access to a large number of drives. Ethernet speed has surpassed SAS and SATA and the switches deliver even more bandwidth. The drive interfaces could eventually jump to 10 Gbit/s or 40 Gbit/s but for now 1 Gbit/s seems more than adequate.
The drives use a variable length key up to 4 Kbits long. The API defines a host of other fields in control packets including access tokens for security. At this point Seagate recommends that the data portion be under 1 Mbyte but the API can handle large objects. Still, objects will tend to be smaller because it allows them to be distributed so accessing them can utilize more overall system bandwidth to transfer data. Transport Layer Security(TLS) support allows encrypted transmission of data.
The objects stored on a drive are independent of each other. Any object linking is done at the application level. Splitting large, compound objects into many smaller ones is already done by many applications.
Operating systems can provide access to these drives but APIs are available for languages like C, C++ and Java for direct access. The intent for this type of software defined storage is to minimize the layers of interface between the application and the data.
This software defined storage is initially being delivered with hard disk drives but the approach is applicable to solid state and hybrid drives or even systems. Seagate's 4U Kinetic Terascale enclosure will hold 60 drives or 240 Tbytes. This JBOK (Just a Bunch of Kinetics) can connect all the drives to a top-of-rack Ethernet switch that in turn links the storage to virtual machine clusters.
Seagate is targeting the enterprise with this type of drive but it has interesting implications for embedded applications especially for high reliability and high availability applications.