Storage of Data
capacity of 250 TB. When working with a large number of high-capacity drives, we faced the challenge of random errors that can corrupt stored data [4]. Use of redundant array of independent drives (RAID) is a common strategy to prevent data corruption. Instead of a dedicated RAID card, we used a soſtware solution relying on a file and operating system (OS) that are optimized to consolidate a large number of drives, monitor the health of volumes and data, and regularly scrub data to correct any errors. For this purpose we picked the freely available open-source FreeNAS [5] system, which comes with all features of the mature server-grade FreeBSD OS and ZFS filesystem and adds a user-friendly web interface for configura- tion and management of the storage. Physical design and power redundancy. We used server-
grade hardware to implement storage. Te main server has 24 drive bays with each drive bay accepting standard 3.5-inch hard drives of 4 TB capacity each. Te server is equipped with two Xeon processors and 256 GBs of RAM. We used an expan- sion box with 45 bays, colloquially referred to as Just a Bunch of Disks (JBOD), where all drives are attached to the host-bus adapter (HBA). Te JBOD is connected to the main server using a single SAS3 cable. Data drives are organized in RAID6 stripes of six drives. Te system drives are two 120 GB solid- state disks (SSD) in mirror configuration (RAID1). Network connectivity was established using 10 GbE cards
that can accommodate copper wire or optical fiber connectors. Recently, servers have started to provide built-in 10 GbE con- nectors on the motherboard. Te storage server is connected to two networks: a local network between the data storage system and a light-sheet microscope using a 10 Gbps transfer, and a general-use 1 GbE connection to the rest of the university net- work. Tis dual connection ensures that even if the university network is unavailable due to failure or maintenance, we can still use the storage server to store large data sets from the light- sheet microscope. Te local 10 GbE network was set up using an affordable 8-port switch, and it connects in one rack the storage server, acquisition computer, and analysis workstation. Te main data server, JBOD, and other computers men-
tioned have redundant power supplies, which allows us to use two uninterruptible power supply (UPS) units running at half- capacity each. If one UPS fails, the second will be able to provide power to the whole setup. When picking the UPS power rating we estimated a 7-minute run on the battery, which is the time interval for the power generator in the building to be activated. Installing this power required contract work with the institution. Te final assembly of servers and UPS is depicted in Figure 3. Data snapshots and backup. When working with data we
may accidentally delete or overwrite files. To partially protect against these actions, the ZFS file system can take “snapshots” of data, recording changes made since the last snapshot. Any file in the ZFS file system can be “rolled back” to the previously available snapshot instantaneously. We used the FreeNAS sys- tem’s automatic scheduler to set up daily snapshots that are kept for a month. Te data backup is implemented as a daily transfer of snapshots to another server that is also running the ZFS operating system, that is, 6 TB drives organized as 8-drive RAID6 stripes in a 36-drive box, 2x Xeon 2.2 GHz CPU, 64 GB ECC RAM. Te backup can be also automatically stored using one of the cloud providers.
2020 July •
www.microscopy-today.com
Figure 3: Organization of servers. From the bottom up: (1) Two redundant UPS units, (2) main storage server, (3) JBOD expansion, (4) analysis computer, (5) web server for
https://fliptrap.org. All servers are mounted in a standard- sized 19-inch-wide server rack using supplied rails.
Conclusion Current data storage methods lag far behind the devel-
opment of modern microscopy tools. It makes processing of data difficult and data sharing slow, while the amount of imag- ing data continues to increase. Using USB hard drives, cloud storage providers, or shared network-attached storage serv- ers should be carefully designed, and benefits and drawbacks should be considered. Here we provided our implementation of a shared expandable storage system as a starting point and an approximation of such a system’s cost. Ideally, the labora- tories that work with significant amounts of data will consult and receive help from their institution, potentially creating a shared-pooled resource as economy of scale makes services cheaper.
Acknowledgements We would like to thank Tom Limoncelli, Dr. Francesco
Cutrale, Dr. Jacqueline Campbell, Dr. Sandra Gesing, and Tom Morrell for valuable discussions.
References [1] C Allan et al., Nat Methods 9 (2012) 245–53. [2] SV Tuyl and G Michalek, J Librarianship Scholarly Commun 3 (2015) eP1258.
[3] GF Hughes and JF Murray, ACM Trans Storage 1 (2005) 95–107.
[4] B Schroeder and GA Gibson, IEEE Trans Dependable Secure Comput 7 (2010) 337–50.
[5] G Sims, Learning FreeNAS, Packt Publishing Ltd., Birmingham, UK, 2008.
45
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68 |
Page 69 |
Page 70 |
Page 71 |
Page 72 |
Page 73 |
Page 74 |
Page 75 |
Page 76 |
Page 77 |
Page 78 |
Page 79 |
Page 80 |
Page 81 |
Page 82 |
Page 83 |
Page 84