search.noResults

search.searching

dataCollection.invalidEmail
note.createNoteMessage

search.noResults

search.searching

orderForm.title

orderForm.productCode
orderForm.description
orderForm.quantity
orderForm.itemPrice
orderForm.price
orderForm.totalPrice
orderForm.deliveryDetails.billingAddress
orderForm.deliveryDetails.deliveryAddress
orderForm.noItems
Practical Guide to Storage of Large Amounts of Microscopy Data


Andrey Andreev1 * and Daniel E.S. Koo2 1California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125


2University of Southern California, USC Michelson Center for Convergent Bioscience, 1002 West Childs Way, Los Angeles, CA 90089 *aandreev@caltech.edu


Abstract: Biological imaging tools continue to increase in speed, scale, and resolution, often resulting in the collection of gigabytes or even terabytes of data in a single experiment. In comparison, the ability of research laboratories to store and manage this data is lag- ging greatly. This leads to limits on the collection of valuable data and slows data analysis and research progress. Here we review common ways researchers store data and outline the drawbacks and benefits of each method. We also offer a blueprint and budget estimation for a currently deployed data server used to store large datasets from zebrafish brain activity experiments using light-sheet microscopy. Data storage strategy should be carefully considered and different options compared when designing imaging experiments.


Keywords: big data, data workflow, data management infrastructure, light-sheet microscopy, zebrafish brains


Introduction Data acquisition rates in fluorescence microscopy are


exploding due to the increasing size and sensitivity of detectors, brightness and variety of available fluorophores, and complex- ity of the experiments and imaging equipment. An increasing number of laboratories are now performing complex imaging experiments that rapidly generate gigabytes and even terabytes of data, but the practice of data storage continues to lag behind data acquisition capabilities. Tis slows data processing, col- laboration, and quality control. Several large-scale database solutions [1] have been developed to manage imaging data, but the vast majority of laboratories rely on outdated meth- ods of data transfer and management such as USB drives and personal computers. Te success of high-performance cluster computing resources, when available, has yet to fully solve the data storage challenge. In this article, we compare common data storage solutions and discuss what can be implemented with different levels of financial and institutional resources, from cloud storage to institution-run storage servers. Cutting-edge fluorescence and multiphoton microscopy


provide a unique challenge for data storage and manage- ment. In our work we use two-photon light-sheet fluorescence microscopy (Figure 1), collecting whole zebrafish brain struc- ture and activity data (Figure 2). An experiment can contain up to 50 axial slices, each covering a 400×400 μm area, col- lected with 3-second volumetric rate. Tis results in a 300–500 GB dataset per single sample. Te data size is further increased when two or three spectral channels are imaged. Here we pro- vide a description of the 250 TB storage server we have built and currently share between more than a dozen researchers. Te system provides automatic backup and data access man- agement. Tis guide can be used as a starting point for the transformation of data management practices toward more resilient and efficient data pipelines in laboratories that use


42 doi:10.1017/S1551929520001091


modern microscopy tools. While our experiments are based on light sheet microscopy, the system described here can be used for any type of image data collection and storage. Similar to other microscopy-oriented labs, we collect large


amounts of data while simultaneously developing new experi- ments and data processing pipelines. Much of our work is in constant flux, and it is nearly impossible to reliably lock in a single data acquisition process for a long period of time. Tis flexibility requires rapid progress in development of our tools and puts additional pressure on the required data processing and storage infrastructure. Te networks for data transfer from acquisition microscopes are oſten slow (less than 1 Gbps), and we rarely have centralized cost-efficient institutional storage capabilities. To use microscopy facilities more efficiently, we need networks that can transfer and store data at a speed of at least 1 Gbps, and centralized storage.


Current Practices Today, research labs oſten use portable USB hard drives and


cloud-based storage [2]. Tese tools are poorly adapted for the data demands of modern microscopy. Here, we briefly review several solutions and provide recommendations on storage strat- egy depending on the size of the files requiring storage (Table 1). USB external hard drives. Moving data using a USB hard


drive or flash drive is very common in research. We can pur- chase storage very quickly and upload data immediately, since all research computers have USB ports. USB drives are very easy to use and allow for drag-and-drop operations for moving files from the data acquisition computer to a destination drive and/or data analysis workstations. Files may also be opened directly from the USB drive itself. Some devices on the market also pro- vide increased reliability, such as portable external storage enclo- sures with multiple drives and duplication of files. An important advantage of USB drives is that they remove the necessity of net- work connectivity and allow “quarantine” of valuable research microscopes from the internet. Te speed of data transfer can be relatively fast, ranging from approximately 1 Gbps to 5 Gbps with the newest flash-based storage and transfer protocols. Disadvantages of USB hard drives include an increase in


probability for data loss and difficulties related to data orga- nization and sharing. A single USB hard drive may have a sig- nificant rate of failure (1–5% per year) or unrecoverable read errors (URE) (1 error per 12 TB of read data [3]). Furthermore, stress on the connectors due to frequent usage may cause their failure within 1–2 years. USB drives can also serve as easy vec- tors for transfer of computer viruses, and drives should always be inspected prior to use. As the number and size of data folders and files increase, the need for increased cataloging and organization of data


www.microscopy-today.com • 2020 July


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54  |  Page 55  |  Page 56  |  Page 57  |  Page 58  |  Page 59  |  Page 60  |  Page 61  |  Page 62  |  Page 63  |  Page 64  |  Page 65  |  Page 66  |  Page 67  |  Page 68  |  Page 69  |  Page 70  |  Page 71  |  Page 72  |  Page 73  |  Page 74  |  Page 75  |  Page 76  |  Page 77  |  Page 78  |  Page 79  |  Page 80  |  Page 81  |  Page 82  |  Page 83  |  Page 84