Indiana University

KnowledgeBase

Data Capacitor

Note: Indiana University will soon replace its current Data Capacitor with Data Capacitor II, a high-speed, high-capacity storage facility for very large data sets. With 5 PB of storage, Data Capacitor II will support big data applications used in computational research. IU partnered with DataDirect Networks, Inc. (DDN) to develop Data Capacitor II, which is scheduled to be installed in the IU Data Center in spring 2013. For more about Data Capacitor II, see the November 8, 2012, press release. If you have questions about how the change to Data Capacitor II will affect your research, email the High Performance File Systems group.

On this page:


System overview

The Indiana University Data Capacitor is a high-speed temporary data storage facility used for research computing. It serves all IU campuses, and other research centers throughout the country. The Data Capacitor provides high read/write speeds for user data and support for very large data sets. Using a wide area file system, the Data Capacitor lets you access remote data as if the file system was mounted locally, letting you share large amounts of data with researchers at multiple remote sites.

System name: Data Capacitor and Data Capacitor-WAN

System provider: IU Research Technologies, Pervasive Technology Institute; this system is managed by High Performance File Systems

Login node: Big Red, Quarry, Mason

Usage policies:

  • Data Capacitor file space is divided into two categories:

    • Project: The Project directory is dedicated to long-term projects with storage and ongoing access requirements that cannot be met with other existing systems. Requests for project space will be submitted to the Data Capacitor team, and evaluated by the Data Capacitor allocation committee.
    • Scratch: The Scratch directory is a temporary workspace currently available to all users of Big Red, Quarry, and Mason. Scratch space is not allocated, and its total capacity fluctuates based on project space requirements. Files in scratch space may be purged if they have not been accessed for more than 60 days.

  • Projects are given, by default, a quota of 10 TB. Larger quotas can be requested if more space is needed. Due to performance issues, storing a large number of small files is discouraged, but arrangements can be made if a need exists. Available scratch space will vary depending on project use, and comprises that portion of the Data Capacitor not allocated to projects.

  • The Data Capacitor is not intended for permanent storage of data, and is not backed up. You can archive data stored or created on the Data Capacitor on IU's Scholarly Data Archive (SDA). It is your responsibility to arrange for long term storage of any data on the system as needed.

  • Lustre is not designed for storing a large number of small files. If you need such storage, you should use a compression utility (e.g., tar or gzip) to bundle your files together into a smaller number of large files. If you don't do this, the performance of the Data Capacitor will suffer, and you will strain the Data Capacitor's file-count (inode) capacity.

Back to top

System information


System configurations for Data Capacitor Aggregate information
Machine type Data storage
Operating system Linux CentOS release 5.4, Kernel 2.6.18
Memory model Distributed
Processor cores 4-6
CPUs 2
Nodes 6
RAM 48 GB - 96 GB DDR-2
Network 10 Gb Ethernet
Storage Connected via 4 Gb fiber channel to DataDirect Network SFA10000 storage controllers
Storage information
File systems Lustre
Total disk space 427 TB
Total scratch space Varies based on system usage
Aggregate I/O 40 Gbps
Availability scope All IU campuses
Quotas 10 TB default; more upon request
Backup and purge policies The Data Capacitor is not intended for permanent storage of data, and is not backed up. Files in project space may be purged if they have not been accessed for more than 180 days. Files in scratch space may be purged if they have not been accessed for more than 60 days.

System configurations for Data Capacitor - WAN Aggregate information
Machine type Data storage
Operating system Red Hat Enterprise Linux v5, Kernel 2.6.18
Memory model Distributed
Processor cores 4
CPUs 2
Nodes 6
RAM 8 GB - 32 GB DDR-2
Network 10 Gb Ethernet
Storage Connected via 4 Gb fiber channel to DataDirect Network S2A9550 storage controllers
Storage information
File system Lustre 1.8.1.1 patched with IU's UID/GID mapping code
Total disk space 339 TB
Total scratch space Varies based on system usage
Aggregate I/O 40 Gbps
Availability scope All IU campuses and other US sites; accessible through the Extreme Science and Engineering Discovery Environment (XSEDE)
Quotas 10 TB default; more upon request
Backup and purge policies The Data Capacitor is not intended for permanent storage of data, and is not backed up. Files in project space may be purged if they have not been accessed for more than 180 days. Files in scratch space may be purged if they have not been accessed for more than 60 days.

Back to top

System access

  • The Data Capacitor is mounted on Big Red, Quarry, and Mason as /N/dc/..., and behaves like any other disk device on those machines. If you have an account on Big Red, Quarry, or Mason, you can access /N/dc/scratch.

  • Users at other institutions (including IU researchers with accounts on remote systems) can request Data Capacitor storage space, which can be mounted at the remote institution, as well as on Big Red, Quarry, and Mason. Access to the WAN space is available at /N/dcwan/.

Back to top

Transferring your files to the Data Capacitor

The Data Capacitor is a parallel high-performance file system. Files are not "transferred" into the system; instead the file system is mounted on computational resources, and is accessible from those resources as a directory path (e.g., /DC/N/Project/). To read or write a file with the Data Capacitor, use any standard Unix command for reading or writing a file in a directory.

Back to top

Reference

For more about the Lustre file system, see the Lustre wiki.

Back to top

Policies

  • XSEDE users can store data on the Data Capacitor.

  • The Data Capacitor team may purge files in scratch space if they have not been accessed for more than 60 days.

  • Unless special arrangements have been made, the Data Capacitor team may purge files in project space that have not been accessed for more than 180 days.

  • If the system's capacity is being strained, the Data Capacitor team may ask you to consolidate your files into a tar or gzip archive, or move them to a long-term backup facility.

Back to top

Support

For technical support or general information, email High Performance File Systems. For after-hours support, call Data Center Operations (812-855-9910) and ask to have High Performance File Systems contacted. To receive maintenance and downtime information, subscribe to the hpfs-maintenance-l@indiana.edu mailing list (see On IU List, how do I subscribe to a list?).

Back to top

This is document avvh in domain all.
Last modified on April 12, 2013.

Search the Knowledge Base