Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project

Date Published
11/2012
Publication Type
Conference Paper
Authors
DOI
10.1109/SC.Companion.2012.150
Abstract

—Efforts such as the Human Genome Project provided a dramatic example of opening scientific datasets to the community. Making high quality scientific data accessible through an online database allows scientists around the world to multiply the value of that data through scientific innovations. Similarly, the goal of the Materials Project is to calculate physical properties of all known inorganic materials and make this data freely available, with the goal of accelerating to invention of better materials. However, the complexity of scientific data, and the complexity of the simulations needed to generate and analyze it, pose challenges to current software ecosystem. In this paper, we describe the approach we used in the Materials Project to overcome these challenges and create and disseminate a high quality database of materials properties computed by solving the basic laws of physics. Our infrastructure requires a novel combination of highthroughput approaches with broadly applicable and scalable approaches to data storage and dissemination.

Conference Name
High Performance Computing, Networking, Storage and Analysis (SCC) 2012
Year of Publication
2012
Pagination
1244-1251
Publisher
IEEE
Refereed Designation
Refereed
Organizations
Download citation