%0 Conference Paper %A Dan Gunter %A Shreyas Cholia %A Anubhav Jain %A Michael Kocher %A Kristin A Persson %A Lavanya Ramakrishnan %A Shyue P Ong %A Gerbrand Ceder %B High Performance Computing, Networking, Storage and Analysis (SCC) 2012 %D 2012 %G eng %I IEEE %P 1244-1251 %R 10.1109/SC.Companion.2012.150 %T Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project %8 11/2012 %X
—Efforts such as the Human Genome Project provided a dramatic example of opening scientific datasets to the community. Making high quality scientific data accessible through an online database allows scientists around the world to multiply the value of that data through scientific innovations. Similarly, the goal of the Materials Project is to calculate physical properties of all known inorganic materials and make this data freely available, with the goal of accelerating to invention of better materials. However, the complexity of scientific data, and the complexity of the simulations needed to generate and analyze it, pose challenges to current software ecosystem. In this paper, we describe the approach we used in the Materials Project to overcome these challenges and create and disseminate a high quality database of materials properties computed by solving the basic laws of physics. Our infrastructure requires a novel combination of highthroughput approaches with broadly applicable and scalable approaches to data storage and dissemination.