Round Table Data Mining for Materials
Presentations of the event: can be found by clicking the following links or in the attachments at the bottom of this page
Internationally, there is a trend towards accelerated material development and faster discovery of innovative, improved materials for a wide range of applications. Several approaches are followed and combined, including predictive modeling, advanced characterization and high-throughput experimentation strategies. The horizontal SIM programs are designed to contribute to this trend.
At the same time, availability of data is constantly improving (cfr big data, consumer data), as well as the tools to efficiently analyze and model these data in order to make a step change compared to state of the art. The approach is often termed as data mining in the case of large datasets. Domains like biotechnology and pharmaceutical industry are front runners in the use of data mining for product development.
We believe that there are yet unseized opportunities for material development using data mining. This is for instance strongly confirmed by the ongoing efforts and strategy of the Materials Genome Initiative in the US.
To get a better view on the opportunities and challenges related to data mining for material development, we have organized this Round Table on January 19th 2015. The event started with presentations by experts in the field of data mining for biotech, medical IT, process engineering and material intelligence, and is followed by a panel discussion and brainstorming session on the opportunities and challenges for data mining in materials R&D.
9:00 – 9:15 Introduction by Johan Paul - Flamac
9:15 - 9:45 short presentation by each participant
9:20 – 10:20 Presentations by data experts
- 9:20: Dr. Alexander Botzki – VIB
- 9:40: Dr. Romain Elleboode – Granta Design
- 10:00: Prof. Bart De Moor – iMinds
10:20 – 12:00 Panel discussion
12:00 – 13:00 Sandwich lunch and networking
The majority of data repositories in the materials science and engineering community are not publicly accessible and associated with specific projects or research groups. Particularly, these repositories are primarily established to store and share the experimental and simulated data (from nano up to macro scale) generated within a specific project or program. They generally do not follow uniform standards for data and metadata nor provide for data discoverability and citation. As a result only loose connections exist between these project data and public databases and models (see Figure 1).
Fig. 1 Today's approach to computational material design 
Reliance on shared digital data in scientific and engineering pursuits - whether the data are generated by computational or experimental efforts - is becoming more commonplace within the materials science and engineering community. Concurrently, government policies across the globe are embracing an 'open science' model which sets a requirement for sharing digital data generated from publicly funded research.
The European Union has been very proactive in studying the impacts of a digitally linked world on the scientific community. The EU FP7 funded a project called Opportunities for Data Exchange that has produced several relevant reports on publishing digital data in the scientific community . And, in June 2012, the Royal Society published 'Science as an open enterprise' which promotes free and open access to scientific results, including data . In the USA, the National Institutes of Health have long promoted a policy of open access to data generated from their grants . Additionally, specific to the materials community, the sharing of digital data is a key strategy component of the US's Materials Genome Initiative, and mechanisms to foster and enable sharing are actively under consideration .
Fig 2 Repositories for standardized data-exchange 
As a result, there is a clear need for an integrated, collaborative workflow that draws simultaneously from experiments, computation, literature and theory. The vast spans of length and time scales covered by materials research create unique challenges for delivering quantitative, qualitative and predictive scientific and engineering tools. Important components are the development of advanced simulation tools that are validated through experimental data and networks to share all useful data and tools.
Tackling these challenges will provide the basis for a structured data archive and mining approach, enabling the rapid development of novel materials with targeted properties.
 (2014) The Materials Genome Initiative, Data, Open Science, and NIST, 2nd National Data Service Consortium Workshop
 (2014) Opportunities for data exchange
 (2012) Science as an open enterprise
 (2003) Final NIH statement on sharing research data -
 (2014) The materials genome initiative