MATERIALS ONTOLOGY: AN INFRASTRUCTURE FOR EXCHANGING MATERIALS INFORMATION AND KNOWLEDGE

We have rich information resources for materials science and engineering - raw measurement data, computational simulation methods, digitized handbooks, and digital libraries. However, these resources have a wide variety of formats, terminologies, and concepts, which makes it difficult to find appropriate information for materials design, development, and evaluation. One solution to this problem is to integrate these resources into a computer readable concept map, called a domain ontology, which describes concepts and relationships among the concepts in materials science and engineering. This paper describes a trial that constructs a standard of metadata description using ontology language and demonstrates the validity of this construction through data exchange among heterogeneous material databases. "Materials Ontology," which consists of several sub ontologies corresponding to substance, process, environment, and property, is developed using the ontology language of the Semantic Web, OWL, which enables the definition of a flexible and detailed structure of materials information. A versatile "materials data format" is built on the Materials Ontology as a component of the materials information platform and is applied to exchange data among three different thermal property databases, maintained by two major materials science research institutes in Japan.


INTRODUCTION
The scientific data and knowledge generated in organizations and institutes are available on the Internet as Wikis, e-Journals, and online databases.These resources are loosely linked with each other, but their representation, terminology, and formats are not standardized.In order to integrate such information resources distributed on the Web, constructing a domain ontology that represents a map of concepts and their relationships for a certain scientific realm is becoming a major concern of e-science.This ontology is needed to assist scientists/researchers in finding adequate information from these data resources and integrating scientific information for research and engineering (Rubin, Lewis, Mungall, Misra, et al., 2006;Nambiar, Ludaescher, Lin, & Baru, 2006).
For materials science in particular, there exist several trials to create a domain ontology: PLINIUS is a knowledge base that handles knowledge about ceramics research (van der Vet, Speel, & Mars, 1995); Ashino and Fujita (2006) offer a trial describing creep properties and a creep data analysis process with an ontology; and MatONT (Cheung, Drennan, & Hunter, 2008) is designed to support information integration for new materials research.Ashino and Oka (2007) constructed a prototype of an information platform for materials information.It consists of two functions: one is an RSS metadata aggregation for materials databases and the other is a domain ontology for exchanging materials data among three heterogeneous materials databases that have been developed independently by AIST (National Institute of Advanced Industrial Science and Technology, Japan), NIMS (National Institute of Materials Science, Japan), and the materials data schema defined by MatDB, a fatigue and creep data schema designed by researchers in Japan.These databases consist mainly of the mechanical and physical properties of engineering materials.
Data Science Journal, Volume 9, 8 July 2010 Furthermore, as evaluation of materials from multiple viewpoints grows more important, we must evaluate materials not only for their cost performance but also for energy efficiency, environmental effects, and possible long term toxicity.This makes it even more important to integrate heterogeneous material databases.Materials Ontology enables the exchange of materials data among heterogeneous materials databases and other information resources.

DESIGN OF MATERIALS ONTOLOGY
Current Materials Ontology consists of the following seven ontologies, which are organized into three groups.Materials Information (Figure 3) is an aggregation of other classes corresponding to a record of materials data.Class EngineeringMaterial is an aggregation of ChemicalFormula, CommonName, SubstanceClass, and other class instances and data that are required to specify a single specimen.Class MaterialProperty is an aggregation of Environment, MeasurementMethod, Property, Equation, and Specimen.The ObjectProperty Specimen takes an instance of EngineeringMaterial as its value.

IMPLEMENTATION AND AN APPLICATION OF MATERIALS ONTOLOGY FOR MATERIAL DATA EXCHANGE
In order to utilize software tools and to maintain compatibility with other ontologies, W3C (the World Wide Web Consortium) recommends the use of OWL, the Web Ontology Language (Bechhofer, Harmelen, Hendler, Horrocks, McGuiness, Patel-Schneider, et al., 2004), to implement Materials Ontology.OWL is a component of the so called Semantic Web framework, a collection of standardized definition methods of data structures, ontologies, rules, and logic that share data and knowledge on the Internet.One of the applications of the Semantic Web shares scientific knowledge and information (Marshall & Prud'hommeaux, 2008).
We used the Protege ontology editor (Knublauch, Fergerson, Noy, & Musen, 2004) to edit and visualize and defined about 600 classes.From the huge amount of conceptual material related to materials properties, we have focused on thermal properties because they are found in all three -AIST, NIMS, and MatDB -materials databases.
Data exchange among three heterogeneous databases is implemented with an intermediate data representation based on Materials Ontology.Figure 5 displays part of thermal property data structures defined in the NIMS, Materials Ontology, and AIST databases.The gray arrows indicate corresponding concepts.For example, the value of the thermal conductivity is defined as a tensor (k11, k22, k33), in the AIST database; the "field" accepts both tensor and scalar; and the corresponding value class in Materials Ontology accepts a scalar value and arbitrary size matrix.
These two schemas show typical styles of material data representation with relational data models.The NIMS data structure is a straightforward derivation of an experimental data structure; each data field has the name of its contents, e.g., thermalConductivity and chemicalFormula.In contrast, the AIST data schema is designed for extensibility.Its fields have generalized names, e.g., "property."In this field, the names of materials properties are stored as string data.This design enables the addition or modification of property names after the schema has been fixed.
The corresponding data fields are mapped into the Materials Ontology network, using an XSLT (Clark, 1999) template to translate.Although the template synthesis can be automated, it is still done manually.
Figure 5. Data schema structures for thermal conductivity of the NIMS, Materials Ontology, and AIST databases and correspondence of their data fields; "matinfo:" is the prefix of the material information ontology.

DISCUSSION
There are several trials that have developed a materials database format with XML, such as MatDB and MatML (Kaufman & Begley, 2003).However, MatML has no definition of concepts.In other words, it has no semantics.
It takes an approach similar to that of the AIST materials database discussed in the previous section, which achieves generality.Of course, it is very difficult to agree on definitions of concepts.In order to exchange data among heterogeneous materials databases, however, some kind of translation table for materials data is essential.
It is reasonable to start extracting common concepts from existent databases or handbooks and gradually extend this work.Modularity and extensibility should be considered in the basic structure.In OWL, they are provided mainly by namespaces and prefix URI, which should be selected carefully.Ashino and Fujita (2006) provide an example of materials science descriptive knowledge using OWL.The Materials Ontology gives essential common terms or notations for this kind of knowledge base.Another possible component is a digital library of empirical and theoretical equations for materials science knowledge written in MathML, the mathematical markup language.These three possibilities use a common data format, XML, and can work together on the Web.

CONCLUSION
Materials Ontology, which covers concepts of materials substance, property, environment, and process, is developed and applied for data exchange among heterogeneous materials databases.It serves as a base with which to integrate databases and other materials information resources, such as computational simulations, equation solvers, and documents.Materials Ontology is a structured common dictionary of concepts related to materials science.It is used to describe materials science knowledge within a Semantic Web framework, mapping concepts among heterogeneous materials data resources, and make an intelligent search of materials information, data, and knowledge.
Documentation of Materials Ontology is available from our materials information portal, http://musigny.rds.toyo.ac.jp:8080/, currently available only in Japanese but in the process of translation into English.

Figure 3 .
Figure 3. Data structure of MaterialInformation ontology.The MaterialInformation class aggregates material, its property, and data source.It corresponds to a single data entry in a materials database.

Figure 4 .
Figure 4. Thermal conductivity data of pure aluminum specimen written in Materials Ontology Structure of the Property Ontology.The Property class has the properties: PropertyName, Value, and UnitSystem.The subclasses, Chemical, Thermal, and Mechanical, correspond to the taxonomy of materials property.Process ontology describes not only material processing, such as heat treatment, but also measurement methods and manufacturing processes, such as welding.Environment ontology describes the environmental parameters of materials processes, such as atmosphere, stress, and temperature.Substance ontology is a taxonomy of materials based on material types as shown in Figure2.The designers of MatOnt have decided to construct a Figure1shows part of the structure of the Property ontology, Property class, which corresponds to the fundamental data structure of materials property and has six properties (attributes).Three of them, PropertyName, UnitDimension, and UnitSystem, are ObjectProperties, which means they can take an instance of appropriate classes.structure ontology because microstructure is very important information for advanced materials design.In this paper, however, the target databases are collections of data for engineering materials.Engineering materials, in many cases, are mixtures of several microstructures, and separated structure ontologies have not been developed.Environment ontology defines concepts related to the environmental parameters of testing, measurement, and processing materials, such as pH, Atmosphere, and HoldingTime.Figure 2. A part of the Substance ontology class network.The concept "Alloy" is a subclass of two classes, "CompoundSubstance" and "Mixture."

Table 1 .
Example of URIs allocated for units and dimensions