THE “ GDSCLIENT ” COLLECTING TOOL FOR NETWORKED SOLID EARTH SCIENCE DATA

The data center of our institute distributes solid earth science data obtained by the Ocean Hemisphere Project (OHP) network through the website of Pacific 21. We have developed Java-based software “GDSClient”, which enables us to collect not only the data of the OHP network but also those distributed from other data centers by means of the web service technology. It is possible to request the data controlling parameters such as data centers, observatories, a data period, and other auxiliary detailed parameters. It is unnecessary to know differences between data centers with preparing a WSDL (Web Services Description Language) file, in which information of user interface is described in XML format. The latest GDSClients are released from the website of Pacific 21.


INTRODUCTION
The Institute for Research on Earth Evolution, Japan Agency for Marine-Earth Science and Technology (IFREE/JAMSTEC), together with the Earthquake Research Institute of the University of Tokyo, has carried out long-term geophysical observations in the western Pacific Ocean called "the Ocean Hemisphere Project (OHP) network" for more than ten years.The main purpose of the OHP network is to fill in the spatial gap of geophysical observatories and to obtain more precise models of the structure and the activities of the Earth's interior (e.g., Kawakatsu et al., 1998).In this OHP network, we are operating 25 seismic observatories, 7 geomagnetic observatories, and 8 GPS observatories.The observational data are distributed via the web site of the data center of IFREE/JAMSTEC called "Pacific 21" (http://www.jamstec.go.jp/pacific21/).We developed a geophysical data distribution system "NINJA", which has a function to collect not only the OHP data but also geophysical data distributed from other data centers (Takeuchi et al., 2002).This networked data collection is possible through a single user interface with inputting parameters such as observatories and a data period to retrieve.However, this function of NINJA requires communications with each data server, and this becomes to be difficult recently from a point of view of the network security.In spite of this, requests from geophysicists to develop useful data collection software that releases from access to each data center are increasing.
We have been developing the NINJA portal and the GDSClient (Geophysical Data Service Client) in order to realize the collection of networked data overcoming the tight network security.The NINJA portal enables us to collect the seismic event and continuous data distributed in multiple data centers, and a prototype version is now available via the website of Pacific21 (Tsuboi et al., 2008).The GDSClient is PC-based software using the web service technology, i.e., the HTTP is used for both communications with data servers and data transfers.Currently available data are not only seismic event and continuous data but also geomagnetic data (e.g., Nagao et al., 2008).We mention the overview of the GDSClient in the next section.

OVERVIEW OF "GDSCLIENT"
2.1 The Role of "GDSClient" in the Flow of Geophysical Data The GDSClient of seismic version can download two types of seismic data, i.e., continuous data and event data.
The continuous data mean sequential observational data themselves while the event data mean parts of the continuous data filed for each earthquake event.Currently available data centers are the IFREE data center, the OHP data center of the University of Tokyo, IRIS (Incorporated Research Institutions for Seismology, USA), and ORFEUS (Observatories and Research Facilities for European Seismology), etc.The GDSClient of electromagnetic version can download the continuous data.Currently available data centers are the IFREE and OHP data centers, and another data centers can be added if necessary.
Even in the case of geophysical data available for anyone, it is usually prohibited to re-distribute collected dta to a third party without a permission of the relevant data center.Therefore software that archives collected geophysical data once into a server may violate this rule.The GDSClient is free from this problem of data re-distribution because users access and download the geophysical data directly from data centers, although it would be better, of course, to negotiate with the relevant data center in advance.

Development of the "GDSClient"
Various operating systems as typified by Windows, MacOS, and UNIX are used in geophysics, therefore it would be ideal to develop data collecting tools to be executable also on any operating systems.The GDSClient was developed with the Java language by taking this point into consideration, and is executable on the Java platform of a version greater than 1.4.
Although input items needed in requesting geophysical data depend on the design of user interface of each data center, some of the input items are usually similar, e.g., a data period, observatories, and channels, in the case of time series of the solid earth science data.Therefore collecting tools should absorb the differences between user interfaces in order to achieve user-friendliness.The GDSClient refers to a WSDL (Web Services Description Language) file for each data center if distributed, in which information of user interface and data schema are described in the XML format.Figure 2 shows the basic structure of a WSDL file in the case of the geomagnetic database of the OHP network in NINJA.The WSDL, roughly speaking, defines what kind of parameters can or should be specified, how to translate input text commands into internal commands used in the data server and vice versa, i.e., the XSLT (Extensible Stylesheet Language Transformations), and which URL to send the users' requests.Users need not to know the differences between user interfaces of each data center, and the GDSClient can reduce drastically its system size owing to the WSDL files, although a WSDL file is not always necessary.

Detailed Functions of "GDSClient"
In both GDSClients of seismic and electromagnetic versions, the mandatory parameters to be input through the user interface are data centers, observatories, channels or components, and a data period.In order to avoid a data download of large amount, a limitation is set for each parameter not to overload the data server.When one of the input parameters or the file size of requested data exceeds these limitations, the GDSClient interrupts the downloading process showing a warning message.The GDSClient has an "advanced search" mode, which can specify auxiliary parameters in order to control the users' requests in more detail.In the case of the seismic version, it is possible to specify magnitudes of earthquakes and depths of hypocenters, for example.Adding to this, an interested area is also selectable from the world map in the case of seismic event data.In the case of electromagnetic version, the range of the Sigma Kp index, the quiet or disturbed days, or the range of missing rate can be specified in the advanced search.Then candidate data matched to the input search parameters are displayed.Most of data centers require inputting user's information such as name, affiliation, and e-mail address before the data download, and the GDSClient carry out this user information process instead of the data center (Figure 1).Finally it is possible to retrieve the geophysical data requested to the local PC by the HTTP.

FUTURE WORKS AND CONCLUSIONS
It is possible to download the GDSClient of the latest version from the website of Pacific 21 (Figure 3).The GDSClient is still now updating, e.g., increase the number of accessible data centers, more rapid communications with HTTP servers in sending requests and data retrieval, and more friendly user interface.We believe that the best solution for the collection of networked geophysical data is to utilize a web service technology like the GDSClient.

Figure 1 Figure 1 .
Figure 1 schematically illustrates each role of the NINJA and the GDSClient in the flows of geophysical data

Figure 3 .
Figure 3.The left panel shows the top page of "Pacific 21" (http://www.jamstec.go.jp/pacific21/) operated by the data center of IFREE/JAMSTEC.It is possible to access to the NINJA interfaces and the NINJA portal from this page, and users can download each kind of geophysical data, i.e., seismic continuous, seismic event, electromagnetic, and GPS data obtained in the OHP network.The website of the GDSClient shown in the right panel can be accessed from the link (red circle) on the top page, and the GDSClient packages of both seismic and electromagnetic versions and their operation manuals are available from this page.

Table 1 .
Some significant XML tags used in WSDL files that are referred by the GDSClients Data Science Journal, Volume 9, 28 March 2010 Figure2.The basic structure of a WSDL file served from a data center.This example is the case of the DSClient of electromagnetic version.Explanations of some significant XML tags are shown in Table1.
:translate type="text/xsl" href="http:// ... " /> <gds> tag indicates URL of a XSLT file in which translation rules between input text commands and internal commands are described. <gds