SUPPORT VECTOR MACHINES FOR PHOTOMETRIC REDSHIFT ESTIMATION FROM BROADBAND PHOTOMETRY

Photometric redshifts have been regarded as efficient and effective measures for studying the statistical properties of galaxies and their evolution. In this paper, we introduce SVM_Light, a freely available software package using support vector machines (SVM) for photometric redshift estimation. This technique shows its superiorities in accuracy and efficiency. It can be applied to huge volumes of datasets, and its efficiency is acceptable. When a large representative training set is available, the results of this method are superior to the best ones obtained from template fitting. The method is used on a sample of 73,899 galaxies from the Sloan Digital Sky Survey Data Release 5. When applied to processed data sets, the RMS error in estimating redshifts is less than 0.03. The performances of various kernel functions and different parameter sets have been compared. Parameter selection and uniform data have also been discussed. Finally the strengths and weaknesses of the approach are summarized.


INTRODUCTION
With the large and deep sky survey projects being carried out, studying the formation and evolution of galaxies has rapidly become a crucial goal of mainstream observational cosmology.In order to achieve this purpose, redshift, which is one of the most crucial factors, must be obtained.Most commonly, the redshifts of galaxies are determined spectroscopically.However, for those large and faint sets of galaxies, spectra of galaxies are not easy to obtain.Rather than observing narrow spectral features of galaxy spectra, the photometric redshift technique concentrates on medium-or broad-band color features.Because the photometric redshift measurement relies only on colors, the approach can be extended to high redshifts (Stephen, 1995).Moreover, the photometric redshift method is also the only way to estimate redshift beyond the spectroscopic limit.The chief disadvantage of using photometric redshifts is that they are less precise compared to spectroscopic ones.However, for determining properties of large numbers of galaxies in a statistical way, the uncertainty of photometric redshift can be tolerated.
Two kinds of photometric redshift methods are available: the template fitting approach and the training set approach.In template fitting, according to the known redshift and galaxy type, some templates are constructed in advance by minimizing the standard χ 2 to fit the observed photometric data with a set of spectral templates.
No spectroscopic information is required, and this method can be extended beyond the redshift limit.Commonly used templates are derived either from real observation, such as CWW (Coleman, Wu, & Weedman, 1980) or from population synthesis models (e.g.Bruzual & Charlot, 1993).Although it is easy to implement, the Data Science Journal, Volume 6, Supplement, 18 August 2007 accuracy of this approach strongly depends on the templates.
The essence of the training set approach is to derive a function between the redshift and photometric data by using a large and representative training set of galaxies for which both photometry results and redshifts are known and then use this function to estimate the remainder of the galaxies with unknown redshifts.In the past few years, a large number of training set methods have been developed and used (Way & Srivastava, 2006).
In this paper, we use support vector machines to estimate photometric redshifts using photometric data from the Sloan Digital Sky Survey and the Two-Micron All Sky Survey.The outline of the paper is as follows: Section 2 introduces support vector machines; Section 3 illustrates the data used in the study, and Section 4 describes and discusses the results.Our conclusions are summarized in Section 5.

SUPPORT VECTOR MACHINES
Support Vector Machines (SVMs) were developed by Vapnik (1995) and has be applied to solve classification and regression problems.The regression problem solution of SVMs is achieved by using an alternative loss the optimal regression function is given by the minimum of the function The resultant optimization problem is Data Science Journal, Volume 6, Supplement, 18 August 2007 To generalize to a non-linear regression, we replace the dot product with a kernel function.More information can be found in Steve's tutorial (1998).
Because of their excellent generalization performance, SVMs have been widely applied in the area of machine learning, such as handwritten digit recognition and face detection.In astronomy, SVMs have been applied for identifying red variables (Williams, Wozniak, Vestrand, & Gupta, 2004), clustering astronomical objects (Zhang & Zhao, 2004), and classifying AGNs from stars and normal galaxies (Zhang, Cui, & Zhao, 2002).
Several software packages of the SVM algorithm are accessible on the web.Regarding its robustness, ability to handle large amounts of data, and the regression time, we use SVM_Light in our case study.SVM_Light is a fast, optimized SVM algorithm, which is implemented in C language.It can deal with many thousands of support vectors, handle hundreds of thousands of training examples, and provide several standard kernel functions.The details about SVM_Light can be found at http://www.cs.cornell.edu/People/tj/svm_light/.

DATA
The data we used for this paper is from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All Sky Survey (2MASS).The general information of SDSS and 2MASS is as follows.

Sloan Digital Sky Survey
The Sloan Digital Sky Survey (SDSS) (York, Adelman, Anderson, Annis, Bahcall, et al., 2000) is an astronomical survey project, which covers more than a quarter of the sky, to construct the first comprehensive digital map of the universe in 3D, using a dedicated 2.5-meter telescope located in Apache Point, New Mexico.
In its first phase of operations, it has imaged 8,000 square degrees in five bandpasses (u, g, r, i, z) and measured more than 675,000 galaxies, 90,000 quasars, and 185,000 stars.In its second stage, SDSS will carry out three new surveys in different research areas: the nature of the universe, the origin of galaxies and quasars, and the formation an evolution of the Milky Way.

Two-Micron All Sky Survey
The Two-Micron All Sky Survey (2MASS) uses two highly-automated 1.3-m telescopes; one is in Mt.Hopkins, Arizona, and the other is located in CTIO, Chile.Each telescope has three-channels, which can observe the entire sky simultaneously at three near-infrared bands (j, h, and k).Jarrett et al. (2000) has more detailed information on the extended source catalog.
We select all galaxies of known redshifts from SDSS Data Release Five, and cross-match the data with 2MASS Data Science Journal, Volume 6, Supplement, 18 August 2007 extended point catalog within a search radius of 3 times the SDSS positional errors.After cross-matching, we generate more than 150,000 galaxies.Using these data, we include more restrictions.All data should satisfy the following criteria: 1) The spectroscopic redshift confidence must be equal to or greater than 0.95.
2) The redshift warning flag is 0.
These qualifications produce a sample of 73899 galaxies.Table 1 shows the broadband filters and their wavelength range.

RESULT AND DISCUSSION
When implementing SVMs, we adopt default soft margin (c) and radial basis function (RBF) kernel, modulate the kernel parameter (ϒ) to obtain the optimal result.We randomly divide the sample into two parts: two thirds for training and one third for testing.The training set has 50,000 samples and the test set has 23,899 samples.
The different parameter sets are selected, including model magnitudes (u, g, r, i, z) from SDSS, dereddening magnitudes (u', g', r', i', z') from SDSS, magnitudes (j, h, k) from 2MASS, and colors composed of these magnitudes.Applying the training set to train the SVMs and the test set to test the regression estimator, we obtain the performances of various parameter sets.The RMS scatters of photometric redshift are listed in Table 2.As Table 2 shows, the performance of colors is better than that of magnitudes; the results with input pattern based on dereddening magnitudes are superior to those based on model magnitudes; the more parameters used, the higher the precision of the redshift estimation.The best RMS error reduces to 0.028.
If using artificial neural networks (ANNs), one should be familiar with the network architecture and make a decision about how many input nodes or hidden lays they have.The more complex networks available, the more accurate the results will be.However, SVMs may use different kernel functions instead of different ANN networks.As long as the appropriate kernel function and parameters are chosen, the RMS scatter will decrease significantly.In this study, the Gaussian function is adopted.Moreover, some classic problems, such as multi-local minima, curse of dimensionality, and overfitting in ANNs, seldom occur in SVMs.

CONCLUSION
We utilize Support Vector Machines (SVMs) to estimate photometric redshifts using cross-matched data from SDSS DR5 and 2MASS.Photometric redshift accuracy produced by SVMs is comparable to that of ANN, as good as linear or quadratic regression, and clearly much better than template fitting.In appropriate situations, SVMs will be highly competitive tools for determining photometric redshifts in terms of speed and application.
However, they do depend on the existence of a large and representative training sample.As a part of empirical photometric redshift estimations, it is impossible to extrapolate SVMs to a region that is not well sampled by the training set.Moreover, a potential solution to the problem of increasing the photometric redshift accuracy is to add additional input parameters, such as r-band 50% and 90% petrosian flux radii.This may improve the accuracy of redshift estimation about 15% (Wadadekar, 2005).Another approach to the problem is to choose a more appropriate kernel function.In the future, we will consider the feature selection/extraction methods in the process of parameter selection.
function, which is modified to include a distance measure.The task of SVMs usually involves training and testing sets that consist of data instances.Each instance in the training set contains one "target value" and several "attributes."The goal of SVMs is to produce a model that predicts the target value of data instances in the testing set, which are given only the attributes.

Table 1 .
Survey filters and characteristics

Table 2 .
Photometric redshift prediction rms errors for different kernel parameters