KRIGING METAMODELING: HOUSING DATA SET¶
This example showcases how to perform Kriging metamodeling using existing data sets. A standard machine learning data set related to the housing prices in Boston is considered.
For more information, see: Harrison, D., Jr. and D. L. Rubinfeld (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81-102. DOI:10.1016/0095-0696(78)90006-290006-2)
Package imports¶
from uqpylab import sessions, display_util
import numpy as np
import matplotlib.pyplot as plt
import scipy.io
Initialize common plotting parameters¶
display_util.load_plt_defaults()
Start a remote UQCloud session¶
# Start the session
mySession = sessions.cloud()
# (Optional) Get a convenient handle to the command line interface
uq = mySession.cli
# Reset the session
mySession.reset()
Processing .
.
done!
uqpylab.sessions :: INFO :: This is UQ[py]Lab, version 1.00, running on https://uqcloud.ethz.ch.
UQ[py]Lab is free software, published under the open source BSD 3-clause license.
To request special permissions, please contact:
- Stefano Marelli (marelli@ibk.baug.ethz.ch).
A new session (e72ff2a606fa4c148fa6b984e63b3b13) started.
uqpylab.sessions :: INFO :: Reset successful.
Set random seed for reproducibility¶
uq.rng(0,'twister');
np.random.seed(1)
Processing the data set¶
# Read the data set and store the content in matrices
mat = scipy.io.loadmat('housing.mat')
X = mat['X']
Y = mat['Y']
# Get the size of the experimental design:
N, M = X.shape
Training and validation sets¶
Use $80\%$ of the data for training and the rest for validation:
N_train = int(np.floor(0.8*N))
N_val = N - N_train
Initialize the results matrices:
N_reps = 5
Y_val = np.zeros((N_val, N_reps))
Y_Krg = np.zeros((N_val, N_reps))
legend_text = []
Kriging metamodels¶
Several Kriging metamodels are created using different training sets of the same size (N_train) randomly sampled from the whole housing data set.
The steps (repeated five times) are as follows:
- Randomly split the available data into a traning and a validation sets
- Define and create a Kriging metamodel based on the training set
- Evaluate the Kriging metamodel at the validation set points
Select Kriging as the metamodeling tool:
MetaOpts = {
'Type': 'MetaModel',
'MetaType': 'Kriging'
}
Use a linear trend for the Kriging metamodel:
MetaOpts['Trend'] = {
'Type': 'linear'
}
Use the CMAES optimization algorithm to estimate the hyperparameters:
MetaOpts['Optim'] = {
'Method': 'CMAES'
}
legend_text = []
for iter in range(N_reps):
# Randomly split the data into a training and a validation sets
idx_train = np.random.choice(N, size=N_train, replace=False)
idx_val = np.setdiff1d(np.arange(N), idx_train, assume_unique=True)
X_train = X[idx_train,:]
Y_train = Y[idx_train]
X_val = X[idx_val,:]
Y_val[:,[iter]] = Y[idx_val]
# Assign the training data set as the experimental design
MetaOpts['ExpDesign'] = {
'X': X_train.tolist(),
'Y': Y_train.tolist()
}
# Create the metamodel object
myKriging = uq.createModel(MetaOpts)
# Evaluate the Kriging metamodel at the validation set points
Y_Krg[:,[iter]] = uq.evalModel(myKriging, X_val)
# Create a text for the plot legend
legend_text.append(f'Fold {iter+1}')
Processing .
. done!
Processing .
done!
Processing .
done!
Validation¶
Compare the Kriging predictions at the validation set points against the true values over the N_reps repetitions:
plt.plot(Y_val, Y_Krg, 'o',alpha=0.7, markersize=3)
plt.plot([0, 60], [0, 60], 'k')
plt.axis([0, 60, 0, 60])
plt.grid(True)
plt.legend(legend_text, loc='lower right')
plt.xlabel('$\\mathrm{Y_{true}}$')
plt.ylabel('$\\mathrm{Y_{Krg}}$');
plt.show()
Terminate the remote UQCloud session¶
mySession.quit()
uqpylab.sessions :: INFO :: Session e72ff2a606fa4c148fa6b984e63b3b13 terminated.
True