KRIGING METAMODELING: HOUSING DATA SET¶

This example showcases how to perform Kriging metamodeling using existing data sets. A standard machine learning data set related to the housing prices in Boston is considered.

For more information, see: Harrison, D., Jr. and D. L. Rubinfeld (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81-102. DOI:10.1016/0095-0696(78)90006-290006-2)

Package imports¶

In [1]:
from uqpylab import sessions, display_util
import numpy as np
import matplotlib.pyplot as plt
import scipy.io

Initialize common plotting parameters¶

In [2]:
display_util.load_plt_defaults()

Start a remote UQCloud session¶

In [3]:
# Start the session
mySession = sessions.cloud()
# (Optional) Get a convenient handle to the command line interface
uq = mySession.cli
# Reset the session
mySession.reset()
Processing .
.
 done!

 uqpylab.sessions :: INFO     :: This is UQ[py]Lab, version 1.00, running on https://uqcloud.ethz.ch. 
                                 UQ[py]Lab is free software, published under the open source BSD 3-clause license.
                                 To request special permissions, please contact:
                                  - Stefano Marelli (marelli@ibk.baug.ethz.ch).
                                 A new session (e72ff2a606fa4c148fa6b984e63b3b13) started.
 uqpylab.sessions :: INFO     :: Reset successful.

Set random seed for reproducibility¶

In [4]:
uq.rng(0,'twister');
np.random.seed(1)

Processing the data set¶

In [5]:
# Read the data set and store the content in matrices
mat = scipy.io.loadmat('housing.mat')
X = mat['X']
Y = mat['Y']

# Get the size of the experimental design:
N, M = X.shape

Training and validation sets¶

Use $80\%$ of the data for training and the rest for validation:

In [6]:
N_train = int(np.floor(0.8*N))
N_val = N - N_train

Initialize the results matrices:

In [7]:
N_reps = 5
Y_val = np.zeros((N_val, N_reps))
Y_Krg = np.zeros((N_val, N_reps))
legend_text = []

Kriging metamodels¶

Several Kriging metamodels are created using different training sets of the same size (N_train) randomly sampled from the whole housing data set. The steps (repeated five times) are as follows:

  • Randomly split the available data into a traning and a validation sets
  • Define and create a Kriging metamodel based on the training set
  • Evaluate the Kriging metamodel at the validation set points

Select Kriging as the metamodeling tool:

In [8]:
MetaOpts = {
    'Type': 'MetaModel',
    'MetaType': 'Kriging'
}

Use a linear trend for the Kriging metamodel:

In [9]:
MetaOpts['Trend'] = {
    'Type': 'linear'
}

Use the CMAES optimization algorithm to estimate the hyperparameters:

In [10]:
MetaOpts['Optim'] = {
    'Method': 'CMAES'
}
In [11]:
legend_text = []
for iter in range(N_reps):
    
    # Randomly split the data into a training and a validation sets
    idx_train = np.random.choice(N, size=N_train, replace=False)
    idx_val = np.setdiff1d(np.arange(N), idx_train, assume_unique=True)
    X_train = X[idx_train,:]
    Y_train = Y[idx_train]
    X_val = X[idx_val,:]
    Y_val[:,[iter]] = Y[idx_val]

    # Assign the training data set as the experimental design
    MetaOpts['ExpDesign'] = {
        'X': X_train.tolist(),
        'Y': Y_train.tolist()
    }
    
    # Create the metamodel object
    myKriging = uq.createModel(MetaOpts)
    
    # Evaluate the Kriging metamodel at the validation set points
    Y_Krg[:,[iter]] = uq.evalModel(myKriging, X_val)
    
    # Create a text for the plot legend
    legend_text.append(f'Fold {iter+1}')
Processing .
. done!

Processing .
 done!

Processing .
 done!

Validation¶

Compare the Kriging predictions at the validation set points against the true values over the N_reps repetitions:

In [12]:
plt.plot(Y_val, Y_Krg, 'o',alpha=0.7, markersize=3)
plt.plot([0, 60], [0, 60], 'k')
plt.axis([0, 60, 0, 60])
plt.grid(True)
plt.legend(legend_text, loc='lower right')
plt.xlabel('$\\mathrm{Y_{true}}$')
plt.ylabel('$\\mathrm{Y_{Krg}}$');
plt.show()
No description has been provided for this image

Terminate the remote UQCloud session¶

In [13]:
mySession.quit()
 uqpylab.sessions :: INFO     :: Session e72ff2a606fa4c148fa6b984e63b3b13 terminated.
Out[13]:
True