GPA layer and NLL loss functiona added#72
Conversation
sgoldenCS
left a comment
There was a problem hiding this comment.
I generally think things look fine although I haven't run the unittests myself (maybe someone else could go through that process?). I'm glad you're using pytest and pytest fixtures though! It looks good!
| initializer=tf.constant_initializer(self.initial_noise_scale), | ||
| trainable=self.train_noise_scale, | ||
| dtype=tf.float32, | ||
| constraint=ClipByValue(1e-6, 1e6), |
There was a problem hiding this comment.
I like the use of a function for the constraint, but would it be possible to continue using the noise_bounds parameter here? Otherwise it should be removed from the class definition as I don't believe it's used elsewhere.
There was a problem hiding this comment.
Removed noise_bound from the class definition.
| # self.noise_scale = tf.Variable( | ||
| # self.initial_noise_scale, | ||
| # dtype=tf.float32, | ||
| # trainable=self.train_noise_scale, | ||
| # constraint=lambda z: tf.clip_by_value(z, self.noise_bounds[0], self.noise_bounds[1]), | ||
| # name='noise_scale' | ||
| # ) |
There was a problem hiding this comment.
This looks to be essentially the same as the new version (other than the noise_bounds thing), so feel free to remove the commented section to clean things up.
| # def call(self, inputs, training=None, return_features=False): | ||
| # if training is None: | ||
| # training = tf.keras.backend.learning_phase() | ||
|
|
||
| # batch_size = tf.cast(tf.shape(inputs)[0], tf.float32) | ||
| # x = tf.convert_to_tensor(inputs, dtype=self.dtype) | ||
| # x = tf.cast(x, tf.float32) | ||
|
|
||
| # x = self.length_scale * x | ||
| # x = self.rff_map(x) | ||
| # x1 = tf.math.cos(x) | ||
| # x2 = tf.math.sin(x) | ||
|
|
||
| # ffs = layers.concatenate([x1, x2]) | ||
|
|
||
| # if self.scale_features: | ||
| # ffs = tf.math.sqrt(2.0 / self.n_fourier_features) * ffs | ||
|
|
||
| # ffs = tf.math.sqrt(self.constant_scale) * ffs | ||
| # output = self.rff_output(ffs) | ||
|
|
||
| # if training: | ||
| # if self.momentum > 0: | ||
| # update_prior_op = ( | ||
| # self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size) | ||
| # ) | ||
| # else: | ||
| # update_prior_op = self.prior + tf.transpose(ffs) @ ffs | ||
| # self.prior.assign(update_prior_op) # Direct assignment | ||
|
|
||
| # variances = self.calc_variance(ffs) | ||
| # else: | ||
| # if not self.do_custom_cov_update: | ||
| # self.update_cov(self.prior) | ||
|
|
||
| # variances = self.calc_variance(ffs) | ||
|
|
||
| # stddevs = tf.math.sqrt(variances) | ||
| # out = [output, stddevs[:, None]] | ||
| # if return_features: | ||
| # out.append(ffs) | ||
|
|
||
| # return out |
There was a problem hiding this comment.
I don't see any changes here other than the default for the training flag, so I think this commented section could be removed too.
There was a problem hiding this comment.
GaussianProcessLayer --> We should rename since this is RFF approximation and not GP
There was a problem hiding this comment.
Name changed to GaussianProcessApproxiamtionLayer.
There was a problem hiding this comment.
please change file name --> gpa_layer.py
| tf.Tensor: The computed NLL loss (scalar). | ||
| """ | ||
|
|
||
| sigma_star = tf.square(std) + noise_scale + 1e-5 # Adding a small constant for numerical stability |
There was a problem hiding this comment.
is the noise scale additive?
There was a problem hiding this comment.
Removing the noise_scale from here because we are accounting the noise scale during the variance calculation within the gpa_layer class.
| import numpy as np | ||
| import tensorflow as tf | ||
| from tensorflow.keras import Model, Input | ||
| from jlab_datascience_toolkit.utils.keras_layers.GP_layer import GaussianProcessLayer |
There was a problem hiding this comment.
update with request change above.
Please don't call it a GaussianProcessLayer.
|
|
||
|
|
||
| def test_forward_pass(random_input): | ||
| layer = GaussianProcessLayer() |
There was a problem hiding this comment.
Please don't call it GaussianProcessLayer
|
@hasan2m - could you please move the layer and loss out from utils by following below directory structure
|
| # def call(self, inputs, training=None, return_features=False): | ||
| # if training is None: | ||
| # training = tf.keras.backend.learning_phase() | ||
|
|
||
| # batch_size = tf.cast(tf.shape(inputs)[0], tf.float32) | ||
| # x = tf.convert_to_tensor(inputs, dtype=self.dtype) | ||
| # x = tf.cast(x, tf.float32) | ||
|
|
||
| # x = self.length_scale * x | ||
| # x = self.rff_map(x) | ||
| # x1 = tf.math.cos(x) | ||
| # x2 = tf.math.sin(x) | ||
|
|
||
| # ffs = layers.concatenate([x1, x2]) | ||
|
|
||
| # if self.scale_features: | ||
| # ffs = tf.math.sqrt(2.0 / self.n_fourier_features) * ffs | ||
|
|
||
| # ffs = tf.math.sqrt(self.constant_scale) * ffs | ||
| # output = self.rff_output(ffs) | ||
|
|
||
| # if training: | ||
| # if self.momentum > 0: | ||
| # update_prior_op = ( | ||
| # self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size) | ||
| # ) | ||
| # else: | ||
| # update_prior_op = self.prior + tf.transpose(ffs) @ ffs | ||
| # self.prior.assign(update_prior_op) # Direct assignment | ||
|
|
||
| # variances = self.calc_variance(ffs) | ||
| # else: | ||
| # if not self.do_custom_cov_update: | ||
| # self.update_cov(self.prior) | ||
|
|
||
| # variances = self.calc_variance(ffs) | ||
|
|
||
| # stddevs = tf.math.sqrt(variances) | ||
| # out = [output, stddevs[:, None]] | ||
| # if return_features: | ||
| # out.append(ffs) | ||
|
|
||
| # return out |
|
|
||
| if training: | ||
| update_prior_op = ( | ||
| self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size) |
There was a problem hiding this comment.
check of the size is correct: (batch,batch)
| self.eigvals.assign(eigvals) | ||
| self.eigvecs.assign(eigvecs) | ||
|
|
||
| def calc_variance(self, ffs): |
There was a problem hiding this comment.
need to understand this better. this should be K_xx^{-1}
There was a problem hiding this comment.
Inverse of K_xx is determined using eigenvalue and eigenvectors.
| name='rff_map' | ||
| ) | ||
|
|
||
| self.rff_output = layers.Dense(self.n_out, use_bias=False, name='GP_mean_pred') |
There was a problem hiding this comment.
Output layer that provides the mean prediction using the fourier features created by rff_map.
| tf.Tensor: The computed NLL loss (scalar). | ||
| """ | ||
|
|
||
| sigma_star = tf.square(std) + 1e-5 # Adding a small constant for numerical stability |
There was a problem hiding this comment.
sigma_star should be sigma2_star
|
@hasan2m what is the status of the requested changes? |
| def update_cov(self, prior): | ||
| eigvals, eigvecs = tf.linalg.eigh(prior) #EigenDecomposition | ||
| eigvals = tf.where(eigvals > 0, eigvals, tf.zeros_like(eigvals)) | ||
|
|
||
| self.eigvals.assign(eigvals) | ||
| self.eigvecs.assign(eigvecs) |
There was a problem hiding this comment.
I wonder if this should be an internal version (_update_cov(self, prior) or something similar) that can be called with any prior, and implement the update_cov function without the extra input parameter like this:
def update_cov(self):
self._update_cov(self.prior)That could make it easier for a user to update the variance calculations outside of the model. Going along with this, it might also make sense to just remove the update_cov() call on line 146, and just rely on the user to properly update the prior when desired. At the moment, if do_custom_cov_update == False, the code will be slow for every non-training/validation call, which probably isn't desirable.
Kishanrajput
left a comment
There was a problem hiding this comment.
When a model that uses this layer is saved and loaded back - it does not save/load prior matrix which makes the saved models not useful. Please add prior saving and loading.
|
@hasan2m - please have a look at the comments and address/comment on them. |
|
Print statement for the RFF shape should be removed |


Unit test passed for both GPA layer and loss function.
GPA layer:
Loss function: