Skip to content

GPA layer and NLL loss functiona added#72

Open
hasan2m wants to merge 4 commits into
mainfrom
71-add-gaussian-process-approximation-gpa-layer
Open

GPA layer and NLL loss functiona added#72
hasan2m wants to merge 4 commits into
mainfrom
71-add-gaussian-process-approximation-gpa-layer

Conversation

@hasan2m
Copy link
Copy Markdown

@hasan2m hasan2m commented Mar 28, 2025

Unit test passed for both GPA layer and loss function.

GPA layer:

image

Loss function:

image

@hasan2m hasan2m linked an issue Mar 28, 2025 that may be closed by this pull request
4 tasks
@hasan2m hasan2m added the enhancement New feature or request label Mar 28, 2025
@hasan2m hasan2m self-assigned this Mar 28, 2025
Copy link
Copy Markdown
Contributor

@sgoldenCS sgoldenCS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally think things look fine although I haven't run the unittests myself (maybe someone else could go through that process?). I'm glad you're using pytest and pytest fixtures though! It looks good!

initializer=tf.constant_initializer(self.initial_noise_scale),
trainable=self.train_noise_scale,
dtype=tf.float32,
constraint=ClipByValue(1e-6, 1e6),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the use of a function for the constraint, but would it be possible to continue using the noise_bounds parameter here? Otherwise it should be removed from the class definition as I don't believe it's used elsewhere.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed noise_bound from the class definition.

Comment on lines +112 to +118
# self.noise_scale = tf.Variable(
# self.initial_noise_scale,
# dtype=tf.float32,
# trainable=self.train_noise_scale,
# constraint=lambda z: tf.clip_by_value(z, self.noise_bounds[0], self.noise_bounds[1]),
# name='noise_scale'
# )
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be essentially the same as the new version (other than the noise_bounds thing), so feel free to remove the commented section to clean things up.

Comment on lines +132 to +174
# def call(self, inputs, training=None, return_features=False):
# if training is None:
# training = tf.keras.backend.learning_phase()

# batch_size = tf.cast(tf.shape(inputs)[0], tf.float32)
# x = tf.convert_to_tensor(inputs, dtype=self.dtype)
# x = tf.cast(x, tf.float32)

# x = self.length_scale * x
# x = self.rff_map(x)
# x1 = tf.math.cos(x)
# x2 = tf.math.sin(x)

# ffs = layers.concatenate([x1, x2])

# if self.scale_features:
# ffs = tf.math.sqrt(2.0 / self.n_fourier_features) * ffs

# ffs = tf.math.sqrt(self.constant_scale) * ffs
# output = self.rff_output(ffs)

# if training:
# if self.momentum > 0:
# update_prior_op = (
# self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size)
# )
# else:
# update_prior_op = self.prior + tf.transpose(ffs) @ ffs
# self.prior.assign(update_prior_op) # Direct assignment

# variances = self.calc_variance(ffs)
# else:
# if not self.do_custom_cov_update:
# self.update_cov(self.prior)

# variances = self.calc_variance(ffs)

# stddevs = tf.math.sqrt(variances)
# out = [output, stddevs[:, None]]
# if return_features:
# out.append(ffs)

# return out
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any changes here other than the default for the training flag, so I think this commented section could be removed too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GaussianProcessLayer --> We should rename since this is RFF approximation and not GP

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name changed to GaussianProcessApproxiamtionLayer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this from utils.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change file name --> gpa_layer.py

tf.Tensor: The computed NLL loss (scalar).
"""

sigma_star = tf.square(std) + noise_scale + 1e-5 # Adding a small constant for numerical stability
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the noise scale additive?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the noise_scale from here because we are accounting the noise scale during the variance calculation within the gpa_layer class.

Comment thread utests/utest_gp_layer.py Outdated
import numpy as np
import tensorflow as tf
from tensorflow.keras import Model, Input
from jlab_datascience_toolkit.utils.keras_layers.GP_layer import GaussianProcessLayer
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update with request change above.
Please don't call it a GaussianProcessLayer.

Comment thread utests/utest_gp_layer.py Outdated


def test_forward_pass(random_input):
layer = GaussianProcessLayer()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't call it GaussianProcessLayer

@hasan2m
Copy link
Copy Markdown
Author

hasan2m commented Apr 3, 2025

Addressed the comments and reran the unit test.

image

image

@Kishanrajput
Copy link
Copy Markdown

@hasan2m - could you please move the layer and loss out from utils by following below directory structure
Toolkit

  • keras (dir)
    • layers
      • layer_v0.py (implementation)
    • losses (dir)
      • loss_v0.py (implementation)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this from utils.

Comment on lines +132 to +174
# def call(self, inputs, training=None, return_features=False):
# if training is None:
# training = tf.keras.backend.learning_phase()

# batch_size = tf.cast(tf.shape(inputs)[0], tf.float32)
# x = tf.convert_to_tensor(inputs, dtype=self.dtype)
# x = tf.cast(x, tf.float32)

# x = self.length_scale * x
# x = self.rff_map(x)
# x1 = tf.math.cos(x)
# x2 = tf.math.sin(x)

# ffs = layers.concatenate([x1, x2])

# if self.scale_features:
# ffs = tf.math.sqrt(2.0 / self.n_fourier_features) * ffs

# ffs = tf.math.sqrt(self.constant_scale) * ffs
# output = self.rff_output(ffs)

# if training:
# if self.momentum > 0:
# update_prior_op = (
# self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size)
# )
# else:
# update_prior_op = self.prior + tf.transpose(ffs) @ ffs
# self.prior.assign(update_prior_op) # Direct assignment

# variances = self.calc_variance(ffs)
# else:
# if not self.do_custom_cov_update:
# self.update_cov(self.prior)

# variances = self.calc_variance(ffs)

# stddevs = tf.math.sqrt(variances)
# out = [output, stddevs[:, None]]
# if return_features:
# out.append(ffs)

# return out
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up code.


if training:
update_prior_op = (
self.momentum * self.prior + (1 - self.momentum) * (tf.transpose(ffs) @ ffs / batch_size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check of the size is correct: (batch,batch)

self.eigvals.assign(eigvals)
self.eigvecs.assign(eigvecs)

def calc_variance(self, ffs):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to understand this better. this should be K_xx^{-1}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inverse of K_xx is determined using eigenvalue and eigenvectors.

name='rff_map'
)

self.rff_output = layers.Dense(self.n_out, use_bias=False, name='GP_mean_pred')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output layer that provides the mean prediction using the fourier features created by rff_map.

tf.Tensor: The computed NLL loss (scalar).
"""

sigma_star = tf.square(std) + 1e-5 # Adding a small constant for numerical stability
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigma_star should be sigma2_star

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corrected.

@schr476
Copy link
Copy Markdown
Contributor

schr476 commented May 7, 2025

@hasan2m what is the status of the requested changes?

Comment on lines +157 to +162
def update_cov(self, prior):
eigvals, eigvecs = tf.linalg.eigh(prior) #EigenDecomposition
eigvals = tf.where(eigvals > 0, eigvals, tf.zeros_like(eigvals))

self.eigvals.assign(eigvals)
self.eigvecs.assign(eigvecs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be an internal version (_update_cov(self, prior) or something similar) that can be called with any prior, and implement the update_cov function without the extra input parameter like this:

def update_cov(self):
    self._update_cov(self.prior)

That could make it easier for a user to update the variance calculations outside of the model. Going along with this, it might also make sense to just remove the update_cov() call on line 146, and just rely on the user to properly update the prior when desired. At the moment, if do_custom_cov_update == False, the code will be slow for every non-training/validation call, which probably isn't desirable.

Copy link
Copy Markdown

@Kishanrajput Kishanrajput left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a model that uses this layer is saved and loaded back - it does not save/load prior matrix which makes the saved models not useful. Please add prior saving and loading.

@Kishanrajput
Copy link
Copy Markdown

@hasan2m - please have a look at the comments and address/comment on them.

@Kishanrajput
Copy link
Copy Markdown

Print statement for the RFF shape should be removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Gaussian Process Approximation (GPA) layer

4 participants