Question about squared_distance fucntion

Hi! 
First of all, thanks for providing this nice work! 

While I am looking into the code, I found the _squared_distance_ function is a little bit confusing. If Y is not provided (so Y = X), this function will do an option of X - X and then take the sum. So, isn't the return value zero?

https://github.com/KlugerLab/SpectralNet/blob/43b0fca784491f234489b860fc35832697ad20c2/src/core/costs.py#L11-L33

Another question about the number of clusters K, can I use a relatively larger number when my dataset contains about 1 million samples? For example, over 1000?

Thanks!
Fan


	def squared_distance(X, Y=None, W=None):
	'''
	Calculates the pairwise distance between points in X and Y

	X: n x d matrix
	Y: m x d matrix
	W: affinity -- if provided, we normalize the distance

	returns: n x m matrix of all pairwise squared Euclidean distances
	'''
	if Y is None:
	Y = X
	# distance = squaredDistance(X, Y)
	sum_dimensions = list(range(2, K.ndim(X) + 1))
	X = K.expand_dims(X, axis=1)
	if W is not None:
	# if W provided, we normalize X and Y by W
	D_diag = K.expand_dims(K.sqrt(K.sum(W, axis=1)), axis=1)
	X /= D_diag
	Y /= D_diag
	squared_difference = K.square(X - Y)
	distance = K.sum(squared_difference, axis=sum_dimensions)
	return distance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about squared_distance fucntion #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about squared_distance fucntion #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions