Quantum K-medians

Distance calculation

pad_input(X)[source]

Adds 0s if X log2(X.dim) != round int.

Parameters:

X (numpy.ndarray) – Input data

Returns:

Padded X

Return type:

numpy.ndarray

DistCalc_DI(a, b, device_name='/GPU:0', shots_n=10000)[source]

Distance calculation using destructive interference.

Parameters:
  • a (numpy.ndarray) – First point - shape = (latent space dimension,)

  • b (numpy.ndarray) – First point - shape = (latent space dimension,)

  • device_name (str) – Name of device for executing a simulation of quantum circuit.

  • shots_n (int) – Number of shots for executing a quantum circuit - to get frequencies.

Returns:

(distance, quantum circuit)

Return type:

(float, qibo.models.Circuit)

Algorithm

initialize_centroids(points, k)[source]

Randomly initialize centroids of data points.

Parameters:
  • points (numpy.ndarray) – Points represented as an array of shape (N, X), where N = number of samples, X = dimension of latent space.

  • k (int) – Number of clusters.

Returns:

k number of centroids.

Return type:

numpy.ndarray

find_distance_matrix_quantum(points, centroid, device_name)[source]

Modified version of scipy.spatial.distance.cdist() function. :param points: Points represented as an array of shape (N, X), where N = number of samples, X = dimension of latent space. :type points: numpy.ndarray :param centroid: Centroid of shape (1, X) :type centroid: :class`numpy.ndarray`

Returns:

Distance matrix - distance of each point to centroid

Return type:

numpy.ndarray

geometric_median(points, median, eps=1e-06, device_name='/GPU:0')[source]

Implementation from Reference - DOI: 10.1007/s00180-011-0262-4

Parameters:
  • points (numpy.ndarray) – Points represented as an array of shape (N, X), where N = number of samples, X = dimension of latent space.

  • median (numpy.ndarray) – Initial median (centroid) of shape (1, X).

Returns:

Median

Return type:

numpy.ndarray

find_centroids_GM(points, cluster_labels, start_centroids, clusters=2)[source]

Finds cluster centroids .

Parameters:
  • points (numpy.ndarray) – Points represented as an array of shape (N, X), where N = number of samples, X = dimension of latent space.

  • cluster_labels (numpy.ndarray) – Cluster labels assigned to each data point - shape (N,)

  • clusters (int) – Number of clusters

Returns:

Centroids

Return type:

numpy.ndarray

find_nearest_neighbour_DI(points, centroids, device_name='/GPU:0')[source]

Find cluster assignments for points.

Parameters:
  • points (numpy.ndarray) – Points represented as an array of shape (N, X), where N = number of samples, X = dimension of latent space.

  • centroids (numpy.ndarray) – Centroids of shape (k, X)

Returns:

  • numpy.ndarray – Cluster labels : array of shape (N,) specifying to which cluster each point is assigned.

  • numpy.ndarray – Distances: array of shape (N,) specifying distances to nearest cluster for each point.

Train

train_qkmedians(latent_dim, train_size, read_file, device_name, seed=None, k=2, tolerance=0.001, save_dir=None)[source]

Performs training of quantum k-medians.

Parameters:
  • latent_dim (int) – Latent dimension of input data.

  • train_size (int) – Number of training samples.

  • read_file (str) – Name of the file where training data is saved.

  • device_name (str) – Name of device for running a simulation of quantum circuit.

  • seed (int) – Seed for data shuffling.

  • k (int) – Number of classes in quantum k-medians.

  • tolerance (float) – Tolerance for algorithm convergence.

  • save_dir (str) – Name of the file for saving results.