A type of partitioning algorithm used to cluster data. We construct a partitioned of a database D of n objects into a set of k clusters.

In the **K-mediods** clustering method we find representative objects (called *medoids*) in clusters. To achieve this goal, only the definition of distance from any two objects is needed.

## Partitioning Around Medoids (PAM) [1]

Originally devised by Kaufman and Rousseeuw in 1987 and built using S+, **PAM** uses real objects to represent clusters:

- Select
**k**representative objects arbitrarily - For each pair of non-selected object
**h**and selected object**i**, calculate the total swapping cost**TC**._{ih}- If
*TC*< 0,_{ih}**i**is replaced by**h**.

- If
- Then assign each non-selected object to the most similar representative object
- Repeat steps 2-3 until there is no change.

## CLARA [2]

Draws multiple samples of the data set, applies PAM on each sample and gives the best clustering as the output.

It can deal with larger data sets than PAM, but a good clustering based on samples will not necessarily represent a good clustering of the whole data set if the sample is biased.

The efficiency of CLARA depends on the sample size;

O(kS^{2} + k(n - k))

## CLARANS: Randomized sampling [3]

A Clustering Algorithm based on Randomized Search (**CLARANS**) draws a sample of neighbours dynamically. The clustering process can be represented as searching a graph, where every node is a potential solution; that is, set of *k* medoids.

If the local optimum is found, CLARANS starts with a new randomly selected node in search for a new local optimum.

More efficient and scalable than both PAM and CLARA. Focusing techniques and spatial access structures may further its performance.

## Focusing and spatial data structure [4]

Ester et al. 1995