Smoothing Spline Clustering
What is Smoothing Spline Clustering?
Smoothing Spline Clustering is a statistical method for clustering time-series
gene expression data.
In particular, Smoothing Spline Clustering is useful for clustering genes in
microarray experiments performed over several time-points, for example, over the course
of development, a drug treatment, or other temporally based experiments.
What can Smoothing Spline Clustering tell me?
Smoothing Spline Clustering provides clusters of similary expressed genes using a statistically
rigorous, biologically based, data-driven method. Importantly, SSC provides the number of gene clusters
in a given dataset without an a priori specification the genes that belong to each cluster, a mean curve
for each cluster describing the average expression profile of each cluster, and associated
95% confidence bands.
Example of an SSC cluster from D. melanogaster developmental data  showing mean expression curve and 95% confidence bands
Why Use Smoothing Spline Clustering?
The big advantage of Smoothing Spline Clustering over other clustering algorithms is that you do
not have to specify a priori the number of clusters in your dataset or specify the expected functional
forms (curves) of genes in the data. SSC achieves this by modelling the natural properties of gene expression over time, taking into
account gene-specific differences in gene expression within a cluster of similarly expressed genes, the effects of
experimental measurement error, and missing data. Furthermore, SSC provides a visual summary of each cluster’s
gene expression function and goodness-of-fit as shown above.
- SSClust is easy to install and to use.
- SSClust is free and open-source.
- You don’t have to sign-up to get it.
- It’s published in a peer-reviewed journal.
Details on the smoothing spline clustering statistical model and algorithm are provided in
 Arbeitman M., Furlong, E., Imam, F., Johnson, E., Null, B. H., Baker, B. S., Krasnow, M. A., Scott, M. P., Davis, R. W., & White, K. P. (2002) Science 297, 2270-2275.
Smoothing Spline Clustering – for time course gene expression data.