Validating clustering for gene expression data


In particular, experiments for show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms.

In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.

The synthetic data contains 121 genes and their expression level under 18 hypothetic time series. The Budding Yeast Data Set The microarray data was produced by Chu in 1998 in his work of sporulation in budding yeast.

is a new genetic algorithm for clustering gene expression data. We randomly pick the normalized expression data of 1000 genes among the data set and take all seven stages into consideration.

