In particular, experiments for show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms.
In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.
The synthetic data contains 121 genes and their expression level under 18 hypothetic time series. The Budding Yeast Data Set The microarray data was produced by Chu in 1998 in his work of sporulation in budding yeast.
is a new genetic algorithm for clustering gene expression data.We randomly pick the normalized expression data of 1000 genes among the data set and take all seven stages into consideration. "Algorithm AS 136: A k-means clustering algorithm." Journal of the Royal Statistical Society. "Refining Initial Points for K-Means Clustering." ICML. Proposal, project investigation, dataset search: all team members. Series C (Applied Statistics) 28.1 (1979): 100-108. This leads, in turn, into two well established and rich research areas.One deals with the design of new clustering algorithms and the other with the design of new validation techniques that should assess the biological relevance of the clustering solutions found.Hierarchical: Siyu Bian, get results for two datasets, using R Tree Cut package to get cluster number in different results, visualize result and process result for evaluation. "Computational analysis of microarray data." Nature reviews genetics 2.6 (2001): 418-427. "The transcriptional program of sporulation in budding yeast." Science 282.5389 (1998): 699-705. "The analysis of a simple k-means clustering algorithm." Proceedings of the sixteenth annual symposium on Computational geometry.