Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Non-Parametric Bootstrap for Spectral Clustering (2209.05812v2)

Published 13 Sep 2022 in stat.ML and cs.LG

Abstract: Finite mixture modelling is a popular method in the field of clustering and is beneficial largely due to its soft cluster membership probabilities. A common method for fitting finite mixture models is to employ spectral clustering, which can utilize the expectation-maximization (EM) algorithm. However, the EM algorithm falls victim to a number of issues, including convergence to sub-optimal solutions. We address this issue by developing two novel algorithms that incorporate the spectral decomposition of the data matrix and a non-parametric bootstrap sampling scheme. Simulations display the validity of our algorithms and demonstrate not only their flexibility, but also their computational efficiency and ability to avoid poor solutions when compared to other clustering algorithms for estimating finite mixture models. Our techniques are more consistent in their convergence when compared to other bootstrapped algorithms that fit finite mixture models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Andrews, J. L. (2018). Addressing overfitting and underfitting in gaussian model-based clustering. Computational Statistics & Data Analysis, 127:160–171.
  2. Model-based classification via mixtures of multivariate t-distributions. Computational Statistics & Data Analysis, 55(1):520–529.
  3. Model-based gaussian and non-gaussian clustering. Biometrics, pages 803–821.
  4. Model-based clustering of high-dimensional data: A review. Computational Statistics & Data Analysis, 71:52–78.
  5. Unsupervised learning algorithms. Springer.
  6. Gaussian parsimonious clustering models. Pattern recognition, 28(5):781–793.
  7. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22.
  8. Monitor ionizing radiation-induced cellular responses with raman spectroscopy, non-negative matrix factorization, and non-negative least squares. Applied spectroscopy, 74(6):701–711.
  9. Efron, B. (1981). Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika, 68(3):589–599.
  10. Efron, B. (1982). The jackknife, the bootstrap and other resampling plans. SIAM.
  11. Mixtures of shifted asymmetriclaplace distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1149–1157.
  12. Trajectories of symptom severity in children with autism: Variability and turning points through the transition to school. Journal of autism and developmental disorders, 52(1):392–401.
  13. The EM algorithm for mixtures of factor analyzers. Technical report, Technical Report CRG-TR-96-1, University of Toronto.
  14. Comparing partitions. Journal of classification, 2:193–218.
  15. Spectral algorithms. Foundations and Trends® in Theoretical Computer Science, 4(3–4):157–288.
  16. Clustering with spectral norm and the k-means algorithm. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 299–308. IEEE.
  17. Optimality of spectral clustering in the gaussian mixture model. The Annals of Statistics, 49(5):2506–2530.
  18. Radiation-induced glycogen accumulation detected by single cell raman spectroscopy is associated with radioresistance that can be reversed by metformin. PloS one, 10(8):e0135356.
  19. McCreery, R. L. (2005). Raman spectroscopy for chemical analysis. John Wiley & Sons.
  20. Mixtures of factor analyzers. In In Proceedings of the Seventeenth International Conference on Machine Learning. Citeseer.
  21. The EM algorithm and extensions. John Wiley & Sons.
  22. Finite mixture models. Annual review of statistics and its application, 6:355–378.
  23. McNicholas, P. D. (2016). Mixture model-based classification. Chapman and Hall/CRC.
  24. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14.
  25. Parsimonious mixtures of multivariate contaminated normal distributions. Biometrical Journal, 58(6):1506–1537.
  26. A bootstrap-augmented alternating expectation-conditional maximization algorithm for mixtures of factor analyzers. Stat, 8(1):e243.
  27. Mixtures of probabilistic principal component analyzers. Neural computation, 11(2):443–482.
  28. Language identification using gaussian mixture model tokenization. In 2002 IEEE international conference on acoustics, speech, and signal processing, volume 1, pages I–757. IEEE.
  29. A spectral algorithm for learning mixture models. Journal of Computer and System Sciences, 68(4):841–860.
  30. Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and computing, 17(4):395–416.
  31. Yu, J. (2012). Online quality prediction of nonlinear and non-gaussian chemical processes with shifting dynamics using finite mixture model based gaussian process regression approach. Chemical engineering science, 82:22–30.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com