Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. Online Coresets for Parametric and Non-Parametric Bregman Clustering
 
  • Details

Online Coresets for Parametric and Non-Parametric Bregman Clustering

Source
Transactions on Machine Learning Research
Date Issued
2022-01-01
Author(s)
Chhaya, Rachit
Choudhari, Jayesh
Dasgupta, Anirban  
Shit, Supratim
Volume
2022-July
Abstract
We present algorithms that create coresets in an online setting for clustering problems based on a wide subset of Bregman divergences. Notably, our coresets have a small additive error, similar in magnitude to the gap between expected and empirical loss Bachem et al. (2017a), and take update time O(d) for every incoming point where d is the dimension of the point. Our first algorithm gives online coresets of size Õ(poly(k, d, ϵ, µ)) for k-clusterings according to any µ-similar Bregman divergence. We further extend this algorithm to show the existence of non-parametric coresets, where the coreset size is independent of k, the number of clusters, for the same subclass of Bregman divergences. Our non-parametric coresets also function as coresets for non-parametric versions of the Bregman clustering like DP-Means. While these coresets provide additive error guarantees, they are significantly smaller for high dimensional data than the (relative-error) coresets obtained in Bachem et al. (2015) for DP-Means— for the input of size n our coresets grow as O(log n) while being independent of d as opposed to O(d<sup>d</sup>) for points in R<sup>d</sup> Bachem et al. (2015). We also present experiments to compare the performance of our algorithms with other sampling techniques.
URI
https://d8.irins.org/handle/IITG2025/28522
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify