Repository logo
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. Tab-Shapley: Identifying Top-k Tabular Data Quality Insights
 
  • Details

Tab-Shapley: Identifying Top-k Tabular Data Quality Insights

Source
Proceedings of the Aaai Conference on Artificial Intelligence
ISSN
21595399
Date Issued
2025-04-11
Author(s)
Padala, Manisha  
Nagalapatti, Lokesh
Tyagi, Atharv
Narayanam, Ramasuri
Saini, Shiv Kumar
DOI
10.1609/aaai.v39i12.33355
Volume
39
Issue
12
Abstract
We present an unsupervised method for aggregating anomalies in tabular datasets by identifying the top-k tabular data quality insights. Each insight consists of a set of anomalous attributes and the corresponding subsets of records that serve as evidence to the user. The process of identifying these insight blocks is challenging due to (i) the absence of labeled anomalies, (ii) the exponential size of the subset search space, and (iii) the complex dependencies among attributes, which obscure the true sources of anomalies. Simple frequency-based methods fail to capture these dependencies, leading to inaccurate results. To address this, we introduce Tab-Shapley, a cooperative game theory based framework that uses Shapley values to quantify the contribution of each attribute to the data's anomalous nature. While calculating Shapley values typically requires exponential time, we show that our game admits a closed-form solution, making the computation efficient. We validate the effectiveness of our approach through empirical analysis on real-world tabular datasets with ground-truth anomaly labels.
Publication link
https://ojs.aaai.org/index.php/AAAI/article/download/33355/35510
URI
https://d8.irins.org/handle/IITG2025/28184
IITGN Knowledge Repository Developed and Managed by Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify