Statistics Thesis Defense - CCRVAM, A Python Package for Model-Free Exploratory Analysis of Multivariate Discrete Data with an Ordinal Response Variable

Apr 30, 2025ยท
Dhyey Mavani
Dhyey Mavani
ยท 0 min read
Abstract
Understanding regression dependencies among discrete variables, especially in the presence of ordinal responses, poses a persistent challenge in exploratory data analysis (EDA). While classical EDA techniques and continuous copula models have proven effective for continuous data, they often fail to capture the structure and interpretability required for categorical datasets. This thesis begins by critically evaluating these traditional approaches and highlighting their limitations in discrete settings. Motivated by these gaps, we explore the model-free dependence measures proposed by Wei and Kim (2021) and further advanced by Liao et al. (2024), which leverages the checkerboard copula framework to robustly characterize regression relationships in multidimensional contingency tables with both ordinal and nominal variables. To operationalize this method, we present a novel, modular, and scalable Python package, ccrvam, designed to support efficient large-scale analysis. The package integrates with established scientific libraries such as NumPy, Pandas, SciPy, and Matplotlib, while incorporating Pytest and Sphinx for testing and maintainability. Through extensive simulations and real-world case studies, we demonstrate that ccrvam offers a powerful and flexible toolset for uncovering complex dependence structures in categorical data. Our contributions provide both a theoretical exposition and a novel practical resource for researchers engaged in data-driven exploration of discrete regression phenomena.
Event
Location

SMUD207 Seelye Mudd Building

31 Quadrangle Dr, Amherst, MA 01002