Reproducing Dealing with Acronyms, Abbreviations, and Typos in Real-World Entity Matching

This reproduction aims to benchmark the 'Smash' string distance algorithm, as proposed in the paper, against other existing string distance algorithms. To reproduce this experiment follow along the reproduced.ipynb

reproduction repo : github.com/HasanPalito/smash_reproduced original repo : github.com/dx-tang/smash

1 1 - 1 Jun. 14, 2025, 7:05 AM

Authors

Launch on Chameleon

Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.

Download Archive

Download an archive containing the files of this artifact.

Download with git

Clone the git repository for this artifact, and checkout the version's commit

git clone https://github.com/HasanPalito/repro.git
# cd into the created directory
git checkout a270ac7804cee30955e3ba0786945cdbed460a18
Feedback

Submit feedback through GitHub issues

Version Stats

1 1 -