fuzzystring: Fast Fuzzy String Joins for Data Frames
Perform fuzzy joins on data frames using approximate string matching.
Implements all standard join types (inner, left, right, full, semi, anti) with
support for multiple string distance metrics from the 'stringdist' package
including Levenshtein, Damerau-Levenshtein, Jaro-Winkler, and Soundex. Features
a high-performance 'data.table' backend with 'C++' row binding for efficient
processing of large datasets. Ideal for matching misspellings, inconsistent
labels, messy user input, or reconciling datasets with slight variations in
identifiers. Optionally returns distance metrics alongside matched records.
| Version: |
0.0.1 |
| Depends: |
R (≥ 4.1) |
| Imports: |
data.table, Rcpp, stringdist |
| LinkingTo: |
Rcpp |
| Suggests: |
dplyr, ggplot2, knitr, qdapDictionaries, readr, rmarkdown, rvest, stringr, testthat (≥ 3.0.0), tidyr |
| Published: |
2026-02-08 |
| DOI: |
10.32614/CRAN.package.fuzzystring (may not be active yet) |
| Author: |
Paul E. Santos Andrade
[aut, cre],
David Robinson [ctb] (aut of fuzzyjoin) |
| Maintainer: |
Paul E. Santos Andrade <paulefrens at gmail.com> |
| BugReports: |
https://github.com/PaulESantos/fuzzystring/issues |
| License: |
MIT + file LICENSE |
| URL: |
https://github.com/PaulESantos/fuzzystring,
https://paulesantos.github.io/fuzzystring/ |
| NeedsCompilation: |
yes |
| Materials: |
README |
| CRAN checks: |
fuzzystring results |
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=fuzzystring
to link to this page.