pickmax: Split and Coalesce Duplicated Records
Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies.
| Version: | 0.1.0 | 
| Imports: | dplyr, rlang, magrittr | 
| Published: | 2025-07-15 | 
| DOI: | 10.32614/CRAN.package.pickmax | 
| Author: | Sbonelo Chamane [aut, cre] (ORCID: 0000-0001-5350-5203),
  Musawenkosi Mabaso [aut],
  Ronel Sewpaul [aut],
  Sean Jooste [aut],
  Kutloano Skhosana [aut],
  Khangelani Zuma [aut] | 
| Maintainer: | Sbonelo Chamane  <SChamane at hsrc.ac.za> | 
| License: | GPL-3 | 
| NeedsCompilation: | no | 
| CRAN checks: | pickmax results | 
Documentation:
Downloads:
Linking:
Please use the canonical form
https://CRAN.R-project.org/package=pickmax
to link to this page.