# -------------------------------------------- # CITATION file created with {cffr} R package # See also: https://docs.ropensci.org/cffr/ # -------------------------------------------- cff-version: 1.2.0 message: 'To cite package "psHarmonize" in publications use:' type: software license: MIT title: 'psHarmonize: Creates a Harmonized Dataset Based on a Set of Instructions' version: 0.3.5 doi: 10.1016/j.patter.2024.101003 identifiers: - type: doi value: 10.32614/CRAN.package.psHarmonize abstract: Functions which facilitate harmonization of data from multiple different datasets. Data harmonization involves taking data sources with differing values, creating coding instructions to create a harmonized set of values, then making those data modifications. 'psHarmonize' will assist with data modification once the harmonization instructions are written. Coding instructions are written by the user to create a "harmonization sheet". This sheet catalogs variable names, domains (e.g. clinical, behavioral, outcomes), provides R code instructions for mapping or conversion of data, specifies the variable name in the harmonized data set, and tracks notes. The package will then harmonize the source datasets according to the harmonization sheet to create a harmonized dataset. Once harmonization is finished, the package also has functions that will create descriptive statistics using 'RMarkdown'. Data Harmonization guidelines have been described by Fortier I, Raina P, Van den Heuvel ER, et al. (2017) . Additional details of our R package have been described by Stephen JJ, Carolan P, Krefman AE, et al. (2024) . authors: - family-names: Stephen given-names: John email: John.Stephen@northwestern.edu orcid: https://orcid.org/0000-0001-7309-9193 preferred-citation: type: article title: 'psHarmonize: Facilitating reproducible large-scale pre-statistical data harmonization and documentation in R' authors: - name: Stephen - family-names: J. given-names: John - name: Carolan - name: Padraig - name: Krefman - family-names: E. given-names: Amy - name: Sedaghat - name: Sanaz - name: Mansolf - name: Maxwell - name: Allen - family-names: B. given-names: Norrina - name: Scholtens - family-names: M. given-names: Denise volume: '5' copyright: All rights reserved issn: 2666-3899 doi: 10.1016/j.patter.2024.101003 abstract: Combining pertinent data from multiple studies can increase the robustness of epidemiological investigations. Effective 'pre-statistical' data harmonization is paramount to the streamlined conduct of collective, multi-study analysis. Harmonizing data and documenting decisions about the transformations of variables to a common set of categorical values and measurement scales are time consuming and can be error prone, particularly for numerous studies with large quantities of variables. The psHarmonize R package facilitates harmonization by combining multiple datasets, applying data transformation functions, and creating long and wide harmonized datasets. The user provides transformation instructions in a 'harmonization sheet' that includes dataset names, variable names, and coding instructions and centrally tracks all decisions. The package performs harmonization, generates error logs as necessary, and creates summary reports of harmonized data. psHarmonize is poised to serve as a central feature of data preparation for the joint analysis of multiple studies. issue: '8' journal: Patterns (New York, N.Y.) month: '8' year: '2024' pmcid: PMC11368672 keywords: - data management - data harmonization - data integration - data pooling - R package start: '101003' repository: https://nudacc.r-universe.dev repository-code: https://github.com/NUDACC/psHarmonize commit: e4ec9192014d4f4a6b961ed2bc60f6b3491777d2 url: https://github.com/NUDACC/psHarmonize contact: - family-names: Stephen given-names: John email: John.Stephen@northwestern.edu orcid: https://orcid.org/0000-0001-7309-9193