Package: piecemaker
Title: Tools for Preparing Text for Tokenizers
Version: 1.0.2
Authors@R: c(
    person("Jon", "Harmon", , "jonthegeek@gmail.com", role = c("aut", "cre"),
           comment = c(ORCID = "0000-0003-4781-4346")),
    person("Jonathan", "Bratt", , "jonathan.bratt@macmillan.com", role = "aut",
           comment = c(ORCID = "0000-0003-2859-0076")),
    person("Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning", role = "cph")
  )
Description: Tokenizers break text into pieces that are more usable by
    machine learning models. Many tokenizers share some preparation steps.
    This package provides those shared steps, along with a simple
    tokenizer.
License: Apache License (>= 2)
URL: https://github.com/macmillancontentscience/piecemaker,
        https://macmillancontentscience.github.io/piecemaker/
BugReports: https://github.com/macmillancontentscience/piecemaker/issues
Depends: R (>= 2.10)
Imports: cli, glue, rlang (>= 0.4.2), stringi, stringr
Suggests: covr, testthat (>= 3.0.0)
Config/testthat/edition: 3
Encoding: UTF-8
RoxygenNote: 7.2.3
NeedsCompilation: no
Packaged: 2023-06-02 18:40:35 UTC; jonth
Author: Jon Harmon [aut, cre] (<https://orcid.org/0000-0003-4781-4346>),
  Jonathan Bratt [aut] (<https://orcid.org/0000-0003-2859-0076>),
  Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]
Maintainer: Jon Harmon <jonthegeek@gmail.com>
Repository: CRAN
Date/Publication: 2023-06-02 19:50:03 UTC
Built: R 4.3.3; ; 2025-04-07 01:58:00 UTC; windows
