1 Introduction

clusteR is an R package that assists epidemiologists (or data scientists or analysts or…) in local, regional, or state health departments in managing a cluster-sampled cohort survey similar to CDC’s CASPER.

1.1 What clusteR can do

In short, clusteR is built to give epidemiologists a framework to manage and analyze a cluster-sampled cohort survey. clusteR is built to handle most data management tasks so epidemiologists don’t need to rely on a difficult-to- maintain (and nearly impossible to share) set of custom scripts.

clusteR can:

  1. Given information about your state and county/counties of interest, randomly select U.S. Census blocks for participation and display simple maps.
  2. Standardize, manage, update, and export a cohort file with key data and status information on your participants.
  3. Export PDF and CSV lists to contact participants via mail, phone, and email.
  4. Filter groups by aggregate status of participants, group selected clusters (by proximity), and produce customizable walk lists for door-to-door interviews.
  5. Produce reports on completion in your cohort.
  6. Establish a data connection, retrieve data, and standardize it.
  7. Customizably clean and weight standardized data.
  8. Produce analytic reports from weighted or unweighted responses.
  9. Export cohort data, raw or cleaned data, and analytic products.

clusteR cannot:

  1. Replace a trained epidemiologist.
  2. Obtain a random sample of participants or addresses in clusters of interest, even when clusteR selects U.S. Census blocks for you.
  3. Build, maintain, or host a survey platform.
  4. Build, maintain, or host a dashboard or other web platform.
  5. Host cohort files or survey data for collaboration.
  6. Secure cohort files or survey data.