Skip to contents

Simulating the data of multinomial responses collected during a survey.

Usage

simulate(n, size, prob, categories = NULL)

Arguments

n

Integer specifying the number of random samples to draw.

size

Vector of integer values, say \(N_i\), specifying the sample sizes for \(i = 1, \ldots , I\) data collection periods.

prob

Numeric non-negative vector or matrix specifying the probabilities for the K response categories in the \(I\) data collection periods.

categories

Vector of values specifying the K category labels to be represented. Default is NULL.

Value

A data.frame with \(N = \sum_{i}N_{i}\) rows and n + 1 variables, which are described below.

period

Data collection period defined by the length of size

responses1 ... n

Simulated multinomial responses with either the values defined by categories or Category k, for \(k = 1, \ldots , K\)

Details

Running in the backend is stats::rmultinom(). So, see the documentation for stats::rmultinom()for more information on how the multinomial responses are generated.

The arguments size and prob are directly connect when it comes to generating multinomial responses for multiple data collection periods. prob may either be a vector or matrix specifying probabilities for the K response categories. When size is specified to generate data for multiple data collection periods, then prob must be specified as a numeric vector of length length(size) times K or numeric matrix with length(size) rows and K columns, where each row of the matrix specify the K probabilities of each data collection period.

When categories is not defined (categories = NULL) simulate will create default lables of Category k, for \(k = 1, \ldots , K\).

Examples

# Creating 5 simulated data sets of with a sample size of 10, where there
# are two possible response categories with a 50% chance of being selected.
satpt::simulate(n = 5, size = 10, prob = c(0.5, 0.5))
#>    period responses1 responses2 responses3 responses4 responses5
#> 1       1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 2       1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 3       1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 4       1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 5       1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 6       1 Category 2 Category 2 Category 1 Category 2 Category 1
#> 7       1 Category 2 Category 2 Category 1 Category 2 Category 2
#> 8       1 Category 2 Category 2 Category 1 Category 2 Category 2
#> 9       1 Category 2 Category 2 Category 2 Category 2 Category 2
#> 10      1 Category 2 Category 2 Category 2 Category 2 Category 2

# Creating 1 simulated data set for two data collection periods, where there
# are three possible response categories that are labeled.
prob <- matrix(
  data = c(0.4, 0.4, 0.2, 0.1, 0.1, 0.8),
  nrow = 2, ncol = 3, byrow = TRUE
)
catg <- LETTERS[1:3]
satpt::simulate(n = 1, size = c(20, 10), prob = prob, categories = catg)
#>    period responses1
#> 1       1          A
#> 2       1          A
#> 3       1          A
#> 4       1          A
#> 5       1          A
#> 6       1          A
#> 7       1          A
#> 8       1          A
#> 9       1          B
#> 10      1          B
#> 11      1          B
#> 12      1          B
#> 13      1          B
#> 14      1          B
#> 15      1          B
#> 16      1          B
#> 17      1          B
#> 18      1          B
#> 19      1          C
#> 20      1          C
#> 21      2          A
#> 22      2          A
#> 23      2          C
#> 24      2          C
#> 25      2          C
#> 26      2          C
#> 27      2          C
#> 28      2          C
#> 29      2          C
#> 30      2          C