Simulating the data of multinomial responses collected during a survey.
Arguments
- n
Integer specifying the number of random samples to draw.
- size
Vector of integer values, say \(N_i\), specifying the sample sizes for \(i = 1, \ldots , I\) data collection periods.
- prob
Numeric non-negative vector or matrix specifying the probabilities for the K response categories in the \(I\) data collection periods.
- categories
Vector of values specifying the K category labels to be represented. Default is
NULL.
Value
A data.frame with \(N = \sum_{i}N_{i}\) rows and n + 1
variables, which are described below.
periodData collection period defined by the length of
sizeresponses1 ... nSimulated multinomial responses with either the values defined by
categoriesor Category k, for \(k = 1, \ldots , K\)
Details
Running in the backend is stats::rmultinom(). So, see the
documentation for stats::rmultinom()for more information on how the
multinomial responses are generated.
The arguments size and prob are directly connect when it comes to
generating multinomial responses for multiple data collection periods. prob
may either be a vector or matrix specifying probabilities for the K
response categories. When size is specified to generate data for multiple
data collection periods, then prob must be specified as a numeric vector of
length length(size) times K or numeric matrix with length(size)
rows and K columns, where each row of the matrix specify the K
probabilities of each data collection period.
When categories is not defined (categories = NULL) simulate will
create default lables of Category k, for \(k = 1, \ldots , K\).
Examples
# Creating 5 simulated data sets of with a sample size of 10, where there
# are two possible response categories with a 50% chance of being selected.
satpt::simulate(n = 5, size = 10, prob = c(0.5, 0.5))
#> period responses1 responses2 responses3 responses4 responses5
#> 1 1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 2 1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 3 1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 4 1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 5 1 Category 1 Category 1 Category 1 Category 1 Category 1
#> 6 1 Category 2 Category 2 Category 1 Category 2 Category 1
#> 7 1 Category 2 Category 2 Category 1 Category 2 Category 2
#> 8 1 Category 2 Category 2 Category 1 Category 2 Category 2
#> 9 1 Category 2 Category 2 Category 2 Category 2 Category 2
#> 10 1 Category 2 Category 2 Category 2 Category 2 Category 2
# Creating 1 simulated data set for two data collection periods, where there
# are three possible response categories that are labeled.
prob <- matrix(
data = c(0.4, 0.4, 0.2, 0.1, 0.1, 0.8),
nrow = 2, ncol = 3, byrow = TRUE
)
catg <- LETTERS[1:3]
satpt::simulate(n = 1, size = c(20, 10), prob = prob, categories = catg)
#> period responses1
#> 1 1 A
#> 2 1 A
#> 3 1 A
#> 4 1 A
#> 5 1 A
#> 6 1 A
#> 7 1 A
#> 8 1 A
#> 9 1 B
#> 10 1 B
#> 11 1 B
#> 12 1 B
#> 13 1 B
#> 14 1 B
#> 15 1 B
#> 16 1 B
#> 17 1 B
#> 18 1 B
#> 19 1 C
#> 20 1 C
#> 21 2 A
#> 22 2 A
#> 23 2 C
#> 24 2 C
#> 25 2 C
#> 26 2 C
#> 27 2 C
#> 28 2 C
#> 29 2 C
#> 30 2 C