Simulate most active brain areas for one experiment

Simulate data for most active brain areas for one experiment. It can be performed for a completely random process, or by using baseline expression levels from the Allen Brain Atlas as well as increases due to the experimental manipulation. Returns a dataframe with the group ("group") and brain area ("my_grouping") of the brain areas simulated to be most active. The variable "batch" indicates the replicate, and the number of independent batches corresponds to the samples_per_group as specified by the user.

sim_most_active(
  weights_df,
  samples_per_group = 1,
  n_exp = 1,
  weight_by_expression = TRUE,
  weight_by_group = TRUE,
  high_prob = 0.95,
  summary = FALSE
)

Arguments

weights_df	dataframe resulting from prepare_sim_weights(). dataframe in long format with one brain area "my_grouping" per group ("group") with the Allen Brain Atlas expression levels ("mean_expression" and "sd_expression") as well as group-dependent weight ("weight")
samples_per_group	number of samples per group. If not specified, it's considered 1.
n_exp	number of experiments to simulate. If not specified, it's considered 1.
weight_by_expression	can take values TRUE or FALSE. If not specified, it's considered TRUE. If FALSE, brain areas are sampled at random from a uniform distribution, and weight_by_group will be ignored. In this case, weight_df requires only the variables "group" and "my_grouping".
weight_by_group	can take values TRUE or FALSE. If not specified, is considered TRUE.
high_prob	number between 0 and 1 indication the threshold for being a highly active region. 0.95 corresponds to the top 5 per cent.
summary	can be either TRUE or FALSE. If true, returns a list where the first element is the data, and the second element is a summary of how many samples per group per experiment select a certain brain area to be most active. If FALSE, only returns the data.

Value

Examples

x <- data.frame(
group = rep(c("control", "experimental"), each = 5),
my_grouping = rep(c("CA1", "CA2", "CA3", "DG", "BLA"), 2),
mean_expression = c(rnorm(5, 10, 2), rnorm(5, 13, 2)),
sd_expression = abs(rnorm(10)),
weight = c(rep(1, 5), rnorm(5, 3, 1))
)

sim_most_active(x, samples_per_group = 3, weight_by_expression = FALSE, summary = FALSE)
#> Error in weights_df %>% dplyr::filter(ww_probs >= top_quantile) %>% dplyr::select(group,     my_grouping): could not find function "%>%"