find_most_active.Rd
Wrapper around dplyr functions to find most active brain areas by looking at the frequency of the top 5 per cent (or other probability specified by the user) of the var distribution. The function returns a dataframe with a summary per group of which brain areas were in the top distribution in how many batches.
find_most_active(region_df, high_prob = 0.95)
region_df | region_based dataframe. Each row is a brain area ("my_grouping") per sample ("sample_id"), where corrected cell count ("cells_perthousand") has been summarized. It contains a variable "batch" that identifies the unit where to perform the calculation. If from a block design, "batch" identifies a unique set control and experimental groups (var "group"), with 1 sample each. It can be output from summarize_per_region() or preprocess_per_region(). |
---|---|
high_prob | number between 0 and 1 indication the threshold for being a highly active region. 0.95 corresponds to the top 5 per cent. |
x <- data.frame( batch = rep(c(1,1,2,2), each = 5), group = rep(c("control", "exp", "exp", "control"), each = 5), sample_id = rep(c("a", "b", "c", "d"), each = 5), my_grouping = rep(c("CA1", "CA2", "CA3", "DG", "BLA"), 4), intensity_ave = sample(10000, 20, replace = TRUE), cells_perthousand = abs(rnorm(20)) ) find_most_active(x)#> Error in region_df %>% dplyr::group_by(batch) %>% dplyr::summarize(high_count = quantile(cells_perthousand, probs = high_prob)) %>% dplyr::ungroup() %>% dplyr::right_join(region_df, by = "batch"): could not find function "%>%"