This function cleans the annotated cells by: 1) removing the halo around the brain and ventricles, 2) filtering brain areas of interest, 3) removing damaged areas, and 4) re-imputing the damaged areas by mirroring the other hemisphere. clean_counts() saves two .RDS files in the path specified for each sample processed. The first file (*clean_counts.RDS) contains the xyz coordinates of cells that met the cleaning criteria, as well as their categorization to brain areas of interest by the user. The second file (*_removed_counts_summary.RDS) contains information about the removed counts during the procedure.

clean_counts(
  sample_id,
  data,
  atlas,
  damaged_areas,
  dodgy_cells = NULL,
  out_mask = out_mask,
  vent_mask = vent_mask,
  warning_percentage = 0.2,
  path_cleaned,
  path_removed
)

Arguments

sample_id

String used to save output. Please do not use spaces.

data

Dataframe with cell coordinates for one sample. Requires variables "xPos", "yPos" and "zPos" for x, y, z coordinates respectively; "id" for code of the brain areas according to the Allen Brain Atlas.

atlas

Dataframe with meta-data of Allen Brain Atlas areas. It can be generated by running the preparation script "atlas_tree.R". Requires variables: "id" for numerical value of ABA areas; "name" for character value; "acronym" for nomenclature; "parent_acronym" for the parent ABA it belongs to; "category" for categorization of brain areas difficult to interpret, see XX for details; "my_grouping" for your categorization of brain areas.

damaged_areas

Dataframe with list of damaged brain areas of all samples. By using the function specify_damage(), relevant damaged areas will be automatically selected. Requires the following variables "area" with the acronym of the brain area that matches the "grouping" variable of the areas dataframe (see above); "hemisphere" which specify the hemisphere where the damage occurs ("right", "left").

dodgy_cells

Dataframe with information about cells with abnormally high intensity (likely to be unspecific binding, for example after checking scans). Three columns named "sample_id" (id of the sample), "my_grouping" (brain area), and "threshold" (threshold above which cells are considered "spots". Filters based on maximum intensity). This dataframe can be null.

out_mask

Mask to identify outer part of brain to be removed due to halo. Ask Heike more info: Created in python by finding all 0 that border the brain (next to non-zero values) from there move 3 voxels in all directions (k). Create new matrix with non-zero values for 3 voxels around k.

vent_mask

Mask to identify outer part of the ventricles to be removed due to halo and unspecific binding of the antibody. Ask Heike more info

warning_percentage

Number between 0 and 1. It defines a threshold after which a warning message is delivered. The warning message specifies if brain areas removed during the cleaning procedure had (abnormally) high cells. 0 will never return a warning, 1 will always.

path_cleaned

String to specify path where cleaned annotated cells will be saved

path_removed

String to specify path where summary of removed cells will be saved

Value

Examples