clean_counts.Rd
This function cleans the annotated cells by: 1) removing the halo around the brain and ventricles, 2) filtering brain areas of interest, 3) removing damaged areas, and 4) re-imputing the damaged areas by mirroring the other hemisphere. clean_counts() saves two .RDS files in the path specified for each sample processed. The first file (*clean_counts.RDS) contains the xyz coordinates of cells that met the cleaning criteria, as well as their categorization to brain areas of interest by the user. The second file (*_removed_counts_summary.RDS) contains information about the removed counts during the procedure.
clean_counts( sample_id, data, atlas, damaged_areas, dodgy_cells = NULL, out_mask = out_mask, vent_mask = vent_mask, warning_percentage = 0.2, path_cleaned, path_removed )
sample_id | String used to save output. Please do not use spaces. |
---|---|
data | Dataframe with cell coordinates for one sample. Requires variables "xPos", "yPos" and "zPos" for x, y, z coordinates respectively; "id" for code of the brain areas according to the Allen Brain Atlas. |
atlas | Dataframe with meta-data of Allen Brain Atlas areas. It can be generated by running the preparation script "atlas_tree.R". Requires variables: "id" for numerical value of ABA areas; "name" for character value; "acronym" for nomenclature; "parent_acronym" for the parent ABA it belongs to; "category" for categorization of brain areas difficult to interpret, see XX for details; "my_grouping" for your categorization of brain areas. |
damaged_areas | Dataframe with list of damaged brain areas of all samples. By using the function specify_damage(), relevant damaged areas will be automatically selected. Requires the following variables "area" with the acronym of the brain area that matches the "grouping" variable of the areas dataframe (see above); "hemisphere" which specify the hemisphere where the damage occurs ("right", "left"). |
dodgy_cells | Dataframe with information about cells with abnormally high intensity (likely to be unspecific binding, for example after checking scans). Three columns named "sample_id" (id of the sample), "my_grouping" (brain area), and "threshold" (threshold above which cells are considered "spots". Filters based on maximum intensity). This dataframe can be null. |
out_mask | Mask to identify outer part of brain to be removed due to halo. Ask Heike more info: Created in python by finding all 0 that border the brain (next to non-zero values) from there move 3 voxels in all directions (k). Create new matrix with non-zero values for 3 voxels around k. |
vent_mask | Mask to identify outer part of the ventricles to be removed due to halo and unspecific binding of the antibody. Ask Heike more info |
warning_percentage | Number between 0 and 1. It defines a threshold after which a warning message is delivered. The warning message specifies if brain areas removed during the cleaning procedure had (abnormally) high cells. 0 will never return a warning, 1 will always. |
path_cleaned | String to specify path where cleaned annotated cells will be saved |
path_removed | String to specify path where summary of removed cells will be saved |