forked from swbioinf/scRNAseq_Workshop
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path07-01-seuratobject.Rmd
More file actions
executable file
·119 lines (61 loc) · 3.67 KB
/
07-01-seuratobject.Rmd
File metadata and controls
executable file
·119 lines (61 loc) · 3.67 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# (PART) Seurat Object {-}
# Structure {#structure}
## Load an existing Seurat object
The data we're working with today is a small dataset of about 5000 PBMCs (peripheral blood mononuclear cells) from a healthy donor.
First, load Seurat package.
```{r libloader, results='hide', message=FALSE, warning=FALSE}
library(Seurat)
```
And here's the one we prepared earlier. Seurat objects are usually saved as '.rds' files, which is an R format for storing binary data (not-text or human-readable). The functions `readRDS()` can load it.
```{r}
seurat_object <- readRDS("data/kang2018.rds")
```
#### Discussion: The Seurat Object in R { - .rational}
Lets take a look at the seurat object we have just created in R, `seurat_object`
To accomodate the complexity of data arising from a single cell RNA seq experiment, the seurat object keeps this as a container of multiple data tables that are linked.
{width=80%}
The functions in seurat can access parts of the data object for analysis and visualization, we will cover this later on.
There are a couple of concepts to discuss here.
<details>
<summary>**Class**</summary>
These are essentially data containers in R as a class, and can accessed as a variable in the R environment.
Classes are pre-defined and can contain multiple data tables and metadata. For Seurat, there are three types.
* Seurat - the main data class, contains all the data.
* Assay - found within the Seurat object. Depending on the experiment a cell could have data on RNA, ATAC etc measured
* DimReduc - for PCA and UMAP
</details>
<details>
<summary>**Slots**</summary>
Slots are parts within a class that contain specific data. These can be lists, data tables and vectors and can be accessed with conventional R methods.
</details>
<details>
<summary>**Data Access**</summary>
Many of the functions in Seurat operate on the data class and slots within them seamlessly. There maybe occasion to access these separately to `hack` them, however this is an advanced analysis method.
The ways to access the slots can be through methods for the class (functions) or with standard R accessor nomenclature.
</details>
**Examples of accessing a Seurat object**
The `assays` slot in `seurat_object` can be accessed with `seurat_object@assays`.
The `RNA` assay can be accessed from this with `seurat_object@assays$RNA`.
We often want to access assays, so Seurat nicely gives us a shortcut `seurat_object$RNA`. You may sometimes see an alternative notation `seurat_object[["RNA"]]`.
In general, slots that are always in an object are accessed with `@` and things that may be different in different data sets are accessed with `$`.
**Have a go**
Use `str` to look at the structure of the Seurat object `seurat_object`.
What is in the `meta.data` slot within your Seurat object currently? What type of data is contained here?
Where is our count data within the Seurat object?
## What's in there?
Some of the most important information for working with Seurat objects is in the metadata.
This is cell level information - each row is one cell, identified by its barcode.
Extra information gets added to this table as analysis progresses.
```{r}
head(seurat_object@meta.data)
```
That doesn't have any gene expression though, that's stored in an 'Assay'.
The Assay structure has some nuances (see discussion below), but there are functions that get the assay data out for you.
By default this object will return the normalised data (from the only assay in this object, called RNA). Every '.' is a zero.
```{r}
GetAssayData(seurat_object)[1:15,1:2]
```
But the raw counts data is accessible too.
```{r}
GetAssayData(seurat_object, layer='counts')[1:15,1:2]
```