scATAC分析神器ArchR初探-使用ArchR识别标记峰（1

2020-05-19 本文已影响0人六博说

11-使用ArchR识别标记峰

在前面讨论基因得分的章节中，我们已经介绍了标记特征的识别。相同的函数（getMarkerFeatures()）可用于从存储在ArchRProject。标记功能是特定单元格分组所独有的功能。这些对于理解簇或细胞类型特异性生物学非常有用。在本章中，我们将讨论如何使用此功能来识别标记峰。

11.1用ArchR识别标记峰

通常，我们很想知道哪些峰对于单个簇或一小组簇是唯一的。我们可以addMarkerFeatures()结合使用，在ArchR中以不受监督的方式执行此操作useMatrix = "PeakMatrix"。

首先，让我们想起我们正在使用的细胞类型projHeme5及其相对比例。

#Our scRNA labels
table(projHeme5$Clusters2)

现在，我们准备通过使用调用addMarkerFeatures()函数来识别标记峰useMatrix = "PeakMatrix"。此外，我们告诉ArchR通过设置bias参数以说明TSS富集和每个细胞唯一片段的数量来说明细胞组之间数据质量的差异。

markersPeaks <- getMarkerFeatures(
    ArchRProj = projHeme5, 
    useMatrix = "PeakMatrix", 
    groupBy = "Clusters2",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  testMethod = "wilcoxon"
)

函数返回的对象getMarkerFeatures()是SummarizedExperiment，其中包含一些不同的assays。

markersPeaks

我们可以使用该getMarkers()函数来检索SummarizedExperiment我们感兴趣的特定切片。此函数的默认行为是返回DataFrame对象列表，每个单元组一个。

markerList <- getMarkers(markersPeaks, cutOff = "FDR <= 0.01 & Log2FC >= 1")
markerList

如果我们对特定细胞组的标记峰感兴趣，可以通过访问器从列表中$访问。

markerList$Erythroid

除了设置DataFrame对象列表，我们还可以通过设置getMarkers()来返回GRangesList对象returnGR = TRUE。

markerList <- getMarkers(markersPeaks, cutOff = "FDR <= 0.01 & Log2FC >= 1", returnGR = TRUE)
markerList

该GRangesList对象可以类似地是GRanges使用$访问器的特定单元组对象的子集。

markerList$Erythroid

11.2 Plotting Marker Peaks in ArchR

ArchR provides multiple plotting functions to interact with the SummarizedExperiment objects returned by getMarkerFeatures().

11.2.1 Marker Peak Heatmaps

We can visualize these marker peaks (or any features output by getMarkerFeatures()) as a heatmap using the markerHeatmap() function.

heatmapPeaks <- markerHeatmap(
  seMarker = markersPeaks, 
  cutOff = "FDR <= 0.1 & Log2FC >= 0.5",
  transpose = TRUE
)

We can plot this heatmap using draw().

draw(heatmapPeaks, heatmap_legend_side = "bot", annotation_legend_side = "bot")

To save an editable vectorized version of this plot, we use the plotPDF() function.

plotPDF(heatmapPeaks, name = "Peak-Marker-Heatmap", width = 8, height = 6, ArchRProj = projHeme5, addDOC = FALSE)

11.2.2 Marker Peak MA and Volcano Plots

Instead of plotting a heatmap, we can also plot an MA or Volcano plot for any individual cell group. To do this, we use the markerPlot() function. For an MA plot we specify plotAs = "MA". Here we specify the “Erythroid” cell group via the name parameter.

pma <- markerPlot(seMarker = markersPeaks, name = "Erythroid", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "MA")
pma

Similarly, for a Volcano plot, we specify plotAs = "Volcano".

pv <- markerPlot(seMarker = markersPeaks, name = "Erythroid", cutOff = "FDR <= 0.1 & Log2FC >= 1", plotAs = "Volcano")
pv

To save an editable vectorized version of these plots, we use the plotPDF() function.

plotPDF(pma, pv, name = "Erythroid-Markers-MA-Volcano", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)

11.2.3 Marker Peaks in Browser Tracks

Additionally we can see these peak regions overlayed on our browser tracks by passing the relevant peak regions to the features parameterin the plotBrowserTrack() function. This will add an additional BED-style track of marker peak regions to the bottom of our ArchR track plot. Here we specify plotting the GATA1 gene via the geneSymbol parameter.

p <- plotBrowserTrack(
    ArchRProj = projHeme5, 
    groupBy = "Clusters2", 
    geneSymbol = c("GATA1"),
    features =  getMarkers(markersPeaks, cutOff = "FDR <= 0.1 & Log2FC >= 1", returnGR = TRUE)["Erythroid"],
    upstream = 50000,
    downstream = 50000
)

We can plot this using grid::grid.draw().

grid::grid.newpage()
grid::grid.draw(p$GATA1)

To save an editable vectorized version of this plot, we use the plotPDF() function.

plotPDF(p, name = "Plot-Tracks-With-Features", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)

11.3组之间的成对测试

标记特征识别是一种非常特殊的差异测试类型。但是，ArchR还可以使用相同getMarkerFeatures()功能启用标准差分测试。诀窍是将其设置useGroups为两个单元组之一，并bgdGroups设置为另一个单元组。这将在两个提供的组之间执行差异测试。在所有这些差异测试中，通过的组中较高的峰useGroups将具有正的倍数变化值，而通过的组中较高的峰bgdGroups将具有负的倍数变化值。

在这里，我们在“类红细胞”细胞组和“祖细胞”细胞组之间进行成对测试。

markerTest <- getMarkerFeatures(
  ArchRProj = projHeme5, 
  useMatrix = "PeakMatrix",
  groupBy = "Clusters2",
  testMethod = "wilcoxon",
  bias = c("TSSEnrichment", "log10(nFrags)"),
  useGroups = "Erythroid",
  bgdGroups = "Progenitor"
)

然后，我们可以使用该markerPlot()函数绘制MA或火山图。我们使用指示MA图plotAs = "MA"。

pma <- markerPlot(seMarker = markerTest, name = "Erythroid", cutOff = "FDR <= 0.1 & abs(Log2FC) >= 1", plotAs = "MA")
pma

同样，我们使用绘制了volvano图plotAs = "Volcano"。

pv <- markerPlot(seMarker = markerTest, name = "Erythroid", cutOff = "FDR <= 0.1 & abs(Log2FC) >= 1", plotAs = "Volcano")
pv

要保存这些图的可编辑矢量化版本，我们使用plotPDF()函数。

plotPDF(pma, pv, name = "Erythroid-vs-Progenitor-Markers-MA-Volcano", width = 5, height = 5, ArchRProj = projHeme5, addDOC = FALSE)

我们将在下一章节中通过在差异可访问的峰中寻找基序富集来继续进行差异分析。

参考材料：

https://www.archrproject.com/

欢迎关注微信公众号，第一时间获取更新