RPM(CPM)/RPKM/FPKM/TPM
RPM/CPM
RPM/CPM: Reads/Counts of exon model per million mapped reads
Calculate Formula:
RPM=Total exon reads/ Mapped reads(Millions)
We can get the decision easily: The longer the gene, the greater the number of reads.
So, we calculate the RPKM to exclude the effect of gene length
RPKM
RPKM: Reads Per Kilobase of exon model per Million mapped reads
Range of Use: Single-end RNA-seq
Calculate Formula:
RPKM=Total exon reads/[Mapped reads(Millions)*Exon length(Kb)]
Example of Calculating RPKM
Gene B is twice as long as gene A, and that might explain why it always gets twice as many reads, regardless of replicate.
Sample3 has way more reads than other replicates, regardless of the gene.
RPKM-Step1:normalize for Read Depth
For the purpose of this 4 gene examples, we’re scaling the total read counts by 10 instead of 1,000,000.
Originally,1,000,000 was picked just because it made the numbers look nice.(i.e. they didn’t require too many decimal places)
RPM-scaled using the ‘per million’ factors.
RPKM-Step2:normalize for gene length
Reads are scaled for depth(M) and gene length(K).
FPKM
RPKM and FPKM-two very closely related terms
RPKM=Reads Per Kilobase Million
FPKM=Fragments per Kilobase Million
RPKM is for single-end RNA-seq.
FPKM is for paired-end RNA-seq.
Differences
针对Single-end RPKM与FPKM基本没有差异
针对Paired-end,如果一对paired-read都比对上那么FPKM计算方法中认为这一对read为一个fragment(RPKM则计为2),如果一对中仅有一个比对上,则将比对上的计为一个fragment.
TPM
TPM is like RPKM and FPKM, except the order of operation is switched.
因此比对TPM和FPKM的公式可以发现,FPKM的分母没有考虑基因长度的影响,所以TPM更加符合我们对相对表达量的定义。
Example of Calculating TPM
TPM-Step1:Normalize for gene length
RPK-scaled by gene length
TPM-Step2:normalize for sequencing depth
TPM-scaled by gene length and sequencing depth(M)