使用R從RNA-seq數(shù)據(jù)繪制熱圖代碼

yjt2004us 2018-06-22

展開全文

文/伍鴻榮

要做熱圖，首先我們是要準(zhǔn)備好數(shù)據(jù)，比如說TCGA的rna-seq，或者你自個(gè)測(cè)有的數(shù)據(jù)。然后可能利用deseq 包進(jìn)行差異分析。比如說作者提出的用阿扎胞苷對(duì)AML3細(xì)胞影響的基因表達(dá)譜數(shù)據(jù)。

數(shù)據(jù)篩選：在熱圖上繪制所有5704個(gè)FDR調(diào)整p值<>

Read the count matrix and DESeq table into R and merge into one table
Sort based on p-value with most significant genes on top
Select the columns containing gene name and raw counts
Scale the data per row
Select the top 100 genes by significance
Generate the heatmap with mostly default values

以下是我將在以下R腳本代碼：

#read in the count matrix

mx<-read.table('aza_aml3_countmatrix.xls', row.names="1" ,="" header="">

#read in the DESeq DGE spreadsheet

dge<-read.table('deb_deseq.xls', row.names="1" ,="" header="">

#merge the counts onto the DGE spreadsheet

mg<>

#sort the merged table by p-value

smg<-mg[order(mg$pval),>

#select only the columns containing the gene names and count data

x<-subset(smg, select="c('Row.names'," 'untr1',="" 'untr2',="" 'untr3',="" 'aza1',="" 'aza2',="">

#make the table a data frame with gene names then remove duplicate gene name column

y<-(as.data.frame(x, row.names="">

x<-subset(y,>

#scale rows

xt<>

xts<>

xtst<>

#only grab top 100 by p-value

h<-head(xtst, n="">

#set layout options - adjust if labels get cut off

pdf('heatmap.pdf',width=7, height=8)

#draw heatmap allowing larger margins and adjusting row label font size

heatmap(h, margins = c(4,10), cexRow=.4)

#output plot to file

dev.off()

正如您所看到的，熱圖顯示了這100個(gè)最顯著差異基因的嚴(yán)重表達(dá)變化。還要注意，前100名中的大多數(shù)基因是下調(diào)的。

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來自： yjt2004us > 《待分類》

舉報(bào)/認(rèn)領(lǐng)