【原】structure 2.3.4 軟件使用指南

育種數(shù)據(jù)分析 2021-11-18

展開全文

1. 軟件下載

structure

Windows版建議安裝桌面版(graphical front end), Linux建議安裝終端版(without front end)

2. 安裝指南

Windows

直接雙擊安裝包, 進(jìn)行安裝即可. 安裝完成后, 桌面上有快捷方式:

Mac OS X
軟件包是一個(gè)可執(zhí)行文件, 雙擊Structure.dmg, 會打開軟件, 創(chuàng)建一個(gè)快捷方式即可.

Unix/Linux

下載軟件包, 解壓進(jìn)入軟件包文件夾, 會有structure可執(zhí)行文件, 運(yùn)行./structure即可.

wget https://web./group/pritchardlab/structure_software/release_versions/v2.3.4/release/structure_linux_console.tar.gztar zxvf structure_linux_console.tar.gz cd console/./structure

出現(xiàn)下面代碼, 說明運(yùn)行成功:

(base) [dengfei@localhost console]$ ./structure

----------------------------------------------------STRUCTURE by Pritchard, Stephens and Donnelly (2000) and Falush, Stephens and Pritchard (2003) Code by Pritchard, Falush and Hubisz Version 2.3.4 (Jul 2012)----------------------------------------------------

Reading file "mainparams".datafile isinfileReading file "extraparams".

Note: RANDOMIZE is set to 1. The random number generator will be initialized using the system clock, ignoring any specified value of SEED.

Unable to open the file infile.

Exiting the program due to error(s) listed above.

3. 示例數(shù)據(jù)

示例數(shù)據(jù), 這里我們使用admixture的數(shù)據(jù), 數(shù)據(jù)格式是SNP數(shù)據(jù)格式, 詳見:

Admixture使用說明文檔cookbook

Structure可以支持的五種類型的數(shù)據(jù):

Simulated microsatellite test data
AFLP data from whitefish.
SNP and microsatellite data from the HGDP.
Thrush data from original Structure paper
Simulated microsatellite data with location information

4. plink數(shù)據(jù)格式轉(zhuǎn)化為structure

.recode.strct_in (Structure format)Produced by "--recode structure", for use by Structure. This format cannot be loaded by PLINK.

A text file with two header lines: the first header line lists all V variant IDs, while each entry in the second line is the difference between the current variant's base-pair coordinate and the previous variant's bp coordinate (or -1 when the current variant starts a new chromosome). This is followed by one line per sample with the following 2V+2 fields:

1. Within-family ID2. Positive integer, unique for each FID3-(2V+2). Genotype calls, with the A1 allele coded as '1', A2 = '2', and missing = '0'

用法:

使用參數(shù)--recode structure, 結(jié)果生成:.recode.strct_in的后綴文件.
plink --file name --recode structure --out result

還可以使用Mega2進(jìn)行格式轉(zhuǎn)化, 轉(zhuǎn)化方法:

https://watson.hgen./docs/conversions/frame_ext_structure.html

5. 使用admixture的數(shù)據(jù)進(jìn)行測試

查看數(shù)據(jù):

(base) [dengfei@localhost test]$ ls hapmap3.bed hapmap3.bim hapmap3.fam hapmap3.map

使用plink進(jìn)行格式轉(zhuǎn)化:

plink --bfile hapmap3 --recode structure --out test_structure

生成test_structure.recode.strct_in文件, 用這個(gè)文件進(jìn)行操作.

數(shù)據(jù)格式如下:

rs10458597 rs12562034 rs2710875 rs11260566 rs1312568 rs35154105 rs16824508 rs2678939 rs7553178 rs133763-1 203827 209332 200465 206966 213697 200280 201401 204163 202132 226411 200445 201484 200329 205708 20NA19916 1 2 2 2 2 1 1 2 2 2 2 2 2 1 2 1 2 2 2 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 1 2 1 1 1 2 1 2 1 2 1 2 1NA19835 2 2 2 1 2 1 2 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 2 1 2 1NA20282 3 2 2 2 2 1 2 1 2 1 2 2 2 2 2 1 1 1 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1NA19703 4 2 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 2NA19901 5 2 2 2 2 1 2 1 2 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 1 2 1 1 1 2 1 1 2NA19908 6 2 2 1 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 1NA19914 7 2 2 2 2 2 2 2 2 1 1 2 2 2 2 1 1 1 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 1NA20287 8 2 2 2 2 1 1 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 1 2 1 2 1 2 1 1 2 2 2 2 2NA19713 9 2 2 2 2 1 2 1 2 2 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 1 1 2 2 1 2 2 2 2 2 1 2 1 1 2 2 1 2 1 2 2 2 2NA19904 10 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 2 2 2 2 2 2 2 0 0 NA19917 1 2 2 2 2 1 2 2 2 2 2 2 2 2 2 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1

6. 導(dǎo)入數(shù)據(jù)到structure軟件中

雙擊打開軟件

點(diǎn)擊File, 點(diǎn)擊New Project

鍵入名稱: name1(自己命名即可)
選擇文件所在的文件夾
選擇文件

數(shù)據(jù)可以提前預(yù)覽, 可以看到:
個(gè)體數(shù)為: 324
SNP個(gè)數(shù)為: 13928
缺失定義為 -999