This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
genetica:pre_1kg [2013/05/02 14:26] osotolongo [DUPLICATE MARKERS FOUND] |
genetica:pre_1kg [2020/08/04 10:58] (current) |
||
---|---|---|---|
Line 161: | Line 161: | ||
Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo) | Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo) | ||
+ | |||
+ | ** Cuidado: ** Hay que garantizar que el orden de los archivos sea el correcto para que el archivo resultante empiece por el cromosoma 1 (opcion //-v// de //ls//) | ||
<code bash> | <code bash> | ||
Line 172: | Line 174: | ||
<code bash> | <code bash> | ||
- | afr=(`ls *.bed`); | + | afr=(`ls |
for (( i=1; i< | for (( i=1; i< | ||
do | do | ||
Line 179: | Line 181: | ||
done; | done; | ||
x=${afr[0]}; | x=${afr[0]}; | ||
- | plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_CEU_merged | + | plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_all_merged |
</ | </ | ||
+ | ===== Eligiendo solo los europeos ====== | ||
+ | |||
+ | Lo que quiero es seleccionar la [[http:// | ||
+ | |||
+ | < | ||
+ | $ awk {'if ($3==" | ||
+ | $ grep -f individuals.txt ../ | ||
+ | $ plink --bfile ../ | ||
+ | |||
+ | @----------------------------------------------------------@ | ||
+ | | PLINK! | ||
+ | |----------------------------------------------------------| | ||
+ | | (C) 2009 Shaun Purcell, GNU General Public License, v2 | | ||
+ | |----------------------------------------------------------| | ||
+ | | For documentation, | ||
+ | | http:// | ||
+ | @----------------------------------------------------------@ | ||
+ | |||
+ | Web-based version check ( --noweb to skip ) | ||
+ | Connecting to web... | ||
+ | |||
+ | Writing this text to log file [ 1000genome_eur.log ] | ||
+ | Analysis started: Mon May 6 16:38:21 2013 | ||
+ | |||
+ | Options in effect: | ||
+ | --bfile ../ | ||
+ | --keep eur_pop.txt | ||
+ | --make-bed | ||
+ | --out 1000genome_eur | ||
+ | |||
+ | Reading map (extended format) from [ ../ | ||
+ | 39706712 markers to be included from [ ../ | ||
+ | Reading pedigree information from [ ../ | ||
+ | 1092 individuals read from [ ../ | ||
+ | 0 individuals with nonmissing phenotypes | ||
+ | Assuming a disease phenotype (1=unaff, 2=aff, 0=miss) | ||
+ | Missing phenotype value is also -9 | ||
+ | 0 cases, 0 controls and 1092 missing | ||
+ | 0 males, 0 females, and 1092 of unspecified sex | ||
+ | Warning, found 1092 individuals with ambiguous sex codes | ||
+ | Writing list of these individuals to [ 1000genome_eur.nosex ] | ||
+ | Reading genotype bitfile from [ ../ | ||
+ | Detected that binary PED file is v1.00 SNP-major mode | ||
+ | Reading individuals to keep [ eur_pop.txt ] ... 379 read | ||
+ | 713 individuals removed with --keep option | ||
+ | Before frequency and genotyping pruning, there are 39706712 SNPs | ||
+ | 379 founders and 0 non-founders found | ||
+ | Total genotyping rate in remaining individuals is 1 | ||
+ | 0 SNPs failed missingness test ( GENO > 1 ) | ||
+ | 0 SNPs failed frequency test ( MAF < 0 ) | ||
+ | After frequency and genotyping pruning, there are 39706712 SNPs | ||
+ | After filtering, 0 cases, 0 controls and 379 missing | ||
+ | After filtering, 0 males, 0 females, and 379 of unspecified sex | ||
+ | Writing pedigree information to [ 1000genome_eur.fam ] | ||
+ | Writing map (extended format) information to [ 1000genome_eur.bim ] | ||
+ | Writing genotype bitfile to [ 1000genome_eur.bed ] | ||
+ | Using (default) SNP-major mode | ||
+ | |||
+ | Analysis finished: Mon May 6 16:59:16 2013 | ||
+ | |||
+ | </ | ||
===== y ya ta ===== | ===== y ya ta ===== | ||
Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]] | Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]] | ||
- | ===== porqueria que puede pasar ====== | + | ====== porqueria que puede pasar ======= |
==== DUPLICATE MARKERS FOUND ==== | ==== DUPLICATE MARKERS FOUND ==== | ||
Line 264: | Line 327: | ||
</ | </ | ||
+ | <del> | ||
asi que voy y lo convierto a ascii. | asi que voy y lo convierto a ascii. | ||
Line 269: | Line 333: | ||
$ plink --bfile all_chr22 --recode --out all_chr22 | $ plink --bfile all_chr22 --recode --out all_chr22 | ||
</ | </ | ||
+ | </ | ||
- | y despues edito a mano los marcadores dobles y pongo a uno //a// y al otro //b//. | + | Edito a mano los marcadores dobles |
Ejemplo: | Ejemplo: | ||
< | < | ||
- | 22 rs11457237a 0 34030843 | + | 22 rs11457237 0 34030843 |
22 rs11457237b 0 34030846 | 22 rs11457237b 0 34030846 | ||
</ | </ | ||
- | Ahora deja ver si son el mismo, | + | < |
< | < | ||
$ plink --file all_chr22 --twolocus rs11457237a rs11457237b --allow-no-sex | $ plink --file all_chr22 --twolocus rs11457237a rs11457237b --allow-no-sex | ||
+ | </ | ||
+ | |||
+ | Bueno esta parte no la entendi asi que voy a borrar uno de los duplicados utilizando el criterio loreal (//Because I'm worth it//). | ||
+ | |||
+ | < | ||
+ | $ cat rmsnps.txt | ||
+ | rs11457237b | ||
+ | rs113940759b | ||
+ | rs71904485b | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | $ plink --bfile all_chr1 --merge-list / | ||
</ | </ |