User Tools

Site Tools


genetica:pre_1kg

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
genetica:pre_1kg [2013/05/03 08:32]
osotolongo [DUPLICATE MARKERS FOUND]
genetica:pre_1kg [2020/08/04 10:58] (current)
Line 161: Line 161:
  
 Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo) Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo)
 +
 +** Cuidado: ** Hay que garantizar que el orden de los archivos sea el correcto para que el archivo resultante empiece por el cromosoma 1 (opcion //-v// de //ls//)
  
 <code bash> <code bash>
Line 172: Line 174:
  
 <code bash> <code bash>
-afr=(`ls *.bed`);+afr=(`ls -v all_chr*.bed`);
 for (( i=1; i<${#afr[@]}; i++ )); for (( i=1; i<${#afr[@]}; i++ ));
 do do
Line 179: Line 181:
 done; done;
 x=${afr[0]}; x=${afr[0]};
-plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_CEU_merged+plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_all_merged
 </code> </code>
  
 +===== Eligiendo solo los europeos ======
 +
 +Lo que quiero es seleccionar la [[http://www.1000genomes.org/category/frequently-asked-questions/samples|poblacion europea]] de todo el estudio.
 +
 +<code>
 +$ awk {'if ($3=="EUR") print $1'} /media/1000Genome/phase1_integrated_calls.20101123.ALL.panel > individuals.txt
 +$ grep -f individuals.txt ../ALL/all_chr1.fam > eur_pop.txt
 +$ plink --bfile ../ALL/1000genome_all_merged --keep eur_pop.txt --make-bed --out 1000genome_eur
 +
 +@----------------------------------------------------------@
 +|        PLINK!           v1.07      |   10/Aug/2009     |
 +|----------------------------------------------------------|
 +|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
 +|----------------------------------------------------------|
 +|  For documentation, citation & bug-report instructions:  |
 +|        http://pngu.mgh.harvard.edu/purcell/plink/        |
 +@----------------------------------------------------------@
 +
 +Web-based version check ( --noweb to skip )
 +Connecting to web...  OK, v1.07 is current
 +
 +Writing this text to log file [ 1000genome_eur.log ]
 +Analysis started: Mon May  6 16:38:21 2013
 +
 +Options in effect:
 +        --bfile ../ALL/1000genome_all_merged
 +        --keep eur_pop.txt
 +        --make-bed
 +        --out 1000genome_eur
 +
 +Reading map (extended format) from [ ../ALL/1000genome_all_merged.bim ]
 +39706712 markers to be included from [ ../ALL/1000genome_all_merged.bim ]
 +Reading pedigree information from [ ../ALL/1000genome_all_merged.fam ]
 +1092 individuals read from [ ../ALL/1000genome_all_merged.fam ]
 +0 individuals with nonmissing phenotypes
 +Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
 +Missing phenotype value is also -9
 +0 cases, 0 controls and 1092 missing
 +0 males, 0 females, and 1092 of unspecified sex
 +Warning, found 1092 individuals with ambiguous sex codes
 +Writing list of these individuals to [ 1000genome_eur.nosex ]
 +Reading genotype bitfile from [ ../ALL/1000genome_all_merged.bed ]
 +Detected that binary PED file is v1.00 SNP-major mode
 +Reading individuals to keep [ eur_pop.txt ] ... 379 read
 +713 individuals removed with --keep option
 +Before frequency and genotyping pruning, there are 39706712 SNPs
 +379 founders and 0 non-founders found
 +Total genotyping rate in remaining individuals is 1
 +0 SNPs failed missingness test ( GENO > 1 )
 +0 SNPs failed frequency test ( MAF < 0 )
 +After frequency and genotyping pruning, there are 39706712 SNPs
 +After filtering, 0 cases, 0 controls and 379 missing
 +After filtering, 0 males, 0 females, and 379 of unspecified sex
 +Writing pedigree information to [ 1000genome_eur.fam ]
 +Writing map (extended format) information to [ 1000genome_eur.bim ]
 +Writing genotype bitfile to [ 1000genome_eur.bed ] 
 +Using (default) SNP-major mode
 +
 +Analysis finished: Mon May  6 16:59:16 2013
 +
 +</code>
 ===== y ya ta ===== ===== y ya ta =====
  
 Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]] Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]]
  
-===== porqueria que puede pasar ======+====== porqueria que puede pasar =======
  
 ==== DUPLICATE MARKERS FOUND ==== ==== DUPLICATE MARKERS FOUND ====
Line 264: Line 327:
 </code> </code>
  
 +<del>
 asi que voy y lo convierto a ascii. asi que voy y lo convierto a ascii.
  
Line 269: Line 333:
 $ plink --bfile all_chr22 --recode --out all_chr22 $ plink --bfile all_chr22 --recode --out all_chr22
 </code> </code>
 +</del>
  
-y despues edito a mano los marcadores dobles y pongo a uno //a// al otro //b//.+Edito a mano los marcadores dobles (** En el mismo .bim **) y pongo una //b// al segundo.
 Ejemplo: Ejemplo:
 <code> <code>
-22 rs11457237a 0 34030843+22 rs11457237 0 34030843
 22 rs11457237b 0 34030846 22 rs11457237b 0 34030846
 </code> </code>
Line 292: Line 357:
  
 <code> <code>
-$ plink --file all_chr22 --exclude rmsnps.txt --out all_chr22_tmp --allow-no-sex+$ plink --bfile all_chr1 --merge-list /home/osotolongo/data/test_impute/mkgendb/allfiles.txt --make-bed --out 1000genome_all_merged --exclude /home/osotolongo/data/test_impute/mkgendb/rmsnps.txt --allow-no-sex
 </code> </code>
genetica/pre_1kg.1367569921.txt.gz ยท Last modified: 2020/08/04 10:48 (external edit)