User Tools

Site Tools


genetica:pre_1kg

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
genetica:pre_1kg [2013/05/06 09:56]
osotolongo [Preprocesando con plink]
genetica:pre_1kg [2013/05/08 11:40]
osotolongo [Eligiendo solo los europeos]
Line 162: Line 162:
 Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo) Primero hay que recodificar los alelos a ACGT y de paso los pasamos a binario (solo es necesario si hemos converitdo con el primer metodo)
  
-** Cuidado: ** Hay que garantizar que el orden de los archivos sea el correcto para que el archivo resultante empiece por el cromosoma 1+** Cuidado: ** Hay que garantizar que el orden de los archivos sea el correcto para que el archivo resultante empiece por el cromosoma 1 (opcion //-v// de //ls//)
  
 <code bash> <code bash>
Line 174: Line 174:
  
 <code bash> <code bash>
-afr=(`ls *.bed`);+afr=(`ls -v all_chr*.bed`);
 for (( i=1; i<${#afr[@]}; i++ )); for (( i=1; i<${#afr[@]}; i++ ));
 do do
Line 181: Line 181:
 done; done;
 x=${afr[0]}; x=${afr[0]};
-plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_CEU_merged+plink --bfile ${x%.bed} --merge-list allfiles.txt --make-bed --out 1000genome_all_merged
 </code> </code>
  
 +===== Eligiendo solo los europeos ======
 +
 +Lo que quiero es seleccionar la [[http://www.1000genomes.org/category/frequently-asked-questions/samples|poblacion europea]] de todo el estudio.
 +
 +<code>
 +$ awk {'if ($3=="EUR") print $1'} /media/1000Genome/phase1_integrated_calls.20101123.ALL.panel > individuals.txt
 +$ grep -f individuals.txt ../ALL/all_chr1.fam > eur_pop.txt
 +$ plink --bfile ../ALL/1000genome_all_merged --keep eur_pop.txt --make-bed --out 1000genome_eur
 +
 +@----------------------------------------------------------@
 +|        PLINK!           v1.07      |   10/Aug/2009     |
 +|----------------------------------------------------------|
 +|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
 +|----------------------------------------------------------|
 +|  For documentation, citation & bug-report instructions:  |
 +|        http://pngu.mgh.harvard.edu/purcell/plink/        |
 +@----------------------------------------------------------@
 +
 +Web-based version check ( --noweb to skip )
 +Connecting to web...  OK, v1.07 is current
 +
 +Writing this text to log file [ 1000genome_eur.log ]
 +Analysis started: Mon May  6 16:38:21 2013
 +
 +Options in effect:
 +        --bfile ../ALL/1000genome_all_merged
 +        --keep eur_pop.txt
 +        --make-bed
 +        --out 1000genome_eur
 +
 +Reading map (extended format) from [ ../ALL/1000genome_all_merged.bim ]
 +39706712 markers to be included from [ ../ALL/1000genome_all_merged.bim ]
 +Reading pedigree information from [ ../ALL/1000genome_all_merged.fam ]
 +1092 individuals read from [ ../ALL/1000genome_all_merged.fam ]
 +0 individuals with nonmissing phenotypes
 +Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
 +Missing phenotype value is also -9
 +0 cases, 0 controls and 1092 missing
 +0 males, 0 females, and 1092 of unspecified sex
 +Warning, found 1092 individuals with ambiguous sex codes
 +Writing list of these individuals to [ 1000genome_eur.nosex ]
 +Reading genotype bitfile from [ ../ALL/1000genome_all_merged.bed ]
 +Detected that binary PED file is v1.00 SNP-major mode
 +Reading individuals to keep [ eur_pop.txt ] ... 379 read
 +713 individuals removed with --keep option
 +Before frequency and genotyping pruning, there are 39706712 SNPs
 +379 founders and 0 non-founders found
 +Total genotyping rate in remaining individuals is 1
 +0 SNPs failed missingness test ( GENO > 1 )
 +0 SNPs failed frequency test ( MAF < 0 )
 +After frequency and genotyping pruning, there are 39706712 SNPs
 +After filtering, 0 cases, 0 controls and 379 missing
 +After filtering, 0 males, 0 females, and 379 of unspecified sex
 +Writing pedigree information to [ 1000genome_eur.fam ]
 +Writing map (extended format) information to [ 1000genome_eur.bim ]
 +Writing genotype bitfile to [ 1000genome_eur.bed ] 
 +Using (default) SNP-major mode
 +
 +Analysis finished: Mon May  6 16:59:16 2013
 +
 +</code>
 ===== y ya ta ===== ===== y ya ta =====
  
 Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]] Ahora hay que seguir fundiendo la DB que tenemos con nuestros datos: [[plink_1kg_impute| para poder imputar]]
  
-===== porqueria que puede pasar ======+====== porqueria que puede pasar =======
  
 ==== DUPLICATE MARKERS FOUND ==== ==== DUPLICATE MARKERS FOUND ====
Line 296: Line 357:
  
 <code> <code>
-$ plink --bfile all_chr10 --merge-list allfiles.txt --make-bed --out 1000genome_all_merged --exclude rmsnps.txt --allow-no-sex +$ plink --bfile all_chr1 --merge-list /home/osotolongo/data/test_impute/mkgendb/allfiles.txt --make-bed --out 1000genome_all_merged --exclude /home/osotolongo/data/test_impute/mkgendb/rmsnps.txt --allow-no-sex
- +
-@----------------------------------------------------------@ +
-|        PLINK!           v1.07      |   10/Aug/2009     | +
-|----------------------------------------------------------| +
-|  (C) 2009 Shaun Purcell, GNU General Public License, v2  | +
-|----------------------------------------------------------| +
-|  For documentation, citation & bug-report instructions: +
-|        http://pngu.mgh.harvard.edu/purcell/plink/        | +
-@----------------------------------------------------------@ +
- +
-Web-based version check ( --noweb to skip ) +
-Recent cached web-check found... OK, v1.07 is current +
- +
-Writing this text to log file [ 1000genome_all_merged.log ] +
-Analysis started: Fri May  3 12:56:36 2013 +
- +
-Options in effect: +
- --bfile all_chr10 +
- --merge-list allfiles.txt +
- --make-bed +
- --out 1000genome_all_merged +
- --exclude rmsnps.txt +
- --allow-no-sex +
- +
-Reading map (extended format) from [ all_chr10.bim ]  +
-1882663 markers to be included from [ all_chr10.bim ] +
-Reading pedigree information from [ all_chr10.fam ]  +
-1092 individuals read from [ all_chr10.fam ]  +
-0 individuals with nonmissing phenotypes +
-Assuming a disease phenotype (1=unaff, 2=aff, 0=miss) +
-Missing phenotype value is also -9 +
-0 cases, 0 controls and 1092 missing +
-0 males, 0 females, and 1092 of unspecified sex +
-Warning, found 1092 individuals with ambiguous sex codes +
-Writing list of these individuals to [ 1000genome_all_merged.nosex ] +
-Reading genotype bitfile from [ all_chr10.bed ]  +
-Detected that binary PED file is v1.00 SNP-major mode +
-Using merge mode 1 : consensus call (default) +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
-Detected that binary PED file is v1.00 SNP-major mode +
- +
-Merging 22 samples, final sample contains 1092 individuals and 39706715 markers +
-Reading list of SNPs to exclude [ rmsnps.txt ] ... 3 read +
-Before frequency and genotyping pruning, there are 39706712 SNPs +
-1092 founders and 0 non-founders found +
-Total genotyping rate in remaining individuals is 1 +
-0 SNPs failed missingness test ( GENO > 1 ) +
-0 SNPs failed frequency test ( MAF < 0 ) +
-After frequency and genotyping pruning, there are 39706712 SNPs +
-After filtering, 0 cases, 0 controls and 1092 missing +
-After filtering, 0 males, 0 females, and 1092 of unspecified sex +
-Writing pedigree information to [ 1000genome_all_merged.fam ]  +
-Writing map (extended format) information to [ 1000genome_all_merged.bim ]  +
-Writing genotype bitfile to [ 1000genome_all_merged.bed ]  +
-Using (default) SNP-major mode +
- +
-Analysis finished: Fri May  3 17:11:00 2013 +
 </code> </code>
genetica/pre_1kg.txt · Last modified: 2020/08/04 10:58 (external edit)