User Tools

Site Tools


gaain:convert

GAAIN

Como parte del proyecto GAAIN hay una parte de los datos que publicamos a trves de este consorcio. El formato de salida que me da Sergi es raro asi que voy a convertirlo

segun la leyenda esta

El formato de partida es una cosa asi:

case;born_date;sex;age_diag_bassal;race_ethnic;AF_Demencia_A;APOE;dateNeuropsyc.1;dateNeuropsyc.2;dateNeuropsyc.3;dateNeuropsyc.4;dateNeuropsyc.5;dateNeuropsyc.6;dateNeuropsyc.7;dateNeuropsyc.8;dateNeuropsyc.9;dateNeuropsyc.10;dateNeuropsyc.11;dateNeuropsyc.12;M_RECOG_NP.1;M_RECOG_NP.2;M_RECOG_NP.3;M_RECOG_NP.4;M_RECOG_NP.5;M_RECOG_NP.6;M_RECOG_NP.7;M_RECOG_NP.8;M_RECOG_NP.9;M_RECOG_NP.10;M_RECOG_NP.11;M_RECOG_NP.12;M_RETENC_NP.1;M_RETENC_NP.2;M_RETENC_NP.3;M_RETENC_NP.4;M_RETENC_NP.5;M_RETENC_NP.6;M_RETENC_NP.7;M_RETENC_NP.8;M_RETENC_NP.9;M_RETENC_NP.10;M_RETENC_NP.11;M_RETENC_NP.12;date_followup.0;date_followup.1;date_followup.2;date_followup.3;date_followup.4;date_followup.5;date_followup.6;date_followup.7;date_followup.8;date_followup.9;date_followup.10;date_followup.11;date_followup.12;date_followup.13;date_followup.14;date_followup.15;date_followup.16;date_followup.17;date_followup.18;date_followup.19;cdr_followup.0;cdr_followup.1;cdr_followup.2;cdr_followup.3;cdr_followup.4;cdr_followup.5;cdr_followup.6;cdr_followup.7;cdr_followup.8;cdr_followup.9;cdr_followup.10;cdr_followup.11;cdr_followup.12;cdr_followup.13;cdr_followup.14;cdr_followup.15;cdr_followup.16;cdr_followup.17;cdr_followup.18;cdr_followup.19;mmse_followup.0;mmse_followup.1;mmse_followup.2;mmse_followup.3;mmse_followup.4;mmse_followup.5;mmse_followup.6;mmse_followup.7;mmse_followup.8;mmse_followup.9;mmse_followup.10;mmse_followup.11;mmse_followup.12;mmse_followup.13;mmse_followup.14;mmse_followup.15;mmse_followup.16;mmse_followup.17;mmse_followup.18;mmse_followup.19;diag_followup.0;diag_followup.1;diag_followup.2;diag_followup.3;diag_followup.4;diag_followup.5;diag_followup.6;diag_followup.7;diag_followup.8;diag_followup.9;diag_followup.10;diag_followup.11;diag_followup.12;diag_followup.13;diag_followup.14;diag_followup.15;diag_followup.16;diag_followup.17;diag_followup.18;diag_followup.19
4556;1931-01-10;2;74,4;0;0;;2005-05-13;2006-02-24;;;;;;;;;;;;2;;;;;;;;;;;;0;;;;;;;;;;;2005-05-24;2006-02-24;2006-12-01;2007-06-15;2008-01-24;2008-10-09;2009-05-11;2009-11-09;2010-03-29;2010-12-20;2011-06-27;2012-01-09;2012-04-11;2012-06-27;2012-09-19;2012-11-07;2013-05-06;2014-03-06;2014-11-17;;;1;1;2;2;;2;2;3;3;3;3;3;3;3;3;3;3;3;;18;17;18;15;16;16;16;14;12;8;9;6;1;1;1;1;;1;;;4;4;4;4;4;4;4;4;4;4;4;4;4;4;4;4;4;4;4;

Asi que primero convierto las comas en puntos, los punto y coma en comas y despues me fajo con los APOE.

$ sed 's/,/./g' gaain_oscar2.csv | sed 's/;/,/g' > tmp_gaain.csv

y ahora vamos a cambiar el APOE

$ sed 's/e2e2/1/;s/e2e3/2/;s/e2e4/3/;s/e3e3/4/;s/e3e4/5/;s/e4e4/6/' tmp_gaain.csv > gaain_matrix_try.csv

y ahora compruebo que tal van los APOE:

$  awk -F "," '{if(!$7) print}' gaain_matrix_try.csv | wc -l
9071
$  awk -F "," '{if($7==0) print}' gaain_matrix_try.csv | wc -l
0
$  awk -F "," '{if($7==1) print}' gaain_matrix_try.csv | wc -l
4
$  awk -F "," '{if($7==2) print}' gaain_matrix_try.csv | wc -l
150
$  awk -F "," '{if($7==3) print}' gaain_matrix_try.csv | wc -l
53
$  awk -F "," '{if($7==4) print}' gaain_matrix_try.csv | wc -l
1683
$  awk -F "," '{if($7==5) print}' gaain_matrix_try.csv | wc -l
1016
$  awk -F "," '{if($7==6) print}' gaain_matrix_try.csv | wc -l
167
gaain/convert.txt · Last modified: 2020/08/04 10:58 by 127.0.0.1