Data Processing

Data Editing

Data editing took place at a number of stages throughout the processing (see Other processing), including:
a) Office editing and coding
b) During data entry
c) Structure checking and completeness
d) Secondary editing
e) Structural checking of SPSS data files

Detailed documentation of the editing of data can be found in the data processing guidelines

Other Processing

Data were processed in clusters, with each cluster being processed as a complete unit through each stage of data processing.  Each cluster goes through the following steps:
1) Questionnaire reception
2) Office editing and coding
3) Data entry
4) Structure and completeness checking
5) Verification entry
6) Comparison of verification data
7) Back up of raw data
8) Secondary editing
9) Edited data back up
After all clusters are processed, all data is concatenated together and then the following steps are completed for all data files:
10) Export to SPSS in 4 files (hh - household, hl - household members, wm - women, ch - children under 5)
11) Recoding of variables needed for analysis
12) Adding of sample weights
13) Calculation of wealth quintiles and merging into data
14) Structural checking of SPSS files
15) Data quality tabulations
16) Production of analysis tabulations

Details of each of these steps can be found in the data processing documentation, data editing guidelines, data processing programs in CSPro and SPSS, and tabulation guidelines.

The data processing was centralized. The field editors checked, cleared and packed the questionnaires by clusters, then questionnaires were delivered to the central office of the National Statistical Committee for further processing. Each incoming pack was registered and simultaneously the database was created.

Data were entered on twenty computers using CSPro software version 2.6.007. In order to ensure quality control, all questionnaires were double entered and internal consistency checks were performed. Procedures and standard programs developed under the global MICS3 project and adapted to the Kyrgyz questionnaire were used throughout. Data processing began simultaneously with data collection in December 2005, and was finished in spring of 2006. Data were analysed using the Statistical Package
for Social Sciences (SPSS) software program, version 14, and the model syntax and tabulation plans
developed by UNICEF for this purpose.

All data entry was conducted at the NSC  head office using manual data entry.  For data entry, CSPro version 2.6.007 was used with a highly structured data entry program, using system controlled approach, that controlled entry of each variable.  All range checks and skips were controlled by the program and operators could not override these.  A limited set of consistency checks were also included inthe data entry program.  In addition, the calculation of anthropometric Z-scores was also included in the data entry programs for use during analysis.  Open-ended responses ("Other" answers) were not entered or coded, except in rare circumstances where the response matched an existing code in the questionnaire.  

Structure and completeness checking ensured that all questionnaires for the cluster had been entered, were structurally sound, and that women's and children's questionnaires existed for each eligible woman and child. 

100% verification of all variables was performed using independent verification, i.e. double entry of data, with separate comparison of data followed by modification of one or both datasets to correct keying errors by original operators who first keyed the files.

After completion of all processing in CSPro, all individual cluster files were backed up before concatenating data together using the CSPro file concatenate utility.

For tabulation and analysis SPSS versions 10.0 and 14.0 were used.  Version 10.0 was originally used for all tabulation programs, except for child mortality.  Later version 14.0 was used for child mortality, data quality tabulations and other analysis activities.

After transferring all files to SPSS, certain variables were recoded for use as background characteristics in the tabulation of the data, including grouping age, education, geographic areas as needed for analysis.  In the process of recoding ages and dates some random imputation of dates (within calculated constraints) was performed to handle missing or "don't know" ages or dates.  Additionally, a wealth (asset) index of household members was calculated using principal components analysis, based on household assets, and both the score and quintiles were included in the datasets for use in tabulations.

Scripts/Programs

Primary data processing programs - CSPro, Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ, English [eng], Kyrgyz Republic [kgz]
Contributor(s): Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ, National Statistical Committee of the Kyrgyz Republic
Publisher(s): Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ
Programs\CSPro.zip
Show more info: Description  Table of Contents 

Secondary data processing programs - SPSS, Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ, 2006-10-26, English [eng], Kyrgyz Republic [kgz]
Contributor(s): Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ, National Statistical Committee of the Kyrgyz Republic
Publisher(s): Strategic Information Section, Division of Policy and Planning (DPP), UNICEF NYHQ
Programs\SPSS Syntax.zip
Show more info: Description  Table of Contents 

Generated: MAR-12-2008 using the IHSN Microdata Management Toolkit