Sampling

Sampling Procedure

The principal objective of the sample design was to provide current and reliable estimates on a set of indicators covering the four major areas of the World Fit for Children declaration, including promoting healthy lives; providing quality education; protecting against abuse, exploitation and violence; and combating HIV/AIDS.  The population covered by the 2006 MICS/PAPFAM is defined as the universe of all women aged 15-49 and all children aged under 5.  A sample of households was selected and all women aged 15-49 identified as usual residents of these households were interviewed.  In addition, the mother or the caretaker of all children aged under 5 who were usual residents of the household were also interviewed about the child.

The 2006 MICS/PAPFAM collected data from a nationally representative sample of households, women and children.  The primary focus of the 2006 MICS/PAPFAM was to provide estimates of key population and health, education, child protection and HIV related indicators for the country as a whole,  for the North West, North East and Central South Zones and for urban and rural areas separately.  Somalia is divided into 18 regions.  Each region is subdivided into districts, and each district into settlements and towns.  The sample frame for this survey was based on the list of settlements developed from the 2005-2006 UNDP Settlement Survey and WHO vaccination campaign data.
 
The Sampling design follows a 4 stage-sample approach. The first stage is the selection of the districts in each of the 18 regions of the country selected using probability proportional to size (pps).  The second stage is the selection of the secondary sampling units which are defined as permanent and temporary settlements. The third stage is the selection of the cluster(s) within the settlement and the fourth stage is the selection of the households to be interviewed. 

Once the districts had been selected great efforts went into compiling a complete list of permanent and temporary settlements within these districts. The main source was the WHO immunisation campaign data, this data was later backed up by the UNDP settlement survey for at least two out of the three zones. Other sources also contributed such as FAO data on water points which could act as proxy for surrounding nomadic areas and temporary settlements. Finally lists were shown to the NGO partners implementing the survey and UNICEF staff on the ground for additional contributions to recent movement of internally displaced persons and nomads. The settlement lists were then sorted into urban and non urban. The first two stages of sampling were thus completed by selecting the required number of clusters from each of the 3 zones by urban and rural areas separately.

Mapping and Listing Activities

For settlements over the estimated size of 150 households some form of segmentation through sketch mapping was necessary. For several district capitals it was possible to use maps from UN Habitat to assist the personnel deployed in sketch mapping. However for most of the larger non-urban settlements there were no maps available. The most important aspect of the sketch mapping was to divide the settlements into roughly equal sizes by estimating the number of households and to clearly delineate the segments using identifiable boundaries.  

Once sketch maps were prepared survey coordinators were then in a position to randomly select the cluster(s) where household would be selected. It must be added at this point that finding people trained in cartographic techniques is rare in Somalia. Thus the quality of the maps varied significantly across the country and resources and time also did not allow for a full household count.

Selection of Households

For the final stage of sampling, the Somali MICS/PAPFAM had no other option than to use the method used in MICS 2 of the Expanded Program for Immunization (EPI) random walk method; the expense of household/dwelling listing would simply be too considerable. 

Whilst the EPI method is quick and approximately self-weighting, it is recognised that this is not a probability sample, and so cannot ensure objectivity of household selection.  In order to try and avoid the subjectivity involved in selecting households some measures were put in place. For example instead of relying on an arbitrary decision regarding the central point of a cluster, supervisors selected at least three or four possible starting points and then randomly choose one of them. Moreover only supervisors were able to select and number the households, not interviewers.  Significant time was spent training supervisors on how to select households in order to avoid some of the criticisms typically directed towards this method. 

For clusters falling in nomadic areas (the temporary settlements) the survey teams were instructed to interview the first 24 households that they came across. Typically nomads do not move in large numbers, therefore in order to ensure representation of nomads in the sample it was necessary to assume a more purposive method of sampling for this group.

Deviation from Sample Design

No major deviations from the original sample design were made.  All clusters were accessed and successfully interviewed with good response rates.

Response Rates

Of the 6000 households selected for the sample 5969 were successfully interviewed for a household response rate of 99.5 percent. In the interviewed households, 7277 women (age 15-49) were identified. Of these, 6764 were successfully interviewed, yielding a response rate of 93 percent. In addition, 6373 children under age five were listed in the household questionnaire. Of these, questionnaires were completed for 6305 which corresponds to a response rate of 98.9 percent. Overall response rates of 92.5 percent and 98.4 are calculated for the women's and under-5's interviews respectively

Weighting

Sample weights were calculated for each of the datafiles.  

Sample weights for the household data were computed as the inverse of the probability of selection of the household, computed at the sampling domain level (urban/rural within each region).  The household weights were adjusted for non-response at the domain level, and were then normalized by a constant factor so that the total weighted number of households equals the total unweighted number of households.  The household weight variable is called HHWEIGHT and is used with the HH data and the HL data.

Sample weights for the women's data used the un-normalized household weights, adjusted for non-response for the women's questionnaire, and were then normalized by a constant factor so that the total weighted number of women's cases equals the total unweighted number of women's cases.

Sample weights for the children's data followed the same approach as the women's and used the un-normalized household weights, adjusted for non-response for the children's questionnaire, and were then normalized by a constant factor so that the total weighted number of children's cases equals the total unweighted number of children's cases.
Generated: MAR-20-2008 using the IHSN Microdata Management Toolkit