Sampling Procedure

The Yemen MICS sample design was a two-stage stratified cluster sample.  The following parameters were accounted for in designing the sample:

1 - The sample is to provide estimates with reasonable precision at national and urban/rural levels.
2 - The residents of the Yemeni islands and the nomadic population are excluded from survey coverage. 
3 - The size of ultimate cluster is 20 households
4 - It is approximately self-weighted design.

Sample allocation

The sample is allocated proportionally between urban and rural strata; the percentage of households that should be allocated to urban and rural areas was obtained from the 2004 Census. As the ultimate cluster is determined to be 20 households, the number of sample clusters is therefore 200. The proportional allocation of the sample is such that 142 for rural stratum and 58 for urban stratum.

Sample Selection

The sample is to be selected in two stages. The Primary Sampling Unit (PSU) is a village (or a group of villages) in rural areas and a lane (hara) in urban. The micro data of the 2004 Census at these administrative levels has been relied upon to create frames for the first stage sample. The following provides a description of the sample selection in both stages:

First Stage Sample

The 2004 Census data (numbers of households and population) for all urban and rural agglomerations have been utilized to create appropriate frames for the first stage sample of urban and rural strata. It was taken into account that the PSU size would be in the range 150-300 households approximately. The creation of a rural frame has entailed grouping neighboring small villages so as to create PSUs in the range of 150-300 households each. Hence, a rural PSU is in most cases a group of small villages.  The whole village is considered a PSU as long as its size is in the range 150-300 households.

The situation in urban areas is quite different from rural areas since most lanes (Haras) are much larger than the indicated range of the desired PSU size. For this reason, a second (dummy) sampling stage is necessary to reduce the burden of field listing whenever the lane size is above 300 households. The first urban stage sample included 41 PSU's that required division into equally sized parts. Whereas only 4 PSU's in the rural sample needed to be divided into equal parts.

An implicit stratification has been introduced in both rural and urban frames of the PSUs. Governorates were ordered geographically in a serpentine fashion starting from the northwest corner moving to the northeast corner and back to the west, then to the east and so on till the last governorate. Moreover, as governorate are further divided into a number of directorates (modyriate), another process of implicit stratification within each governorate was implemented by geographically ordering directorates following the same way as for governorates. Undoubtedly, implicit stratification will contribute to more precise sample estimates at both national and urban/rural levels. 

The selection of rural and urban first stage samples was made following the Probability Proportionate to Size (PPS) selection method. The employed measure of size (MOS) is the number of Households in each PSU as measured in the 2004 Census. 

Second stage sample

The selected PSU from the first sample stage, whether it was the whole PSU or a part of one, was updated in the field. A field operation was carried out in each PSU (or a part of it), which has been selected in the first stage sample so as to create an updated list of households for each sample PSU.  These lists were used as sample frames for selecting the second stage sample. 

The proposed selection method was determined in such a way so as to create compact ultimate clusters of 20 households in the rural sample, and non-compact ultimate cluster of the same size in the urban sample. The reason for selecting compact clusters for rural sample is that most of the rural sample PSU's are composed of several small villages which are, in most cases, located at the tops of adjacent mountains. The spread of the household sample over several small villages, within the same PSU, that would result from the systematic selection, would impose much difficulty in the main survey fieldwork. Hence it has been deemed operationally efficient to deal with the household list for each rural sample PSU as forming a circle. The selection of a single random number in the range of 1 - the total number of households in the list, will determine the entire household sample to be selected from the sample PSU. The household indicated by the selected random number and the subsequent 19 households in the list constitute the household sample to be selected from rural sample PSU's (keeping in mind the circular nature of the list).

In the case of the urban sample, however, an ordinary random systematic selection is suggested, so as to produce a non-compact cluster of 20 households. The households forming urban PSU (or a part of it) are not dispersed over a large area; hence the compact cluster is not justifiable.

Deviation from Sample Design

No major deviations from the original sample design were made.  All sample enumeration areas were accessed and successfully interviewed with good response rates.

Response Rates

Of the 3979 households selected for the sample, 3972 were found to be occupied. Of these, 3586 were successfully interviewed for a household response rate of 90.3 percent. In the interviewed households, 3912 ever-married women (age 15-49) were identified. Of these, 3742 were successfully interviewed, yielding a response rate of 95.7 percent. In addition, 3918 children under age five were listed in the household questionnaire. Questionnaires were completed for 3783 of these children, which corresponds to a response rate of 96.6 percent. Overall response rates of 86.4 and 87.2 are calculated for the women’s and under-5’s interviews respectively. Response rates were similar across urban and rural areas.


Sample weights were calculated for each of the datafiles.  

Weights were used in deriving survey estimates to account for the expected differences between the updated household lists of the sample PSU and the Measure of Size (the 2004 number of households) as well as non-response which is inevitable in surveys of this nature. If non-response varies substantially over the sample PSU’s weights are needed for data tuning. The final weight is the product of design weight and non-response weight, where the design weight is the inverse of the overall selection probability and the non-response weight is the inverse of response rate.
Generated: MAY-27-2009 using the IHSN Microdata Management Toolkit