This page contains a Flash digital edition of a book.

The Collection and Processing of National Household Survey Data

Collection The objective of the NHS, according to Statistics Canada’s NHS User’s Guide: National Household Survey, 2011, is

“to provide data for small geographic areas and small population groups.” The survey, conducted between May 2011 and August 2011, was carried out using mixed modes including an online questionnaire, a paper questionnaire mailed to some households and interviewer-administered questionnaires in mainly remote areas. The NHS was administered to a random sample of 4.5 million households that represented slightly less than 30% of

all households. The data collection was done in three waves:  Wave 1 (May and June) focused on online collection;

 Wave 2 (June to mid-July) focused on printed questionnaires that were mailed to households that did not respond online; and

 Wave 3 (mid-July to mid-August) was conducted for households that did not respond to Waves 1 and 2. An important aspect of the non-response follow-up was that, in mid-July, a subsample of 400,000 of the 1.2 million

dwellings that had not yet responded to the NHS was selected for re-contacting. This subsample was carefully chosen on the basis of the geographic distribution of non-response and the heterogeneity of the population. The latter was an attempt to minimize the non-response bias that might arise when hard to enumerate population groups have lower response rates. The overall response rate to the NHS was 68.9%, similar to what Statistics Canada achieves in many voluntary

surveys. Nearly two-thirds of all responses were online. The response rate varied across the country, with lower rates reported in less populated areas mainly found in rural and small-town Canada. It seems likely that these lower rates were in part due to the fact that there was likely much less follow-up in these rural areas that are much more homogeneous. (Statistics Canada has not released detailed information about which geographic areas were included in the special sample for non-response follow-up, but the general description of the follow-up suggests most was done in highly urbanized areas that have a very diverse population.)

Data Processing Although respondents are asked to completely fill in NHS questionnaires, some may be returned incomplete, other

questions left blank and some answers may be inconsistent. Partial missing data, or “item non response” (to differentiate it from “complete non response”), vary in magnitude by content area. Overall, the level of item non- response in the NHS was similar to the level in the 2006 Census.

Weighting The final step in developing the NHS database is the weighting of the sample to make the final NHS data estimates

as representative as possible of the total population. Since the response rates varied by geography and there was a concern about non-response bias caused by differential non-response, the strategy was to develop weights that resulted in the sample being as similar as possible to the characteristics available in the full 2011 Census. It is important to note that, after the calibration and final weighting, there will remain some differences between the NHS estimates and the Census counts.

The first results from the NHS became available in May

2013, with more data to follow during the summer of 2013. An early analysis of the first release points to some of the implications of the switch to a voluntary NHS.

Indicators of Data Quality The NHS, like all surveys, is subject to both sampling-

and non-sampling error. With a final sample of about 21% of the total population, sampling error is only an issue for very small populations. Statistics Canada will be releasing estimates of the coefficients of variation. However, these measures provide no information on non- sampling errors that might result from non-response bias. In other words, they do not measure overall data quality. Statistics Canada’s “Global Non-Response Rate (GNR)”

combines both total non-response and item non- response. Essentially, one can think of this as the percent


of data provided by respondents for a particular geographic area. The GNR has been calculated for all geographic areas for which data are published, and Statistics Canada will report this as an integral part of all published tables. Furthermore, Statistics Canada has decided that no data will be published for geographic areas having a GNR of 50% or higher. Differences between the NHS estimate and the Census

count (discussed in more detail in the NHS User Guide) are relatively small for areas with a population of 5,000 or more but increase as the areas become less populated, and may be particularly large for areas of less than 1,000 total population. According to Statistics Canada, “Whether there is a discrepancy or not is an indication of the quality of the NHS estimates.” Data users should compare the data from the NHS with the Census.


Page 1  |  Page 2  |  Page 3  |  Page 4  |  Page 5  |  Page 6  |  Page 7  |  Page 8  |  Page 9  |  Page 10  |  Page 11  |  Page 12  |  Page 13  |  Page 14  |  Page 15  |  Page 16  |  Page 17  |  Page 18  |  Page 19  |  Page 20  |  Page 21  |  Page 22  |  Page 23  |  Page 24  |  Page 25  |  Page 26  |  Page 27  |  Page 28  |  Page 29  |  Page 30  |  Page 31  |  Page 32  |  Page 33  |  Page 34  |  Page 35  |  Page 36  |  Page 37  |  Page 38  |  Page 39  |  Page 40  |  Page 41  |  Page 42  |  Page 43  |  Page 44  |  Page 45  |  Page 46  |  Page 47  |  Page 48  |  Page 49  |  Page 50  |  Page 51  |  Page 52  |  Page 53  |  Page 54