letter for survey participation (see Chapter 7) and monitoring the data collection process, for example, by sending solicitations and, eventually, applying the responsive design (see Chapter 8) to better finalize the sampled units' participation.
Data processing takes place from the fifth step forward. Processing data—step 1 is on database creation, where many error risks need identification and their corrections adopted. For instance, item nonresponse must be considered, the reasons for them evaluated, and, when necessary, imputation methods applied (sub‐step Data imputation). Coding should also be considered a sub‐step (sub‐step Code open questions). Erroneous coding could cause misinterpretation of the survey results. Compilation of the questionnaire automatically generates the database and avoids the significant risk of errors connected with data transcription. It is a great advantage of web surveys; data errors are only due to incoherence or respondent little attention. Therefore, a data quality check is still necessary in web surveys, even if the risks for errors are smaller than in traditional paper‐and‐pencil interviews.
Once the checked database has been created, the second phase of data processing takes place (Processing data—step 2). Note that several computations and activities carried out in the second phase of data processing are undertaken according to decisions done at the Designing the web survey step. Application of the estimation and weighting techniques allows for the extension of the survey results to the target population and to present the survey results (sub‐steps: Estimation technique choice, Calculation of weights and estimators, and Process data and produce tables; see Chapters 12 and 13). Estimation procedures are like to those applied in the traditional survey modes. However, several aspects encompass many advantages that are specific to the web. In many cases, it could be possible to link the survey microdata to another database providing individual data for auxiliary variables. A key variable useful for connecting the two databases at the individual level must be available though. Through database integration, modeling survey results, understanding participation behavior, and better profiling the respondents are possible. Integration with administrative databases, paradata databases, or other survey databases could be undergone.
Thus, in selecting the estimation techniques, the researcher needs to integrate the survey database (sub‐step: Integrate datasets) with another database (e.g., an administrative database or a paradata database). This is useful when the researcher has planned to use auxiliary variables in the data processing, to compute weights, or to improve estimates. For instance, propensity scores technique (Chapter 13) is one approach that uses auxiliary variables for estimation purposes.
After deciding possible integration and applying the appropriate estimation and weighting technique, the data processing ends up; tables and figures are then produced to synthesize the survey results. Presenting the survey results includes offering quality indicators as well. AAPOR (2016) provides a detailed description of rates and indicators and emphasizes that all survey researchers should adopt the standardized final disposition for the outcome of the survey. For example, response rate calculation is in many different ways. Not to simply state “the response rate is X” is important. The researcher should exactly name the rate he is talking. In particular, there are two types of response rate:
1 The response rate type 1 (RR1), i.e., the number of complete interviews divided by the number of interviews plus the number of non‐interviews (i.e. refusal, break‐off plus non‐contacts and others, plus all cases of unknown eligibility);
2 The response rate type 2 (RR2), where also partial interviews are considered at the numerator of the rate.
In web surveys, as discussed in Chapter 5, in regard to possible errors that might occur throughout the process, quality also uses specific indicators that take advantage of the paradata collected by the survey system during the survey process. For example, it is possible to compute the time spent in the completion of the questionnaire.
The final step, the End of the survey process, involves writing the conclusions. Usually, a final reporting is written. A clear and consistent reporting of the methods complements comments about substantive research results. This is a key component for a methodologically reliable survey. AAPOR (2016) is stressing this idea. A set of standardization of the outcome details and outcome rates has been proposed, and they should be available and part of a survey documentation.
3.3 Application
In the construction of the termed PAADEL (Agro‐Food and Demographic Panel in Lombardy), the steps described in the flowchart (Figures 3.1 and 3.2) were followed. This section examines some problems that arise at the different steps and the interrelationship between errors. In particular, focus is on the step of choice of the mode and on the use of adaptive design during the step of the data collection.
Skipping the first two steps (Deciding the objective, Metadata description, and Designing the survey), which are related to the survey's subject matter and beyond the present study's scope, focus should shift to the mode choice. The panel's objective was to have a probability‐based web data collection. Target population were Lombardy region inhabitants. Internet penetration in the population was not high, and an exhaustive list of e‐mail addresses did not exist; thus, there was a need for a proxy for the population list. The proxy list available contained postal addresses and phone numbers. Therefore, a probability‐based survey was not possible without some preliminary step to select a probability‐based sample; the adoption of a mixed‐mode approach to cover the part of the population not on the Internet was the solution. Thus, a contact mode had to be decided, considering that postal and telephone number codes were available. The selection of the survey mode for the sampled units took place using a mixed‐contact mode, partially telephone and partially mail. The survey mode was a mixed‐mode as well; only part of the sampled units had an e‐mail address and accepted a web survey, while for some others the interview was by telephone or mail.
Table 3.2 Responses to the survey by mode and percentage composition
Mode | % | Mode | % |
---|---|---|---|
Web | 68.5 | Web | 45.5 |
Phone | 71.2 | Phone | 30.2 |
53.2 | 24.3 | ||
Total | 63.5 | Total | 100.0 |
The data collected by mode are in Table 3.2. The response rate has been satisfactory. The web mode turned out to be the most important component of the mixed‐mode approach.
To address the web component of the mixed‐mode approach, the steps in Figure