separately for each item across the respondents. It is then possible, for example, to compare profiles of two or more organizations in a ‘snake’ diagram as in Figure 1.5.
Figure 1.4 Profiling: a semantic differential
Figure 1.5 A snake diagram
An alternative to profiling is to locate each case as a single point in multidimensional space. What is known as multidimensional scaling (often referred to as MDS for short, or as perceptual mapping) refers in fact to a series of techniques that help the researcher to identify key characteristics underlying respondents’ evaluations. Such techniques attempt to deduce the underlying dimensions from a series of similarity or preference judgements of objects, products, services, organizations, and so on made by respondents. MDS is explained in more detail in Chapter 6.
Table 1.1 The variables used in the alcohol marketing study
Key points and wider issues
Properties are the characteristics of cases that the researcher has chosen to observe or measure and then record. They may be demographic, behavioural or cognitive and they may play one or more roles in a research project as descriptors, as causes or as effects. Properties are to different degrees researcher constructs that are generated directly, indirectly, derived from two or more properties or treated multidimensionally. Properties do not always constitute ‘reality’ as such, but rather tend to reflect researcher attempts to locate traces of complex systems such as the degree of bureaucracy within an organization.
Most researchers will use the term ‘variable’ rather than ‘property’, to refer to characteristics of cases. However, the next section makes a distinction between properties as variables and properties as set memberships. The distinction is crucial to an understanding of the difference between variable-based and case-based approaches to data analysis. In the alcohol marketing study, the properties were originally used as variables, which are listed in Table 1.1. These are a mixture of demographic, behavioural and cognitive variables. Some are used purely as descriptors while others may in addition play a role as dependent or independent variables. Drink status (Have they ever had a proper alcoholic drink?) is an interesting variable because it could be seen either as an outcome (e.g. What factors are associated with whether or not pupils have already had a proper alcoholic drink?), or it could be interpreted as an independent variable (How does it impact on the likelihood that respondents think they will have an alcoholic drink in the next year?). Some of the variables are measured directly (like watched television in the last seven days), some indirectly (like social class) and some are derived (total importance of brands).
Values
Values are what researchers actually record as a result of the process of assessing properties. Such records may relate either to variables or to set memberships. The values recorded for variables arise from one or more of the activities of classifying, ordering, ranking, counting or calibrating the characteristics of cases. All variables at a minimum classify cases into one of two values, but there may be many or even an infinite number of possible values. The range of values deployed to record a case property either may consist of a defined number of categories that are mutually exclusive – they do not overlap – and are exhaustive of all the possibilities, or are a result of applying a metric. The former may be called ‘categorical’ or ‘non-metric’ variables, the simplest of which are binary variables. These consist of a record of the presence or absence of a property. Thus an organization may be commercial or non-commercial; a nation-state may be democratic or non-democratic; an individual may be married or not married. Some variables are naturally binary, for example a product is either on the shelf in a supermarket or it is not. Some variables with a limited number of possible values may be readily converted into binary sets, for example employment status. If the possibilities are ‘employed full time’, ‘employed part time’ or ‘unemployed’ then these can become either ‘employed full time/not employed full time’ or ‘employed, either full or part time/unemployed’. However, there are nearly always complicating issues. Even the apparently simple distinction between married and not married becomes complicated by decisions about whether ‘not married’ includes divorced, separated, widowed or in a civil partnership. Despite these issues, much of our thinking is binary in nature. We often think that people are ‘right’ or ‘wrong’, that propositions are ‘true’ or ‘false’, that countries are ‘democratic’ or ‘undemocratic’. Binary logic, furthermore, has been at the forefront of developments in electronic circuits, computer science and computer engineering, which are all based on binary language.
Where cases are classified not into the presence or absence of a characteristic, but into contrasting groups, then we have a nominal variable. Dichotomies, for example, consist of two categories that represent two contrasting groups like black/white, male/female, or they may be polar opposites like rich/poor, fast/slow, instrumental/expressive. Such variables may be better treated as two binary variables, ‘rich/not rich’ and ‘poor/not poor’, or better still as fuzzy sets with degrees of membership of these categories. Fuzzy sets are explained below. Strictly speaking, dichotomies are not binary in the sense that white, for example, is not the absence of black, and female is not the absence of male. Cases are either type A or type B rather than A or not A. If there really are only two categories, as with gender, then it does no harm to treat such variables as if they are binary. However, for a yes/no answer in a questionnaire, if it is binary, then ‘no’ really means ‘not yes’ and may, for example, include those who refused to answer, did not have an answer, or the question was inapplicable. In a dichotomy, ‘no’ means that the answer ‘no’ was given and these other possibilities are excluded.
Nominal variables are sometimes converted into binary variables so that, for example, the dichotomy A/B becomes A/not A and B/not B. A trichotomy becomes A/not A, B/not B and C/not C. Statisticians sometimes call these dummy variables and they are useful because they have particular properties and can be used in some statistical procedures where nominal variables are inappropriate.
A key feature of nominal variables is that where there are three or more categories, the order in which the values appear in a table makes no difference to any statistical calculations that may appropriately be applied to the data. The values do need to be listed in some sequence (which might, for example, be alphabetical), but it is not a graduated series from ‘high’ to ‘low’ or ‘large’ to ‘small’. Some variables, however, define the relationships between values not just in terms of categories that are exhaustive and mutually exclusive, but the categories are also arranged in relationships of greater than or less than, although there is no metric that will indicate by how much. Thus product usage can be classified into ‘Heavy’, ‘Medium’, ‘Light’ and ‘Non-user’; there is an implied order, but no measure of the actual usage involved. The various social classes, social grades or socio-economic groups used in various European countries are good examples of such ordered category variables. The individual items used to generate summated rating scales such as the Likert scale, which were explained in the previous section, are also common examples of ordered categories.
In ordered category variables there is usually a limited number of categories into which researchers may map a large number of cases. So, 200 students might be mapped onto five degrees of the extent to which they say they have been bullied at school. However, in other situations it may be possible to rank-order each respondent.