Curt Hinrichs

JMP Essentials


Скачать книгу

are treated the same way.

      Table 2.2 Data Types Appropriate for Modeling Types

Modeling TypeData Type
NumericCharacter
ContinuousYesNo
NominalYesYes
OrdinalYesYes
Note
Numeric data is right-justified in the data table, whereas character data is left-justified. This can be useful to check whether data contains errors.

      In our example, Big Class contains five variables (or columns) representing each of these modeling types. (See Figure 2.24.) Let’s briefly explain why they are classified by their data and modeling types:

      ● Name is nominal because it is a character data type and the student’s name is arbitrary.

      ● Age is ordinal because the values are rounded down and we want to retain the six ordered age groups (12 to 17) in our analysis.

      ● Sex is nominal because its data type is character (M or F) and it has no order.

      ● Height and weight are continuous because they are both numeric and represent a measurement.

Note
Age could also be considered continuous because the values are numeric, but this would treat age differently and yield different results.

      Figure 2.24 Understanding Modeling Types in the Data Table

Figure 1.1 Some JMP Help Options
Note
Let’s briefly review some of the more specialized data and modeling types and what they are used for.• Row State is a data type that enables you to store and manage information about a row of data. (See Section 2.5.)• Expression is a data type that enables you to store images or matrices in a column.• Multiple Response is a modeling type that is commonly used in surveys where one may be asked for more than one answer.• Unstructured Text is a modeling type that is used for documents such as customer reviews, wine tasting notes, or an entire book. This modeling type is used in text mining with JMP’s Text Explorer.• None is a modeling type that tells JMP not to use that column in an analysis. This might be used for a column that represents an identifier for each row of data, for example, a student ID number, or patient number.For more information, select Help  JMP Documentation Library  Using JMP  Ch. 5, About Modeling Types.

      Changing the Modeling Type

      When you import data, the JMP default selects and assigns one of two modeling types based on whether the data is numeric or character. Numeric data becomes continuous and character data becomes nominal. Sometimes you might want to change the default modeling type of your data to generate results that are more meaningful.

      For example, if we imported the Big Class data from Excel, age as numeric data would be imported as a continuous column. We might want to change that to ordinal. Changing the modeling type is simple in JMP. Click the column’s corresponding icon in the Columns panel in the data table and select the correct type. (See Figure 2.25.)

      Figure 2.25 Changing the Modeling Type

Figure 1.1 Some JMP Help Options

      If the Continuous option is grayed out, your data type is classified as character. To change the data type, double-click on the column heading and change the data type to numeric. (See Figure 2.26.) In this window, you can also change the modeling type along with a host of other formatting options, which are described in the next section.

      Figure 2.26 Changing the Data Type

Figure 1.1 Some JMP Help Options

      For more information, select Help  Books  Using JMP  Chapter 5, Set Column Properties  About Data and Modeling Types.

      Sometimes data is not in the best shape or in the right form when it is imported. Fortunately, JMP has extensive column formatting abilities. This section focuses on the most common features, including:

      ● Cleaning up your data format, such as decimal places, dates, times, and currency. We will use the Column Info window to accomplish these tasks.

      ● Introducing the Formula Editor, which enables you to create new columns from old ones, add IF statements, and transform data using basic or more advanced functions. We will introduce a basic example in this section. For more information, select Help  JMP Documentation Library.

      ● Learning to use the RECODE command, which is a handy way to merge similar categorical responses into a single category. For example, if you have Woman, Female, and Girl as responses, you can merge these into a single response: Female.

      Example 2.2 Movie Rentals

      We will use the Movies.jmp data table to illustrate the concepts in this section. This data table consists of the 277 top-grossing movies released between 1937 and 2003. The columns are:

      ● Movie: name of movie

      ● Type: genre/category of movie (for example, comedy, family)

      ● Rating: US movie rating system (for example, general audience [G], adult [R])

      ● Year: year of movie release (for example, 1937)

      ● Domestic $: US domestic revenue in $ earned by the movie in that year

      ● Worldwide $: Worldwide revenue in $ earned by the movie in that year

      ● Director: director of movie

      You can access this data table in the Sample Data folder that is installed with JMP by selecting File  Open  C:  Program Files  SAS  JMP  15  Samples  Data  Movies.

      Getting your data into a standard format is done through the Column Info window, which is accessed from the Cols menu. Options to format your data are driven by the data and modeling types specified for that column of data. You can change these types, if necessary, to meet the requirements of your analysis. Recall that changing these types affects the graphs or statistics that you can generate from that column. (See the previous section.) Let’s begin by opening the Movies.jmp data table:

      1. Open the Movies.jmp data table.

      2. Select the Domestic $ column, and then select Cols  Column Info.

      3. Because Domestic$ is a numeric value, you see the Format drop-down menu (see Figure 2.27), which leads to several options. It is also our starting point for the next items we will discuss. Note that if you select a Character column, the Format menu does not appear in the Column Info window.

      Figure 2.27 Column Info Format Menu – Continuous Variables

Figure 1.1 Some JMP Help Options
Note
You can also either double-click on the column name as mentioned in the previous section, or right-click on the column header and select Column Info from the menu.

      Formatting Decimal Places

      To change the number of decimal places displayed in a column of data, do the following:

      1. Click on the column of interest. In our example, it is Domestic$.

      2. Select Cols  Column Info. JMP will make a best guess on the format of the data; in our example, Currency was correctly specified. (See Figure 2.27.) You can easily change this format by selecting another format from the menu.

      3. To the right of the Format menu are two boxes, Width and Dec. Width refers to the number of characters that can be in the column, and Dec refers to the number of decimals right of the point. In our example, type “0” in the Dec box, then select Apply and OK. (See Figure 2.28.)

      Figure 2.28 Formatting Decimal Places