Dean Allemang

Semantic Web for the Working Ontologist


Скачать книгу

the division, and so on. We want to represent these data in RDF.

      Since each row represents a distinct entity, each row will have a distinct URI. Fortunately, the need for unique identifiers is just as present in the database as it is in the Semantic Web, so there is a (locally) unique identifier available—namely, the primary table key, in this case the column called ID. For the Semantic Web, we need a globally unique identifier. The simplest way to form such an identifier is by having a single URI for the database itself (perhaps even a URL if the database is on the Web). We use that URI as the namespace for all the identifiers in the database. We will discuss the minting of URIs more in details in Chapter 5. Since this is a database for a manufacturing company, let’s call that namespace mfg:.

      Then we can create an identifier for each line by concatenating the table name “Product” with the unique key and expressing this identifier in the mfg: namespace, resulting in identifiers mfg:Product1, mfg:Product2, and so on.

      Each row in the table says several things about that item—namely, its model number, its division, and so on. To represent this in RDF, each of these will be a property that will describe the Products. But just as is the case for the unique identifiers for the rows, we need to have global unique identifiers for these properties. We can use the same namespace as we did for the individuals, but since two tables could have the same column name (but they aren’t the same properties!), we need to combine the table name and the column name. This results in properties like mfg:Product_ModelNo, mfg:Product_Division, and so on.

Image

      With these conventions in place, we can now express all the information in the table as triples. There will be one triple per cell in the table—that is, for n rows and c columns, there will be n × c triples. The data shown in Table 3.12 have 7 columns and 9 rows, so there are 63 triples, as shown in Table 3.13.

      The triples in the table are a bit different from the triples we have seen so far. Although the subject and predicate of these triples are RDF resources (complete with CURIE namespaces!), the objects are not resources but literal data—that is, strings, integers, and so forth. This should come as no surprise, since, after all, RDF is a data representation system. RDF borrows from XML all the literal data types as possible values for the object of a triple; in this case, the types of all data are strings or integers.

      The usual interpretation of a table is that each row in the table corresponds to one individual and that the type of these individuals corresponds to the name of the table. In Table 3.12, each row corresponds to a Product. We can represent this in RDF by adding one triple per row that specifies the type of the individual described by each row, as shown in Table 3.14.

Subject Predicate Object
mfg:Product1 mfg:Product_ID 1
mfg:Product1 mfg:Product_ModelNo ZX-3
mfg:Product1 mfg:Product_Division Manufacturing support
mfg:Product1 mfg:Product_Product_Line Paper machine
mfg:Product1 mfg:Product_Manufacture_Location Sacramento
mfg:Product1 mfg:Product_SKU FB3524
mfg:Product1 mfg:Product_Available 23
mfg:Product2 mfg:Product_ID 2
mfg:Product2 mfg:Product_ModelNo ZX-3P
mfg:Product2 mfg:Product_Division Manufacturing support
mfg:Product2 mfg:Product_Product_Line Paper machine
mfg:Product2 mfg:Product_Manufacture_Location Sacramento
mfg:Product2 mfg:Product_SKU KD5243
mfg:Product2 mfg:Product_Available 4
Subject Predicate Object
mfg:Product1 rdf:type mfg:Product
mfg:Product2 rdf:type mfg:Product
mfg:Product3 rdf:type mfg:Product
mfg:Product4 rdf:type mfg:Product
mfg:Product5 rdf:type mfg:Product
mfg:Product6 rdf:type mfg:Product
mfg:Product7 rdf:type mfg:Product
mfg:Product8 rdf:type mfg:Product
mfg:Product9 rdf:type mfg:Product

      The full complement of triples from the translation of the information in Table 3.12 is shown in Figure 3.7. The types (i.e., where the predicate is rdf:type, and the object is the class mfg:Product) are shown as links in the graph; triples in which the object is a literal datum are shown (for sake of compactness in the figure) within a box labeled by their common subject.

      It is not unusual for someone who is building a model in RDF for the first time to feel a bit limited by the simple subject/predicate/object form of the RDF triple.

Image

      Figure