Dean Allemang

Semantic Web for the Working Ontologist


Скачать книгу

3.7 Graphical version of the tabular data from Table 3.12.

      They don’t want to just say that Shakespeare wrote Hamlet, but they want to qualify this statement and say that Shakespeare wrote Hamlet in 1604 or that Wikipedia states that Shakespeare wrote Hamlet in 1604. In general, these are cases in which it is, or at least seems, desirable to make a statement about another statement. This process is called reification. Reification is not a problem specific to Semantic Web modeling; the same issue arises in other data modeling contexts like relational databases and object systems. In fact, one approach to reification in the Semantic Web is to simply borrow the standard solution that is commonly used in relational database schemas, using the conventional mapping from relational tables to RDF given in the preceding challenge. In a relational database table, it is possible to simply create a table with more columns to add additional information about a triple. So the statement Shakespeare wrote Hamlet is expressed (as in Table 3.1) in a single row of a table, where there is a column for the author of a work and another column for its title. Any further information about this event is done with another column (again, just as in Table 3.1). When this is converted to RDF according to the example in Challenge 1, the row is represented by a number of triples, one triple per column in the database. The subject of all of these triples is the same: a single resource that corresponds to the row in the table.

      An example of this can be seen in Table 3.13, where several triples have the same subject and one triple apiece for each column in the table. This approach to reification has a strong pedigree in relational modeling, and it has worked well for a wide range of modeling applications. It can be applied in RDF even when the data have not been imported from tabular form. That is, the statement Shakespeare wrote Hamlet in 1601 (disagreeing with the statement in Table 3.2) can be expressed with these three triples:

Subject Predicate Object
bio:n1 bio:author lit:Shakespeare
bio:n1 bio:title “Hamlet”
bio:n1 bio:publicationDate 1601

      This approach works well for examples like Shakespeare wrote Hamlet in 1601, in which we want to express more information about some event or statement. It doesn’t work so well in cases like Wikipedia says Shakespeare wrote Hamlet, in which we are expressing information about the statement itself, Shakespeare wrote Hamlet. This kind of metadata about statements often takes the form of provenance (information about the source of a statement, as in this example), likelihood (expressed in some quantitative form like probability, such as It is 90 percent probable that Shakespeare wrote Hamlet), context (specific information about a project setting in which a statement holds, such as Kenneth Branagh played Hamlet in the movie), or time frame (Hamlet plays on Broadway January 11 through March 12). In such cases, it is useful to explicitly make a statement about a statement. This process, called explicit reification, is supported by the W3C RDF standard with three resources called rdf:subject, rdf:predicate, and rdf:object.

      Let’s take the example of Wikipedia says Shakespeare wrote Hamlet. Using the RDF standard, we can refer to a triple as follows:

Subject Predicate Object
q:n1 rdf:subject lit:Shakespeare
q:n1 rdf:predicate lit:wrote
q:n1 rdf:object lit:Hamlet

      Then we can express the relation of Wikipedia to this statement as follows:

Subject Predicate Object
web:Wikipedia m:says q:n1.

      Notice that just because we have asserted the reification triples about q:n1, it is not necessarily the case that we have also asserted the triple itself:

Subject Predicate Object
lit:Shakespeare lit:wrote lit:Hamlet

      This is as it should be; after all, if an application does not trust information from Wikipedia, then it should not behave as though that triple has been asserted. An application that does trust Wikipedia will want to behave as though it had.

      So far, we have seen how a collection of triples can be considered as a graph, either for display purposes (as in many of the figures in this chapter), or as we will see in Chapter 6, for querying. But we haven’t been very specific about what exactly we mean by a graph.

      Informally, a graph is a diagram with nodes and edges. In RDF, this corresponds directly to a set of triples. When the same URI is used in many triples (as in, for example, Figure 3.7), the drawing of the graph is highly connected.

      From a more formal point of view in RDF, a graph is simply a set of triples. They might be highly connected, or not at all, it doesn’t matter; a graph is just a set of triples.

      When we manage data sets, we might just refer to all the triples in our data, as we have done with all the examples in this chapter so far. For most situations, this is fine. But we might want to single out a set of triples (i.e., a graph) and give that a name. Since this is the Web, that name will be in the form of a URI. The RDF standard provides a means for doing this—it is called the named graph.

      The idea of a named graph is quite simple; we refer to a set of triples with a name, which itself is a URI.

      Why would we want to name a graph? There are a few basic use cases:

      • One file, one graph. So far, we have seen examples of how we can extract RDF data from spreadsheets. We can extract RDF data from other sources as well, and indeed, we can create data natively as RDF. In the next section, we’ll see how to write down RDF data into a plain text file. When we load this data into an RDF data store, we might want to keep data from different sources separate. A convenient way to do this is to put all the data from one source into a single named graph. The name of the graph (as a URI) can even give information as to where we can find that source.

      • Reification In Section 3.6, we saw the need for higher-order relationships, in which we want to make statements about statements. Named graphs provide another way to accomplish this. We put a set of triples about which we want to make some statement into a named graph, and make the statement about that graph.

      • Context Sometimes when we have a set of triples, we would like to consider them in some context; for example, earlier we considered the fact Kenneth Brannagh