Dean Allemang

Semantic Web for the Working Ontologist


Скачать книгу

this example, in the movie (where by the movie we are referring specifically to https://www.imdb.com/title/tt0116477/) represents a context for the assertion Kenneth Brannagh played Hamlet.

      As an example of reification with named graphs, let’s return to the statement, Wikipedia says Shakespeare wrote Hamlet. Suppose we start with the single triple stating that fact:

Subject Predicate Object
lit:Shakespeare lit:wrote lit:Hamlet

      Now, let’s add a column to this table to specify which named graph this is in. Furthermore, we’ll just use the URI https://www.wikipedia.org/ for Wikipedia (since that’s the URL for Wikipedia itself). Then we have

Image

      This is a bit of a degenerate example, since we have a graph that contains a single triple, but there is no reason not to have graphs this small. Of course, there are a lot of other facts that are in the Wikipedia graph. In fact, there is a resource on the Web called dbpedia that does just this—it makes all the data of Wikipedia available as RDF data. We describe it in detail in Chapter 5.

      Named graphs are a simple extension to the RDF formalism, and really don’t change any of the basics; RDF still links one named resource to another, where each name is global in scope (i.e., on the Web). Named graphs simply allow us to manage sets of these links, and to name them as well. Sometimes when we are using named graphs, we refer to quads instead of triples; this is because it is possible to represent a triple and its graph as a four-tuple (as shown in the table above). The name of the fourth entry in the quad is usually called the graph (as it is here), but is sometimes referred to as the context, anticipating a particular use for the named graph.

      So far, we have expressed RDF triples in subject/predicate/object tabular form or as graphs of boxes and arrows. Although these are simple and apparent forms to display triples, they aren’t always the most compact forms, or even the most human-friendly form, to see the relations between entities.

      The issue of representing RDF in text doesn’t only arise in books and documents about RDF; it also arises when we want to publish data in RDF on the Web. In response to this need, there are multiple ways of expressing RDF in textual form.

      One might wonder why we have so many different ways to express RDF, and how they differ. It is useful to compare different serializations to different ways to write the same language; in English and other European languages, the same sentence can be printed or written in cursive script. These don’t look at all alike, and there are good reasons for why we might use one instead of the other in any particular situation. But we can copy a message from cursive to print without any loss of content. The same is true with the serializations; we can express the same triples in one serialization or the other, depending on taste, expediency, availability of tools, and so on.

       N-Triples

      The simplest form is called N-Triples and corresponds most directly to the raw RDF triples. It refers to resources using their fully unabbreviated URIs. Each URI is written between angle brackets (< and >). Three resources are expressed in subject/predicate/object order, followed by a period (.). For example, if the names-pace mfg corresponds to http://www.WorkingOntologist.org/Examples/Chapter3/Manufacture#, then the first triple from Table 3.14 is written in N-Triples as follows:

Image

      It is difficult to print N-Triples on a page in a book—the serialization does not allow for new lines within a triple (as we had to do here, to fit it in the page). An actual ntriple file has the whole triple on a single line. The advantages of N-Triples are that they are easy to read from a file (parse) and to write into a file for importing and exporting.

       Turtle/N3

      In this book, we use a more compact serialization of RDF called Turtle which is itself a subset of a syntax called N3. Turtle combines the apparent display of triples from N-Triples with the terseness of CURIEs. We will introduce Turtle in this section and describe just the subset required for the current examples. We will describe more of the language as needed for later examples. For a full description of Turtle, see the W3C Recommendation [Carothers and Prud’hommeaux 2014].

      Since Turtle uses CURIEs, there must be a binding between the (local) CURIEs and the (global) URIs. Hence, Turtle begins with a preamble in which these bindings are defined; for example, we can define the CURIEs needed in the Challenge example with the following preamble:

Image

      Once the local CURIEs have been defined, Turtle provides a simple way to express a triple by listing three resources, using CURIE abbreviations, in subject/predicate/object order, followed by a period, such as the following:

Image

      The final period can come directly after the resource for the object, but we often put a space in front of it, to make it stand out visually. This space is optional.

      It is quite common (especially after importing tabular data) to have several triples that share a common subject. Turtle provides for a compact representation of such data. It begins with the first triple in subject/predicate/object order, as before; but instead of terminating with a period, it uses a semicolon (;) to indicate that another triple with the same subject follows. For that triple, only the predicate and object need to be specified (since it is the same subject from before). The information in Tables 3.13 and 3.14 about Product1 and Product2 appears in Turtle as follows:

Image Image

      When there are several triples that share both subject and predicate, Turtle provides a compact way to express this as well so that neither the subject nor the predicate needs to be repeated. Turtle uses a comma (,) to separate the objects. So the fact that Shakespeare had three children named Susanna, Judith, and Hamnet can be expressed as follows:

Image

      There are actually three triples represented here—namely:

Image

      Turtle provides some abbreviations to improve terseness and readability; in this book, we use just a few of these. One of the most widely used abbreviations is to use the word a to mean rdf:type. The motivation for this is that in common speech, we are likely to say, “Product1 is a Product” or “Shakespeare is a playwright” for the triples,

Image

      respectively. Thus we will usually write instead:

Image

       RDF/XML

      While Turtle is convenient for human consumption and is more compact for the printed page, many Web infrastructures are accustomed to representing information in HTML or, more generally, XML. For this reason, the