What is Semantic Web

Four key concepts are handled in this chapter: the World Wide Web, the Semantic Web, Ontologies, and thesaurus. All four are crucial as they are the guide towards comprehending the whole idea of a Web with machine-readable content.

What is the World Wide Web

Very few people are able to differentiate between the Internet and the World Wide Web. In fact, for most, the two are used interchangeably.This is erroneous. Even though the World Wide Web and the Internet are related, the two systems are not similar. Initialized as WWW, the World Wide Web refers to a network of online content whose formatting is in HTML and its resources are identified by URLs which are inter-connected via hypertext links. The Internet is used as the gateway for accessing the web (The Editors of Encyclopaedia Britannica, 2019).

Uniform Resource Locators (URLs) are used to identify web resources. A URL, also called a web address, acts as a point of reference on the Internet, specifying the exact location where given resources are stored and are the mechanism by which the resource is retrieved from the computer network. Reference web pages (http) are the most common places where URLs occur, but database access (JDBC), email (mailto), and file transfer (ftp) also make use of URLs. These are typically displayed on the address bar of web browsers.

Development of the Information Age is centrally powered by the World Wide Web, a primary tool that billions across the world rely on to interact via the Internet. Web pages are made up largely of text documents, but there are other contents such as audio, video, and images which could be included on the web pages (W3C, n.d.). Hyperlinks are embedded on the web pages so that users can easily move from one page to another, in either direction. When several of these web pages are combined, with all containing the same theme and domain name, they make up a website. The publisher may solely contribute to website content, or there may be different users who all contribute to the content, or their actions may lead to the creation of the website content.

To view a website’s web page on the World Wide Web, one has to type the given page’s URL on the web browser, or simply follow the provided hyperlink. The web browser then initiates a series of fetch instructions in order to gain access to the given resource. This act of scrolling through the web page, and accessing other pages via hyperlink, led to the emergence of the term ‘web surfing’ or ‘browsing’ or ‘navigating through the web’(Computer Hope, 2018). As you go about your browsing, you are literally accessing a web of information that could have been published from any location on the globe.

What is Semantic Web

From a simplified definition, Semantic Web could be said to be an extension of the World Wide Web as we know it now, whereby well-defined meaning is brought to the available information, and users collaboratively access it using better enabling computers (Rouse, 2015). A discussion on Semantic Web is incomplete without the mentioning of Semantic Web Tower. Its outlook is as shown in the diagram below.

Semantic Web Tower
Semantic Web Tower

The essence of Semantic Web is to set the stage for further advancing the current state of the web with the aid of Semantics. To be more specific, semantic annotations are proposed as the means of describing the meaning of different parts of Web information. For instance, a restaurant’s website could be specifically annotated to suitably identify the restaurant name, drinks, category, meals, additional services offered, among others(Millman & Hopping, 2018). This kind of meta-data goes a long way in helping process the website’s information, hence machines can also access it as well, as opposed to the current state of affairs where only humans have access.

With Semantic Web, a common framework is provided through which data sharing and reuse happens. It is essentially an integrator of content, systems, and information applications. This is a term that was coined by Tim Berners-Lee, whose main agenda is to develop a web of information that is machine-readable. Whereas the skeptics do not believe that the goal is feasible, proponents are of the opinion that once rolled out successfully, it will have enormous applications in a wide range of areas. Already, progress made in human sciences research and biological studies prove that the concept can be fruitful. According to Berners-Lee, once the original concept has been achieved, we would have machines talking to machines, as they conduct various operations such as trade and bureaucracy. What intelligent agent people have been working on for decades will finally come to fruition(McGhee, 2018).

But even as experts seek to achieve that, there arises a glaring problem – how can these semantic annotations be combined if each individual has their own terminologies. To solve this issue, it is proposed that vocabularies be organized in the so-called ontologies. The shared vocabularies open a way for them to be easily referenced, hence web resources and applications have some kind of interoperability. More discussion on ontology is provided in the next section.

Ontology

Ontologies are central to the success of Semantic Web. As much as there is no a universal definition for the term, a Semantic Web vocabulary could be regarded as a special form of ontology. Informal definition of the term is ‘a collection of Uniform Resource Identifiers, each having some kind of defined meaning(Lickels, 2012).

The emergence of Semantic Web led to the idea of both Ontologies and Vocabularies. Vocabularies are the driving fuel on the Semantic Web, describing and representing an area of concern. With the aid of these vocabularies, it becomes possible to make a classification of the terms depending on a given application. They are also used to classify possible relationships and highlight the constraints involved. There are situations where vocabularies can be simple (only one or two concepts are described), or very complex (thousands of terms are involved)(Lickels, 2012). There is no certainty over what differentiates vocabularies from ontologies. However, experts seem to have come to the agreement that ‘ontology’ would be used for complex and slightly formal collection of terms, while ‘vocabulary’ is reserved for the strict meaning of words. As already highlighted above, any kind of inference on the Semantic Web largely relies on Vocabularies.

Vocabularies play a crucial role in integrating data, for instance, in situations where there are ambiguities with regard to the terms used to refer to different data sets, or in cases where some little additional knowledge is needed to discover new relationships. Considering the healthcare sector as a case study on how ontologies are applied, medical professionals could rely on them if they want to represent knowledge on diseases, symptoms, and diseases. Pharmaceutical firms rely on them when they want to represent information on allergies, dosages, and drugs. When this knowledge from the pharmaceutical and medical communities is combined, it yields to a host of applications, like decision support systems and tools helpful in medical research. Vocabularies could also be used when you want to organize knowledge. Social networking platforms, government portals, museums, libraries, and newspapers can make use of linked data to achieve unimaginable heights(Ontotext, n.d.).

Ontology is the specification of a given conceptualization describing a domain. It can be applied in a wide range of areas, and the extent to which it is applied depends on the nature of the chosen application. Some applications may make use of very simple vocabularies while others may require an agreement on common terminologies(Obitko, 2014).

What is a Thesaurus

A thesaurus refers to a kind of controlled vocabulary which aims at dictating metadata semantic manifestations when indexing content objects. With a thesaurus, semantic ambiguities are minimized through the introduction of uniformity and consistency while storing and retrieving content objects. Content object refers to an item that has been described to be included in a website, information retrieval system, or various sources of information. With a thesaurus, it becomes possible to assign preferred terms in a way that semantic metadata is linked to the content object (PoolParty, n.d.).

In information retrieval, a thesaurus is used to direct the indexer and the searcher during the selection of the same preferred term or different combined terms so as to represent a certain subject. A thesaurus is made up of at least three elements, namely (1) list of words, (2) relationships in the words, and (3) rules guiding on the use of the thesaurus (Wikipedia, n.d.). A thesaurus leads to improvement in precision and recall by aiding in the expression of manifestations. It can be used to maintain a hierarchical listing of terms and also when one wants to limit semantic ambiguity.

The expressive nature of a thesaurus is advanced enough to the extent that it can lead to the improvement of most enterprise applications. However, it lacks the complexity that is required to create and maintain it in a sustainable manner. Both ontology, thesaurus, and taxonomy are types of controlled vocabularies, which have synonyms and alternative spellings whose combination leads to the formation of concepts. Whereas a taxonomy has only terms that are arranged in a hierarchical manner, a thesaurus introduces non-hierarchical relationships into the concepts and various facets of individual concepts(PoolParty, n.d.).

Leave a Reply

Your email address will not be published. Required fields are marked *