Discovering Connections in Historical Domains: an Approach Based on Semantic Trajectories moreDoctoral Consortium Poster, Twelfth International Conference on the Principles of Knowledge Representation and Reasoning, Toronto, Canada, May 9-13, 2010. |
40 views |
Discovering Connections in Historical Domains: an Approach Based on Semantic Trajectories
Ilaria Corda
School of Computing, University of Leeds, United Kingdom ilaria@comp.leeds.ac.uk
Abstract
When browsing information resources people often face the problem of identifying relevant connections between entities and facts. In historical domains, tasks typically involve discovering temporal and causal relations between events. The aim of this research is to assist users’ navigation in historical domains by generating Semantic Trajectories, which are derived from an appropriate event ontology and causal model. Our approach identifies semantic trajectories to help users discover connections relevant to their information seeking goals. The approach is illustrated in a case study from the History of Science domain.
• Formal Representation of an Event Ontology. Formal syntax and semantics of an event ontology including reasoning rules, used for constructing semantic trajectories. • Formal Representation of a Causal Model. A generic formal approach for discovering connections between historical events which further specifies semantic trajectories on the basis of causal reasoning. • Task Representation. A formal representation of the user’s goals and tasks in the History of Science • Adapting Semantic Trajectories. Task-based semantic trajectories only generating information relevant to the user’s tasks. • Implementation. An implementation of a web-based interface illustrating semantic trajectories based on event ontology, causal model, and task representation.
Motivation and Description of the Problem
The remarkable growth of digital content available over the web has brought forth the need to develop technologies that provide intelligent ways for organising and accessing information. Finding information in digital environments leads to a number of user-oriented issues related to dealing with abundance of unstructured content. When people look for information, it is usually to resolve problems or accomplish tasks for which their current state of knowledge is not adequate. Although users are aware of their lack of knowledge, they are often unable to identity what can fulfil their purposes (Belkin 2000). In addition, the nature of the domain influences information seeking activities and the tasks users want to accomplish. For instance, in a history domain users may be mainly interested in investigating the dependency between historical events and elicit causal connections between them. The need to provide effective support for accessing historical content is the main driver for the research in this PhD project. Research Focus and Key Challenges. Our work focuses on the use of an ontology to provide structured access to information in historical domains. Specifically, we investigate how ontological structures can enhance users’ experiences searching through digital content by generating semantic trajectories. Particular emphasis is on historical domains and temporal aspects of information. We illustrate the approach in the domain of History of Science. To support and guide users when interacting with information from historical domains, this PhD will investigate the following key challenges:
Plan of Research
Our methodology consists of three stages : a) generating semantic trajectories to explore the contextualised reference space, b) generating semantic trajectories on the basis of the causal model c) generating goal-driven semantic trajectories on the basis of the user’s tasks. Stage 1: Generating semantic trajectories for exploring the contextualised reference space • Developing a logical model of an event ontology by generalising our previous work (Corda 2007). • Developing a logical approach for representing our rules. • Developing a formal representation of semantic trajectories based on the ontology model. • Applying the ontology model for constructing semantic trajectories from encoded facts and entities. Stage 2: Causal model for semantic trajectories • Developing a logical model for discovering possible causal connections between historical events. • Developing a logical approach for representing causal rules.
• Developing a formal representation of semantic trajectories based on the causal model. • Applying the causal model and reasoning for generating semantic trajectories. Stage 3: Task model for adapting semantic trajectories • Developing a formal representation of goals and tasks in the History of Science. • Developing a formal representation of semantic trajectories based on user’s tasks. • Developing a logical approach for representing goaldriven reasoning. • Applying the task-based reasoning for generating semantic trajectories.
Vr is the set of binary relation symbols; Vv is the set of event-verb symbols. The domain D specifies the objects from the real world: D = I, E, P where I is the set of all individuals; E is the set of all event tokens; P is the set of all time points. is a total order relationship over P. We consider two functions over the domain: begin: E → P and end: E → P where begin(e) end(e) for every event token e ∈ E. The interpretation structure δ = δc , δn , δt , δh , δr , δv interprets the non-logical symbols from the vocabulary by mapping them to the semantics: • δc : Vc → 2I assigns to each concept symbol a subset of individuals in I; • δn : Vn → I vidual from I; assigns to each name symbol an indi-
Progress to date
This PhD builds on a completed Master research project (Corda 2007) which developed an approach for conceptualising the History of Science domain, focusing on the Astronomical Revolution. Our approach adapted Davidson’s theory of events (Davidson 1967) to associate properties to History of Science events, and exploited Allen’s interval algebra (Allen 1991) to reason about temporal connections between events. The ontology was defined in Prolog and gave an initial exploration of the possible connections which can be discovered between History of Science events (Corda, Bennett, and Dimitrova 2008). We have taken this research further to define a generic approach applicable across historical domains, which will allow guiding a user to discover connections between entities and facts. We have developed a logical model of an event ontology that enables us to define semantic trajectories showing connections in a historical domain. In this section, we outline the ontology model and our initial definition of semantic trajectories. We will illustrate the logical model with examples from our History of Science ontology.
• δt : Vt → P assigns to each time point symbol a time point from P; • δh : Vh → E assigns to each event token symbol an event token from E; • δr : Vr → 2I×I assigns to each binary relation a subset of pairs from I; • δv : Vv → ((I×I) → 2E ) assigns to each eventverb symbol a mapping from the set of pairs of individuals I × I to a subset of event tokens from E. Example We illustrate δc , δr and δh :
δc (astronomer) = {galileo, ptolemy, brahe . . . } δr (influence) = { copernicus, kepler , kepler, brahe . . . } δv (observe) = { galileo, sunspot , {GSun1, GSun2} , brahe, supernova , {BrSup1,BrSup2} , brahe, starspot , {} , . . . }
The syntax of our object language consists of terms and propositions. The terms include Individuals Vn = {a, b, c, . . . }; Time points Vt = {t1 , t2 , t3 , . . . }; Concepts Vc = {C1 ,C2 ,C3 , . . . }; and Event tokens Vh = {e1 , e2 , e3 , . . . }. The propositions are either atomic propositions or propositional constructs. The atomic propositions include: • C1 C2 where C1 ,C2 ∈ Vc • C1 (a) where C1 ∈ Vc and a ∈ Vn • R(a, b) where R ∈ Vr and a, b ∈ Vn • a = b where a, b ∈ Vn • t1 = t2 where t1 , t2 ∈ Vt • e1 = e2 where e1 , e2 ∈ Vh • t1 ≤ t2 where t1 , t2 ∈ Vt • token(e, V(a, b)) where e ∈ Vh ; V ∈ Vv and a, b ∈ Vn
Event Ontology Model
An Event Ontology is a structure Ω = V, D, Φ, , begin, end, δ where V is a vocabulary of symbols; D is a domain representing all entities in the real world; Φ is the set of all asserted and inferred formulae; is a total order relationship over the domain; begin and end are functions over the domain; δ is an interpretation structure. The vocabulary V specifies the sets of non-logical symbols: V = Vc , Vn , Vt , Vh , Vr , Vv where Vc is the set of concept symbols; Vn is the set of name symbols; Vt is the set of time point symbols; Vh is the set of symbols associated with event tokens (happenings);
• begin(e, t) and end(e, t) are functional relationships where e ∈ Vh and t ∈ Vt . In addition, we use the propositional constructs: • participate(a, e) where a ∈ Vn and e ∈ Vh • before(e1 , e2 ) where e1 , e2 ∈ Vh • during(e1 , e2 ) where e1 , e2 ∈ Vh • overlap(e1 , e2 ) where e1 , e2 ∈ Vh The semantic evaluation of each proposition is defined using the interpretation structure δ and standard set theory. For instance, C1 C2 and before(e1 , e2 ) are evaluated as: C1 C2 = true if δc (C1 ) ⊆ δc (C2 ), if end(δh (e1 )) otherwise = false begin(δh (e2 )),
before(e1 , e2 ) = true otherwise = false
which can guide the user’s exploration of the corresponding domain object. To define trajectories, we first define a sequence of semantically connected formulae ϕ1 , ϕ2 , . . . , ϕn where every ϕi (1 < i ≤ n) is either an atomic proposition which has a common term with ϕi−1 or can be derived from a sequence of formulae ϕi1 , ϕi2 , . . . , ϕim such that each of ϕi1 , ϕi2 , . . . , ϕim belongs to ϕ1 , . . . , ϕi−1 . Given a term f , where f ∈ Vn ∪ Vh , a semantic trajectory with a focus f is a sequence of semantically connected formulae τ[ f ] = ϕ1 , ϕ2 , . . . , ϕn where ϕ1 includes the term f . Given two terms f1 and f2 , where f1 , f2 ∈ Vn ∪ Vh , a semantic trajectory connecting f1 and f2 is a sequence of semantically connected formulae τ[ f1 , f2 ] = ϕ1 , ϕ2 , . . . , ϕn where ϕ1 includes the term f1 and ϕn includes the term f2 . Each term f or a pair of terms f1 , f2 can be a focus of more than one trajectory, and the set of all possible trajectories is denoted as T [ f ] or T [ f1 , f2 ], respectively. Level 1 semantic trajectories will return the whole sets T [ f ] or T [ f1 , f2 ]. Level 2 semantic trajectories will expand T [ f ] and T [ f1 , f2 ] to include trajectories derived by applying causal reasoning which will infer sequences of events that might be causally connected. Level 3 will select subsets of T [ f ] and T [ f1 , f2 ] associated to particular user goals. Example We will illustrate three level 1 semantic trajectories derived from the history of science ontology.
τ1 [galileo] = astronomer(galileo), astronomer
entist(galileo) scientist, sci-
We use a set of rules in the form of ϕ1 , ϕ2 ⇒ ϕ classified in three main modes: • Concept-based mode includes rules that determine direct and indirect concept-individual inheritance. For instance: C1 (a), (C1 C2 ) ⇒ C2 (a) • Relation-based mode includes rules which define transitive, symmetrical and inverse relationship closures. For instance: R(a, b), R(b, c) ⇒ R(a, c) where R is a transitive relation (e.g. influence). • Event-based mode includes rules which define reasoning upon events. For instance: before(e1 , e2 ), during(e3 , e2 ) ⇒ before(e1 , e3 ) The rules allow to derive new formulae from a sequence of existing formulae, used for generating semantic trajectories.
Semantic Trajectories
Semantic Trajectories enable discovering connections between individuals and events from the domain based on the ontology and the reasoning rules. In this work, we consider three levels of Semantic Trajectories: • Level 1: facilitating the exploration of contextualised reference spaces of individuals and events from the domain. These trajectories are generated using the ontology and the concept-based, relation-based, and event-based modes; • Level 2: pointing at potential causal connections between domain events by employing a causal reasoning mode; • Level 3: supporting goal-driven navigation by selecting only those trajectories (from the previous types) which are relevant to the user’s tasks. We will provide our initial definition of level 1 semantic trajectories. Each trajectory has a focus which is either a term f or a pair of terms f1 , f2 corresponding to individuals or event tokens from the domain. The trajectory defines relevant semantic connections about the focus term (or pair of terms)
τ2 [galileo, newton] = formulate(galileo, tides theory), explain(tides theory, tides), subphenomenon(tides, gravity), investigate(newton, gravity) τ3 [token(GTel, invent(galileo, telescope))] = GSid), token(GSid, publish(galileo, sidereus)), GSun1), token(GSun1, observe(galileo, sunspot))
before(GTel, before(Gtel,
τ1 is derived by applying a Concept-based rule, τ2 is derived using atomic propositions, and τ3 illustrates the application of an Eventbased rule.
Current and Future Work
We have identified the theoretical assumptions for our causal model and are developing formal definitions for type and token level causation. We have are considering necessary and sufficient conditions and are investigating how to represent deterministic and likely causation between historical events. At the moment, we are defining patterns under which eventtokens are causally connected and exemplifying them with facts and entities form the History of Science. The logical definitions of level 2 and 3 semantic trajectories will be expanded together with our work on the causal and task model.
Related Work
Our work relates to semantic approaches to information browsing, event modelling, and formal models of causation. Concept mapping (Novak and Caas 2006) and Topic-Map based interfaces (Biezunski and Newcomb 2001) provide subject-centric approaches for structuring resources. However, Concept Maps are lacking of formal semantics and TopicMaps only provide a lightweight representation for organising information in web-based applications. Most critically in our case, Topic Maps do not support an effective way for representing and exploring temporal relations between events. In contrast, our approach supports structured access to web resources in historical domains based on an event-based representation. Modelling of events is increasingly gaining widespread attention for a number of reasons. Lately, there has been a growing number of event-based systems which require a formal representation of events (Szabolcs Rozsnyai and Schatten 2007). There are mainly two kinds of event models: those which allow interoperability between diverse eventbased components and systems (Scherp et al. 2009), and those developed for specific applications (Ram Nevatia and Bolles 2004) or domains (Raimond and Abdallah. 2007). In particular, domain-centred event representations do not often include formal syntax and semantics for representing objects other than events (Fernando 2008). On the other hand, some existing models of event show a complexity which is beyond our requirements. (Glenn Shafer and Scherl 2000). In contrast, our logical model is grounded on the Davidson’s theory of event (Davidson 1967), but it also provides a general syntax and formal semantics for representing relationships between entities and facts. In the context of causation, the issue of discovering causal connections is still appealing and challenging (Halpern and Pearl 2005), especially in historical domains (Masterman and Sharples 2002). Many authors have endorsed that there are two forms of causation: one related to tokens and the other related to types (Hausman 1998; Eells and Sober 1983). The type and token distinction is one of the theoretical assumptions of our causal model and allow us to distinguish causal dependencies between event types and corresponding event tokens. Acknowledgement Many Thanks to my supervisors Vania Dimitrova and Brandon Bennett from the University of Leeds. I would like also thank all members from the Knowledge Representation and Reasoning group at the University of Leeds for their comments and feedback.
[Biezunski and Newcomb 2001] Biezunski, M., and Newcomb, S. R. 2001. Xml topic maps: Finding aids for the web. IEEE MultiMedia 8(2):104–108. [Corda, Bennett, and Dimitrova 2008] Corda, I.; Bennett, B.; and Dimitrova, V. 2008. Interacting with an ontology to explore historical domains. In ONTORACT ’08: Proceedings of the 2008 First International Workshop on Ontologies in Interactive Systems, 65–74. Washington, DC, US: IEEE Computer Society. [Corda 2007] Corda, I. 2007. Ontology-based representation and reasoning about the history of science. Master’s thesis, School of Computing, University of Leeds. [Davidson 1967] Davidson, D. 1967. The logical form of action sentences. In Rescher, N., ed., The Logic of Decision and Action. University of Pitsburgh Press. [Eells and Sober 1983] Eells, E., and Sober, E. 1983. Probabilistic causality and the question of transitivity. Philosophy of Science 50:35–57. [Fernando 2008] Fernando, T. 2008. Observing events and situations in time. Linguistic and Philosophy Journal 30:527–550. [Glenn Shafer and Scherl 2000] Glenn Shafer, P. R. G., and Scherl, R. 2000. The logic of events. Annals of Mathematics and Artificial Intelligence 28:315–389. [Halpern and Pearl 2005] Halpern, J. Y., and Pearl, J. 2005. Causes and explanations: A structural-model approach part i: Causes. British Journal of Philosophy of Science 56:843–887. [Hausman 1998] Hausman, D. M. 1998. Causal Asymmetries. Cambridge University Press. [Masterman and Sharples 2002] Masterman, L., and Sharples, M. 2002. A theory-informed framework for designing software to support reasoning about causation in history. Comput. Educ. 38(1-3):165–185. [Novak and Caas 2006] Novak, J. D., and Caas, A. J. 2006. The theory underlying concept maps and how to construct them. Technical report, Institute for Human and Machine Cognition. [Raimond and Abdallah. 2007] Raimond, Y., and Abdallah., S. 2007. The event ontology. Available on line at: http://motools.sf.net/event. [Ram Nevatia and Bolles 2004] Ram Nevatia, J. H., and Bolles, B. 2004. An ontology for video event representation. In Computer Vision and Pattern Recognition Workshop. [Scherp et al. 2009] Scherp, A.; Franz, T.; Saathoff, C.; and Staab, S. 2009. A model of events based on a foundational ontology. Technical report, Technical Report of the Department of Computer Science. [Szabolcs Rozsnyai and Schatten 2007] Szabolcs Rozsnyai, J. S., and Schatten, A. 2007. Concepts and models for typing events for event-based systems. In Proceedings of the 2007 inaugural international conference on Distributed event-based systems, volume 233 of ACM International Conference Proceedings Series, 62–70. ACM.
References
[Allen 1991] Allen, J. F. 1991. Time and time again: the many ways to represent time. International Journal of Intelligent Systems 6:pages 341–355. [Belkin 2000] Belkin, N. J. 2000. Helping people find what they don’t know. Communications of the ACM 43(8):58–61.