You are on page 1of 292

Asset Categorization

Asawin Rajakrom

Course Syllabus
This course describes how the power distribution network assets are modeled and categorized into classes and draw a relationships among those classes. The class attribute represents a network data that will be used for inducing asset conditions, costs, probability of network failure as well as social and environment factors that influence the asset investment decision. The modeling approach bases on the prominent a Common Information Model (CIM) modeling method that used for representing real-world objects and information entities exchanged within the value chain of the electric power industry. Underpinning the CIM knowledge representation are several methods and methodologies such as UML, XML, and RDF. The course provides all necessary background of these technologies. In addition, engineering disciplines such as knowledge engineering and ontological engineering which emphasizes the knowledge acquisition and ontology development are also explicated. Combining them all together, attendees will equip themselves with all necessary knowledge to model not just power distribution system assets but all the other area of knowledge modeling.

Course Outline
Categorization principle & terminologies Unified modeling language eXtensible markup language Resource description framework Common information model knowledge engineering Ontological development Power distribution network asset categorization

Asset Categorization
Categorization Principle & Terminologies

Categorization Overview
The basic cognitive process of arranging into classes or categories The process in which ideas and objects are recognized, differentiated and understood. Categorization implies that objects are grouped into categories, usually for some specific purpose. Ideally, a category illuminates a relationship between the subjects and objects of knowledge The function of category systems and asserts that the task of category systems is to provide maximum information with the least cognitive effort The structure of the information so provided and asserts that the perceived world comes as structured information rather than as arbitrary or unpredictable attributes

Controlled Vocabulary
Way of describing a concept under a single word or phrase May vary in its definition and usage when use in different domain An established list of standardized terms used for both indexing and retrieval of information The list of terms should be controlled by and be available from a controlled vocabulary registration authority in order to make a it unambiguous, non-redundant

Controlled Vocabulary
At a minimum, the following two rules should be enforced to make true in practice:
If the same term is commonly used to mean different concepts in different contexts, then its name is explicitly qualified to resolve this ambiguity. If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms or aliases.

Classification
Systematic arrangement in groups or categories according to established criteria Act or process of putting people or things into a group or class Establishing the correct class (or category) for an object where an object needs to be characterized in terms of class to which it belongs

Classification
Classification is an approach to systematically arranging objects into categories according to established criteria. Objects are the physical and conceptual things we find in the universe around us: Hardware, software, documents, animals, human beings, and even concepts. Classification allows to us manage things easily by grouping them into certain category under specific criteria and then manipulate against established condition.

Taxonomy
An orderly classification of plants and animals according to their presumed natural relationships A hierarchy created according to data internal to the items in that hierarchy An orderly classification of objects into hierarchical structure using a parent-child relationships Using parent-child relationships in taxonomy: e.g., whole part, genus species, type instance, or class subclass. Differ from classification in the sense that it classifies in a structure according to some relation between the entities and that a classification uses more arbitrary (or external) grounds

Taxonomy

Ontology
A branch of metaphysics concerned with the nature and relations of being A system of concepts used as building blocks of an information processing system Consists of concepts, hierarchical (is-a) organization of them, relations among them, in addition to is-a and part-of, axioms to formalize the definitions and relations. An explicit specification of a conceptualization

Ontology
Taxonomy and ontology are often interchangeably used, however they are fundamentally different. Taxonomy classifies objects in a domain in hierarchical structure give exact names for everything in a specified domain show which things are parts of other things Ontology offers more by expressing meaningful content within a specified domain of interest. Has strict, formal rules (a "grammar") about those relationships that let us make meaningful, precise statements about our entities/relationships A formal ontology is hence a controlled vocabulary expressed in an ontology representation language

Meta-model
Data about data
Facilitate the understanding, characteristics, and management usage of data

An explicit model of the constructs and rules needed to build specific models within a domain of interest A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models Schema is Metadata

Power Distribution System Asset Categorization


Provide all key attributes of network assets, either concrete or abstract, operational stresses and external environments for determining asset conditions and failure probability Provide all key attributes to deduce asset costs Provide all associated social and environment factors that influence decision of asset investment This information is modeled into classes and attributes as well as class relationships using the common information model (CIM) specification

UML
Unified Modeling Language

Origins of UML
Evolution of object-oriented technology: Develop and start using OOP language Use of OOAD in business process modeling, requirement analysis and software systems design UML was designed to bring together the best features of a number of analysis and design technologies and notations to produce and industrial standard.

Emergence of UML

What is UML?
UML is a visual language that originally applied in developing software systems. Now is extended for using in other area like knowledge modeling. It is a specification language. it has a set of elements and a set of rules that determine how it can be used. Most of UML elements are graphical: lines, rectangles, ovals and other shapes, and many of these graphical elements are labelled with words that provides additional information.

Why use UML?


The needs of modeling: Modeling can be as straightforward as drawing a flowchart listing the steps carried out in business process. Readability brings clarityease of understanding. This involves knowing what a system is made up of, how it behaves, and so forth. Reusability is the byproduct of making a system readable. After a system has been modeled to make it easy to understand, we tend to identify similarities or redundancy, be they in terms of functionality, features, or structure. The underline is standardization.

UML Concepts
UML is used to:
Show main functions and boundaries in a system using use cases and actors. Illustrate use case realizations using interaction diagrams. Represent a static structure of a system using class diagrams. Modelling object behaviour using state diagrams. Show implementation of the physical architecture using component and deployment diagrams. Enhance the functionality using stereotypes.

UML Diagrams and Elements


Use case diagrams Static structural diagrams
Class, object

Interaction diagrams
Sequence, collaboration

State diagrams Activity diagrams Implementation diagrams


Packages, Components, Deployment

Use Cases Diagram


Use cases diagrams describes the behavior of the target system from an external point of view. Use cases describe "the meat" of the actual requirements. Use cases: A use case describes a sequence of actions that provide something of measurable value to an actor and is drawn as a horizontal ellipse. Actors: An actor is a person, organization, or external system that plays a role in one or more interactions with your system. Actors are drawn as stick figures. Associations: Associations between actors and use cases are indicated by solid lines. An association exists whenever an actor is involved with an interaction described by a use case

Use Cases Diagram

Class Diagram
Class diagrams show the classes of the system, their inter-relationships, and the operations and attributes of the classes Explore domain concepts in the form of a domain model. Analyze requirements in the form of a conceptual/analysis model Depict the detailed design of objectoriented or object-based software

Class Diagram
Class name Person Attributes attribute name : type Operations operation name(parameter : type) : result type Person - TaxIDNo : String - Name : String + Income : double + TaxPaid : Boolean + calcTax() + calcTaxBal()

Object Diagram
Object diagrams (instance diagrams), are useful for exploring real world examples of objects and the relationships between them. It shows instances instead of classes. They are useful for explaining small pieces with complicated relationships, especially recursive relationships.

Class and Objects


City Name : String = default Country : String = default Population : integer = default

setName (s : String = deault) setPopulation(p : integer = default)


<<instanceOf>>
London : City
Name = London Country = UK Population =2,324,320

<<instanceOf>>
New York : City Name = New York Country = USA

<<instanceOf>>
Sydney : City Name = Sydney Country = Australia

Population =5,734,012

Population =3,536,000

Sequence Diagram
Sequence diagrams models the collaboration of objects based on a time sequence. It shows how the objects interact with others in a particular scenario of a use case.

Sequence Diagram

Collaboration Diagram
Collaboration (Communication) diagrams used to model the dynamic behavior of the use case. When compare to Sequence Diagram, the Communication Diagram is more focused on showing the collaboration of objects rather than the time sequence.

Collaboration Diagram

State Diagram
State diagrams can show the different states of an entity also how an entity responds to various events by changing from one state to another. The history of an entity can best be modeled by a finite state diagram.

State Diagram

Activity Diagram
Activity diagrams helps to describe the flow of control of the target system, such as the exploring complex business rules and operations, describing the use case also the business process. It is object-oriented equivalent of flow charts and data-flow diagrams (DFDs).

Activity Diagram

Packages Diagram
Package diagrams simplify complex class diagrams, it can group classes into packages. A package is a collection of logically related UML elements. Packages are depicted as file folders and can be used on any of the UML diagrams.

Packages Diagram

Components Diagram
Component diagrams shows the dependencies among software components, including the classifiers that specify them (for example implementation classes) and the artifacts that implement them; such as source code files, binary code files, executable files, scripts and tables.

Components Diagram

Deployment Diagram
Deployment diagram depicts a static view of the run-time configuration of hardware nodes and the software components that run on those nodes. Deployment diagrams show the hardware for your system, the software that is installed on that hardware, and the middleware used to connect the disparate machines to one another.

Deployment Diagram

UML Class Diagrams and Relationships


How would you draw a family tree? The steps you would take would be:
Identify the main members of the family Determine how they are related to each other Identify the characteristics of each family member Find relations among family members Decide the inheritance of personal traits and characters

UML Class Diagrams and Relationships


By definition, a class diagram is a diagram showing a collection of classes and interfaces, along with the collaborations and relationships among classes and interfaces. A class diagram consists of a group of classes and interfaces reflecting important entities of the business domain of the system being modeled, and the relationships between these classes and interfaces. A class diagram is a pictorial representation of the detailed system design.

Elements of a Class Diagram


Name

Attributes

Methods

UML Class Relationships


Relation
Association

Symbol

Description
When two classes are connected to each other in any way, an association relation is established. For example: A "student studies in a college" association can be shown as:

UML Class Relationships


Relation
Multiplicity

Symbol

Description
An example of this kind of association is many students belonging to the same college. Hence, the relation shows a star sign near the student class (one to many, many to many, and so forth kind of relations).

UML Class Relationships


Relation
Directed Association

Symbol

Description
Association between classes is bi-directional by default. You can define the flow of the association by using a directed association. The arrowhead identifies the container-contained relationship.

UML Class Relationships


Relation
Reflexive Association

Symbol

Description

No separate symbol. An example of this kind of relation is when a class has a variety of responsibilities. For example, an employee of a college can be a professor, a housekeeper, or an administrative assistant.

UML Class Relationships


Relation
Aggregation

Symbol

Description
When two classes are When a class is formed as a collection of other classes, it is called an aggregation relationship between these classes. It is also called a "has a" relationship.

UML Class Relationships


Relation
Composition

Symbol

Description
Composition is a variation of the aggregation relationship. Composition connotes that a strong life cycle is associated between the classes.

UML Class Relationships


Relation
Inheritance/ Generalization

Symbol

Description
Also called an "is a" relationship, because the child class is a type of the parent class. Generalization is the basic type of relationship used to define reusable elements in the class diagram. Literally, the child classes "inherit" the common functionality defined in the parent class.

UML Class Relationships


Relation
Realization

Symbol

Description
In a realization relationship, one entity (normally an interface) defines a set of functionalities as a contract and the other entity (normally a class) "realizes" the contract by implementing the functionality defined in the contract..

Other Terms for Annotations of Class Diagrams


Responsibility of a class: It is the statement defining what the class is expected to provide. Stereotypes: It is an extension of the existing UML elements; it allows you to define new elements modeled on the existing UML elements. Only one stereotype per element in a system is allowed. Vocabulary: The scope of a system is defined as its vocabulary. Analysis class: It is a kind of a stereotype. Boundary class: This is the first type of an analysis class. In a system consisting of a boundary class, the users interact with the system through the boundary classes. Control class: This is the second type of an analysis class. A control class typically does not perform any business functions, but only redirects to the appropriate business function class depending on the function requested by the boundary class or the user. Entity class: This is the third type of an analysis class. An entity class consists of all the business logic and interactions with databases.

Put Them Together

XML
eXtensible Markup Language

Evolution
SGML (Standard Generalized Markup Language) ISO Standard, 1986, for data storage & exchange Metalanguage for defining languages (through DTDs) A famous SGML language: HTML!! Separation of content and display Used in U.S. gvt. & contractors, large manufacturing companies, technical info. Publishers,... SGML reference is 600 pages long XML (eXtensible Markup Language) W3C (World Wide Web Consortium) -http://www.w3.org/XML/) recommendation in 1998 Simple subset (80/20 rule) of SGML: ASCII of the Web, Semantic Web. XML specification is 26 pages long

Evolution
Canonical XML normalization, equivalence testing of XML documents SML (Simple Markup Language) Reduce to the max: No Attributes / No Processing Instructions (PI) / No DTD / No non-character entityreferences / No CDATA marked sections / Support for only UTF-8 character encoding / No optional features XML Schema XML Schema definition language Back to complex:
Part I (Structures), Part II (Data Types), Part III aehm 0 (Primer)

What is XML?
XML is a universal format for structured documents and data. Can be understood using any (archaic CP/M) editor Can be parsed easily Contains its own structure (=parse tree) in the data Allows separation of marked-up content from presentation (style sheets) As a self-describing format good for archival into the past - not bad for archival into the future XML uses a Document Type Definition (DTD) or an XML Schema to describe the data XML with a DTD or XML Schema is designed to be self-descriptive

Simple XML Example


<?xml version=1.0 encoding=windows-874?> <note>
<to> Tom </to> <from> Jane </from> <heading> Reminder </heading> <body> Meeting at 9.00 AM</body>

</note>

Why Is XML Important?


Plain Text
Easy to edit Useful for storing small amounts of data Possible to efficiently store large amounts of XML data through an XML front end to a database

Data Identification
Tell you what kind of data you have Can be used in different ways by different applications

Why Is XML Important?


Stylability
Inherently style-free XSL---Extensible Stylesheet Language Different XSL formats can then be used to display the same data in different ways

Inline Reusabiliy
Can be composed from separate entities Modularize your documents without resorting to links

Why Is XML Important?


Linkability -- XLink and XPointer
Simple unidirectional hyperlinks Two-way links Multiple-target links Expanding links

Easily Processed
Regular and consistent notation Vendor-neutral standard

Hierarchical
Faster to access Easier to rearrange

XML Building Blocks


Element
Delimited by angle brackets Identify the nature of the content they surround General format: <element> </element> Empty element: </empty-Element>

Attribute
Name-value pairs that occur inside start-tags after element name, like: <element attribute=value>

XML Building blocks--Prolog


The part of an XML document that precedes the XML data Includes
A declaration: version [, encoding, standalone] An optional DTD (Document Type Definition )

Example
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>

XML Syntax
All XML elements must have a closing tag XML tags are case sensitive All XML elements must be properly nested All XML documents must have a root tag Attribute values must always be quoted With XML, white space is preserved With XML, a new line is always stored as LF Comments in XML:
<!-- This is a comment -->

XML is Based on Markup


<bibliography> Markup indicates <paper ID= "object-fusion"> structure and semantics <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> Decoupled from </bibliography>

presentation

XML Elements
XML Elements are Extensible
XML documents can be extended to carry more information

XML Elements have Relationships


Elements are related as parents and children

Elements have Content


Elements can have different content types: element content, mixed content, simple content, or empty content and attributes

XML elements must follow the naming rules

XML as Labeled Ordered Trees


bibliography paper authors author paper fullpaper ... title ...

can also represent relational and object-oriented data

author

Object Fusion
<bibliography> <paper ...> <authors> <author>Yannis</author> <author>Serge</author> ... </authors> <title>Object Fusion</title> ... </paper> </bibliography>

Yannis

Serge

semistructured data labeled trees/graphs

Elements and their Content


element name
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>

Element Content Empty Element

element

Character content

XML Attributes
Located in the start tag of elements Provide additional information about elements Often provide information that is not a part of data Must be enclosed in quotes Should I use an element or an attribute?
metadata (data about data) should be stored as attributes, and that data itself should be stored as elements

Element Attributes
Attribute name
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>

Attribute Value

XML Validation
"Well Formed" XML document correct XML syntax "Valid" XML document well formed Conforms to the rules of a DTD (Document Type Definition) XML DTD defines the legal building blocks of an XML document Can be inline in XML or as an external reference XML Schema an XML based alternative to DTD, more powerful Support namespace and data types

Displaying XML
XML documents do not carry information about how to display the data We can add display information to XML with
CSS (Cascading Style Sheets) XSL (eXtensible Stylesheet Language) -- preferred

XML Specification
XML Document Type Definitions (DTDs):
define the structure of "allowed" documents (i.e., valid written a DTD) database schema improve query formulation, execution, ...

XML Schema
defines structure and data types allows developers to build their own libraries of interchanged data types

XML Namespaces
identify your vocabulary

Document Type Definitions (DTD)


Define and Constrain Element Names & Structure
<!element <!element <!element <!element <!element <!element <!element <!attlist <!attlist bibliography paper*> paper (authors, fullPaper?, title, booktitle)> authors author+> Element Type author (#PCDATA)> fullPaper EMPTY> Declaration title (#PCDATA)> booktitle (#PCDATA)> fullPaper source ENTITY #REQUIRED> Attribute List paper ID ID>

Declaration

Document Type Definitions (DTD)


Sequence of 0 or more paper
<!element <!element <!element <!element

bibliography paper*> paper (authors, fullPaper?, title, booktitle)> authors author+> Sequence of 1 or author (#PCDATA)>

Authors followed by optional fullpaper, followed by title, followed by booktitle

more author
Character content
<!element <!element <!element <!attlist <!attlist fullPaper EMPTY> title (#PCDATA)> booktitle (#PCDATA)> fullPaper source ENTITY #REQUIRED> paper ID ID>

Document Type Definitions (DTD)


<person ID="yannis"> Yannis info </person> <bibliography>

Object Identity Attribute

<paper ID="object-fusion" ROLE="publication">

CDATA (character data) <authors> <author authorRef="yannis"> IDREF Y.Papakonstantinou</author> intradocument </authors> reference <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <related papers= "semistructured-data" "mediators"/> </paper>
</bibliography>

Reference to external ENTITY

XML Namespaces
Namespace is a mapping between an element prefix and a URI
cars is the prefix in this example,
<cars:part xmlns:cars=URI>

URIs are not a pointer to information about the Namespace. They are just unique identifiers. You cannot resolve XML namespace URIs.

XML Namespaces
An XML document may reference more than one schema A Namespace specifies which schema defines a given tag XML, like Java, uses qualified names
This helps to avoid collisions between names Java: myObject.myVariable XML: myDTD:myTag Note that XML uses a colon (:) rather than a dot (.)

If an XML processor is not namespaceaware, the colon is just part of the name

Namespaces and URIs


A namespace is defined as a unique string
To guarantee uniqueness, typically a URI (Uniform Resource Indicator) is used, because the author owns the domain It doesn't have to be a real URI; it just has to be a unique string Example: http://www.matuszek.org/ns

There are two ways to use namespaces:


Declare a default namespace Associate a prefix with a namespace, then use the prefix in the XML to refer to the namespace

Namespace Syntax
In any start tag you can use the reserved attribute name xmlns: <book xmlns="http://www.matuszek.org/ns">
This namespace will be used as the default for all elements up to the corresponding end tag You can override it with a specific prefix

You can use almost this same form to declare a prefix: <book xmlns:dave="http://www.matuszek.org/ns">

Use this prefix on every tag and attribute you want to use from this namespace, including end tags--it is not a default prefix <dave:chapter dave:number="1">To Begin</dave:chapter>

You can use the prefix in the start tag in which it is defined: <dave:book xmlns:dave="http://www.matuszek.org/ns">

Namespaces and DTD


Here is a sample Namespace specification within a DTD.
<!ELEMENT title ...> <!ATTLIST title xmlns CDATA #FIXED ttp://www.person.com"> <!ELEMENT person:title ...> <!ATTLIST person:title xmlns:person CDATA #FIXED "http://www.person.com">

XML Schema
People are dissatisfied with DTDs due to: It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the <elevation> element to hold an integer with a range of 0 to 12,000" Desire a set of datatypes compatible with those found in databases
DTD supports 10 datatypes; XML Schemas supports 44+ datatypes

What is XML Schema?


A grammar definition language
Like DTDs but better Uses XML syntax Defined by W3C

Primary features

Datatypes e.g. integer, float, date, etc More powerful content models e.g. namespace-aware, type derivation, etc type definitions simple type complex type (contains element or attribute) element declarations

A schema is a collection of:


Schema Terminology
Schema: a formal description for the structure and allowed content of a set of data (esp. in databases) XML Schema is often used for each of 1. XML Schema, the W3C Rec. that defines 2. XML Schema Definition Language (XSDL), an XML-based markup language for expressing ... 3. schema documents, each of which describes a schema (DTD) for a set of XML document instances

Advantages of XSDL
XML syntax
schema documents easier to manipulate by programs (than the special DTD syntax)

Compatibility with namespaces


can validate documents using declarations from multiple sources

Content datatypes
44 built-in datatypes (including primitive Java datatypes, datatypes of SQL, and XML attribute types) mechanisms to derive user-defined datatypes

Advantages of XSDL
Independence of element names and content types; Compare with
DTDs: 1-to-1 correspondence btw. element type names and their content models CFGs: 1-to-1 correspondence btw. nonterminals and their productions

For example, could define titles of people as Mr./Mrs./Ms. and titles of chapters as strings

Advantages of XSDL
Support for schema documentation
element annotation with sub-elements documentation (for human readers) and appInfo (for applications)

Ability to specify uniqueness and keys within selected parts of document for example, that titles of chapters should be unique

Disadvantages of XSDL
Complexity of XSDL (esp. of Rec. Part 1!) > a long learning curve Possible immaturity of implementations (?) W3C XML Schema Web site mentions a dozen of tools or processors (http://www.w3.org/XML/Schema#Tools, March 2002) Open-source Apache XML parsers (Xerces C++ 1.7.0 and Xerces Java 1.4.4) seem reasonable implementations, but also document limitations/problems in their XML Schema support

Highlights of XML Schemas


XML Schemas are a tremendous advancement over DTDs: Enhanced datatypes 44+ versus 10 Can create your own datatypes Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". Written in the same syntax as instance documents less syntax to remember Object-oriented'ish Can extend or restrict a type (derive new type definitions on the basis of old ones) Can express sets, i.e., can define the child elements to occur in any order Can specify element content as being unique (keys on content) and uniqueness within a region Can define multiple elements with the same name but different content Can define elements with nil content Can define substitutable elements - e.g., the "Book" element is substitutable for the "Publication" element.

Example: DTD
<!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>

note.dtd

Example: XMLDTD
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

note.xml

Example: XML Schema


<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

note.xsd

Example: XMLXML Schema


<?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

note.xml

RDF
Resource Description Framework

Motivation for RDF


RDF and Metadata
Scenario 1: The library
Lookup system search properties include author, title, subject etc.

Scenario 2: The video store


Lookup system search properties include directors, actors, etc.

The common thread:


Metadata: information about information

Motivation for RDF


What about the Web?
One big library, need call number to get things without a search Has hardly any metadata, HTML Yahoo
Has metadata based lookup facility, uses human generated subject categories and site labels

Library example to illustrate need for metadata

What is RDF?
RDF stands for Resource Description Framework RDF is a framework for describing resources on the web RDF provides a model for data, and a syntax so that independent parties can exchange and use it RDF is designed to be read and understood by computers RDF is not designed for being displayed to people RDF is written in XML RDF is a part of the W3C's Semantic Web Activity RDF is a W3C Recommendation

What is RDF?
Describe relationships and attributes of (Internet) resources, i.e. advanced metadata Based on Directed Labelled Graphs (DLG) and classical Information Analysis Also represented in XML, N3, N-Triple Attributes and Relation types may be defined by XML Namespaces, e.g. Dublin Core A general method to decompose knowledge into small pieces with some rules about semantics or meaning of those pieces Designed for knowledge, not data, means RDF is particularly concerned with meaning

RDF and XML


RDF is an implementation of XML Why not just use XML?
XML falls apart on the scalability design goal. There are two problems: Order of elements important unnatural in metadata, also expensive in practice Representation of XML documents in memory trees difficult to manage when large

XML unequalled as an exchange format on the Web, but it doesnt provide a metadata framework

Uses of RDF
Resource Discovery to provide better search engine capabilities Cataloging for describing the content and content relationships Intelligent software agents to facilitate knowledge sharing exchange Content rating in describing collections of pages that represent a single logical document

Uses of RDF
Describing intellectual property rights Privacy preferences expression of a user as well as the privacy polices of a Web site Web of Trust RDF with digital signatures will be key to building the Web of Trust for electronic commerce, collaboration, and other applications.

RDF Components
Formal data model Syntax for interchange of data Schema Type system (schema model) Syntax for machine-understandable schemas Query and profile protocols

RDF Data Model


Imposes structural constraints on the expression of application data models
for consistent encoding, exchange and processing of metadata

Enables resource description communities to define their own semantics Provides for structural interoperability

RDF Data Model


Directed labelled graphs Model elements
Statement: Resource (Subject) + Property (Predicate) + Value (Object) Resource: anything that can be identified, identified by a URI. Property: specific aspect, characteristic, attribute, or relation used to describe a resource URI: verbose name for Resource, can be http, urn, tag types Value

RDF Elements
Subject source of relationship
Always a resource

Predicate labeled arc


Always a resource

Object relationships destination


Resource or literal

Subject and Predicates are first-class objects


Which means they can be used as subjects or objects of other statements

RDF Model Primitives

Property

Resource

Value Resource

Statement

RDF Model

Author

Resource

Paul

RDF Syntax
RDF Model defines a formal relationships among resources, properties and values Syntax is required to...
Store instances of the model into files Communicate files from one application to another

W3C XML eXtensible Markup Language


http://www.w3.org/XML

RDF Model Example

dc: Title

URI:R
dc: Creator

RDF Presentation

Paul Miller

RDF Syntax Example


dc: Title

URI:R
dc: Creator

RDF Presentation

Paul Miller <RDF xmlns = http://www.w3.org/TR/WD-rdf-syntax# xmlns:dc = http://purl.org/dc/elements/1.0/> <Description about = URI:R> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> Paul Miller </dc:Creator> </Description> </RDF>

RDF Model Example


dc: Title

URI:R
dc: Creator

RDF Presentation

Paul Miller URI:PAUL

bib:Aff UKOLN

bib:Name Paul Miller

bib:Email p.miller@ ukoln.ac.uk

URI:UKOLN

RDF Syntax Example


<RDF xmlns = http://www.w3.org/TR/WD-rdf-syntax# xmlns:dc = http://purl.org/dc/elements/1.0/ xmlns:bib = http://www.bib.org/persons#> <Description about = URI:R> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> <Description> <bib:Name> Paul Miller </bib:Name> <bib:Email> p.miller@ukoln.ac.uk </bib:Email> <bib:Aff resource = http://www.ukoln.ac.uk /> </Description> </dc:Creator> </Description> </RDF>

RDF Schema
RDFS or RDF Schema is an extensible knowledge representation language, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. The first version was published by W3C in April 1998, and the final W3C recommendation was released in February 2004. Main RDFS components are included in the more expressive language OWL. RDFS is also written in XML.

RDF Schema
RDF describes resources with classes, properties, and values. In addition, RDF also need a way to define application -specific classes and properties. Application-specific classes and properties must be defined using extensions to RDF: RDF Schema RDF Schema does not provide actual applicationspecific classes and properties. Instead RDF Schema provides the framework to describe application-specific classes and properties Classes in RDF Schema is much like classes in object oriented programming languages. This allows resources to be defined as instances of classes, and subclasses of classes

RDF Schema
Basic vocabulary to describe RDF vocabularies Defines properties of the resources (e.g., title, author, subject, etc) Defines kinds of resources being describes (books, Web pages, people, etc) XML Schema gives specific constraints on the structure of an XML document RDF Schema provides information about the interpretation of the RDF statements

RDFS / RDF Classes


Class

Resource

Datatype

Container

Literal

Property

List

Statement

Alt

Bag

Seq

XMLLiteral

ContainerMembershipProperty

RDFS / RDF Properties


Element rdfs:domain rdfs:range rdfs:subPropertyOf rdfs:subClassOf rdfs:comment rdfs:label rdfs:isDefinedBy rdfs:seeAlso rdfs:member rdf:first rdf:rest rdf:subject rdf:predicate rdf:object rdf:value rdf:type Domain Property Property Property Class Resource Resource Resource Resource Resource List List Statement Statement Statement Resource Resource Range Class Class Property Class Literal Literal Resource Resource Resource Resource List Resource Resource Resource Resource Class The subject of the resource in an RDF Statement The predicate of the resource in an RDF Statement The object of the resource in an RDF Statement The property used for values The resource is an instance of a class Description The domain of the resource The range of the resource The property is a sub property of a property The resource is a subclass of a class The human readable description of the resource The human readable label (name) of the resource The definition of the resource The additional information about the resource The member of the resource

RDFS / RDF Attributes


Element rdf:about rdf:Description rdf:resource rdf:datatype rdf:ID rdf:li rdf:_n rdf:nodeID rdf:parseType rdf:RDF xml:base xml:lang Domain Range Description Defines the resource being described Container for the description of a resource Defines a resource to identify a property Defines the data type of an element Defines the ID of an element Defines a list Defines a node Defines the ID of an element node Defines how an element should be parsed The root of an RDF document Defines the XML base Defines the language of the element content

RDF Schema Example


<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdfs:Class rdf:ID="Person"> <rdfs:comment>Person Class</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"/> </rdfs:Class> <rdfs:Class rdf:ID="Student"> <rdfs:comment>Student Class</rdfs:comment> <rdfs:subClassOf rdf:resource="#Person"/> </rdfs:Class> <rdfs:Class rdf:ID="Teacher"> <rdfs:comment>Teacher Class</rdfs:comment> <rdfs:subClassOf rdf:resource="#Person"/> </rdfs:Class>

RDF Schema Example (cont.)


<rdfs:Class rdf:ID="Course"> <rdfs:comment>Course Class</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"/> </rdfs:Class> <rdf:Property rdf:ID="teacher"> <rdfs:comment>Teacher of a course</rdfs:comment> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="#Teacher"/> </rdf:Property> <rdf:Property rdf:ID="students"> <rdfs:comment>List of Students of a course in alphabetical order</rdfs:comment> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq"/> </rdf:Property> <rdf:Property rdf:ID="name"> <rdfs:comment>Name of a Person or Course</rdfs:comment> <rdfs:domain rdf:resource="#Person"/> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"/> </rdf:Property> </rdf:RDF>

RDF (corresponded to previous schema)


<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.cs.rpi.edu/~puninj/XMLJ/course_schema.rdf#"> <Course rdf:ID="csci_2962"> <name>Programming XML in Java</name> <teacher> <Teacher rdf:ID="jp"> <name>John Punin</name> </Teacher> </teacher> <students> <rdf:Seq> <rdf:li> <Student rdf:ID="er"> <name>Elizabeth Roberts</name> </Student> </rdf:li> <rdf:li> <Student rdf:ID="gl"> <name>George Lucas</name> </Student> </rdf:li> <rdf:li> <Student rdf:ID="js"> <name>John Smith</name> </Student> </rdf:li> </rdf:Seq> </students> </Course> </rdf:RDF>

CIM
Common Information Model

CIM Motivation
Deregulation of the power industry worldwide requires utility companies share power system data: Energy Management System- EMS Exchanging power systems data is always problematic due to use of proprietary formats Needs of open standard for representing power system components CIM defines a common model for describing the components in power systems for use in a common EMS

CIM Overview
CIM is an information object-oriented model representing real-world objects found in transmission and distribution operation and management Enable integration of applications/systems Provides a common model behind all messages exchanged between systems Basis for defining information exchange models CIM provides a comprehensive, logical view of EMS information for: Transmission network analysis Generation control SCADA Operator training simulation

CIM Overview
Enable data access in a standard way Common language to navigate and access complex data structures in any database
Provides a hierarchical view of data for browsing and access with no knowledge of actual logical schema

Inspiration for logical data schemas (e.g., for an operational data store) Not tied to a particular applications view of the world But permits same model to be used by all applications to facilitate information sharing between applications Also provides consistent view of the world by operators regardless of which application user interface they are using

CIM Overview
A data model to enable data transfer or integration in any domain where a common power system model is needed
Model includes Classes, their Attributes, and Relationships to represent utility objects The Classes (Objects) are abstract and may are used in a wide variety of applications Useful: As Foundation for Logical Data Base Schema To Define Component Interfaces Common Language for Data Exchange

Sample Power System Model

Role of CIM in Utility Enterprise


Data preparation Provides common set of semantics and data representation regardless of source of data Improves data quality and enables data validation Data exchange Provides common language and format Provides common set of services for sharing data System integration Provides basis for a standards-based integration framework Web services payloads and Service Oriented Architecture (SOA) Enterprise Information Management Part of overall Enterprise Information Model relating to business processes/automation/management

Benefits of Using CIM Approach


Data model driven solutions leads to interoperability Provides common semantics for information exchange between heterogeneous systems Used for CA to CA communications
NERC mandated use of CIM and RDF Schema version for power system model exchange

Provides for automatic generation of message payloads in XML Ensures common language for all messages defined Avoids proprietary message formats from vendors (based on internal schemas) Eliminates work of creating DTD for each message Alternative to EDI or CSV file formats

Benefits of Using CIM Approach


Uses industry standard modeling notation
UML, XML, RDF

Permits software tool use for:


Defining and maintaining data models Single point of maintenance for changes Documenting data models Automatic generation of information payloads

Automatically generate IDL, Java, C code

CIM Related Standards


EPRI CCAPI: The Electric Power Research Institute (EPRI) proposed an integration framework called control center application program interface for EMS data sharing IEC 61970-301: Common Information Model (CIM) base- A semantic model describing the components of a power system at an electrical level and the relationships between each component IEC 61970-501: Common Information Model Resource Description Framework (CIM RDF) schema IEC 61968-4: Interfaces for records and asset management IEC 61968-11: Extends the model to cover the other aspects of power system software data exchange such as asset tracking, work scheduling and customer billing

CIM Representation
CIM is documented as a set of class diagrams using the Unified Modeling Language (UML) UML specifies CIM in an abstract manner that allows for open implementation:
There is no restriction to relational, object oriented or other modeling technologies

The UML is a Diagramming Tool for CIM

An Example of CIM in UML

CIM Packages
CIM consists of a number of packages

CIM - Common Information Model


Needed to make the model easier to design, understand and review Packages are grouped to be handled as a single standard document
CIM Base in UML - IEC 61970 Part 301 CIM Energy Scheduling, Reservations & Financial - IEC 61970 Part 302CIM SCADA - IEC 61970 Part 303 GID - Generic Interface Definition CIM Model Exchange Format

CIS - Component Interface Specifications


CIM RDF Schema (UML->RDF) - IEC 61970 Part 501 CIM XML Model data Exchange Format - IEC 61970 Part 552-4

CIM Base Part 301


CIM Base in UML
Package used for the Project

Dashed lines indicate a dependency relationship between packages Arrow points from the dependent package to the package on which it has a dependency The Generation package is divided into two sub packages:
Production GenerationDynamics

Components of Part 301


Core
This package contains the core Naming, PowerSystemResource, EquipmentContainer, and ConductingEquipment entities shared by all applications plus common collections of those entities Not all applications require all the Core entities This package does not depend on any other package, but most of the other packages have associations and generalizations that depend on it
This package is an extension to the Core Package Specifies physical definition of how equipment is connected together In addition it models Topology, that is the logical definition of how equipment is connected via closed switches The Topology definition is independent of the other electrical characteristics

Topology

Components of Part 301


Wires The Wires package is an extension to the Core and Topology packages Models information on the electrical characteristics of Transmission and Distribution networks This package is used by network applications such as State Estimation, Load Flow and Optimal Power Flow Outage This package is an extension to the Core and Wires packages Models information on the current and planned network configuration

Components of Part 301


Protection
This package is an extension to the Core and Wires packages Models information for protection equipment such as relays

Meas
Describes dynamic measurement data exchanged between applications

Components of Part 301


LoadModel
Provides models for the system load as curves and associated curve data Used for Load Forecasting and Load Management

Production
Provides models for various types of generators Models production costing information which is used to economically allocate demand among committed units and calculate reserve quantities This information is used by Unit Commitment and Economic Dispatch, Load Forecasting, Automatic Generation Control applications.

Components of Part 301


Generation Dynamics
Provides models for prime movers This information is used by Unit Modeling for Dynamic Training Simulator applications

Domain
Data dictionary of quantities and units This package contains the definition of datatypes, including units of measure and permissible values

Core Package

Topology Package

Wire Package

Outage Package

Protection Package

Meas Package

LoadModel Package

Production Package

GenerationDynamic Package

Domain Package

CIM XML
A common model exchange format based on the CIM data definition and XML was developed Proposed to NERC and subsequently adopted by their Data Exchange Working Group (DEWG) All major vendors of energy management systems have voiced their support for the format CIM/XML is a language for expressing CIM models in XML The NERC has adopted CIM/XML as the standard for exchanging models between power transmission system operators The CIM/XML format is also going through an IEC international standardization process

CIM XML
Resource Description Framework (RDF) defines a mechanism for describing resources RDF is a general-purpose language for representing information in the Web RDF integrates a variety of applications using XML as an interchange syntax RDF Schema is a standard which describes how to use CIM XML CIM/XML is an RDF application, using RDF and RDF Schema to organize its XML structures

CIM XML RDF Example


The base class of the CIM is the PowerSystemResource class Other more specialized classes such as Substation, Switch, and Breaker are defined as subclasses CIM/XML uses RDF as the language for exchanging specific system models

CIM XML RDF Example

CIM XML RDF Example

KE
knowledge engineering

What is Knowledge?
Data: raw, simply exists, intercept by sensory devices or organ Information: meaning that interpreted from data Knowledge: collection of information, people use when solving the problem
Data Inform ation Knowle dge

Where Knowledge Resides?

The problem with knowledge, however, is that, unlike information, it typically doesn't reside on paper. Instead, it lives inside people's heads.

Knowledge Management (1)


A strategy, framework or system designed to help organisations create, capture, analyse, apply, and reuse knowledge to achieve competitive advantage.

A key aspect is that knowledge within an organisation is treated as a key asset.


A core aspect is "getting the right knowledge to the right people at the right time in the right format".

Knowledge Management (2)

Knowledge Management (3)

to
Tacit Knowledge
Type of Knowledge

Tacit Knowledge

Explicit Knowledge

Socialization

Externalization

Explicit Knowledge

Internalization

to

Combination

Nonaka SECI Model

Knowledge Engineering (1)


A field within artificial intelligence that develops knowledge-based systems Computer programs that contain large amounts of knowledge, rules and reasoning mechanisms to provide solutions to realworld problems An expert system that designed to emulate the reasoning processes of an expert practitioner

Knowledge Engineering (2)


Key KE principles: Different types of knowledge Different types of experts and expertise Different ways of representing knowledge Different ways of using knowledge Right approach and technique must be employed acquire validate and reuse of Knowledge

Knowledge Engineering (3)


Three types of experts Academic:
Theoretical understanding is prized. Their job is to explicate clarify and teach others May be far from day-to-day problem solving Engage constant day-to-day problem solving Implicit Difficult for them to articulate Pure performance expert Equipped with theoretical knowledge and put them into real problem solving Comfortable to articulate

Practitioner:

Samurai:

Knowledge Engineering (4)


Need a way to relates different type of knowledge, experts, representation and task together to perform a knowledgeoriented activity Not to interview experts about knowledge they cannot articulate, represent it in a form no one understand and eventually find they do not really need it Use structured methods

Knowledge Roles
knowledge manager defines knowledge strategy initiates knowledge development projects facilitates knowledge distribution

knowledge provider/ specialist

elicits knowledge from elicits requirements from

knowledge engineer/ analyst manages project manager

validates

delivers analysis models to KS uses knowledge user designs & implements knowledge system developer manages

Classification of Knowledge
Declarative and Procedural Knowledge: Knowing what vs. knowing how Tacit and Explicit Knowledge: Easy to articulate vs. hard to articulate Generic and Specific Knowledge: Applying across many situations vs. applying across a few situations

Knowledge Modeling
A way of structuring projects, acquiring and validating knowledge and storing knowledge for future use.
Symbolic character-based languages, such as logic Diagrammatic representations, such as networks and ladders Tabular representations, such as matrices Structured text, such as hypertext

Knowledge Object
Field of logic has also inspired important knowledge types, notably concepts, attributes, values, rules and relationships Concepts are the things (physical objects, information, people, etc.) that constitute a domain. Each concept is described by its relationships to other concepts in the domain (e.g. in a hierarchy), and by its attributes and values. Instance is an instantiated class. For example, "my car" is an instance of the concept "car Attributes are the generic properties, qualities or features belonging to a class of concepts, e.g. weight, cost, age and ability. Values are the specific qualities of a concept such as its actual weight or age. Values are associated with a particular attribute and can be numerical (e.g. 120Kg, 6 years old) or categorical (e.g. heavy, young) Rules are statements of the form "IF... THEN...". Relationships represent the way knowledge objects (such as concepts and tasks) are related to one another. Important examples include is a to show classification, part of to show composition,

Structured Modeling Techniques


Relational database (RDB) Object oriented database (OODB) eXtensible markup language (XML) Unified modeling language (UML)

Uses of Knowledge Models


Knowledge elicitation (from an expert) Validation (with the same expert) Cross-validation (with another expert) Knowledge publication Maintenance and updating of the knowledge system or publication

Knowledge Acquisition (1)


Generic process
1. Conduct an initial interview with the expert to

2. Transcribe the initial interview and analyze the resulting document (called a protocol) to produce a set of questions that cover the essential issues across the domain and that serve the goals of the knowledge acquisition exercise

a) scope what knowledge should be acquired, b) determine to what purpose the knowledge should be put, c) gain some understanding of key terminology, and d) build a rapport with the expert

Knowledge Acquisition (2)


Generic process
1. Conduct a second interview with the expert using the pre-prepared questions to provide structure and focus. (This is called a semi-structured interview.) 2. Transcribe the semi-structured interview and analyse the resulting protocol, looking for knowledge types: concepts, attributes, values, classes of concepts, relationships between concepts, tasks and rules. 3. Represent these knowledge elements in a number of formats, for example, hierarchies of classes (taxonomies), hierarchies of constitutional elements, grids of concepts and attributes, diagrams, and flow charts. In addition, document, in a structured manner, anecdotes (war stories) and explanations that the expert gives.

Knowledge Acquisition (3)


Generic process
1. Use the resulting representations and structured documentation with contrived techniques to allow the expert to modify and expand on the knowledge you have already captured. 2. Repeat the analysis, representation-building and acquisition sessions until the expert is happy that the goals of the project have been realised. 3. Validate the knowledge acquired with other experts, and make modifications where necessary.

Knowledge Acquisition (4)


Issues in Knowledge Acquisition:
Most knowledge is in the heads of experts Experts have vast amounts of knowledge Experts have a lot of tacit knowledge
They don't know all that they know and use Tacit knowledge is hard (impossible) to describe

Experts are very busy and valuable people Each expert doesn't know everything Knowledge has a "shelf life"

Knowledge Acquisition (5)


Requirements for knowledge acquisition:
Take experts off the job for short time periods Allow non-experts to understand the knowledge Focus on the essential knowledge Can capture tacit knowledge Allow knowledge to be collated from different experts Allow knowledge to be validated and maintained

Knowledge Acquisition Techniques (1)


Interviewing Work observation Commentary Protocol analysis Laddering Concept sorting Repertory grid

Interviewing (1)
Common use for knowledge acquisition Range from completely unstructured to formally planned, structured interview Audio-visual recording is required

Interviewing (2)
Probe Code
P1 P2 P3 P4

Question template
Why would you do that? How would you do that? When would you do that? Is<the rule>always the case? What alternatives to <the prescribed action/decision> are there? What if it were not the case that <currently true condition>? Can you tell me more about <any subject already mentioned>?

Effect
Converts an assertion into a rule Generates lower-order rules Reveals the generality of the rule and may generate other rules Generates more rules

P5 P6

Generates rules for when current condition does not apply Used to generate further dialogue if expert dries up

Interviewing (3)
EX: KE: EX: KE: EX: KE: EX: I actually checked the port of the computer Why did you check the port? (P1) If its been lightning recently then its good to check the port, because lightning tends to damage the ports. Are there any alternatives to that problem? (P4) Yes, that ought to be prefaced by saying that if it was several keys with odd effects, not necessarily all of them, but two or more. Why does it have to be more than two? Well, if it was only one or two keys doing funny things then the thing to do is check theyre closing property, speed would affect all keys, parity would affect about half the keys.

Interviewing (4)
IF THEN IF THEN IF THEN there has been recent lightning check port for damage there are two or fewer malfunction keys check the key contacts about half the keyboard is malfunctioning check the parity

IF THEN

the whole keyboard is malfunctioning check the speed

Work observation
Simply observing and making notes as the expert performs their daily activities Videotaping task performance can be useful especially if combined with retrospective reporting techniques

Commentary
Think aloud problem-solving
Expert providing a running commentary of their thought processes as they solve a problem Experts protocol of task behaviour shown in video and asked to provide a running commentary on what they were thinking and doing

Protocol Analysis
To identify of basic knowledge objects within a protocol - transcript An interview transcript would be analyzed by highlighting all the concepts that are relevant to the task Categories of fundamental knowledge such as concepts, attributes, values, tasks and relationships would be extracted For example, if the transcript concerns the task of diagnosis, then such categories as symptoms, hypotheses and diagnostic techniques would be used for the analysis

Laddering
Involve the creation, reviewing and modification of hierarchical knowledge, often in the form of ladders, i.e. tree diagrams See example

Knowledge intensive Task Hierarchy


knowledgeintensive task

analytic task

synthetic task

classification

diagnosis

prediction

design

planning

assignment

modelling assessment monitoring

scheduling

configuration design

Analytic versus synthetic tasks


analytic tasks
system pre-exists it is typically not completely "known" input: some data about the system, output: some characterization of the system

synthetic tasks
system does not yet exist input: requirements about system to be constructed output: constructed system description

Structure of template description in catalog


General characterization typical features of a task Default method roles, sub-functions, control structure, inference structure Typical variations frequently occurring refinements/changes Typical domain-knowledge schema assumptions about underlying domainknowledge structure

Classification
establish correct class for an object object should be available for inspection "natural" objects examples: rock classification, apple classification terminology: object, class, attribute, feature one of the simplest analytic tasks; many methods other analytic tasks: sometimes reduced to classification problem especially diagnosis

Classification: Pruning method


generate all classes to which the object may belong specify an object attribute obtain the value of the attribute remove all classes that are inconsistent with this value

Classification:inference structure
object specify attribute

generate

class

obtain

match

feature

truth value

Classification: method control


while new-solution generate(object -> candidate) do candidate-classes := candidate union candidate-classes;

while new-solution specify(candidate-classes -> attribute) and length candidate-classes > 1 do obtain(attribute -> new-feature); current-feature-set := new-feature union current-featureset; for-each candidate in candidate-classes do match(candidate + current-feature-set -> truth-value); if truth-value = false; then candidate-classes := candidate-classes subtract candidate;

Classification: method variations


Limited candidate generation Different forms of attribute selection
decision tree information theory user control

Hierarchical search through class structure

Classification: domain schema


object type

has-attribute class-of

2+
object class

1+
attribute

requires value: universal

class constraint

Rock classification
rock
texture grain size colour

1+
mineral

minerals ontology

igneous rock

mineral content
percentage presence

silicate

volcanic rock

plutonic rock mineral content constraint

neso silicate

tecto silicate

syenite

diorite olivine quartz

peridotite

dunite

Nested classification
rock classifcation
rock sub-task obtain: Quartz percentage contains identify Quartz minerals

mineral classification
Quartz olivine

Rock classification prototype

Assessment
find decision category for a case based on domain-specific norms. typical domains: financial applications (loan application), community service terminology: case, decision, norms some similarities with monitoring

differences:
timing: assessment is more static different output: decision versus discrepancy

Assessment: abstract & match method


Abstract the case data Specify the norms applicable to the case e.g. rent-fits-income, correct-householdsize Select a single norm Compute a truth value for the norm with respect to the case See whether this leads to a decision Repeat norm selection and evaluation until a decision is reached

Assessment:inference structure
case

abstract

abstracted case

specify

norms

select

evaluate

norm

decision

match

norm value

Assessment: method control


while new-solution abstract(case-description -> abstracted-case) do case-description := abstracted-case; end while specify(abstracted-case -> norms); repeat select(norms -> norm); evaluate(abstracted-case + norm -> norm-value); evaluation-results := norm-value union evaluationresults; until has-solution match(evaluation-results -> decision);

Assessment control: UML notation


[more abstractions] abstract

specify norms [no more abstractions] select norm [match fails no decision] [match succeeds: decision found]

evaluate norm

match decision

Assessment: method variations


norms might be case-specific
cf. housing application

case abstraction may not be needed knowledge-intensive norm selection


random, heuristic, statistical can be key to efficiency sometimes dictated by human expertise
only acceptable if done in a way understandable to experts

Assessment: domain schema


case abstraction rule

case datum

1+

value: universal 1+

has abstraction

case datum

implies

requirement

norm

indicates truth-value: boolean 1+


decision rule

decision

Claim handling forunemployment benefits


claim handling
collect data data entry decide about claim

finacial department

:claim

[no right] [right] compute benefit

send notification

prepare payment

Decision rules for claim handling


<norm> WW benefit requirement DEFINES <decision> WW benefit right

<decision rule> benefit decision rule

insured = false DEFINES WW-benefit-right.value = no-right iunemployed = false DEFINES WW-benefit-right.value = no-right weeks-worked-requirement = false DEFINES WW-benefit-right.value = no-right

insured = true AND unemployed = true AND weeks-worked--requirement = true AND years-worked-requirement = false DEFINES WW-benefit-right.value = short-benefit

insured = true AND unemployed = true AND weeks-worked--requirement = true AND years-worked-requirement = true DEFINES WW-benefit-right.value = long-benefit

Diagnosis
find fault that causes system to malfunction example: diagnosis of a copier terminology: complaint/symptom, hypothesis, differential, finding(s)/evidence, fault nature of fault varies state, chain, component should have some model of system behavior default method: simple causal model sometimes reduced to classification task direct associations between symptoms and faults automation feasible in technical domains

Diagnosis: causal covering method


Find candidate causes (hypotheses) for the complaint using a causal network Select a hypothesis Specify an observable for this hypothesis and obtain its value Verify each hypothesis to see whether it is consistent with the new finding Continue this process until a single hypothesis is left or no more observables are available

Diagnosis:inference structure
hypothesis specify observable

complaint

select

obtain

cover

hypothesis

verify

finding

result

Diagnosis: method control


while new-solution cover(complaint -> hypothesis) do differential := hypothesis add differential; end while repeat select(differential -> hypothesis); specify(hypothesis -> observable); obtain(observable -> finding); evidence := finding add evidence; foreach hypothesis in differential do verify(hypothesis + evidence -> result); if result = false then differential := differential subtract hypothesis until length differential =< 1 or no observables left faults := hypothesis;

Diagnosis: method variations


inclusion of abstractions simulation methods see literature on model-based diagnosis
library of Benjamins

Diagnosis: domain schema


syst em feat ure

syst em st at e syst em observable syst em st at e

can cause

syst em feat ure

value: universal

status: universal

causal dependency

fault

prevalence: number[0..1]

Monitoring
analyze ongoing process to find out whether it behaves according to expectations terminology: parameter, norm, discrepancy, historical data main features: dynamic nature of the system cyclic task execution output "just" discrepancy => no explanation often: coupling monitoring and diagnosis output monitoring is input diagnosis

Monitoring:data-driven method
Starts when new findings are received For a find a parameter and a norm value is specified Comparison of the find with the norm generates a difference description This difference is classified as a discrepancy using data from previous monitoring cycles

Monitoring: inference structure


system model

receive

new finding

select

parameter

compare

norm

specify

difference

classify

discrepancy

historical data

Monitoring: method control


receive(new-finding); select(new-finding -> parameter) specify(parameter -> norm); compare(norm + finding -> difference); classify(difference + historical-data -> discrepancy); historical-data := finding add historical-data;

Monitoring: method variations


model-driven monitoring
system has the initiative typically executed at regular points in time example: software project management

classification function treated as task in its won right


apply classification method

add data abstraction inference

Prediction
analytic task with some synthetic features analyses current system behavior to construct description of a system state at future point in time. example: weather forecasting often sub-task in diagnosis also found in knowledge-intensive modules of teaching systems e.g. for physics. inverse: retrodiction: big-bang theory

Synthesis
Given a set of requirements, construct a system description that fulfills these requirements
requirements (external)
soft requirement
"fast system"

constraints & preferences (internal)


preference
"prefer cheapest component"

hard requirement
"price lower than $2,000"

constraint
"P166 processor requires 16Mb"

Ideal synthesis method


Operationalize requirements
preferences and constraints

Generate all possible system structures Select sub-set of valid system structures
obey constraints

Order valid system structures


based on preferences

Synthesis:inference structure
operationalize requirements system composition knowledge

generate

possible system structures

hard requirements

select subset

constraints

valid system structures


preferences

soft requirements

sort

preference ordering knowledge

list of preferred system structures

Design
synthetic task system to be constructed is physical artifact example: design of a car can include creative design of components creative design is too hard a nut to crack for current knowledge technology sub-type of design which excludes creative design => configuration design

Configuration design
given predefined components, find assembly that satisfies requirements + obeys constraints example: configuration of an elevator; or PC terminology: component, parameter, constraint, preference, requirement (hard & soft) form of design that is well suited for automation computationally demanding

Elevator configuration: knowledge base reuse

Configuration:propose & revise method


Simple basic loop: Propose a design extension Verify the new design, If verification fails, revise the design Specific domain-knowledge requirements revise strategies Method can also be used for other synthetic tasks assignment with backtracking skeletal planning

Configuration: method decomposition


requirements specify skeletal design

operationalize

soft requirements

propose

extension

hard requirements design verify

modify

action

critique

violation

truth value

select

action list

Configuration: method control


operationalize(requirements -> hard-reqs + soft-reqs); specify(requirements -> skeletal-design); while new-solution propose(skeletal-design + design +soft-reqs -> extension) do design := extension union design; verify(design + hard-reqs -> truth-value + violation); if truth-value = false then critique(violation + design -> action-list); repeat select(action-list -> action); modify(design + action -> design); verify(design + hard-reqs -> truth-value + violation); until truth-value = true; end while

Configuration: method variations


Perform verification plus revision only when for all design elements a value has been proposed. can have a large impact on the competence of the method Avoid the use of fix knowledge Fixes are search heuristics to navigate the potentially extensive space of alternative designs alternative: chronological backtracking

Configuration: domain schema


act ion t ype
fix act ion

1+

fix

const raint

preference rating: universal

implies

const raint expression

1+
design element

computes 1+

design element

1+

defines preference

calculat ion expression

preference expression

component
component

0+ has-parameter

paramet er value: universal

1+

model list: list

Types of configuration may require different methods


Parametric design Assembly is largely fixed Emphasis on finding parameter values that obey global constraints and adhere to preferences Example: elevator design Layout Component parameters are fixed Emphasis on constructing assembly (topological relations) Example: mould configuration Literature: Motta (1999), Chandrasekaran (1992)

Assignment
create mapping between two sets of objects allocation of offices to employees allocation of airplanes to gates mapping has to satisfy requirements and be consistent with constraints terminology subject, resource, allocation can be seen as a degenerative form of configuration design

Assignment: method without backtracking


Order subject allocation to resources by selecting first a sub-set of subjects If necessary: group the subjects into subjectgroups for joint resource assignment requires special type of constraints and preferences Take an subject(-group) and assign a resource to it. Repeat this process until all subjects have a resource

Assignment:inference structure
subjects select subset subject set

subject group

group

resources

assign

resource

current allocations

Assignment:method control
while not empty subjects do select-subset(subjects -> subject-set); while not empty subject-set do group(subject-set -> subject-group); assign(subject-group + resources + currentallocations -> resource); current-allocations := < subject-group, resource > union current-allocations; subject-set := subject-set/subject-group; resources := resources/resource; end while subjects := subjects/subject-set; end while

Assignment: method variations


Existing allocations
additional input

subject-specific constraints and preferences


see synthesis and configuration-design

Planning
shares many features with design main difference: "system" consists of activities plus time dependencies examples: travel planning; planning of building activities automation only feasible, if the basic plan elements are predefined consider use of the general synthesis method (e.g therapy planning) or the configurationdesign method

Planning method
requirements plan goal generate plan composition knowledge

operationalize possible plans

hard requirements

select subset

constraints

valid plans

preferences

soft requirements

sort

preference ordering knowledge

list of preferred plans

Scheduling
Given a set of predefined jobs, each of which consists of temporally sequenced activities called units, assign all the units to resources at time slots production scheduling in plant floors Terminology: job, unit, resource, schedule Often done after planning (= specification of jobs) Take care: use of terms planning and scheduling differs

Scheduling:temporal dispatching method


Specify an initial schedule Select a candidate unit to be assigned Select a target resource for this unit Assign unit to the target resource Evaluate the current schedule Modify the schedule, if needed

Scheduling: inference structure


job specify truth value

select

schedule

verify

candidate unit

assign

modify

select

target resource

Scheduling: method control


specify(jobs -> schedule); while new-solution select(schedule -> candidate-unit) do select(candidate-unit + schedule -> target-resource); assign(candidate-unit + target-resource -> schedule); evaluate(schedule -> truth-value); if truth-value = false then modify(schedule -> schedule); end while

Scheduling: method variations


Constructive versus repair method Refinement often necessary
see scheduling literature catalog of Hori (IBM Japan)

Scheduling: typical domain schema


schedule job

release-date: time due-date: time includes


{temporally ordered}

job unit

resource

{dynamically linked}

unit

preference constraint

type: string start-time: time end-time: time


is performed at

start: time end: time resource-type: string

resource capacity constraint

Modeling
included for completeness "construction of an abstract description of a system in order to explain or predict certain system properties or phenomena" examples: construction of a simulation model of nuclear accident knowledge modeling itself seldom automated => creative steps exception: chip modeling

In applications: typical task combinations


monitoring + diagnosis
Production process

monitoring + assessment
Nursing task

diagnosis + planning
Troubleshooting devices

classification + planning
Military applications

Example: apple-pest management


mintor crop execute plan

[possible threat] [possible pest]

identify pest

plan measure

Comparison with O-O analysis


Reuse of functional descriptions is not common in O-O analysis notion of functional object But: see work on design patterns strategy patterns templates are patterns of knowledgeintensive tasks Only real leverage from reuse if the patterns are limited to restricted task types

Ontology Engineering
Ontology Development

What Is An Ontology?
An ontology is an explicit description of a domain:
concepts properties and attributes of concepts constraints on properties and attributes Individuals (often, but not always)

An ontology defines
a common vocabulary a shared understanding

Ontology Examples
Taxonomies on the Web
Yahoo! categories

Catalogs for on-line shopping


Amazon.com product catalog

Domain-specific standard terminology


Unified Medical Language System (UMLS) UNSPSC - terminology for products and services Common Information Model (CIM)- A semantic model describing the components of a power system at an electrical level and the relationships between each component

What Is Ontology Engineering?


Defining terms in the domain and relations among them Defining concepts in the domain (classes) Arranging the concepts in a hierarchy (subclass-superclass hierarchy) Defining which attributes and properties (slots) classes can have and constraints on their values Defining individuals and filling in slot values

Why Develop an Ontology?


To share common understanding of the structure of information
among people among software agents

To enable reuse of domain knowledge


to avoid re-inventing the wheel to introduce standards to allow interoperability

Why Develop an Ontology?


To make domain assumptions explicit
easier to change domain assumptions (consider a genetics knowledge base) easier to understand and update legacy data

To separate domain knowledge from the operational knowledge


re-use domain and operational knowledge separately (e.g., configuration based on constraints)

Backbone of Other systems


Declare structure

Databases
Knowledge bases

Ontologies

Provide domain description

Software agents

Problemsolving methods

Domainindependent applications

Ontology Development Process


Determine the domain and scope of the ontology Consider reusing existing ontologies Enumerate important terms in the ontology Define the classes and the class hierarchy Define the properties of classesslots Define the facets of the slots Create instances
Ontology Development 101: A Guide to Creating Your First Ontology

Pizza Domain
DMRs Olive Oil
Contains

Onion

Made by
Contains

Offers

Provolone
Contains

The special

Competency Questions
Which styles should I consider when choosing a pizza? Is a Sicilian pizza a tomato or olive oil base? Does tuna go well with pepperoni? What is the best choice of pizza for a vegetarian? Which characteristics of a pizza affect its appropriateness for a party? Does the flavor of an ingredient change with the base? What were good toppings for a thick crust?

Consider Reuse
Why reuse other ontologies?
to save the effort to interact with the tools that use other ontologies to use ontologies that have been validated through use in applications

What to Reuse?
Ontology libraries

DAML ontology library (www.daml.org/ontologies) Ontolingua ontology library (www.ksl.stanford.edu/software/ontolingua/) Protg ontology library (protege.stanford.edu/plugins.html)
IEEE Standard Upper Ontology (suo.ieee.org) Cyc (www.cyc.com) DMOZ (www.dmoz.org) WordNet (www.cogsci.princeton.edu/~wn/)

Upper ontologies

General ontologies

Domain-specific ontologies

UMLS Semantic Net GO (Gene Ontology) (www.geneontology.org) CIM

Enumerate Important Terms


What are the terms we need to talk about? What are the properties of these terms? What do we want to say about the terms?

Define Classes and the Class Hierarchy


A class is a concept in the domain
a class of pizzas a class of pizza shops a class of ingredients

A class is a collection of elements with similar properties


Instances of classes

the pizza you will have for lunch

Class Inheritance
Classes usually constitute a taxonomic hierarchy (a subclasssuperclass hierarchy) A class hierarchy is usually an IS-A hierarchy:
an instance of a subclass is an instance of a superclass

If you think of a class as a set of elements, a subclass is a subset

Class Inheritance - Example


Mushroom is a subclass of Topping
Every Mushroom is an Topping

Green-pepper is a subclass of Vegetable


Every green-pepper is a vegetable

Provolone is a subclass of Cheese


Every Provolone is a Cheese

What should be the specification? The Kind? The hunk-of?

Modes of Development
top-down define the most general concepts first and then specialize them bottom-up define the most specific concepts and then organize them in more general classes combination define the more salient concepts first and then generalize and specialize them

Documentation
Classes (and slots) usually have documentation
Describing the class in natural language Listing domain assumptions relevant to the class definition Listing synonyms

Documenting classes and slots is as important as documenting computer code

Define Properties of Classes Slots


Slots in a class definition describe attributes of instances of the class and relations to other instances
Each Pizza will have crust, sauce, and toppings.

Necessary conditions? Necessary and sufficient? Sufficient?

Properties (Slots)
Types of properties
intrinsic properties: Crust, sauce, extrinsic properties: name, price, parts: ingredients for a pizza relations to other objects: pizza store, customer,

Simple and complex properties


simple properties (attributes): contain primitive values (strings, numbers) complex properties: contain (or point to) other objects (e.g., a pizza instance)

Slot and Class Inheritance


A subclass inherits all the slots from the superclass
If a topping has a name and a cost, a cheese also has a name and flavor

If a class has multiple superclasses, it inherits slots from all of them


Use great care!!

Property Constraints
Property constraints (facets) describe or limit the set of possible values for a slot
The name of a pizza is a string The pizza producer is an instance of PizzaShop A PizzaShop has exactly one location

Common Facets
Slot cardinality the number of values a slot has Slot value type the type of values a slot has Minimum and maximum value a range of values for a numeric slot Default value the value a slot has unless explicitly specified otherwise

Common Facets: Slot Cardinality


Cardinality Cardinality N means that the slot must have N values Minimum cardinality Minimum cardinality 1 means that the slot must have a value (required) Minimum cardinality 0 means that the slot value is optional Maximum cardinality Maximum cardinality 1 means that the slot can have at most one value (single-valued slot) Maximum cardinality greater than 1 means that the slot can have more than one value (multiple-valued slot)

Common Facets: Value Type


String: a string of characters (The Special) Number: an integer or a float (15, 4.5) Boolean: a true/false flag Enumerated type: a list of allowed values (high, medium, low) Complex type: an instance of another class
Specify the class to which the instances belong The Pizza class is the value type for the slot produces at the PizzaShop class

Domain and Range of Slot


Domain of a slot the class (or classes) that have the slot
More precisely: class (or classes) instances of which can have the slot

Range of a slot the class (or classes) to which slot values belong

Facets and Class Inheritance


A subclass inherits all the slots from the superclass A subclass can override the facets to narrow the list of allowed values
Make the cardinality range smaller Replace a class in the range with a subclass
Pizza
is-a producer PizzaShop is-a producer

The Special

DMRs

Create Instances
Create an instance of a class
The class becomes a direct type of the instance Any superclass of the direct type is a type of the instance

Assign slot values for the instance frame


Slot values should conform to the facet constraints Knowledge-acquisition tools often check that

Asset Categorization
Power Distribution Network Asset Categorization

Development Process
Defining purpose, domain and scope Performing competency questioning and informal describing of domain knowledge Analyzing to capture concepts and properties Considering of reuse of existing ontology, i.e. CIM, and mapping concepts into CIM Modeling asset classes and relationships Verifying of interchangeability, expressivity, reusability, extensibility and integrateability

Purpose, Domain and Scope


The purpose is to facilitate the determination of risks, costs and socials factors associated with the implementation of power distribution network The domain encompass the medium voltage (MV) distribution feeder including network components, network operation, and operational environment. The scope is limited to capture information that aids determining risks, costs and socials factors involved with distribution feeder.

Elicitation of Domain Knowledge


The competency questions are formed and asked, and Then human experts are thus interviewed and concerning documents are researched to elaborate informal description about the domain, i.e. MV distribution feeder. ows some of the domain informal description elicited from the experts.

Domain Informal Description: Example


What is power distribution network?

It is a part of power system. It distributes electric energy from main substation to distribution substations and transformers. It situates in diverse landscapes and environments. It runs along public road. It also runs through field and forest. It can be overhead or underground construction or combination of both. Overhead power line is placed above ground with appropriate clearance from nearby structures and trees. Underground power line is placed under ground with some kind of protection. Underground power line can also be put above ground, inside a type of structure, e.g. buildings, bridges, etc.

Informal Description Analysis


Using an annotation technique to capture the keywords that represent the concepts in the domain. The concept basically features with characteristics that differentiates itself from other concepts. For example, the concepts in power distribution system domain include distribution feeder, overhead line, underground line, or location.

Classes and Relationships


Transform concepts to asset classes while their characteristics are will turn into class properties Employ CIM specification to constrain the modeling work Reuse of existing CIM models where applicable Extend or develop new models to suite application Verifying of interchangeability, expressivity, reusability, extensibility and integrateability

Asset Classes and Its Relationships


PSR
Equipment EquipmentContainer

ConductingEquipment

Substation

Conductor

WireType

Feeder Jumper Fuse Insulator OHLine

Location

Switch

Pole

DS LBS Cable

Hanger OHConductor Joint Termination

UGLine

Duct

Thank you

You might also like