Linked Open Data

Linked Open Data

Linking Open Data Linking the world of data from LOD mailinglist Acknowledgement for Tom Heath (Talis) Ying Ding ([email protected]) http://info.slis.indiana.edu/~dingying/ What is now User generated content is growing tremendously Isolated contents need deadly to get connected. The world is connected, so do the data,

information and knowledge Old terms Data -- sensing the world What you sense (see, hear, smell, touch) Information perceiving the world Perceive the sensed data Knowledge contextualizing information Comprehend the perceived information Add context

Context ultimately determines whats actually what. What is our daily life Access data Manipulate data (add, delete, change) Process data Generate information (tables, forms) Create knowledge (reports, papers..) Data is our life Data is our daily bread Do we have identifier for data?

Not really important if data is small and individual Really important if data is huge and connected ? Should we need identifier for our data ? Why do we need our name, or social security number ? Can you refer to someone without identifier ?a person with good heart---- Make our busy life less messy We just got 24 hours per day, not more Add identifier to our data Give the everyone-agreed-unique-identifier to each data -- the perfect world of our dreamland

We will not have any integration problem, most of the IT departments can be closed Different groups give different identifiers to the same data we can live with that, it is more real in our daily life, standardization bodies and IT guys are helping us. We are happy that we can refer to data Where are our data In computer On the Web In my paper notes

In printed books Data are being digitalized and are available online Web Data Web data Data on the Web Online journal Blog Wiki Data in physical world

Yourself Table Book in library Computer you are using The boundary is blurring Paper is both in your hand and on the Web How to refer data Web data DOI (Digital Object Identifier) OpenID (people, )

URI (blog, wiki, homepage, ) URI (Uniform Resource Identifier) To identify or name a resource on the Internet The main purpose is to enable interaction with representations of the resource over a network, typically WWW, using specific protocols from Wikipedia URN like a persons name urn:isbn:0-486-27557-4 Book of Romeo and Juliet URL like a street address

http://www.slis.indiana.edu Linked Data A term coined by Tim Berners-Lee It describes HTTP-based Data Access by Reference for the Web Current web is changing from hypertext links (link documents) to hyperdata links (linking data) Data are small components of the resources It drills deep to the details of the resources Linked data provides a powerful mechanism for meshing disparate and heterogeneous data

Vision from Sir Berners-Lee The Semantic Web isnt just about putting data on the web. It is about making links. Four Rules for linking data Use URIs as names for things Use HTTP URIs so that people can look up those names When someone looks up a URI, provide useful information (URI dereferencing) Include links to other URIs, so that they can discover more things Breaking them does not destroy anything, but misses an

opportunity to make data interconnected. This in turn limits the ways it can later be reused in unexpected ways. It is the unexpected re-use of information which is the value added by the web W3C SWEO Linking Open Data Project Project aims to Publish existing open license datasets as linked data on the web Interlink things between different data sources Develop clients and applications that consume linked data from the web

Bubbles in May 2007 Over 500M RDF triples Around 120K RDF links between data sources Bubbles in April 2008 >2B RDF triples Around 3M RDF links Bubble now Organization participating in the LOD

community Academic MIT, Univ Southampton, DERI, Open Univ, Univ London, Univ Hannover, Penn State Univ, Univ Leipzig, Univ Karlsruhe, Joanneum (AT), Free Univ Berlin, Cyc, SouthEast Univ (CN), Commercial BBC, OpenLink, Talis, Zitgist, Garlik, Mondeca, Renault, Boad Interactive What are Linked Data? Linked Data require RDF

Why not XML? Different model theory But not all RDF data are linked data You have to compliant your RDF data according to the four rules mentioned by Berners-Lee What is RDF? Basic Ideas behind RDF RDF uses Web identifiers (URIs) to identify resources

RDF describes resources with properties and property values Everything can be represented as triples The essence of RDF is the (s,p,o) triple Resource (subject) Property (predicate) Value (object)

Subject has a property with value object (s,p,o) RDF Triples Triple A Resource (Subject) is anything that can have a URI: URIs or blank nodes A Property (Predicate) is one of the features of the Resource: URIs A Property value (Object) is the value of a Property, which can be literal or another resource: URIs, literal, blank nodes Resource

(subject) Property (predicate) Value (object) Literals can be the object of an RDF statement, but cannot be the subject or the predicate Do you have linked data Linked data are just RDF triples

How can I get RDF triples Relational database: D2R tools can convert them for you RDFizers from SIMILE: Can convert JPEG, MARC/MODS, OAI-PMH, OCW(MIT Open Course), Email, BibTex, Java, Javadoc, etc. to RDF

Thumb of the rules Understand your data What do you want to have in your data Do not reinvent REUSE! Potential ontologies/vocabularies FOAF, SIOC, Geo URI Aliases Different URIs for the same non-information resource (Berlin, etc.) owl:sameAs to link these URI aliases More principles

Linked Data is simply about using the Web to create typed links between data from different sources. The principle of Linked data is to: Use the RDF data model to publish structured data on the web Use RDF links to interlink data from different data sources. Use HTTP URIs to identify resource To avoid other URI schemes (URNs or DOIs) Power of Linked Data rdf:type

ying foaf:Person foaf:name Ying Ding dblp:publications foaf:publication foaf:knows

Stefan foaf:based_near 72K dp:population db:Galway dp:Dublin skos:subject skos:subject

dp:Cities_in_Ireland How to become a bubble Publishing your bubble Are you ready? Dereferencing HTTP URIs Information resources (resources available on the web): HTTP GET HTTP response code 200 OK Non-information resources (real-word objects that exist outside of the web):

HTTP GET HTTP 303 See Other (303 redirect) You are not your homepage, but you can be dereferenced by your homepage Publish your bubble Step 1: Choosing URIs Use HTTP URIs for everything (http://) Make it dereferenable Try to use the existing dereferencable URIs to represent common things (city, music, artist, etc.): http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingO penData/CommonVocabularies

For instance: Geonames, DBpedia, Musicbrainz, dbtune, RDF Book Mashup Keep implementation info out of your URIs Keep your URIs stable and persistent Publish your bubble Step 1: Choosing URIs http://dbpedia.org/resource/Berlin http://dbpedia.org/page/Berlin http://dbpedia.org/data/Berlin http://id.dbpedia.org/Berlin http://pages.dbpedia.org/Berlin

http://data.dbpedia.org/Berlin http://dbpedia.org/Berlin http://dbpedia.org/Berlin.html http://dbpedia.org/Berlin.rdf Reference: Sauermann et al.: Cool URIs for the Semantic Web (tutorial on URI dereferencing and content-negotiation) Publish your bubble Step2: choose the vocabularies to represent information Reuse terms from well-known vocabularies wherever possible Friend of a Friend (FOAF) Dublin Core (DC)

Semantically-Interlinked Online Communities (SIOC) Description of a Project (DOAP) Simple Knowledge Organization System (SKOS) Creative Commons (CC) More: http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpe nData/CommonVocabularies You should only define new terms yourself if you cannot find required terms in existing vocabularies Publish your bubble Step2: choose the vocabularies to represent

information If you really have to define your own vocabularies: Do not define new vocabularies from scratch Provide for both humans and machines (rdf:comments, rdfs:label) Make term URIs dereferenceable Make use of other peoples terms State all important information explicitly Do not create over-constrained, brittle models, leave some flexibility for growth Publish your bubble Step3: Link your bubble with other

bubbles RDF links enable browsers and crawlers to navigate between data sources and to discover additional data. foaf:knows, foaf:based_near, foaf:topic_interest owl:sameAs (map different URI aliases) Publish your bubble Step3: Link your bubble with other bubbles Auto-generating RDF Links: owl:sameAs

ISBN for books (e.g., RDF Book Mashup) More complex property-based algorithms Interlinking DBpedia and Geonames Interlinking Jamendo and MusicBrainz Publish your bubble Recipes for publishing different information as Linked Data on the Web Things must be identified with dereferenceable HTTP URIs If such a URI is dereferenced asking for the MIME-type

application/rdf+xml, a data source must return an RDF/ XML description of the identified resource URIs that identify non-information resources should return HTTP 303 redirect Besides RDF links to resources within the same data source, RDF descriptions should also contain other RDF links to link to other resources, so that you can browse the web of data. Test your bubble Step4: test and debug linked data Vapour linked validation service: a linked data validator (http://vapour.sourceforge.net/)

Use Linked browsers to see whether your information display correctly and your RDF links work Tabulator, Marbles, OpenLink RDF Browser, Disco Welcome to the bubble world Very excited! Then what is my contribution and benefit? Add more data to RDF data Increase semantic content Bring Web to its full potential!

What LOD can bring? It will lift current document web up to a data web LOD browsers can let you navigate between different data sources by following RDF links. It can drill down to the lower granularity of the information allowing you for more fine search on the web making the question-answer search on the Web possible meshing up different data through RDF links Making the built-on-top application easier

Document Web vs. Data Web Document Web Glued by hyperlinks Data are HTML pages Query result is HTML pages, which can not be further processed Data are just interlinked, but not integrated Data access through different APIs Data Web

Glued by RDF links Data are RDF triples Query result is RDF triples which can be easily further processed (e.g., web services) Data are interlinked and integrated, and links are typed Data access through a single and standardized access mechanism (maybe it will called in the

future LOD API?) More about LOD LOD Wiki http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/ LinkingOpenData Tutorial on how to publish LOD data http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/ Further readings and tools W3C Track LOD WWW2008 http://www.w3.org/2008/Talks/WWW2008-W3CTrack-LOD.pdf

Linked Data Planet in New York 2008 http://linkeddata.org/slides/2008-06-nyc-ldp.pdf LDOW2008 workshop in WWW2008 http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol369/ ISWC 2008 LOD tutorial http://events.linkeddata.org/iswc2008tutorial LOD mailinglist

Recently Viewed Presentations

  • Time Management - Wbja

    Time Management - Wbja

    TIME MANAGEMENT TIME IS MONEY You can make money; you can't make time. An inch of gold cannot buy an inch of time (Chinese proverb). WHY TIME MANAGEMENT ? To utilise the available time in optimum manner to achieve one's...
  • Qualitative Research

    Qualitative Research

    4. Confirmability. The degree to which the results could . be confirmed or corroborated by others. Enhancing confirmability: Documenting. the procedures for checking and rechecking the data. Another researcher can take a "devil's advocate" role with respect to the results....
  • Contact Us The Linux Foundation General Inquiries info@linuxfoundation.org

    Contact Us The Linux Foundation General Inquiries [email protected]

    Contact Us. The Linux Foundation. 1 Letterman Drive. Building D, Suite D4700. San Francisco CA 94129. Phone/Fax: +1 415 7239709. www.linuxfoundation.org. General ...
  • Brian Aspinall @mraspinall

    Brian Aspinall @mraspinall

    "Flipping Out" Exploring Buzzwords in Education Brian Aspinall @mraspinall Grade 8 LKDSB www.brianaspinall.com RCAC Symposium 2013 An inquiry approach to elementary classrooms using Computer Science
  • Buddha: The Doctor - Ms. Williams' Student Site

    Buddha: The Doctor - Ms. Williams' Student Site

    How was the Buddha like a doctor? The Buddha often compared himself to a doctor or therapist. He believed that he had patients to care for. The Buddha saw humans as his patients.
  • WELCOME TO KINDERGARTEN MEET THE TEACHER 2019 KL

    WELCOME TO KINDERGARTEN MEET THE TEACHER 2019 KL

    The children are tested only once using the Best Start resources. Children are then monitored through class assessments throughout the rest of their schooling. You will receive a written report generated from the individual student results via mail in the...
  • 50th school engagement - Selkirk College

    50th school engagement - Selkirk College

    School Specific Engagement Planning. The core committee is asking school chairs to engage with their departments in the development and planning of school specific 50th anniversary celebrations or initiatives that are unique to that specific study area ( i.e. forestry,...
  • Title

    Title

    Status on SESAR IIBAC CNS/ATM Advisory Group November 18th, 2010 * 9.49 ensure the coordination of avionics developments and schedules across European (SESAR) and US (NextGen) to achieve global interoperability for avionics systems through ICAO SARPs and coordinated Industry standards...