以文本方式查看主题

-  W3CHINA.ORG讨论区 - 语义网·描述逻辑·本体·RDF·OWL  (http://bbs.xml.org.cn/index.asp)
--  『 Semantic Web(语义Web)/描述逻辑/本体 』  (http://bbs.xml.org.cn/list.asp?boardid=2)
----  请教目前有哪些比较成熟的领域本体实例  (http://bbs.xml.org.cn/dispbbs.asp?boardid=2&rootid=&id=56965)


--  作者:hoyou
--  发布时间:12/19/2007 4:01:00 PM

--  请教目前有哪些比较成熟的领域本体实例
看了那么多文章,都是讲理论层面的,最好有一些实例能够加深理解
目前网上上可下载到的领域本体有哪些呢,我记得好像有个叫UMLS的生物医学库,但是不知能不能下载到,国内有人做过哪怕最简单的实例么,谢谢

--  作者:whfcarter
--  发布时间:12/19/2007 4:22:00 PM

--  
楼主可以通过以下网址了解和下载到现在public available的semantic web dataset
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets
同时可以去访问swoogle或者watson等semantic web gateway来获取domain ontologies.
最后回答一下关于bioinformatics里面,专门的文档库包括pubmed, 然后mesh是thesarus,而mesh是标注各种gene或者其他物质之间的relation的semantic network,好像均需要注册,然后发mail,他应该会寄光盘的
--  作者:yyandrea
--  发布时间:1/18/2008 5:26:00 PM

--  
还是没有找到哦~~~
可不可以说的再细一点呢?
新手,不太清楚
--  作者:jpz6311whu
--  发布时间:1/18/2008 5:45:00 PM

--  

SWEO Community Project: Linking Open Data on the Semantic Web

Datasets
This page collects RDF datasets that are part of the Semantic Web.

For being part of the Semantic Web data has to be accessable as RDF over the HTTP protocol though at least one of the access methods listed below. The more methods the better (but avoid aliases).

The page is part of the SWEO Interest Group Community Projects effort.


Datasets available with dereferencable URIs - LinkedData
Example things are starting points for use of RDF browsers.

ISWC and ASWC 2007 Conference Data The data set contains data about tracks, papers, sessions, talks, workshops, tutorials, invited talks, panels, organisers, people, organisations and topics. The data is available as Linked Data, SPARQL endpoint and as RDF dumps.

RKB Explorer Data 25 different domains, each with a separate dataset. The data sets are focused on scientific research, and the larger ones include DBLP, Citeseer, CORDIS, NSF, EPSRC, RAE2001 as sources. The data is available as Linked Data, SPARQL endpoint and RDF dumps, and a simple browser is provided. Semantic Web Sitemaps provided.

lingvoj.org provides URIs and multilingual labels for hundreds of human languages. Example entries:French language, Chinese language.

Wikicompany is a free, worldwide business directory that anyone can edit. OpenLink Software hosts a Linked Data version of the directory, extracted by the DBpedia team using DBpedia's software. Example entries: Northwest Airlines, Apple Computer, OpenLink Software.

Christian Becker's flickr wrappr pulls photos related to DBpedia resources from flickr and serves them as RDF. Example: Paris

Joshua Tauberer's GovTrack.us publishes linked data about members of the U.S. Congress, as well as bills, committees and votes. 12M triples. Example resources, announcement

US Census RDF version of the 2000 US census dataset. Consists of around 1 billion triples. Served as linked data and via a SPARQL endpoint. Example things: USA New Jersey

WordNet is a large lexical database of English. Currently being RDFized by a W3C Best Practices Task Force. Details ... Example thing: the verb "read" in the first sense

DBpedia: Linked Data version of Wikipedia. The DBpedia dataset currently provides information about more than 1.95 million “things”, including at least 80,000 persons, 70,000 places, 35,000 music albums, 12,000 films. Provides descriptions in 12 different languages. Altogether, the DBpedia dataset consists of 103 million RDF triples. The dataset is interlinked with various other data sources. Example things: Paul McCartney, Berlin, Tetris

Open Cyc Semantic Web version of the Open Cyc ontology. Supports content negotiation on concept URIs. Example things: RetailStore, Dog. Concept Browser

DBLP Bibliography Server Berlin: Provides bibliographic information about scientific papers. Size of the dataset. 800.000 articles and 400.000 authors, aprox. 15 million triples. Example thing: Tim Berners-Lee in the bibliography. The server provides the November 2006 version of the DBLP dataset. As the Hannover DBLP Bibliography server is updated weekly, you should set RDF links to this server and not the Berlin one.

DBLP Bibliography Server Hannover: Derived from the FUBerlin server, but with more links between the publications (e.g., to conference series) and updated weekly. Unfortunately, no backward compatibility with regard to URIs (URI for persons do not include numbers anymore). Example thing: Tim Berners-Lee

RDF Book Mashup: Provides bibliographic information, reviews and sales offers for most books that have a ISBN number. Maps data from Amazon and Google base to RDF. Size of the dataset: Unknown, billions of triples. Example thing: "Weaving the Web", the book

Project Gutenberg Catalog Linked data version of and SPARQL endpoint over the Project Gutenberg catalog. Interlinked with DBpedia. Example author: Ed Krol

Gene Ontology Annotations Chris Mungall (Berkeley Drosophila Genome Project) serves 6 million annotations from Gene Ontology database

Gene Fruitfly Embryogenesis Images Chris Mungall (Berkeley Drosophila Genome Project) serves a database containing annotated images of gene expression in fruitfly embryogenesis.

IS-Group@Freie Univeristät Berlin There is RDF data about the activities and members of the IS-Group at Freie Universität Berlin available. Example thing: DOAP description of D2R Server project

ECS School Southampton Serves data about members, projects and seminars on the Web as Linked Data. Example person: Marcus Cobden

MindSwap There is RDF data about the activities and members of the Mindswap group at Maryland available.

Revyu has reviews and ratings in RDF/XML available via dereferencable URIs and a SPARQL endpoint. FOAF and Tag information is also available by the same mechanism.

ESWC2006 Conference Dataset describes many aspects of ESWC2006, according to the ESWC2006 Conference Ontology describing authors, papers, session and workshops. Mostly available via dereferenceable URIs. The data might need checking over, and it's not a huge number of triples, but is also well complemented by similar data sets from ISWC2006.

ESWC2007 Conference Dataset describing authors, papers, session and workshops. Available as Linked Data, HTML and via a SPARQL endpoint.

geonames INformation about over 6 million places and geographic features. Example thing Berlin

Several community site with FOAF-enabled profiles — see table at the FOAF wiki

UniProt provides a large life sciences data set with 300M+ triples (contact Eric Jain for a login)

OpenGuides are a network of wiki-based city guides. Example Open Guide to Milton Keynes Each node has RDF/XML describing the thing the node is about, in addition to wiki versioning information. URIs might need tidying up, and don't currently support 303 redirects.

Advogato is exporting its users profiles using FOAF.

Robots.net is exporting its users profiles using FOAF.

TalkDigger is exporting its users profiles using FOAF and the conversations data using SIOC (note: some problems should be resolved between the sioc Users and the FOAF profiles).

Locationary provides geographic information from different information sources. Still prototypical.

dbtune provides an RDF version of the Magnatune music database using the Music Ontology and D2R server.

SemanticWebCentral is a software development site for Open Source Semantic Web tools (think SourceForge for the Semantic Web). It publishes information about its projects and developers in RDF, using the GForge ontology.

Semantic Web School - Vienna: The Semantic Web School provides the latest information on issues about the Semantic Web in form of it's d2r mapped press collection with glossary, wikilinks and so forth using the d2r-server and rss features.

SKOS Data Zone

Jamendo Music server exposing Artist, albums, tracks, covers, lyrics, tags, P2P links (bittorent, ed2k)

CIA Factbook D2R Server publishing the CIA Factbook. Example thing: Botswana

Bio2RDF Semantic web atlas of postgenomic knowledge about human and mouse.

Eurostat Countries and Regions D2R Server publishing statistical information about European countries and regions. Example thing: Leipzig. See also LOD Eurostat page

News about the Semantic Web provided by the Semantic Web School Austria.

Freshmeat DOAP 43000 DOAP profiles of Freshmeat projects.

Open Archives Demo showing how a OAI-PMH endpoint is exposed as Linked Data with OAI2LOD server.

BBC Later and Top of the Pops Data about episodes and tracklists. Interlinked with MusicBrainz and DBpedia.

See also http://esw.w3.org/topic/AnRdfHarvesterStartingPoint


Datasets available as RDF Dumps
QB's Quotes RDF contains at least 42,000 famous quotations with author and subject, from Quotations Book

SIMILE Data Collection containing various datasets including CIA's World Factbook, Library of Congress' Thesaurus of Graphic Materials, National Cancer Institute's cancer thesaurus, Web Consortium's Technical Reports.

dbpedia: Dataset containing extracted data from Wikipedia. About 1.6 million concepts described by 91 million triples, including abstracts in 10 different languages.

DMOZ RDF Dump

GovTrack.us RDF data about the U.S. congress

U.S. Census data comprises population statistics at various geographic levels, from the U.S. as a whole, down through states, counties, sub-counties (roughly, cities and incorporated towns), > 700 million triples.

UniProt provides a large life sciences data set with 300M+ triples

SwetoDblp ontology focused on bibliography data of publications from DBLP with additions that include affiliations, universities, and publishers

Wikipedia³: 47 million triples containing extracted metadata from Wikipedia.

Chef Moz: 290344 restaurants - 104856 reviews - 59243 links to reviews - 2402 editors available as RDF under a free license.

DOAP Store prodives daily generated dumps with all its DOAP project descriptions. RDF/XML, N3

Rpm Find - This is freely downloadable from http://rpmfind.net/linux/rpm2html/mirror.html. The RDF data expands to about 1.3GB - not sure what that equates to in numbers of triples.

Open Directory - this is the classic RDF? source but historically has had some problems with RDF correctness. http://rdf.dmoz.org/

Music Brainz - this service dumps its data as RDF fairly frequently at ftp://ftp.musicbrainz.org/pub/musicbrainz/data/. Currently the zipped version of this data is 102MB

Bitzi - a collaborative file describing service. Dumps data as RDF here: http://bitzi.com/openbits/datadump. The data consists of 330,026 discrete files, 270MB uncompressed.

Texai Lexicon - This is a machine readable dictionary derived from WordNet 2.1, Wiktionary, the CMU Pronouncing Dictionary and the OpenCyc lexicon. Each lexicon word sense entry contains links back to the source dictionary entry, and also to OpenCyc if the entry is has been mapped to the Cyc ontology.

Lots of others. Please feel free to add plenty  


Datasets available via SPARQL Endpoints
See Collection of SPARQL Endpoints


Datasets you can RDFize yourself
If you have some data that needs to be RDFized, and wonder how, look also here:

RDFImportersAndAdapters lists software projects that convert data to RDF

Datasets currently being RDFized
MusicBrainz. Please ask Frederick Giasson for details.

GEMET. GEMET is the GEneral Multilingual Environmental Thesaurus of the European Environment Agency. Please ask Bernard Vatant for details.


Datasets that would be nice to have on the Web of Data
Lots. Please feel free to add plenty  

Open Library project that builds a open, digital library that is supposed to contain all books that have been published. Simple data model so wrapping it should be easy. See also Frederick's post on the open library and the BIBO ontology

U.S. Census Tiger/Line data on roads, zip code geography, places, etc. See also LOD Eurostat page(there is some overlap with Geonames)

IMDB Data. Not sure of the licensing terms. Source. Can be converted to MySQL using JMDB. Source

Open University Course Units. See LabSpace for an idea of what is available, currently in OU-specific XML wrapped in a zip file  

GCIDE_XML. The GNU version of The Collaborative International Dictionary of English (Webster's). Available now as XML. Source

Internet Archive. Provides multiple interesting datasets.

Library of Congress Catalog. Provides information about books and millions of other digital assets.

FreeDB. A database to look up CD information using the internet. Source

See also list containing lots of Ecomonic datasets.

List of US government repositories


Papers and Web Resources on serving Data on the Semantic Web
Tim Berners-Lee: Linked Data

Tim Berners-Lee: Browsable Data

Alistair Miles et al.: Best Practice Recipes for Publishing RDF Vocabularies

Christian Bizer, Richard Cyganiak, Tom Heath: How to publish Linked Data on the Web (Tutorial)

Ding, Finin: Characterizing the Semantic Web on the Web

Frederick Giasson: Distribution of semantic web data

Richard Cyganiak: Debugging Semantic Web sites with cURL

Frederick Giasson: RDF dump vs. dereferencable URIs

Henry Story: I have a web 2.0 name ! together with Foaf enabling an enterprise and a discussion of the posts by Richard Cyganiak.

ESW Wiki: DereferenceURI

ESW Wiki: SparqlEndpointDescription

Francois Belleau: Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System

Frederick Giasson: Content negotiation: bad use cases I recently observed


Nolonger available
Roller Blog Entries: There was a D2R Server running at http://roller.blogdns.net:2020/ which exported blog posts from a Roller Blog Server using the AtomOWL vocabulary. See SPARQLing Roller for details. The D2RQ mapping file should still be useful.


Related Weblogs
Kingsley Idehen's Blog


Related Wikis

Related Shared Bookmarks

Related Feeds (RSS or Atom)

Related Items on Flickr

Related Live Demos


--  作者:wanggou
--  发布时间:1/19/2008 9:38:00 AM

--  
最简单,最好找的就是pizza, protege tutorial里的例子,好好看看
W 3 C h i n a ( since 2003 ) 旗 下 站 点
苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》
62.500ms