Archive for the ‘Data Mining’ Category

What’s Semantic Web

Saturday, September 1st, 2007

Semantic Web is a solution towards better tuning web data into web knowledge. The basic idea behind semantic web is add formal descriptive material to each web page, although it is invisible to human, but would make its content easily understood by computers.

According to W3C’ definition: The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects.

The approach to semantic web includes two technologies suggested by W3C:

  1.  RDF ( Resource Description Framework ) : It is used to represent information and share knowledge in the Web.  Most famous usages of RDF include RSS Feeds, Apple Podcast, Open Directory Project’s data.
  2. OWL ( Web Ontology Language ) : It is used to publish and share sets of terms called ontologies, supporting advanced Web search, software agents and knowledge management. Again, a example is Open Directory Project’s category information.
  • Share/Bookmark

Categories of Web Data Mining

Friday, August 31st, 2007

We talked about data mining for quite a while. Up till now, we begin focusing on web data mining. The research currently focuses on:

  1. Web Structure Mining, which includes: Information Retrieval and Web search, Hyper-link based  ranking.
  2. Web Content Mining, which includes: Clustering, Classification.
  3. Web Usage Mining.

This categorization comes from Z. Markov and D. Larose ’s excellent book: Data mining the Web -uncovering patterns in web content, structure and usage.

  • Share/Bookmark