Unlocking Data

Data storage, administration and publication are generally the three facets of a traditional Content Management System but key enterprise data assets are no longer confined to predictably structured, transactional databases and data warehouses. Information and knowledge stores of data are buried in spread sheets, in emails, in Access databases and in documents in various formats.

  • Scalability: semantic repositories need to scale to the large amount of Semantic Data that is available and adapt to inevitable growth. A higher schema flexibility compared to the relational model would enhance performance, though the exploitation of Semantic Data on the Web requires managing the scale that so far can only be handled by the major search engine providers.
  • Source: the very nature of the Semantic Web is defined by interaction with data sources outside and beyond any proprietary system which will demand effective implementation of trust mechanisms and policies for privacy and rights management. Though this will be primarily realised by the Content Management System front end the choice of storage and in particular the use of a triple-store.
  • Dynamicity: An important property of Semantic Data is its dynamicity. While some data, such as public administration archives or collections of text documents might not change too frequently, other data, coming from remote connections such as RSS, micro blogging, etc., may update on a per millisecond basis. The effects of such changes have to be addressed through a combination of stream processing, mining, and semantics-based techniques.

Search and Disambiguation

Disambiguation and relevance are key filters when importing or exporting data. Effective data awareness and contextual information may deliver and rank pertinent results and help the users to focus on the part of the data that is relevant.

The large and growing amount of Semantic Data enables new kinds of applications. At the same time, more data means that ultimately, there might be more results produced from it that one needs or wants. This creates an attractive proposition to focus investment on the ICT assets which currently exist in an organisation to leverage greater value.