Data Model

The Australian National Corpus Meta Data Model (hereafter AusNC_MD_Model for short) is aimed at being an integration format for collecting, connecting and enriching the descriptions provided by AusNC corpus providers.

As such it may be said to include any element (i.e., class or property) found in a content provider's description. Giving an account of all these elements is clearly an impossible task, since they form an open set, i.e. a set that can be extended as new providers join the AusNC information space.

There is however a well-identified set of elements that AusNC_MD_Model uses in order to carry out its task. These elements can be divided into two main categories:

  1. The elements re-used from other namespaces, and
  2. The elements introduced by AusNC_MD_Model

A .owl version of the data model is available to download for those using ontology software such as Protege.

The AusNC_MD_Model re-uses from the following namespaces:


Namespace name:Web address:
The Resource Description Framework (RDF) and the RDF Schema (RDFS) namespaces.
The OAI Object Reuse and Exchange (ORE) namespace
The Simple Knowledge Organization System (SKOS) namespace.
The Dublin Core namespaces for elements, terms and types., abbreviated as DC., abbreviated as DCTERMS., abbreviated as DCMITYPE.

The WGS84 Namespace. A vocabulary for representing latitude, longitude and altitude information in the WGS84 geodetic reference datum.
The GeoNames Ontology makes it possible to add geospatial semantic information to the Word Wide Web.
The FRBR Core ontology is a is a conceptual entity-relationship model developed by the International Federation of Library Associations and Institutions (IFLA) that relates user tasks of retrieval and access in online library catalogues and bibliographic databases from a user’s perspective.
The MADS vocabulary. The purpose of this list of relator terms and associated codes is to allow the relationship between a name and a resource to be designated in bibliographic record &
DOLCE is an upper ontology (top-level ontology, or foundation ontology) is an ontology which describes very general concepts that are the same across all knowledge domains.
TriX (Triples in XML) is a serialization format for RDF (Resource Description Framework) graphs. It is an XML format for serializing Named Graphs and RDF Datasets which offers a compact and readable alternative to the XML-based RDF/XML syntax.
The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.
The Bibliographic Ontology describe bibliographic things on the semantic Web in RDF. This ontology can be used as a citation ontology, as a document classification ontology, or simply as a way to describe any kind of document in RDF.
The OLAC metadata set is based on the Dublin Core (DC) metadata set and uses all fifteen elements defined in that standard. To provide greater precision in resource description, OLAC follows the DC recommendation for qualifying elements by means of element refinements or encoding schemes.


FOAF (an acronym of Friend of a friend) is a machine-readable ontology describing persons, their activities and their relations to other people and objects. Anyone can use FOAF to describe him or herself.
Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources. These aggregations, sometimes called compound digital objects, may combine distributed resources with multiple media types including text, images, data, and video.

In the following segment, the elements of AusNC_MD_Model are presented in a formal way. Classes are introduced first, properties subsequently, both in alphabetical order, giving priority to re-used elements.

To see a full screen version of this documentation below click here