Search NFAIS

Home
About NFAIS
Events

Promotions
Information Community News
Press Releases
Members
Committees
Join NFAIS
Contact NFAIS

Member Login



 

 

 

 

 

 

 

 

 

 

 

Home  >>  Publications  >>  Metadiversity  >>  Preprints Contents
 
Preprints of the Metadiversity Conference Proceedings

  Session 3: The Challenge in Earth Observation, Ecosystem Monitoring, and Environmental Information

Environmental Metainformation in the Work Program of the European Environmental Agency

STEFAN JENSEN, Project Leader, European Topic Centre/Catalogue of Data Sources (ETC/CDS), European Environment Agency

ABSTRACT

In 1996 ETC/CDS started operation under the task and vision to build a metainformation system on the environment on the European scale on behalf of the European Environment Agency (EEA). Following the guidance of G7 metadata initiative, the ETC/CDS advisory committee and the EEA representatives involved - building on thorough experiences in developing a metainformation system on the environment for Austria and Germany - the CDS fields and data model was developed and agreed upon. Technically, it is based on the GELOS standard - adding some optional fields through national demand. To meet the need for a multilingual environmental thesaurus, GEMET was built by merging existing European thesauri and adding translations for missing languages. The main purpose of the thesaurus was defined as indexing metainformation. Customers of the systems are seen to be the EIONET, the general public, and national initiatives. From this ground, software development results in a flexible input tool (WinCDS) and a state-of-the-art retrieval tool (WebCDS). For thesaurus purposes a maintenance tool as well as a simple "thesaurus browser" (ThesShow) is available. Metainformation collection started in 1997 with other ETCs and EEA, following a supply-driven approach (to register available and used sources). Only after agreeing on the selection criteria early in 1998 could they be used to follow a demand-driven collection approach. Interpreting the vision as the call for supplying - sooner or later - a seamless access to all kinds of environmental information through a catalogue, the GELOS+ fields are shaped according to recommendations from American and European standardizing bodies (both currently merging into the international ISO 15046-15). This enables appropriate description of spatial environmental data. This goes parallel with the implementation of spatial visualization and query of metadata in the CDS software. Collection by now results in a database with 1280 data sources and 550 addresses. Taking for granted that filling and updating of the catalogue will be a core activity of the ETC/CDS in the years to come, several crucial decisions need to be made. The maintenance of these sources is currently hampered by the lack of binding updating obligations and some changing policies.

The EEA's strategy of a "European Reference Centre on Environmental Information"must be used to overcome these shortcomings by: clearly defining the role of CDS as the entry point (catalogue) for retrieval of quality-assured environmental information; identifying the integration with EEA data warehouse and EEA GELOS server; involving the ETCs in a constant process intensively using the selection criteria for national data collection. The interest of the majority of the member states in the usage of the CDS or a similar approach shows the opportunity to use it as a harmonizing approach to manage both their own business case in metainformation and their European reporting obligations. To further support this, reporting obligations from EU legislation are currently added to the database. Beyond this, the CDS system will develop into a distributed environmental information system that forms an entry point to various environmentally related sources, located at distributed providers - no matter if these are information from space or from earth science, from mapping authorities or from monitoring networks - bridging the gap between public science and administration, between Europe and its regions, as well linking to global services.

My name is Stefan Jensen. I am the project leader of one of the nine European Topic Centres set up by the European Environment Agency. The European Environment Agency (EEA) was installed in 1994 in order to do reporting on the state of the environment.

The Organization

The European Observation and Information Network was established through the work of the EEA. As I mentioned, the network consists of nine Topic Centres. Most of them are subject-oriented. For example, one deals with nature conservation (this is where the biodiversity aspect would fit in). One deals with air pollution, one with soil, and so on. All the major topics are covered. Our Topic Centre–the Catalogue of Data Source (CDS) Topic Centre–deals with the information aspects of the network. One of our principal tasks is to gather metainformation, or metadata, in order to facilitate access to information collected by other partners in the network. This Topic Centre is an organization consisting of nine active partners from four European countries–Austria, Germany, Italy, and Sweden.

The work functions like this: There are 15 member countries in the European Union. But we are working with 18 countries (extended by Iceland, Liechtenstein, and Norway) at the moment. These countries named 18 National Focal Points. In addition, for each of the Topic Centres you find in the member states a so-called National Reference Centre (NRC). The NRCs carry out the work in individual topic areas. So the core of the network consists of about 200 contacts involved in the work.

Then there are other institutions, such as scientific organizations, that are named by the member countries and that play an important role in environmental reporting. These other institutions are also part of the network and, to different degrees, they are involved in the current work.

The Task

Our task is, speaking on the meta-level, to create a European-wide metainformation system on the environment.

We began this task in 1996 by conceptualizing and implementing a common data model, a common language. The next thing was to promote the new data model to institutions that were not involved in this process. This initial effort also addressed the issue of existing national environmental information systems–metainformation systems–within the member states (which are, to a certain degree, already available, although the vast majority of the member states do not have them yet).

Next came the development of some pieces of software--first for data collection and second for the retrieval of data. For example, an important issue in Europe is the fact that there are a total of 13 languages that need to be addressed. So it was thought to be beneficial that building a multilingual environment be a part of the work.

The development of selection criteria was another issue we had to address. When we started data collection, we had not yet set selection criteria. As a result, we had to define some selection criteria based on the kinds of sources we were using.

We are continuing to collect data. We also have to maintain the meta-database, and we have to supply access to distributed systems, which we are now only starting to do. So at the moment, we have no distributive system yet. But like the Global Change Master Directory in the U.S., we have one database that is currently used.

The User Groups

Who are the user groups? Some of the user groups, including the EEA, the Topic Centres, and the National Focal Points, are pretty obvious as core users. But other users are not so clear. They include institutions running national systems (national metainformation initiatives), other institutions working in the field, and the "general public." There certainly are various other institutions that might be interested in these kinds of data, but we are still in a learning process about them and other potential users.

The Data Models

What is our data model? Where are we building on? We are building on the Global Environmental Information Locator System (GELOS) described earlier in this conference by Eliot Christian. We took the GELOS element set and had member states add certain fields, which were not made mandatory. Neither are the fields we added mandatory. However, we do have certain mandatory fields, and we encourage our users to fill in the mandatory fields. We also encourage the use of these mandatory fields in the construction of the software. Still it is possible to register the entries without mandatory fields, but I have seen this in only a couple of applications. If the information is really too thin–if, say, you have only three or four fields filled in–then it might not be very useful, and what you get out of the system may not be what you thought you were going to get.

We are also conducting various standardization initiatives, which allow us to at least meet Level I requirements (Level I requirements mean that you only cover first entries). We use other standards to build a thesaurus.

As with GELOS, Z39.50 will also be our protocol for accessing distributed systems. We are running profiles such as GELOS on it. But at the moment we are using GELOS not for a distributed system but for a description of elements we use in one database. SGML is the current data exchange format. I can see from the discussion here that XML is probably the next-generation format for such things, but SGML is a good start in moving toward XML.

Metainformation

I would like to reflect briefly on our experiences with metadata and why we introduced this kind of metainformation. I think that metainformation is relevant to a pyramid full of sources, including databases, stations, documents, maps, images, tools, and projects. The top of the pyramid is the locator system–the entry to this information–the tip of the iceberg. The bottom of the pyramid is founded by the access to the data themselves.

CDS-Based Harmonization

Our system is called CDS–Catalogue of Data Sources. We see that there are some national metainformation systems around in Europe that have a very high level or degree of detail but still do not cover everything. This is why we concentrate on a common subset to all of them.

Harmonization is, therefore, an issue. What we have achieved in this area is that member states are adopting the data model for the design of their national systems. You can imagine that each country has its own specialties and, like some people working with biodiversity, I imagine each country will have its own specific ideas. For example, some would like to have specific fields about plants or specific fields about beetles, and so on, just to describe the individuality of the source. Something like this is happening here as member states build on the CDS, including GELOS.

However, there are some countries that stick very closely to what we are doing. They are building their national metainformation systems on our software, which is based on MS ACCESS. They can easily change and adopt the software to their needs.

The CDS also is used in some supranational projects. One example is the Alpine Convention, which can be described as a biodiversity convention for the Alpine region.

Tools

We are also building various tools. One tool I mentioned already is a thesaurus, which can be used for indexing and retrieving metainformation. The one we are building is just a general thesaurus, so it is not a thesaurus on biodiversity. If you look into it you might find some terms out of your field, but it will probably not meet all the specific needs of individuals and scientific domains. However, the general thesaurus is a starting point and includes quite a number of terms (5,400!) in–at the moment–11 languages. (Greek and Islandic are currently missing but should be included by April 1999, by which time we want to finish the thesaurus.)

Software developments have resulted in a flexible input tool–WinCDS. We also have a state-of-the-art retrieval tool–WebCDS. This WebCDS tool is based on JAVA, which many of our clients are not able to use effectively because of firewall problems. Win CDS allows the usage of Structured Query Language (SQL) databases, and it has an easy search interface for HTML customers.

Criteria and Priorities for Collection

Now about the data that is in such a system: It was decided by the EIONET group, by the member states, and by the European Environment Agency that we should have at least a small central catalogue with core information where a certain level of quality control can be applied. This catalogue will include the following information:

The Directory of EIONET partners
Items produced by the EEA/EIONET
Data requested by the EEA/EIONET on a regular and scheduled basis
Data deliveries to the EU as a result of legislative reporting
Data requested by several international bodies
Environmental databases operated by international organizations and environmental conventions
National State-of-the-Environment Reports
National Environmental Monitoring Programs
National Environmental Resource Libraries
National meta-databases or reference databases on the environment

WebCDS Content

What is at the moment contained in this catalogue system? Here are the environmental themes and their percentages:
 

environmental policy (20%)
information (8%)
water (8%)
pollution (7%)
general (6%)
legislation (5%)
biology (5%)
air (4%)
administration (4%)
natural areas, landscape,
    ecosystems (4%)
rest (29%)

Themes like those listed above, which are the most popular themes, are a part of the general thesaurus. The 5,400 thesaurus terms are assigned to 40 themes. The terms are used for the indexing; then you can assign them to the themes. The themes identified here also show that at this stage there is some focus on an administrative catalogue, since it includes quite a bit of environmental policy information.

Previous | Next

 


Questions: Email us or Call (215) 893-1561

Copyright © 2003 NFAIS. All rights reserved. No part of this product or service may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without prior written consent.

Privacy Policy