Preprints of the
Metadiversity
Conference
Proceedings
Session 3: The Challenge in Earth Observation, Ecosystem
Monitoring, and Environmental Information
The Challenge in Earth Observation, Ecosystem Monitoring,
and Environmental Information
ROBERTA BALSTAD MILLER, Director, the Center for
International Earth Science Information Network (CIESIN)
at Columbia University
|
ABSTRACT
This presentation will
discuss the challenges inherent in developing
information systems that encompass the vast
quantities of data and information from diverse
sources necessary for research and policy on
biodiversity and ecosystems. These challenges fall
into three broad areas: data integration; data
access and dissemination; and public information.
Examples will be drawn from environmental data
systems and tools in use or being developed. |
It has been a very
interesting day thus far. We have had people discussing what
should be done, people discussing what will be done, and
then people discussing what is currently being done. It is
that last category that I think we will take further today
with what really is a very outstanding panel reporting on
some systems efforts both at the national level and
international level.
But first what I would like
to do is look at the PCAST report of the Panel on
Biodiversity and Ecosystems, in historical perspective. Then
I will address three major challenges that are raised by
that report and that we need to deal with in responding to
the report. Those challenges, I would argue, are the
challenge of data integration, the challenge of data access
and dissemination, and, finally, the challenge of public
information.
Background
The PCAST Panel on
Biodiversity and Ecosystems emphasizes the need to use data
and information resources for monitoring ecosystems and for
integrating data. What the panel members had in mind when
they wrote their report was economic data and biodiversity
data for providing policymakers and the public with
information for resource management. All of this is
important. But the desire to build links among scientific
research, data management, and public policy is not new.
National Income Accounts. In
the 1930s, the economist Simon Kuznets, working at MIT,
developed the National Income Accounts, which were time
series data of economic activity (in terms of productivity)
in all sectors of the economy. His goal in doing this was to
improve our understanding of the economy and to provide a
means of measuring change in productivity in various
sectors. National Income Accounts are the economic
indicators that still affect economic policy and, some would
argue, are responsible for our economic prosperity in the
period since then. I think this can be seen as a very
successful attempt to link the scientific research in
economics with information series and public policy.
Social Indicators
There was a second attempt
also. This attempt took place in the late 1960s and the late
1970s. Taking off from the idea of economic indicators,
there was in this country–and in a number of other
countries–something called the Social Indicators Movement.
The Social Indicators Movement was based on the recognition
that what is most important in national policy is not
economic in nature and that, while economic indicators are
themselves useful, they are not enough. National policy
needs to be informed by statistical data on both tangible
and intangible changes in the society.
Tangible kinds of indicators
that were discussed included education test scores, infant
mortality rates, literacy levels, and unemployment
statistics. Intangible indicators included topics such as
health and well-being and confidence in government. (If you
think that confidence in government is not very important,
look at the United States and Russia today and see how
important confidence in government is for the smooth working
of a country and, in particular, an economy.)
Economics vs. Social
Indicators
There are significant
differences between economic and the social indicators. The
economic indicators have a common metric: They use dollars
and cents. Everything is expressed in money. As a result,
the economic indicators are additive. You could add these
indicators together and get a gross national product.
Social indicators are
different. They use diverse metrics. You don't measure
literacy, unemployment, and confidence in government along
the same metric. They are very, very different. As a result,
social indicators are not additive. They are all separate.
There have been attempts to combine various kinds of
indicators into a composite indicator. (One of the more
well-known attempts resulted in the Indicator of
Development, which I believe was composed of literacy,
infant mortality, and education of women.) But in almost
every case, the additive social indicators left out
something. When you combine social indicators, you lose
information and detail. Consequently, social policy based on
these additive indicators did not work terribly well.
The Current Situation
Today, we are facing similar
problems in some respects and different problems in other
respects. We need data series that will help us in research,
in policy, and in resource management. But the circumstances
today are very different than they were in the ‘30s or in
the ‘60s or in the ‘70s. Biodiversity and ecosystems are
complex systems. They are not closed systems. They are
affected by economic activities. They are affected by
social, demographic, and cultural activities and phenomena.
They are affected by politics and by public policy,
including treaties, regulations, war, foreign policy, and
transportation policy. They are affected by physical and
environmental change at regional and global levels. Then, of
course, there are the biological functions that take place
within this broad, shifting framework of many other types of
change.
A second difference in the
situation today is related to advances in information
technologies. We have the capability to obtain and save vast
quantities of information–probably much more information
than any one person could use. So, part of the problem is
that we have an embarrassment of riches. The problem
addressed by this meeting is to bring order to all this
information. But policymakers are not going to be able to
deal with the vast quantities of information that scientists
can deal with or that data managers are going to be able to
deal with. The translation from science and data management
requirements to the policy framework has to be a matter of
imposing order on that chaos.
A third difference from
earlier attempts to create data series for public policy is
the growing emphasis upon public as well as policy
information. In the 1930s, Simon Kuznets did not worry about
informing the public about environmental indicators. This
was–even then–a public policy issue but not something about
which the public was concerned.
Things are different today,
and such issues must be addressed by this group and by
others who wish to respond to the PCAST Panel on
Biodiversity and Ecosystems.
The Challenge of
Integrating Data
As I noted at the start,
there are three challenges for us to accept. First of all is
the challenge of data integration. We have already heard a
great deal about the data problems of biodiversity. But it
is a lot more complicated than data. In a very real sense,
biodiversity is about people. Biodiversity is about economic
markets, and biodiversity is about global environmental
change. In order to understand biodiversity, you have to
have data integration. You have to pull data on all of these
together into a single data series or a single database or a
single type of data. This is true whether you approach the
topic from a scientific, a policy, or a public information
perspective. Because the science of biodiversity involves so
many fields, the data series themselves have to involve
multidisciplinary data.
Integrating multidisciplinary
data is not an easy task. For example, you may have to
compare or combine remote sensing data with population data,
with transportation data, with in situ data in order to have
the background to deal with certain kinds of land-management
issues. We don't have a common metric. We are not like the
economists in the 1930s and thereafter. We don't have
dollars and cents that we can use for all of these data
series. Space–spatial representation–frequently becomes the
framework for integrating the data.
The Problem of Space
But space also creates a
problem in data integration, because the unit of analysis or
the basic spatial unit differs for the three major types of
data that I am talking about. Remote sensing data is
provided to us in an image, and that image then is
superimposed with an imaginary grid. Scientists use that
grid to analyze images.
But socioeconomic data are
collected in political jurisdictions, and those political
jurisdictions never approach a grid (except for a few places
in the Midwest). Jurisdictions are determined by historical
forces, historical practice, or historical agreements.
Jurisdictions also can be determined by rivers or by, in
some cases, the shifting boundaries created by war,
politics, and treaties. All socioeconomic data that are
collected by the government are collected for political
jurisdictions. So you have to break out of the tyranny of
those political jurisdictions in order to put the data in a
framework where you can use them with the gridded data that
are available through remote sensing.
To complicate things further,
you have ecosystem data. Now, ecosystems don't translate
easily to a grid or to a political jurisdiction. So, you
have still a third geographical area that you have to put
together when you are integrating data. Therefore the data
themselves often need to be transformed before they can be
integrated and used together. This is a really difficult
task.
Examples of Integrated
Data
What I want to do now is give
you a couple of examples of integrated data. One example is
the newly created gridded population of the world map. It is
roughly a five-minute-by-five-minute grid, and, obviously,
the areas must differ up at the poles. The map is imperfect,
but it is being corrected right now. In addition, parts of
the map are better than others. However, this map marks the
first time we have ever been able to produce population data
that wasn't expressed by national boundaries but instead by
the grid. Of course, you don't want to lose the national
boundaries, because that is where the laws and the
regulations are enforced. So, you need to move between those
two means of representation. In fact, at CIESIN, we have
also gridded the Mexican population on a
one-kilometer-square grid. These data are all available
online.
The next example is a program
that we maintain online called the Demographic Data Viewer,
or DDViewer. This provides you with mapping capability for
the U.S. Census. You can map the entire country, a state, a
county or group of counties, and you can even map counties
across state boundaries. And since census databases are
created from massive state databases, to be able to cross
these state boundaries is quite a feat. You are able to go
in and select the unit you want to map. You select the
variables that you would like to have on the map and it goes
down to the census-block group. You can specify the
parameters of what you are doing through a program. Then you
basically press the Map-It button, and you get your data
instantly. Again, it is a way of visualizing demographic
data, population data, and census data in a different way so
that you can integrate them with various kinds of
information. It enables you to get away from the tyranny of
the political jurisdictions in the display of socioeconomic
data.
Still another product that we
maintain online is something we call DDCarto that translates
census data to other kinds of units. From the counties and
the states, you can translate data to zip codes, you can
translate data to congressional districts, and you can
translate data to eco regions. Therefore, you can move from
one kind of geography to another kind of geography.
It is this kind of work that
needs to be done for data integration. It is as much data
preparation and data development as it is data analyses. And
yet, if you are going to integrate disparate types of data
and if you are going to provide data for public policy and
public information, you have got to go through these
exercises and you have to transform your data.
The Challenge of Data
Access and Dissemination
The second challenge that
needs to be addressed is the challenge of data access and
dissemination. This is something that we have talked about a
number of times already today. People have talked about the
need for interoperable metadata. People have talked about
the need for common metadata standards. People have talked
about the need for distributed information management
systems. These topics are going to be addressed in very
experienced and able detail by the panel today, so I am not
going to go into too much detail now. I would emphasize
though that distributed information systems for biodiversity
must link with multiple kinds of data. It is not enough
simply to have biodiversity or biological data. You have got
to link into the socioeconomic data and the global-change
data.
Let’s look at the data and
information system from the Socioeconomic Data and
Applications Center (SEDAC), which is part of NASA’s Earth
Observing System. Because many of the data on socioeconomic
factors have to be pulled from many different sources, the
SEDAC search system provides a means of searching multiple
data catalogues, either singly or all at once. You can
search it through a structured system, through a key word,
and through a geographical interface. This is a metadata
search tool. You are not searching the data themselves–you
are searching the metadata to identify metadata that might
be of interest. We do have a version of this that allows
people, particularly in developing countries, to do an
e-mail search of this catalogue, because although many of
the people who use it do not have the bandwidth for full
connectivity, they do have e-mail available.
The Challenge of Public
Information
The third challenge that is
raised by the Report of the PCAST Biodiversity and
Ecosystems Panel is the challenge of public information.
There are a number of reasons to focus on public
information: The panel recommends it. The convention on
biodiversity recommends it. Agenda 21 recommends it. A host
of other public reports recommend that a data management
strategy must include a public information component as
well.
But another reason for doing
so is because the technology has a bias for public
dissemination. The days when information was placed in
libraries or provided only to those who had a "need to know"
are fast disappearing.
A third reason for
emphasizing public access, as well as policy access and
scientific access, is related to democratic traditions. This
is an argument that comes out of the earlier Social
Indicators Movement. One of the leaders of that movement,
Sten Johanson, a Swedish sociologist who was in charge of
the Level of Living study in Sweden, argued that in a
democratic society, a government had a duty to inform (to
give data to) its citizenry on public policy, and,
furthermore, the citizens had a right to the kind of data
that would enable them to evaluate how well they were being
governed. For the first time in history, we have the
technological capability to make this happen.
Biodiversity and ecosystem
research can have indicators, can have a steady stream of
public data because, again, the technology has changed. We
have moved from the book to the electronic medium. But there
are some requirements that this technology lays on us.
First, it requires having a
user interface that is very friendly. Not everyone is going
to be technologically sophisticated. Computer data programs
must be easy to use.
Second, I would argue that
there should also be multiple means of dissemination of this
information. There should be computers and electronic
information systems. There should also be computers with
printing capabilities available in public information
centers or public information places. I am thinking of
libraries, civic buildings, and the contemporary market
place: shopping centers, convenience stores. If there were
computers there that had data available and a printing
capability, even someone who wasn't able to run the computer
could get the information he or she needed and walk away
with a printout.
Still another aspect of
providing indicators for public policy is providing
training. It is going to be easier, obviously, for younger
people than older people, but some kind of training program
is a valuable and logical part of the public information
program.
In Summary
In summary then, one of the
central recommendations of the PCAST Panel on Biodiversity
and Ecosystems is to translate scientific research into data
that can be used in monitoring ecosystems, in managing
biodiversity and ecological resources, and in forming public
policy. This will only happen, I would argue, if we are able
to improve our capacity to integrate data, if we are able to
document and disseminate the data, and if we are able to
make the resulting data and information available to both
policymakers and the general public.
Previous |
Next
Questions:
Email us or Call (215)
893-1561
Copyright © 2003 NFAIS. All rights
reserved. No part of this product or service may be
reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without prior written consent.
Privacy
Policy |