Preprints of the
Metadiversity
Conference
Proceedings
Session 2: The Challenge in Species Discovery and
Taxonomic Information
ITIS, The Integrated Taxonomic Information System
BRUCE B. COLLETTE, Senior Scientist, National
Marine Fisheries Service Systematics Laboratory
|
ABSTRACT
The Integrated
Taxonomic Information System (ITIS) is a relational
database of scientific and common names for plants
and animals. The use of consistent names of species
is fundamental to successful management of
biological systems. ITIS provides a standardized
vocabulary for this purpose and integrates the
scientific results of the world taxonomic community
into a coherent list of biological names. ITIS was
designed to replace the flat file of scientific
names maintained by the U.S. National Oceanographic
Data Center (NODC). ITIS currently contains about
266,000 names of plants and animals and is
accessible on the Web at http://www.itis.usda.gov/itis.
ITIS is up-to-date for North American vertebrates,
vascular plants, and crustaceans. ITIS staff are
reviewing and editing names transferred from NODC
and have added high-priority names such as fish
species covered in FAO world catalogs. Through the
continued cooperation of its partners, ITIS will
make a significant contribution to the scientific
infrastructure that is fundamental to the
conservation and management of the world’s
biodiversity. |
First I would like to say a
few words about the Integrated Taxonomic Information
System, or ITIS. ITIS is a unique government organization
because it is a non-bureaucratic. It has no director and
no direct funding. It exists only because the people
involved had a vision: that we need a standard, yes a
standard, name–not as an expression of taxonomy–as a way
to get from one database to another database. So, for the
purpose of communicating between databases, we have to
agree on a name. It would be nice if it were the "right"
name, but the important thing is to have the same name so
that someone can get data about a species from all your
databases. And that is simply what ITIS is all about–a way
of doing this.
It is not the same way that Frank Bisby spoke of. We are
coming from different directions and hopefully we will
meet in the middle. One of the reasons that ours is
different is because we started with an existing database
of hundreds of thousands of names that had already been
entered. This was the National Oceanographic Data Center (NODC)
database, which we loaded and now need to modify.
An Introduction to ITIS
What I want to do this morning
is tell you a little about the ITIS project, including the
key components of interrelationships with other projects,
where we are now, and where we are going. ITIS is an
online database (see http://www.itis.usda.gov/itis) built
through partnerships with the world taxonomic community,
sponsoring agencies, and organizations. Our goal is to
provide on the Web quality taxonomic information about all
organisms from both aquatic and terrestrial habitats.
(There are, according to various accounts, 300,000 to
400,000 names available now.) Our original focus is North
America because the governmental agencies involved are
North American. But where databases are available, like
for mammals of the world, we go global immediately.
Creating a Standardized
List
We believe that informed
decisions for managing our biological heritage can best be
made with easy access to the wealth of information already
existing about plants and animals. The problem is
communicating between databases. As I said before, we have
to agree on the same name so that we can get into each
other's databases. In fact, we started as an organization
because several federal agencies responsible for managing
the nation's biological heritage found that they had
information stored under different names about the same
organisms, so they could not communicate with each other.
This was on both the federal and the state level. We
needed interoperability.
As a result, our goal is to
standardize credible lists of species names, which have
unique identifiers. This falls within the recommendation
of the National Research Council and other agencies that
have said that taxonomy is important if we are going to
manage biodiversity. I once spent 10 minutes trying to
explain to Jim Baker, the head of the National Oceanic and
Atmospheric Administration (NOAA), why there is not a
list. There are, of course, lists of lots of things–but we
don't even have a list of commercial species within the
United States. Thousands of taxonomists have been working
on this for hundreds of years, and we still have a long
way to go. Depending on the source, there are from 6
million to 40 million entries we need to make. The names
are in Latin. The original descriptions are dispersed
among thousands of biological journals, from obscure
societies and little museums, and in all languages. The
only thing you can read in some of them is the Latin.
Rules to Follow
We have rules of nomenclature.
We also have separate rules for zoology, biology, and
bacteria in constructing the infrastructure of ITIS. It
was necessary that the business rules for each of the
codes were integrated into the structure so that the names
would not violate those codes. Yes, classification is
constantly changing, and it has to be constantly updated.
But that is the beauty of the Internet–it can be updated.
The History of ITIS
ITIS actually began in 1972 in
the Chesapeake Bay region, when the Virginia Institute of
Marine Sciences started one of a lot of local codes. This
was then taken on by the National Oceanographic Data
Center. NODC is responsible for archiving physical
chemical data, including biological data.
This was in the 1970s, and
computer fields were limited, so it was not a good option
to put in the entire species name. A code was better. So,
a flat file, a so-called intelligent number system, was
devised. The first number is the phylum, the second number
is the class, and the third number is the order. This
works okay up to a certain point, but then it collapses
because you have too much data. Further, the emphasis was
on quickly putting the names in the database because the
data were available. Employees were charged with getting
names in quickly, and this meant that many of the names
entered were unreviewed.
In 1985, EPA entered into a
partnership with NOAA. This then broadened to include what
is now the United States Geological Survey (USGS). In 1992
there was formal commitment to replace the code. As a
result, in 1993, the real effort started to develop a
relational database based on a system of classification,
so that, for example, if you wanted data on a given
species and there were taxonomic problems with
identification, you might be able to go up a level in the
relational database structure and get the data on a
generic level.
Where We Are Now
We went online in 1996, and it
became necessary to migrate the old National Oceanographic
Data Center database into ITIS. This means we inherited
some good data and some bad data. This also means we now
have a big cleanup job. So when you go to our Web site and
you see bad data, it is because we have not yet cleaned it
up. The reason is some people were using those names and
those codes, and they can continue to use them. We will
create links so you can type in the old name, the
scientific name, or the common name and the system will
tell you the currently accepted scientific name. There is
also a "change link" so that you can get back to the
original taxonomic serial number. So, where we are right
now is getting new data and cleaning up our old data. In
this sense we need help from the systematics community.
Partnerships
Our partners include the
Department of Agriculture, NOAA from the Department of
Commerce, the USGS, EPA, and the National Museum of
Natural History. We are directly linked to the KDI project
in the University of Kansas, and recently we have added
Canadian participation. These are active partners that are
participating as we develop our program.
ITIS as a Relational
Database
The system is built around a
relational database. The database includes scientific
names, the authors, the dates in a single classification,
a unique identifier, and a taxonomic serial number (TSN).
In addition, there are associated data, some of which are
obligatory and some of which are "nice-to-know" data that
we put in when available. There is an online system that
can be queried. Reports can be asked for. There is a
system that allows you to compare two lists of names if
they are in the proper format. In addition, you can
download the data. In fact, if you don't like our
particular system of classification, download it and
change it. It is there for people to use.
Taxonomic Workbench
We also have a taxonomic
workbench, which is designed to enter data. We are still
working out the best procedures to make this more
interoperable, to make it more accessible to people in the
field, and to make corrections. Right now all updates and
changes funnel through us.
This is a simplified database,
consisting of a scientific name with a number for computer
purposes–the taxonomic serial number–behind it. Synonyms
are linked to it. There is a record of change-tracking
from the time a name is originally entered. Every time
there is a change there is a reason for the change. There
are publications sited. There is a series of vernacular
names linked, and it is indicated if these are approved by
some organization or other authority. There is a lookup
table of authors of publications and of species. There is
a comment field for anything else that is not required but
about which we have information (for example, is it an
endangered species?).
Our Homepage
Visit our homepage to see what
we have to offer. You’ll find you can query the database
by typing in a common name or a scientific name. You can
generate reports. You can extract the scientific name and
other data from ITIS. You can download anything you want
and modify it any way you want. You can compare different
databases because when you have local lists from different
places, there are going to be a lot of matches (and some
non-matches–you actually might want to focus on those
non-matches and figure out why they are not matching).
You also will see on the
homepage places to pull up publications, experts, and
names. Credibility can be found in the right-hand corner.
It also will alert you regarding whether the data have
been reviewed. If they have not, be cautious–we haven’t
gotten to that record yet. Remember, we don't have
millions of dollars and hundreds of people, so we have not
yet achieved the goal of updating all the data. But we
will get there. In fact, you can help us by visiting our
site, then telling us how you think it could be improved.
Reaching Our Goal
One way we’re trying to reach
our goal is to take some of the small amount of money that
is available and contract with systematists to produce
lists for groups that are important. For example, we have
a contractor working on a beetle list for North America.
And we have an algae contract.
In addition, we are
particularly interested in finding old-time taxonomists
who still have lists of species on 3x5 cards. We want to
make sure that we can get that information captured
electronically while these people are still alive.
Taxonomy is itself an endangered discipline. In many
groups there is only one expert in the entire world. If we
don't get that information now, we will have to do it all
over again, which is not very cost effective.
We also have people to
evaluate the data. Many of these are systematists based at
the Natural History Museum of the Smithsonian. Many others
are, and will increasingly be, in other parts of the
country and from other parts of the world. Experts will
review the data and make a rational decision on which name
to use.
We have a data-development
team that is trying to obtain new sources of data, new
lists, just as Frank Bisby does for his distributive
system. We acquire the data from whatever source we can
get, but then it has to be developed. It has to be
formatted. It then must undergo peer review. Hopefully, it
is certified. Then it is loaded and managed online.
All of this is accessible on
the Internet. We realize we cannot make changes as fast as
we would like. There are some glitches and problems in
development, as there are in any large system. But we
still have lots of names up there, and names for many
important groups are in very good condition.
We also plan to interact with
different groups. Information presumably will flow out
from ITIS to groups like the public and science writers,
people who want the correct name for an organism for a
high-school report or a story they are writing. This is
the place to get that correct name.
We are trying to be
representative of good systematics, but we also have to be
practical and results have to be immediate. We have to
manage our resources now–not with tomorrow's taxonomy, but
with the information we have right now. We cannot wait for
perfect taxonomy. We have to make some decisions to give
you some standard names to enable you to move back and
forth between the databases right now.
Interaction with the
systematics community is essential. The data stewards come
from the systematics community. The data sources come from
the systematics community. And peer review is provided by
systematists.
We are in cooperation and
coordination with a long suite of organizations including
Species 2000, CONABIO in Mexico, and FAO, which produces
aquatic species catalogues on taxa of importance. FAO
covers mostly food fishes, but they include turtles and
other organisms. All the names of all the organisms of the
FAO catalogues are in, updated, and correct in the list.
Bit by bit we will move through and get the rest of the
names updated.
In addition, we have been
endorsed by the National Performance Review and Access
America. We have space at the National Museum, where we
are in direct contact with the largest group of
systematists in this country. We have a
data-standardization process. We have hundreds of
thousands of names in there. We have been recognized by
Vice President Gore with a Hammer Award for Interagency
Cooperation for making information available at a really
cheap cost. There is very little direct funding money. As
I said before, it is a volunteer effort.
Where We’re Going
So, where do we go from here?
We have to finish cleaning up the data. We have to expand
the geographic and taxonomic coverage. We have to redo the
Web page to make it easier to retrieve information. We
have to become a bureaucracy, because at some stage the
buck has to stop someplace and somebody has to be in
charge.
We also need somebody to
encourage the sponsoring cooperating agencies to
contribute more money so we can get the job done. We need
to expand partnerships with various organizations. We have
to expand our relationships with the systematics
community. When I say these things, some of my taxonomic
colleagues won't talk to me because they think I am trying
to make the ITIS system a standard. But it is not a
standard–it is just a means for helping people move among
and use various databases. It is just a method of
communication.
Previous |
Next
Questions:
Email us or Call (215)
893-1561
Copyright © 2003 NFAIS. All rights
reserved. No part of this product or service may be
reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without prior written consent.
Privacy
Policy |