Publications / Miles Conrad Lectures
2004 Miles Conrad Memorial Lecture
46th NFAIS Annual Conference
February 23, 2004
The Battle for Mindshare: A battle beyond access and
retrieval
John J. Regazzi Managing Director, Market Development, Elsevier
Presentation Slides
I am indeed
grateful and honored to accept the 2004 Miles Conrad
Award. NFAIS has been a large part of my professional life,
and receiving this award from NFAIS members while standing
in the company of previous Miles Conrad award winners is
indeed a genuine honor for me. I have simply aspired to be
a player in this great industry - nothing more and nothing
less. As you know well, no player can be effective in this
business alone - you need a team. I have had the great
fortune to work with some of the finest companies and groups
in our business. In honoring me with this award, without
question you honor us all, and for this I am deeply and
doubly grateful.
For the
last four or five years, I have been away from NFAIS and the
Abstracting and Indexing (A&I) industry, but I am truly
delighted to be back with you today. In preparing for this
presentation, I tried to use that absence to allow myself to
revisit this industry - to see what is still the same and
what has changed. I must confess I was somewhat surprised by
not only how much has changed in such a short time, but also
how many new and mounting challenges our industry now seems
to face. I will endeavor in my comments today to respond
to what I have discovered after this brief absence.
As you well
know, our industry finds itself in the midst of significant
changes. I would like to focus mostly on these changes
today, on what I have called our ‘shifting sands’. In order
to provide a bit of background to these changes, I have some
background historical context that I think may help give
some perspective on them. In line with the conference’s
theme, I have also attempted to look at what might be a way
forward following our present day shifts. So my comments
are simply organized into these parts: Context; Shifting
Sands; and What Way Forward.
The
Context
When I
first entered the publishing industry, we spoke of the
‘publishing chain’. It was, in fact, a simple
yet elegant supply chain. In its most basic form, this
chain was efficient, and all of its links were clearly
defined, consisting of authors, publishers, libraries, and
readers. Authors would do research and write articles and
submit their work to publishers. The publishers were then
responsible for organizing a peer review network, editing
these articles, publishing them in journals, and
distributing these journals to libraries. Of course, the
libraries would then make the journals available to readers,
who were often themselves researchers and authors. In
the 1970s and 1980s, as the industry became more electronic,
A&I services combined with electronic vendors to provide
even faster and more efficient access and retrieval to
scientific and scholarly articles.
This simple
supply chain, however, has been transformed today into a
complex, some might argue ‘too complex,’ information
network.
Today,
researchers and authors have a wide variety of means with
which to communicate their research findings, from
traditional publishers (both commercial and society-based)
as well as through many other means, including preprint
servers, institutional depositories, content aggregators,
and syndicators, among others. These publishing vehicles
are further organized by frequently overlapping services,
such as secondary databases, web portals, search engines,
online vendors, local system providers, institutional
library vendors, and so forth. Additionally, the user is
confronted with what may be a bewildering array of services
to chose among for accessing this material, again including
local web services and portals, online and/or local system
services, database services, primary publishing search
services, and knowledge management systems. Finally, the
recent new development of ‘author pays’-based journals -
known as ‘open access’ - adds still another alternative for
authors in our modern information supply chain.
How did
this complexity happen? And is this an advance for
scientific communications or not?
In brief, I believe the context for
these changes has developed in three parts: 1) an explosion
of technology that 2) drives significant growth in directory
databases, and that 3) leads in turn to incredible growth in
discovery, access, and search-and-retrieval systems. You
would expect that these changes would lead to significant
growth in the A&I industry–but, oddly, they did not. We’ll
look at this unexpected situation later, but before we get
to that question, let me briefly illustrate the strengths of
the growth of technology, directory data, and search
systems.
The Explosion of Technology
In order to
illustrate the changes in technology that are now affecting
us, I have compared the costs of the key elements of
information technology - computing, storage, and
transmission–for the years 1972 and 2003. I used 1972
because it represents the infancy of the “online
information” industry. In making the comparison, I used
three measures: 1) the cost of executing one million
instructions per second (a measure of CPU capacity); 2) the
cost of storing one million characters; and 3) the cost of
transmitting one million characters over high speed lines -
in this illustration, from New York City to Los Angeles. In
1972, it would have cost nearly $5,000 to execute one
million instructions per second, and that figure does not
include the fact that the computers needed to do so would
fill a good size auditorium and need to be kept cool at
significant additional costs. In 1972, it would cost
roughly $1,000 to store 1 million characters on magnetic
media, again not including the costs of housing those
storage units. And finally in 1972, it would have cost
nearly $2,500 to transfer 1 million characters from New York
to LA via the highest speed line available - which was about
9600 baud. Today, all of these functions can be performed
for less than one-tenth of a cent.
What is the
real significance of this comparison? For me it is not in
the technology, but in the fact that the technology is today
in the hands of virtually anyone who wants to use it. When
the ‘online industry’ was launched in 1972, it was in the
hands of fewer than six companies and government agencies.
Today, anyone can be a publisher, an online vendor, a
library, and an information system. The barriers of entry to
access and retrieval have never been lower, and in fact, one
can hardly imagine them getting any easier to overcome. As a result,
never has the industry been more competitive.
The Growth of Data and
Databases
We often think of what we call the
“information explosion” as the rapid, exponential growth of
scientific and scholarly articles and journals. This is not
the case; in fact journals have grown at a steady rate of
3.3% per annum since the beginning of the 20th
century, except for a brief period after World War II, when
the growth rate was 4.7%. In contrast, since 1972 the
number of scientific and scholarly (directory) databases and
the number of records in those databases has grown
exponentially, averaging ten-year growth rates of 150% and
122% respectively, or 12-15% per annum.
Growth of the A & I Market
Despite the
growth of technology, data, and databases, the A&I industry
has not grown beyond the rate of inflation (i.e., 0% growth
in constant dollars) in the period 1972 to 1999. Perhaps
more puzzling, the industry has shown a decline of nearly 5%
per annum from 2000 to today.
Thus
the question emerges: with the clear expansion of the data
now provided to the science community, and with the
information ‘revolution’ fully engaged, why are we
experiencing decline or, at best, no growth?
The Shifting Sands
In thinking about these trends and
trying to understand what might be constraining our
industry, I have focused on three areas that I would like to
suggest to you:
--
University infrastructure spending; -- A&I
production and coverage; and -- Scientists’ and researchers’
search patterns and their ‘mindshare’ today.
University
Infrastructure Spending
The early
1970s was a time when, for the most part, research libraries
could buy all new research material, thus keeping up with
virtually all R&D developments. But for the 20-year
period from 1975 to 1995, university library expenditures
increased only at the rate of 2.2% (which is actually a
decline in real buying power if set in constant dollars)
while research and development spending increased by 4.6%,
nearly double that of the library. The result is a
huge gap in the university library’s ability to keep up with
the production of research and development.
Perhaps
equally telling is that if you look at the forty largest
Association of Research Libraries institutions in the US
during the period from 1982 to the present, library
expenditures as a measure of total university spending have
decreased from 3.7% to 2.8%, a decline of 25%.
A typical private university in the U.S. will
spend 1.3% on the library and 0.2% on serials and A&I
services.
In our older, simpler
supply chain, publishers and libraries worked closely
together in order to provide for the access, retrieval,
distribution, and delivery of scientific and scholarly
information.
Now,
despite the productivity gains being realized by both
publishers and libraries, universities seem to be not only
taking those value gains out of the library’s materials and
resources budgets, but also demanding further value
reductions.
A & I Production in Coverage
In an
effort to gain competitive differentiation and advantage
and, perhaps, to increase value to library subscribers, A&I
services have invested heavily in expanding the scope and
coverage of their databases. As noted earlier, while
scientific and scholarly journal production increased at
less than 4% annually, records in A&I databases increased at
three to six times that rate, leading to a great deal of
overlap and redundancy among these services. This
redundancy is significant, as is illustrated by the 2001
study.
The problem
here is not really the overlap per se. In fact, some might
argue that this overlap is valuable because the indexing is
customized for each discipline. Yet much of each database
record is the same, so a library is faced with paying for a
record three or more times for each search conducted or
database purchased. This, coupled with the fact that A&I
databases are often available through aggregators with
increased distribution mark-ups, leads to inefficient
purchases and diminishing value for institutional buyers - a
condition clearly not conducive to further buying by
libraries.
Scientists’ and Researcher’s Search Patterns and Their
Mindshare
Today, the
computerized technologies of search and retrieval are
ubiquitous, and their technology-driven use among
scientists, researchers, and professionals continues to rise
dramatically. In 1972 there were an estimated one
million online searches, while today there are an estimated
80 billion. Similar growth rates have occurred for the
number of personal computer units available in the science
community as well as the number of web hosts, with the
latter growing from about 130 in 1992 to 172 million today.
The
patterns of searching for scientific, technical and medical
information among these professionals are longstanding and
seem to be deeply rooted. 70% of these professionals have
used Internet search routinely in their work for over three
years, and nearly 80% of these use this method of access and
retrieval between four and seven days a week.
Recent
developments even suggest that a new supply chain could be
emerging, one in which scientists rely as much on search
engines in the future as they did on libraries and A&I
services in the past. Some publishers have begun to explore
partnerships with search engine providers, allowing them to
index full-text articles and access and retrieval services
around these indexes. Similarly the “author pays” (or open
access) business model relies specifically on free access
provided by search engines such as Google, Yahoo, Overture
(Fast), and so forth. Of course there is no guarantee that
the ‘free’ search engines of today will be free in the
future, but for now this shift is significant indeed and can
be best illustrated by the mindshare gains made by search
engines.
In a survey
for this lecture, librarians and scientists were asked to
name the top scientific and medical search resources that
they use or are aware of. The difference is startling.
Librarians named Science Direct, ISI Web of Science, and
Medline, while scientists named Google, Yahoo, and PubMed
(librarians also named PubMed).
The search
engine ‘mindshare’ translates to clear economic gains.
Total annual sales for the A&I industry is approximately
$800 million, with a total estimated market value of
approximately $2 billion. In contrast, Internet search
engines in their last five years of development have reached
sales of $3 billion and a market capitalization and
estimated value of nearly $30 billion. Though the scope of
these services is different, the effect on search-engine
‘mindshare’ among scientists engines is indeed significant
Is there a
future for A&I, and if so what is it? Will search engines
be able to deliver what researchers need in the form and
format that they require?
What Way Forward?
In trying to answer this question, it may be
helpful to note first that researchers themselves are under
increasing pressures to do more ‘applied’ research and thus
are not immune to greater and greater competitive factors.
Researchers
are becoming more pragmatic in their approach to research,
with academic researchers working more in teams and
corporate researchers moving down the R&D cycle to
development with compressed product cycle times. Recent
studies have also illustrated that researchers are employing
an increasing number of information sources to meet their
information needs. The top five include: 1) trade journals
and publications (94%); 2) regulations (83%); 3) technical
training (79%); 4) scientific and technical journals (77%);
5) reference books and textbooks (74%). Many researchers
are involved in a broad range of research activities, and
require not only traditional ‘science’, but also business,
legal, and regulatory information.
What seems
to be emerging - even as search increases its mindshare
among scientists - is a much more fundamental need that goes
well beyond search. For me, that need is best described as
“data mining,” in which information services are designed to
deliver diverse content so as to inform specific problems
that researchers address at different points in their
specific research cycles. Let me provide three examples
from three different research communities: Biotech,
Medicine, and Agricultural Engineering.
Biotech
The Biotech
area has most directly addressed the need for data mining,
and a concern for data integration has been stated clearly
and strongly in this community. In a recent BASF
conference, one element of this need was described by
joining internal and external data with scientific and
business information in a federated search. The Boston
Consulting Group estimated that 33%, or $282 million, of
costs could be saved for successful new drugs if an
integrated information platform could be built. Similarly
McKinsey finds the number one obstacle to improved biotech
productivity to be a lack of integrated data. The top three
barriers identified by Outsell in a study of this industry
are: inability to compare data across information sources;
determination of the quality, credibility and accuracy of
data; and knowledge of what information is available for
specific problems.
Medicine
Physicians
are more and more being called upon to deliver their
services within tight constraints of time and money. In a
typical interaction with a patient, following the gathering
of the patient history performing the physical exam, coming
to a primary evaluation, and completing a diagnosis, the
physician is often called upon to make several decisions
that will have a significant impact on the health of the
patient as well as the costs of treatment.
Studies
have shown that this whole cycle now takes on average six
minutes from history to plan. In order for treatment to be
successful within this short period of time, many healthcare
providers have recognized that they must provide physicians
with modern information handling tools that will assist them
in making these decision and judgments quickly and
effectively.
Agricultural Engineering
Recently I
came across a truly unique information service from CABI.
This information service attempts to provide answers to
farmers and agricultural engineers who are having a problem
with crop disease. This service first helps in identifying
what pests or diseases may be affecting the crop, depending
on a wide range of factors, such as climate, geography, soil
conditions, and so forth. It then assists the researcher in
identifying, for example, the specific pest through
photographs, characteristics of the pest’s effects on the
plant, and other factors, as needed. Once the problem is
identified, the service provides treatment options and
associated costs. Finally the compendium also helps in
identifying ways by which the pest or diseased plant can be
controlled, for example through better importation and
exportation regulations and/or through tests. This service
not only is efficient and productive of researchers’ time,
it also can create greater wealth for a company or country
relying on a particular crop for its economic value.
Conclusion
Who will
build and provide these types of services in the future? It
is impossible to say, but what is clear is that neither
content nor search is king here. Rather we need both.
Mostly we need organizations that can filter and select the
right information; that can structure content so that it can
be used for specific purposes across a wide range of
information problems - organizations that have the
capability to provide essential information at the right
time. We need organizations that can create ‘good
sense’-making tools. That is, tools that help us understand
the problems we face and that inform our decisions around
the options we have in solving these problems
I am
doubtful, despite their current mindshare, that search
technologies alone will fulfill this need. Rather we require
the traditional skills of secondary services and primary
publishers who are willing to apply those skills
specifically to particular professionals in a new way
through understanding the detailed highly complex problems
that professionals face every day.
In short,
the future belongs not to those who merely navigate us
through cyberspace, nor those who populate it with data.
Rather it belongs to those who help us make sense of all the
data that is available to us.
Questions:
Email us or Call (215)
893-1561
Copyright © 2003 NFAIS. All rights
reserved. No part of this product or service may be
reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without prior written consent.
Privacy
Policy |