Bibliometrics – What’s That?

Journal of Homosexuality

Image via Wikipedia

I’ve been getting quite a number of independent projects lately – mostly doing some bibliometric analysis for a major scientific (medical) publisher in the area.  But I find that explaining this to family earns me blank stares unless I go into a little more depth, so I thought I might do a little of that here.

A senior executive at this publishing company used to be my direct boss over at Thomson and has been trying to get me as much work on the side as she can.    At first, she basically explained her situation in about 10 minutes, said “do what you can” and threw me into a desk.  The situation was thus:

The way the scientific publishing world works is when a scholarly society (The American Society of Something) decides to publish a journal.  They go to one of these publishers and arrange a contract.  These contracts do expire, and my particular friend was going into a touchy renewal meeting in a matter of days.  She just wanted whatever intelligence could be found about the journal, using available datasets, including internal usage data and ISI bibliometrics.

So, I pulled together what I could and it seemed to satisfy, as I started getting more and more projects.  I may go into this more in the future, but I’m really writing today to explain some basics about Bibliometrics, and how I got involved with that.


During my tenure at Thomson Reuters, I was really working as a part of the organization known as ISI, the Institute for Scientific Information.  Originally founded by Dr. Eugene Garfield as a way to deliver journals’ tables of contents to fellow scientists (so that they would not have to subscribe to all those journals just to find out what was being published) in the dark days before the Internet.

Before a few years were up, Dr. Garfield made a huge leap in the field of scholarly research – he determined to track the citations offered in the bibliographies of those articles, those journals.  All scholarly articles contain a bibliography – a list of cited sources used to build the hypothesis presented in the paper.  Dr. Garfield aimed to track these citations, to record them so that the next time an article cited the same source, the two citations would show the same.

Each journal has its own format, its own structure for bibliographies, but the majority do contain the same key elements: article title, publication name, pub date, author, etc. etc. etc.  Once the different formats are sorted out, it was not too big a leap to start capturing the relevant information fields into a very early-model database.

But why would anyone do this?  Well, Dr. Garfield’s reasoning was this: viewing cited references to a source article, one can see the influence that article has had in the scientific community.

So let’s imagine Einstein.  (Or Newton, or von Neumann, or Freud.)  How many future physicists cited Albert Einstein’s Theory of Relativity when writing their own papers?  How many astronomers, how many nuclear engineers?  Dr. Garfield was finally able to do this.  Well, to a point.

Dr. Garfield, and consequently ISI, would really track the citations to a particular journal.  The fact is, it was difficult to index and calculate for every individual author in the database.  Standardizing the data to that level would have been prohibitive, when really Dr. Garfield needed to be able to track it down to the journal level.  But in the end, he could very confidently say several things: this journal is cited more than that journal, this article was cited that many times in this many years, or even that this subject usually got more citations than that one.

This revolutionized the scholarly publishing world in several ways.  First, as ISI would still have people believe, using common cited references is a great search and discovery tool.  The assumption here is that two articles citing the same resources would likely be more similar than two articles citing independent resources. 

Measurements started to come into place.  And as Dr. Garfield/ISI began to use citation data to select journals for inclusion into their products, standards started to form in the industry.  A journal included in the ISI data was deemed better than one not included.  Once Dr. G. created the Impact Factor, a measurement designed to show how well cited a journal has been over a given year, that metric took off in the marketplace – every journal wanted an Impact Factor, and then to compare and contrast with itself over time, as well as with competitors’ Impact Factors.

Many other metrics too shape around this field – Cited Half-Life and Immediacy Index, H Index, Eigenfactor and more now flood the marketplace, each measuring something specific, useful only in how these numbers are viewed – in what context, alongside what other information, for which purpose.

ISI (now under the Thomson Reuters umbrella) does sell several solutions for libraries, governments, universities, corporations and more, which offer various ways of looking at these data.  Thomson’s major competitor Elsevier has even sexier solutions – although they don’t have the history ISI does in the field, Elsevier is the Pink Elephant in ANY scientific publishing house.  They quickly put together a solution which while not as definitive as that of ISI, is much more user-focused, and therefore sellable.  (ISI’s new Web platform does not seem to have made the impact on the marketplace that they had been hoping.)

In my various roles at ISI/Thomson, I was exposed to these data frequently, and given my natural abilities with data analysis, I was drawn to this piece of the business.  So when my friend at the medical publishing house asked for help, I was the person she thought of.

In digging through a combination of (primarily) internal usage data (which articles were viewed most frequently, by platform) and citation data (which articles were being cited more, as well as how overall citation patterns might drive strategy) I was able to demonstrate definitively that the society absolutely does benefit from being handled by this particular publishing house.

As I mentioned above, bibliometrics can be used in many different ways.  Universities assess themselves against other universities, or assess themselves internally, whether by department, individual or some other slice of information.  Governments use the data in these ways, but also to evaluate at national levels – e.g. tracking the rise and fall of Japanese scientific discovery.  Publishers themselves have dozens of ways to use these metrics. 

As it is so hard to talk in generalities, I will likely expound on this in the future.  Stay tuned to learn more!

4 Responses to “Bibliometrics – What’s That?”
Check out what others are saying...
  1. […] after my previous article telling a little bit about bibliometrics, I thought I would take the opportunity to get a little more in-depth.  I have taken one of my […]

  2. […] Bibliometrics – What’s That? ( […]

  3. […] Bibliometrics – What’s That? ( […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

  • Calendar

    July 2010
    S M T W T F S
    « Jun   Sep »
  • Archives

  • Enter your email address to subscribe to this blog and receive notifications of new posts by email.

    Join 17 other followers

%d bloggers like this: