Predicting h-index

What is your future impact?

Researchers Acuna, Allesina, and Kording decided to use machine learning to find out. They recently published a Nature article, “Future impact: Predicting scientific success,” that describes their method and findings.

Their goal was to predict a scientist’s future h-index given his or her current bibliographic data. I wrote about discovering the h-index two years ago. Nowadays, Google scholar will calculate this value for you. It’s a measure of research impact, characterized as the number h of your papers that have at least h citations.

Acuna et al. collected data on 3,085 neuroscientists and performed a linear regression on these features:

  • n: number of papers written
  • h: current h-index
  • y: years since publishing first article
  • j:┬ánumber of distinct journals published in
  • q: number of articles in Nature, Science, Nature Neuroscience, Proceedings of the National Academy of Sciences, and Neuron

They found that this five-factor prediction did better at predicting the future h-index than just using the current h-index itself. Their R2 value for predicting h-index one year into the future was 0.92; five years out, 0.67; and ten years out, 0.48. Their conclusion was that raw h-index numbers were not as predictive as also capturing the scientist’s “breadth” (in j) and the quality of the publication venues (in q).

You can try out their model on your own data, although they note that it is “probably reasonably precise for life scientists, but likely to be less meaningful for the other sciences.” Also, you’ll have to wait the specific number of years to see if it comes true. Or you can plug in your data from a few years ago and see how the predictions match the present. Using my data from two years ago (h-index 12), their system predicts that my h-index this year should reach 19. Google scholar pegs it at 17 right now, so either I am not reaching my proper potential, or their model is wrong. ;)

There’s more than recreational fun going on here. The authors note that h-index values may be used in tenure decisions. In that context, the ability to predict a candidate’s h-index five years into the future could have even more impact—if it were sufficiently reliable. As usual, we can hope that such decisions are made with more than just these impoverished metrics in mind!

Post a Comment

I knew this already. I learned something new!