H/T for the title to James Cagney in White Heat.

Note: for background on methods, start reading this thread on Amazon sales data with Scraping the Gorilla – Picking data off of Amazon


Amazon publishes a list of the top 100 authors, which is updated daily – mostly. I did a running sample of this list for a few weeks to get a feel for how that list changes, and with a little sampling of individual authors, perhaps to learn how that list is constructed.

First, let’s look at how that list churns. In the below graph (I apologize for the clutter) we see those authors who have appeared in the top 20 at some point during the 24 days of the sample. You can see a bigger, and possibly more legible, version by clicking on it.

top 100 authors churnNote the pause in updates during 6/8 and 6/9, a weekend. From this I might speculate that the list is manually updated rather than based on an automatic algorithm, and whomever was responsible was off. In the legend they are ordered by the number of days they were in the top 20.

While the top authors plug along well, we can see many authors quickly climb from lower in the list, and almost as quickly descend. I presume this is the result of a sudden and temporary increase in sales. This was not practical for me to verify retroactively – I can evaluate present sales but not past sales unless I were to catalog the rank daily of every one of the thousands of books belonging to all the author in the top 100 list. This would be impolite to Amazon.

Book unit sales, as estimated from the author’s books’ ranks, do loosely correlate with author rank, at least better than with other measures such as gross sales dollars, author royalties, or Amazon share of revenue. Below is a single day snapshot.

author rank correlationThe correlation coefficients are listed along the side. Some of the weakness in correlation with estimated unit sales may be the result of book rank (from which I estimate sales) being a running average, the length of time it took for my script to pull down all those book pages, and other possible factors used to establish the list, such as perhaps the rate of change in sales – “Movers and Shakers.” But another factor may be seen below.

Here I have stripped away the authors who began the sample in the top 100, instead focusing on those who suddenly appeared in the top 20 without previously even being in the top 100.

top 100 authors surprisesIt is easy to understand why Maya Angelou should suddenly appear in the #1 spot the day after her passing. Likewise, Aldous Huxley, whose classic Brave New World is required reading for nearly every student at one time or another, could easily be catapulted up from obscurity when summer reading lists are issued.

Other authors are more puzzling. In many cases, when I investigated an author from this subset, I discovered that their books were selling rather modestly – certainly no where near enough to justify being a top 100 author, let alone top 20. The common tread I found was that many of these authors were signed with one of Amazon’s publishing arms, which include: “Amazon Encore”, “Amazon Crossing”, “Montlake Romance”, “Little A”, “Thomas & Mercer”, “47North”, “The Domino Project”, “New Harvest”, “Grand Harbor Press”, “Amazon Children’s Publishing”, “Little A”, “Jet City Comics”, “Day One”, “Skyscape”, “Two Lions”, “Lake Union Publishing”, “StoryFront”, “Waterfall Press”, and “Kindle Worlds.”

This may be one of the ways that Amazon promotes its own authors, as alluded to in a previous post. I personally do not see any ethical problem with this. Every retailer tries to promote the brands from which they make the most profit, whether it be your local grocery store or your car dealer. Amazon makes few or no warranties about the significance or origin of its various lists. More power to them.

