Note: this is the first in a series of posts on Amazon sales data as gleaned from the Amazon website.
There is little doubt that indie* publishing has given upcoming writers new options. They now can choose not to battle gatekeepers at publishing houses, or accept the onerous publisher contracts with their low royalties and rigid publishing schedules. Whether indie publishing is inconveniencing the Big 6** publishing houses is debated. That debate requires data, which publishing companies are loathe to reveal. Some of this data can, however, be gleaned from the 500 pound gorilla in book retail: Amazon.
Some questions that such data might answer:
- How often do indie published titles earn royalties comparable to those of Big 6 titles?
- How many indie authors earn royalty incomes rivaling those of Big 6 authors?
- How many authors can make a living from their writing?
- What are the relative sales in each genre/subgenre?
- How important are online customer reviews/ratings to midlist sales?
- Do series outperform standalone titles?
To answer these and other questions, I set out to
scrape gather data from Amazon’s website. There had been discussions on the internet (ex: kboards.com) about what Amazon’s book rank means in terms of sales. Many authors have posted their rank vs sales figures, which helps to roughly deduce sales from rank.
Early on, I discovered authorearnings.com, a project led by the much admired and successful indie author Hugh Howey. This project sought many of the same answers and published not only reports on project findings, but the raw (although anonymized) data on individual titles. Data mining has always fascinated me, and one thing I’ve learned from many years of analysis is how data can be tortured into confessing – confirmation bias. I wanted to analyze the data myself so carried on writing scripts to pull and parse pages on around 60,000 titles. This included the top 100 titles in the various fiction categories (there are over 400 categories), as well as the complete corpus of around 200 authors that included the 100 top ranked Amazon authors, authors on NPR’s 100 best SF list, and around 20 indie authors known to have enjoyed success. Although I am an indie myself, I did not set out to make a point about indie publishing.
Before I post data, I must warn of some important caveats.
- Motion blur: My data is not a snapshot. Acquiring the data took weeks, so doesn’t reflect a single point in time. Rankings changed as data was being collected. This shouldn’t affect the overall picture it renders, but makes the data somewhat internally inconsistent.
- Rank to sales algorithm: The algorithm used assumes that book rank is determined by unit sales (this metric has the strongest correlation with author rank – more on that in future posts) and is likely a moving average. The approximation I use involves seven equations (linear, power series, and exponentials) that span various regimes from #1 to #2,500,000. The estimate of gross sales integrated over all ranks does not compare unfavorably with published Amazon books sales figures (more on that later). Sales estimates will be least accurate for the top ranked books, since #1 could mean anything from 1,000 to 10,000 sales a day, depending on buyer whim. My algorithm arbitrarily assumes that #1 in eBooks means sales of 4000 per day, and assumes that eBooks outsell paper 1.5:1. Amazon does not rank audio books, so I cannot estimate sales.
- Behind the curtain: One hopes that Amazon calculates the various rankings based on simple metrics without manipulating those numbers for their own purposes. Who knows?
In subsequent posts I will try to guess at the answer to the questions I posed above and others that occur to me.
For perspective, consider the following estimates of daily sales and royalty for the top fiction titles:
|The Fault in Our Stars||John Green||Big 6||6562||$7,653||4.8|
|Divergent||Veronica Roth||Big 6||5834||$4,345||4.6|
|Allegiant||Veronica Roth||Big 6||5493||$6,601||3.3|
|Insurgent||Veronica Roth||Big 6||5424||$6,224||4.6|
|Divergent Series Box Set||Veronica Roth||Big 6||5045||$14,547||4.3|
|I Am Livia||Phyllis T. Smith||Amazon||4030||$7,038||4.6|
|The Goldfinch||Donna Tartt||Big 6||4001||$6,470||3.8|
|Bloodline||James Rollins||Big 6||3904||$3,080||4.5|
|Home to Stay||Terri Osburn||Amazon||3779||$6,600||4.1|
|The Fixed Trilogy||Laurelin Paige||Self||3709||$1,378||4.7|
|Plaster City||Johnny Shaw||Amazon||3278||$5,725||4.1|
|The Way Life Should Be||Christina Baker Kline||Big 6||2906||$4,390||4.4|
|The Boleyn Inheritance||Philippa Gregory||Big 6||2902||$3,369||4.4|
|Missing You||Harlan Coben||Big 6||2824||$5,723||4.3|
|Beach Road||James Patterson||Big 6||2753||$5,504||3.0|
|NYPD Red 2||James Patterson||Big 6||2671||$4,119||4.5|
|The Husband's Secret||Liane Moriarty||Big 6||2669||$2,978||4.3|
|Orphan Train||Christina Baker Kline||Big 6||2331||$2,609||4.6|
|I've Got You Under My Skin||Mary Higgins Clark||Big 6||2324||$4,309||4.4|
|Killing Ruby Rose||Jessie Humphries||Amazon||2275||$1,930||4.3|
|The Alchemist||Paulo Coelho||Big 6||2109||$1,966||4.2|
|The Book Thief||Markus Zusak||Big 6||1833||$1,759||4.6|
|Shadow Spell||Nora Roberts||Large||1760||$2,313||4.5|
|A Game of Thrones 5-Book Boxed Set||George R. R. Martin||Big 6||1684||$5,362||4.5|
|Modern Wicked Fairy Tales: Collection||Selena Kitt||Big 6||1649||$279||4.2|
|The Headmistress of Rosemere||Sarah E. Ladd||Big 6||1649||$2,718||4.6|
|Gone Girl||Gillian Flynn||Big 6||1437||$2,037||3.8|
|Tall, Dark, and Deadly 3 book box set||Lisa Renee Jones||Self||1398||$484||4.3|
|The Collector||Nora Roberts||Big 6||1310||$2,546||4.5|
|Hidden||Catherine McKenzie||Big 6||1277||$2,231||3.8|
|The Invention of Wings||Sue Monk Kidd||Big 6||1244||$2,479||4.6|
|Little Girl Lost||Brian McGilloway||Big 6||1150||$403||4.3|
|Too Many Crooks Spoil the Broth||Tamar Myers||Small||1148||$193||4.0|
|The Maze Runner||James Dashner||Big 6||1131||$861||4.3|
Note: Table data excludes audio. A “title” includes all text formats. “Self” means a publisher that has only one author; “Large” for 20 or more. Royalties are based on estimated unit sales, price, and typical royalty percentage for publisher type and format. Again, keep in mind the caveat that unit sales figures (and thus estimated royalties) for the top ranks are quite speculative.
What did I learn from this? I was frankly surprised at the revenue a single title can generate, at least when it’s hot. I was also surprised that fiction outsold non-fiction handily, about 5:1. While the above table shows only fiction, there were seven non-fiction titles interspersed, the hottest being in the #5 spot. I was also surprised at the dominance of genre fiction.
Two indie titles make this list. That could be misleadingly low if others on the list started as indie but were picked up by a publishing company along the way.
I shouldn’t have been surprised by the four Amazon published titles. Whether these were indies picked up by Amazon, or whether Amazon was one of the publishers queried, Amazon’s marketing clout on its own site may ensure the popularity of their own titles.
*As the term is used here, “indie” refers to writers who undertake to publish their work (ebook and/or print on demand) using online retailers without benefit of agents or publishing companies. They typically hire their own cover artists, editors, and formatters, and conduct their own marketing.
**I define Big 6 as the various publishing arms of the following six corporations: Hachette, Holtzbrinck/Macmillan, Penguin, HarperCollins, Random House, and Simon & Schuster. These companies comprise over 300 imprints. For purposes of calculation, I assume royalty arrangements are roughly the same across all imprints and their authors.
If the reader has suggestions for other interesting analyses, please leave them in the comments.