Piracy raw data update

Here's a big data dump of stats (followed by analysis), for those who care about this sort of thing, from a March 2009 ars technica article:

- 17M people stopped buying CDs in 2008

- 8M people started buying digital music in 2008
- There are now 36M digital music customers
- 1.5B songs were sold "digitally" (ie, online) in 2008  
- 33% "of all music tracks" purchased in the US were digital

- Pandora use doubled in 2008, to "18 percent of Internet users"
- "Social network music streaming" rose from 15 to 19 percent usage

A January 2009 ars technica article rounds out these stats with:

- "unit purchases" increased by 10.5% in 2008
- 428M albums (LPs + CDs + online) were sold in 2008, down 14%
- 65.6M online albums sold in 2008, up 32% over 2007
- 1.5B songs sold online in 2008, up 27% over 2007
- 1.88M vinyl sales in 2008, up 89% over 2007

So all that looks pretty rosy for the music industry, in absolute terms.  But how did it do relative to piracy?  According to this slightly more pessimistic January 2009 IFPI report:

- Digital music sales grew 25% in 2008 to $3.7B worldwide
- Digital music sales account for 20% of recorded music sales, up 15% over 2007

- 40B songs were "illegally file-shared" in 2008
- 72% of UK music consumers wold stop pirating if told to do so by their ISP
- 74% of French consumers agree internet disconnection is preferable to fines

A linked "key facts" PDF has a boatload of additional statistics, including:

- 16% of European internet users "regularly swapped infringing music" in 2008
- 13.7M films were distributed via P2P in France in May 2008, compared to 12.2M cinema tickets
- "free music" was given as the primary reason for piacy
- P2P file sharing accounts for up to 80% of traffic on ISP networks

So pirated downloads still utterly dominates legit downloads, to the tune of 26:1.  If anything, it seems like piracy is accelerating, even faster than legal download services.

What about legit streaming?  In July 2008 I estimated that MySpace users legally streamed about 110M songs per day.  Turns out I was off by a lot: they streamed 1B downloads after "only a few days", and this September 2008 TechCrunch article tosses out 20B streams initiated *per day*.  That's an amazing number.

But it's also an incredibly vague number, as stream initiation isn't nearly as interesting as stream completion.  For example, the average user spends under 10 minutes on the site per visit, meaning there's barely time for two full-length songs.  I'm having a surprisingly hard time finding recent data, but this 2007 article shows MySpace had like 29M daily visitors, so even doubling that for 60M daily visitors today suggests at most time for 120M full-length songs per day -- roughly 43B per year -- and this ignores the large subset of international users (who can't get newly-released music).

Similarly, YouTube had 5B views in July 2008, and 6B views in December 2008, so let's just assume something like 66B total videos in 2008.  As for what fraction of those equate to "songs" I have no idea; I'd say this is more about "intent" than anything (ie, people who play the video in the background like a radio, rather than watching it like a music video), and I have no data at all on that.  But I wager it's not the common case, so let's say 25% of YouTube videos are actually just played as songs -- and even that seems high.  (Also, this assumes all YouTube music is licensed, when in fact the opposite is probably more often true.  Details, details...)

Adding to MySpace's 43B and YouTube's 16.5B would be all of Pandora's streams, which should be considerable given the claim that 18% of all Internet users use it, but I can't find any data on it.  One reason for that is probably because Pandora actually has nowhere near that userbase: this Dec 19, 2008 TechCrunch article reports they only just hit 20M users, while in that same month the internet was estimated to comprise 248M North-American users (1.4B global).  This puts Pandora's penetration at a much more conservative 8% of North-American users (assuming 100% are North American), or 1% global.  Still significant, but 20M *total* users is nowhere near MySpace's 100M *active* users.

So for the sake of argument, let's say there are about 60B legit streams, against 40B pirated downloads -- meaning piracy utterly dominates in the download market, whereas legit streaming utterly dominates in the streaming market.  Indeed, there is essentially no such thing as a meaningful "legitimate" download market, or a meaningful "pirate" streaming market.

As for which accounts for more total "listens" and thus ultimately controls more users' ears, that's an open question: on the one hand, streamed songs are only heard at most once, whereas downloaded songs can be listened to multiple times.  But streamed songs are probably more likely to be heard at all, with a lot of pirated songs probably just going into vast personal libraries having never been played.

Who's winning?  Who knows, and as piracy goes dark, it's harder and harder to tell.  Personally, I'd still put my money on piracy having a strong lead on users' ears, both right now and for the forseeable future.  If the average pirated song is listened to just 1.5 times (which seems reasonable), than piracy is still winning.

So in conclusion, it seems to me that the battle for downloads is utterly and irretrievably lost to piracy, but the battle for pirate streaming is only just beginning.

As it stands, streaming is overwhelmingly in favor of legitimate content owners.  But I really wonder how long that will last. 

After all, the list of streaming P2P applications is long and always growing (now over encrypted onionskin darknets).  Basically, P2P streaming is a hard problem, but it's also largely a solved problem.  So if there's no technical reason why pirates don't stream, maybe they don't simply because they don't want to? 

The most obvious reason why this might be true is because people turn to piracy primarily to avoid paying.  (Please excuse the alliteration.)  So long as MySpace and YouTube continue give it out for free, there's little incentive to build a pirate streaming site.  But the real test will come if something in that calculation changes, by one or more of the major parties.

For example, let's say MySpace decides they don't like paying to stream content from central servers, and then paying again for licensing fees.  Maybe they find their ad revenue sagging and decide to integrate a streaming P2P plugin (I'm betting on Littleshoot for now) to offer the same exact experience as today -- but by tapping into the pirate networks.  So no bandwidth costs, no licensing fees.

Alternatively, let's say the powers that be do something incredibly stupid like pulling their music from MySpace, or jacking up the price such that MySpace is forced to charge for it.  At this point there's an opening for someone like The Pirate Bay to offer a first-class pirate station, and then it's game on.

Either party would use an argument like "we don't host any data, we just enable user sharing.  Any illegal behavior they do is their business and we don't encourage it (we merely profit from it)." 

And unlike the small P2P outfits who have tried this in the past, the next wave of defendants will have substantial legal resources and astonishing revenue incentive.  And unlike the tiny, outgunned P2P outfits of yore, MySpace's or The Pirate Bay's victory won't be quite so Pyrrhic.

Anyway, just wanted to do a quick review of the available data and update my predictions.  Can anyone provide more recent or accurate data to correct the above analysis, or see holes in the logic?  I'm as eager as anyone to get a firm grasp on reality; let me know if you think my grip is slipping.

Fun times, I can't wait to see where this goes.  Thankfully, it's going there really fast, so there's little time to wait.

- David Barrett
Follow me at http://twitter.com/quinthar

Also, download text/HTML/PDF receipts in PDF form

Also, I should note that when you upload receipts as HTML or text emails (or even upload them as original PDFs), we store them securely on our server as both a high-resolution PDF and a low-resolution JPG thumbnail.  We typically only show the JPG, but your always welcome to go back and download the full PDF in all it's glory.  Just click on the receipt on the home or receipt page, then click "Download as PDF" in the upper-right corner:

Download as PDF

Naturally, this option is only available for HTML, text, and PDF receipts -- receipts uploaded as a photograph are kept in their original format.

- David Barrett
Twitter: @expensify

Better uploaded HTML receipts: now with embedded images!

So we've been absolutely flooded with users and that's a great problem to have!  One of the (surprisingly few) areas of problem was with receipts: you'd be amazed how many formats email receipts can come in.  But we're steadily learning how to handle them all, and we've a major new trick up our sleeve: embedded images!

That's right, now if you forward us an HTML receipt that has images in it, we'll render the images in full glorious color.  For example, here's what it looks like when I upload an Orbitz reminder that Athens was in the midst of riots when we recently visited:

Beautiful Orbitz receipt

Pretty slick, eh?  So send your receipts to receipts@expensify.com and your expense reports can look this good too!

- David Barrett
Twitter: @expensify

More pirate innovation: scan barcode at the store, downloaded at home

Just one more example of how all the best innovation is happening
outside the law.




Torrent Droid: Scan Barcodes, Get Torrents
Written by enigmax on March 11, 2009

You are standing in a store looking for a new DVD to buy. Rather than
buying it, you photograph the barcode with your phone and press a couple
of buttons. By the time you make it home, the movie is waiting for you
in your torrent client. You can with Torrent Droid.

AndroidAround a month ago, Android-orientated website Androidandme
launched 'Android Bounty', a new initiative which has led to the
creation of nice little torrent app. To find out more, we spoke to
Taylor Wimberly from the site.

"Android Bounty is a new kind of developers challenge we started for
creating applications on Google Android," he told TorrentFreak. "Users
submit ideas which can be voted up by others who pledge money to the
bounty. The first developer who delivers a working application is
rewarded with the bounty." Taylor explained the idea is similar to how
users promote stories on Digg, except people vote with cash.

To start things rolling, a few days later Androidandme set a challenge
to its readers - create an Android-compatible BitTorrent application to
scan UPC barcodes and find related torrents on the larger BitTorrent
search engines. Users would be able to find and start torrents remotely,
and the music album or movie would be fully downloaded by the time they
got home.

There were some terms and conditions to the challenge. The software
would use the G1 cellphone's inbuilt camera to scan a retail DVD UPC
barcode, and use the capture to identify the official details of the
product from a database.

Once the product is positively identified, the software should be able
to send the results directly to a BitTorrent search engine, such as The
Pirate Bay or Mininova. After the search results appear, the user could
then choose which torrent to start.

Once selected, the .torrent file would be downloaded and sent to the
webUI of uTorrent and the download would begin, hopefully ready for when
the user reaches his or her home machine. No typing input would be
required for the above.

Just a few weeks later, Alec Holmes of Zerofate had stepped up to the
challenge, created the app and collected the modest bounty of $90.00.

"This version of Torrent Droid is a work in progress but the video shows
the core features work," said Alec.

The full version of Torrent Droid will be released within a month but in
the meantime, here is a video of it in action.

It's official: Expensify is open and ready for business!

After months of hundreds of beta testers pouring over every nook and cranny, it's time: as of Wednesday, March 11th at 8am PST, Expensify has opened its doors to all comers!  That's right, as of now, we are in "open beta", so sign up now and encourage everyone you know to follow!

If you already know what Expensify is, then don't bother reading this blog: just go to http://expensify.com and get started.  Or, if you're looking for a three-minute refresher, watch this video.  Otherwise if you're looking for a bit more detail on what we're all about, here goes:

Expensify does expense reports that don't suck. 

Backing up a bit, let me say this: I hate expense reports, as does nearly everyone I know.  They take forever to prepare, there are always missing receipts, my boss is always slow to reimburse.  In short, they suck.

That's why what we do is so amazing: Expensify does expense reports that don't suck.  Such a simple goal.  Can it really be possible?  Ultimately you can judge for yourself, but here's how we try to make this bold vision a reality.  We call it the "Expensify Way".
1) Import your credit card; no more data entry

If you already have a credit card, great!  Expensify imports expenses from 94% of US credit cards.  That means no typing into Excel or ancient web forms: just enter your credit card details and we'll import straight from your banking website into our PCI-compliant datacenter.

Alternatively, if you don't have a credit card, or have one but don't like mixing business and personal expenses on the same card, we can help you get a corporate card that imports its expenses straight into Expensify.

2) Import your receipts; no more paper receipts

Not only does Expensify import your expenses, we also create Guaranteed eReceipts for purchases under $75.  Guaranteed eReceipts comply with all IRS regulations for documenting purchases (we guarantee it), so you can literally throw away 80% of your paper receipts.

Of the remaining 20%, most are online purchases such as plane tickets or hotel reservations -- just forward the email receipt to receipts@expensify.com and we'll take care of it.

For those few paper receipts that remain, just use your cameraphone and send a picture of your receipt to receipts@expensify.com.  (Or use our iPhone App, once Apple gets around to approving it...)

The upshot is we can literally do away with paper receipts.

3) Submit in one click; no more printing and stapling

In just one click we'll take all your expenses, subtotal them by expense category, attach your uploaded and eReceipts, and construct a full, ready-to-send expense report.  Enter any email address and we'll send a PDF containing the completed expense report, receipts and all, along with a form to reimburse it.

4) Reimburse online; no more trips to the bank

Pay or get paid from or your checking account or credit card, online.  Or mark it approved and get paid through regular channels (payroll, wire transfer, etc), or even reject it and send it back with comments.

The Expensify Way is the fast and easy way.  Import your expenses and receipts, submit in one click, reimburse online.  Before today, it was the way expense reports should work.  But now, it's the way they do work.  If you're still stuck in expense report hell, why not find salvation today?

Sign up today.  It only takes seconds, and it's completely free.

Expensify.  Expense reports that don't suck. 

- David Barrett, Founder
Twitter: @expensify

Here's why backbone sampling will *never* be accurate:

Every once in a while someone gets a brilliant idea for dealing with piracy: why not just assemble a big pool of money and then distribute it in proportion to how often content is pirated?

Both parts of that (filling the pool, and then selectively emptying it) are atrociously bad ideas for a huge number of reasons, but let me zero in on the latter half here.  In essence:

Under no circumstance proposed or envisioned will backbone measurement ever estimate volume to even the barest degree of accuracy, darknet or otherwise.

Consider what is ostensibly the most widely viewed image on the internet: the Google logo:

It's unprotected, unencrypted, no darknet, no P2P file sharing, no copying to an iPod for offline consumption.  In short, if backbone measurement could ever estimate *anything* then surely this would be the ideal use case, right?

But the Google image is cached locally -- in my case (according to about:cache in Firefox) until 2038.  No matter how many times I visit Google.com, I won't redownload it.  So estimating visits to Google.com by sampling the number of times the logo is downloaded is completely and irreparably flawed.

(And the most common caching solution is LRU so content that is accessed *more* often is actually re-downloaded *less*.)

Thus estimating the number of times a song is listened to by measuring how often it is downloaded is even more flawed -- as all the reasons I gave for why Google is the ideal case are precisely inverted for music.

Even if we can't agree on anything else, we should all at least agree that backbone sampling is a patently absurd notion for estimating popularity, and thus is intrinsically unsuitable for redistributing some big pool of money -- regardless of how it's filled.

- David Barrett
Twitter: Follow @quinthar

Testing, take 2

Ok, that didn't really work, how about *this* one...  Ain't technology great?

- David Barrett
Twitter: @quinthar

Testing blog uplink...

So we've moved the Expensify blog from quinthar.com over to Facebook.  To simplify the transition, I'm just going to cross-post the subset of my Quinthar blog that relates to Expensify to the new Expensify blog on Facebook.  This is a test to see if that works...

- David Barrett
Twitter: @quinthar

Twitter is sitting on a goldmine

So I've been doing the twitter stuff for a while and I've been liking it, but it doesn't really scale up by the orders of magnitude I'd like.  It brings in dozens of clicks a day, but I want thousands.

(Incidentally, I finally filtered out all the Twitter bots -- conversion is still incredibly high.  The technique works amazingly well.)

Naturally, for those thousands of clicks a day I'd go to AdWords.  I know I can't afford that, but I'm curious what it would cost.  So I set up a series of keywords and set a small test ad budget, with the thought that I'd instantly be flooded with clicks and my ad budget depleted within minutes, but at least I'd get the data I need.

Days pass, and not a single ad was shown.  I check everything, add some more keywords, verify my billing is set up, remove my $2.00/click maximum, and try again.  Still nothing.

I'm thinking WTF.  So I dig around a bit more and find this keyword estimator and I find some really surprising results:

Even if I threw unlimited money at the problem, I would only get between 86-111 clicks a day, at a cost of $230-380/day.  That's $2.67 - $3.42 per click on average, and still it's such an insignificant flow of users it's not even worth the effort.

In other words, the technique I'm using with Twitter not only converts far better than AdWords, it does it way cheaper and is in fact *easier* to use.

And on top of all that, let me throw another datapoint at those readers who are concerned that my Twitter technique is spam: we get about 4x more "thanks for the link!" responses than we do complaints.  And given the general addage that people are 10x more likely to complain than thank you, that means between 4-40x more people are actually appreciative of our contact than upset.

Given that you can't please everyone, pleasing 40x more people than you upset is about as good as you can do.

The upshot is this: Twitter is sitting on a massive goldmine.  Indeed, my data suggests that Tweets are far more monetizable than searches, and users will actually thank you for it.

Will that scale?  Unknown.  We adhere zealously to the Twitter Promotion Code of Conduct I outlined earlier, and I imagine there will be a flood of people who aren't so kind who will in all probability ruin it for the rest of us.

Until then, there's gold in them thar' hills, so go on out and grab it!

- David Barrett
Twitter: @quinthar

Another example of the impossibility of going legit

At AngelConf today I sat next to a man had founded (and was apparently failing) at a business that taught how to play the guitar using super detailed 3d motion capture of the specific hand motions.  He was able to get all the incredibly advanced technology -- 40 16 Megapixel cameras filming at 360 frames per second -- no sweat.  What was the problem?


He did manage to get sync licenses so he could show his 3d models in sync with the music.  But he couldn't get equivalent rights for the tablature.  The result?  After all this work he couldn't show the fingering position notation in sync with the 3d models and audio.

Now I'm sure someone will say "why didn't he do X or Y or Z?"  And I have no idea.  He was apparently smart enough to get the audio and sync licenses; I don't know why he couldn't get tablature licenses.

But for some reason he didn't or couldn't and ultimately that's all that matters.  Another law-abiding entrepreneur bites the dust.

What I found especially interesting was how he brought up the topic entirely unprompted; I was pitching expense reports when he suddenly delves into a tirade against music licensing.  Little did he know he had such an eager audience!


- Jan 2014 (1) - Mar 2012 (1) - Nov 2011 (1) - Oct 2011 (1) - Apr 2011 (1) - Mar 2011 (3) - Feb 2011 (2) - Jan 2011 (9) - Nov 2010 (1) - May 2010 (1) - Mar 2010 (1) - Feb 2010 (1) - Jan 2010 (1) - Dec 2009 (1) - Nov 2009 (1) - Oct 2009 (1) - Sep 2009 (1) - Aug 2009 (2) - Jul 2009 (1) - Jun 2009 (4) - May 2009 (3) - Apr 2009 (3) - Mar 2009 (10) - Feb 2009 (5) - Jan 2009 (3) - Dec 2008 (5) - Nov 2008 (5) - Oct 2008 (5) - Sep 2008 (4) - Aug 2008 (5) - Jul 2008 (11) - Jun 2008 (8) - Feb 2008 (1) - Aug 2007 (1) -