What we should build for the Egyptian (and other) protesters

Egypt appears to have cut all internet connectivity with the rest of the
world in an attempt to quell its use in organizing protests. The only
reason this makes any sense is if the tools used to organize the
protests (Twitter, Facebook, Gmail, etc) are hosted outside Egypt.

To this you might say "Let's just host protest-organizing tools on
servers inside protest-likely nations in anticipation of them using this
strategy again." But that won't work because odds are the government
would just seize all protest-organizing servers within their borders.

So the only protest-tools that will continue to work reliably are those
that continue to work without access to the outside world, without
relying on locally-hosted servers, and *without even relying on the
internet at all*. It's a tall order, but here's how I'd do it.

1) Recognize that this service needs to be used in the good days, such
that there is adequate distribution already in place when the bad days
happen. THIS IS THE HARDEST PART. I say this in all caps because this
is why no meaningful system like this exists today: the people most
likely to build it are too obsessed with esoteric technical problems
than solving the issues that actually matter in the real world.
Asymmetric, anonymized, mesh-distributed, onionskin-routed communication
doesn't mean anything if nobody uses it. So before even thinking about
the technology, we need to think how to make it relevant to users who
*aren't* protesting (yet).

2) At an absolute minimum, it needs to be no worse than then existing
alternatives. So if it's going to replicate Twitter, it needs to be at
*least* as good as Twitter, otherwise everybody will use the *real*
Twitter (until it's turned off by their local neighborhood dictator).
On way to be better than Twitter is to actually be better than Twitter.
Good luck with that. Another way is to just make your tool post to
Twitter. I think that's a much better idea: if this tool (let's call it
"anoninet" just for kicks) offers some Twitter-like functionality, it
should be completely compatible with the real Twitter in the
99.99999999999% of situations where the real Twitter is actually
available. Same goes for Facebook, Flickr, etc.

3) Ok, so anoninet's primary value in "good times" is starting to take
shape: it's a one-stop-shop to post to all your social networks. So you
install this thing, type in all your passwords (You could store them
locally in some encrypted keychain decrypted by a master password, but
that's the sort of technomasturbation thinking that obscures real-world
requirements; in reality just store it unencrypted because those who
don't care don't care, and those who do should really just encrypt their
whole hard drive), then you can post status updates, photos, videos, and
everything will automatically go to the right place. Indeed, before you
even think about making this into some sort of resilient
protest-enabling tool, you should make this the best possible
social-network posting tool. (Because if it's not that, then nobody
will have it installed when they want it most.) I'd suggest emphasizing
how this thing works even with unreliable internet, essentially letting
you queue up everything locally and it does background uploading as the
network becomes available. Similarly, it downloads everything locally
for offline reading. Odds are your protest-likely environment has
shitty internet to start, so this feature will likely have immediate
value. Add in really good support for USB-connected devices (cameras,
videocams), and basically present it as the single best way to do social
networking in a nation with shitty internet.

4) Step 4 is to succeed with step (3). Don't even think of anything
else until you've done that. Seriously, it's a waste of your time and a
disservice to your users. (3) needs to be totally nailed and immensely
popular before anything else matters. I'd say something like 10% of
your target population needs to be using it before you consider continuing.

5) Once you've got huge distribution of your client-side
social-network-optimizer, then you can start to raise the bar. Because
it's targeted to environments that have expensive and/or unreliable
internet, P2P starts to sound interesting. Throw in a network-localized
DHT and build out a distribution network that "rides" on these other
networks. So every time they post to Twitter, Facebook, Flickr,
YouTube, or whatever -- they're also posting to anoninet. And when
another anoninet is reading your Twitter stream, somehow they detect
each other and rather than getting the data from Twitter (for example),
they get it directly via some localized P2P connection. Present this to
the user as faster, more reliable, and cheaper than getting it from the
*real* YouTube.

6) Quietly encrypt everything and tunnel over commonly-used ports.
Don't talk about this, just do it. Users don't care until they do, and
by then it's too late.

7) Ok, so at this point we have wide distribution of a very popular
social networking tool that uses a localized P2P mesh as an optimized
fallback to the major global tools. Its major advantage is it works
over networks that are slow, unreliable, or expensive. This'll save you
in the Egypt case; these users would continue using the tools they
already use, to talk to the people they already talk with, and
everything will continue functioning as normal. They won't be able to
talk with the rest of the world, but they *will* be able to talk amongst
themselves, which is the important thing. Furthermore, because it's all
P2P, there are no servers to seize, and because it's all encrypted over
common ports, it's indistinguishable from all other encrypted traffic.

8) However, if this had existed in Egypt, odds are Egypt would have just
shut down the internet, period. If a dictator is willing kill you, odds
are they wouldn't blink at turning off your email. So how to make this
work without internet? The answer is: make it incredibly easy to batch
and retransmit data like Fidonet back in the day. So when shit is
*really* going down, you whip out your favorite 4GB, 32GB, or 640GB USB
drive and just sync your local repository (remember how everything was
conveniently cached locally for fast offline access?) with the device.
Optimize it to sync the most popular content first, basically ensuring
that the most intersting/important message is also the most widely and
redundantly distributed.

9) Finally, this needs to spit out an installable copy of itself to
whatever removable media is available. This way when the shit starts to
*really* go down, as people realize the true value of this system it can
spread fast to the people who need it.

Voila. A tool that supports communication amongst protesters even in
the face of total internet blackout. Some other random thoughts:

- Ideally it'd piggyback on existing credentials. So when you install
this thing you don't need to think "I'm creating a new account".
Rather, you just install this thing, type in your Twitter username and
password, and whatever giant asymmetric keypair it creates internally is
just some nameless thing associated with that Twitter account. (And you
might have multiple.)

- This thing needs to broadcast itself via existing networks in a
totally transparent way, so if we're both users and I read your Twitter
stream, I should know you're also a user without you ever telling me.
The first way that comes to mind is this thing could watermark your
profile image with maybe a digital signature (or perhaps just jam it
into some sort of extra field in the image). Then when I follow you, my
client sees the watermark, reaches out to the DHT, sees that you're
signed in (or not), and establishes a NAT-tunneled P2P connection directly.

- Social networks are particularly good for this sort of architecture as
they map well to the "publish/subscribe" model. This works easily on a
P2P network (you register yourself with the DHT by name and
keyword/hashtag, and then when you post there everybody who is
"following" you or a particular hashtag gets your data), as well as
create an implicit "value" metric for use when synchronizing data in
"sneakernet mode" (publishers/hashtags with a high follower count are
assumed to be more valuable and thus beat out less-popular content).

- This sort of system actually isn't that useful to terrorists,
criminals, drug-dealers, and so on because it's designed for mass public
communication (not indvidual private communications). Granted, nothing
in this protects the individual from being targeted, but that's an
entirely different problem. (And I wager one that could be layered on
top of this in a straightforward manner.)

In all honesty, this isn't that hard a thing to build. One dude could
do it. I could personally do it, and know several others who could as
well. But I'm busy. Hopefully a better person than me with more time
on their hands will pick up on this and do what needs to be done. The
world will thank them for it, though its dictators won't.

-david
My blog (including this post) is at http://quinthar.com
Follow me at http://twitter.com/quinthar

From the archive: David's Voluntary Payment Plan

This one is from 2008. I was asked something along the lines of "Well if you're so so smart, how would you fix the music industry?" Here's my answer:

http://quinthar.com/DavidsVoluntaryPaymentPlan.html

David's Voluntary Payment Plan

David Barrett

dbarrett@quinthar.com

2008/3/20

Abstract

This plan recommends creating “music registrars” to authoritatively manage song metadata in a fashion similar to how domain registrars authoritatively do the same for domain names. Artists (or their representatives) upload songs to registrars, who in turn check their waveform fingerprints against a master database of all known songs. If the song has already been registered by another owner, a conflict resolution process is started. Otherwise, the song is transcoded to a MP3 and tagged with a variety of metadata (artist and song name, artist website, etc), including “payment protocols” that enable fans to support the artist in a standardized way. iPods and other MP3 players are gradually outfitted with integrated support for various payment protocols, as well as methods for receiving artist communication or learning of and purchasing artist merchandise, concert tickets, and so forth.

I. Example of Operation

First, here's a quick walkthrough of how the system would be used in common operation:

A. Adding a new song

Alice, an independent musician, selects from one of several music registrars, creates a free account, uploads her track in the FLAC format, assigns it a name, optionally organizes it in one or more albums, and is done. The entire operation is free, takes less than 10 minutes, and requires no personal information beyond an email address.

B. Downloading a song

Bob, a music aficionado, browses a variety of free music outlets for new songs. One of those locations has an active online community around indie music, and the forum is buzzing around a new musician, Alice. The forum links to a page where Alice's music can be downloaded -- he clicks the link, chooses the format and bitrate, and downloads the MP3 for free. Though the website allows low-quality 128Kbps versions of the song to be downloaded or streamed straight from the server, for cost reasons it only allows 256Kbps and FLAC versions to be downloaded via a P2P network. He's all about quality, so he whips out his favorite P2P application and downloads the FLAC.

C. Listening to a song.

When the download completes, Bob copies the file several places -- his laptop, his home stereo, his iPod, his phone -- all of which support the completely standard, unprotected audio format.

D. Supporting Alice

Bob decides that he really likes Alice's music and wants to see more of it get played. He has several ways to help that happen:

One way is to go back to the website where he downloaded the music in the first place. There there's a small (but growing) forum where Alice fans discuss her music, links to other music by Alice, recommendations of other music by Alice, and so on. Furthermore, there's a quick note by Alice herself saying "Hi, I'm trying to raise $1000 to fund my next album, please help me out!" Bob sees she's up to $950 right now. He's got a few options of how to help. One is to just do a simple cash contribution, one is to help raise up to $1000 (at $950 so far) with the caveat that if she doesn't raise the full amount within a set timeframe, the money is given back. Another is a subscription of $1/mo that gets his name put on a list of True Fans. Yet another is to buy the last limited-edition autographed copy of Alice's first Vinyl album for $50. All of these options can be paid with PayPal or a credit card.
Another way is to use a feature built into iTunes and his iPod to auto-support any any song he listens to more than 5 times, to the default (but adjustable) amount of $0.05/listen. Similarly, whenever he looks at the face of his iPod to remember who he's listening to, he sees Alice's message that she's trying to raise $1000 and is up to $950. Likewise, he sees there's one more copy of the limited edition vinyl available.

Ultimately, he decides to go for the vinyl recommended by his iPod. He goes to iTunes, chooses "open musician's website", and buys the vinyl online.

E. Getting paid

When Alice signed up, she had no idea her music would be such a hit. But her inbox is full of messages, donations, and all her vinyl copies (which she hasn't even made yet) have already been sold.

Getting to work, she uploads the cover art design and asks her registrar to press the given number of vinyl records and FedEx to her for signing. When she sends them back, the company redistributes them to the customers who purchased them, and the money is deposited into her account.
As for how to get her money, she has a couple options. The classic approach is to just give her direct deposit information and it's deposited via the ACH network (automated clearing house). Another is to give her PayPal information. She doesn't like any of those options, so she goes with a third option of just having a reloadable prepaid Visa card sent her way -- any money added to her account is instantly available for use at any merchant, or even to be withdrawn from any ATM.

II. Music Registrars

Core to this plan is the notion of "music registrars". Like DNS registrars (from which this draws inspiration), there are many and all provide compatible functionality while competing aggressively on price and value-added services. Musicians are free at any time to sign up with any number of registrars, or move tracks between registrars at a later date. But each track ultimately maps back to a single registrar that manages (at least) standardized metadata operations around that track. In essence, a registrar provides at least the following:

Account creation. Generally with a username/password, though optionally with more secure mechanisms (multi-factor authentication, PKI, etc).
FLAC storage. For every track managed, permanently store a master FLAC version.
Metadata hosting. For a given track, host its authoritative name, artist, album, etc. (essentially, ID3 tags) in one or more languages.

Though not strictly required, in general a registrar will offer a wide variety of additional services, including some subset of:

Transcoding and hosting. Generates a variety of file formats from the master FLAC, including MP3, Flash, etc. and hosts them on the web and P2P networks.
Payment gateway. Accepts payments from fans according to a variety of payment protocols and securely deposits into the artist's account.
Fan management. Forums, blogs, RSS feeds, and all the accouterments of web 2.0.
eCommerce. Anything ranging from a Yahoo Store-like checkout system to a CafePress-style product generation assistant.
Recommendation engines, playlist management, webcasting radio stations, promotion services, gig management, tour assistance, discount music equipment, etc. Basically, each registrar will attempt to provide artists with a complete one-stop-shop of all things they could possibly need to be a happy, successful musician.

A service exists that lets anybody look up the latest metadata on any track. (Typically you would just download the metadata straight from its registrar, but there would be a mechanism to determine who the registrar is -- if any -- for an unknown piece of music.) This service uses a combination of servers hosted by the registrars, as well as servers hosted by an independent organization that manages the registrars themselves. This organization is focused exclusively on the operation of enabling transfers of music between registrars, resolving disputes between registrars (and between users and registrars), and authoritatively stating which registrar is currently managing which track. This organization is funded through annual re-certification fees paid to the organization by registrars.

One operation that is particularly interesting is: how does this organization uniquely identify each track in order to guarantee that each is only being represented by a single registrar? The answer is by using waveform fingerprints. Each registrar holds onto the master FLAC for every song in its management. Upon adding a new song, it uploads a "fingerprint" of the song to the master organization, which then confirms no other song has the same signature. (If there is a conflict, the organization investigates and resolves it.) The organization will make the choice as to which signature function to use (and it needn't be perfect, it's just a tool in helping proactively identify and resolve conflicts), and it can at any point decide to use a new function by simply having all registrars re-fingerprint all FLACs with the new function. Again, the fingerprinting doesn't need to be (and won't be) perfect -- it's just a flag that triggers manual corrective action. The better the function, the less wasted work.

III. MP3, ID3, and Metadata

In general practice, a musician would upload a track's master FLAC to her music registrar, and the registrar would generate a series of MP3s that have all the ID3 tags correctly set. The musician could then do whatever she liked with those MP3s -- email them, post them to P2P networks, post them on forums, burn them to CDs, etc -- and the ID3 tags would just be carried along with them.

However, the metadata can be indexed, distributed, and used in any way, even outside of MP3s -- the same information can be downloaded from the registrar at any time.

IV. Music Metadata and Player Support

In general, the metadata associated with a particular song can be any arbitrary name/value pair that the owner sees fit to associate with the song. There are no strict requirements or limitations on what sort of metadata must be associated. Similarly, players can choose to support all, none, or any subset of the metadata contained within a file. Any metadata not understood should be simply ignored. Some types of metadata include:

The standard ID3 tags: The obvious metadata includes artist name, song name, album, genre, and everything else you typically see in MP3 players. Example:
Name: Before Today Artist: Everything but the Girl Album: Walking Wounded Track: 1
Unique song GUID: A globally unique identifier assigned by the registrar to this song. A given song would have the same GUID across all bitrates and encodings, for example, but different mixes of this song would have different GUIDs. In general, all MP3s with the same GUID should have the same waveform fingerprint; similarly, in general, no two tracks with different GUIDs should have the same waveform fingerprint. This GUID can be used by the player, website, or other service for whatever purpose it likes (it's handy to have a key by which to index the song). Example:
GUID: s8d9fgfud6s6d6f8ds8sys6s65
Metadata URL: A new tag would be a HTTP URL from which the latest authoritative metadata can always be downloaded in some standard format (I'd propose JSON, others might argue XML, but the specific choice is TBD). Any player or service can download the latest metadata for this track at any time, possibly rewriting the MP3 itself with the new information. Example:
MetadataURL: http://mytunes.com/meta/s8d9fgfud6s6d6f8ds8sys6s65

Payment protocols: A series of descriptions through which this artist can be automatically compensated according to some predetermined protocol. There will be many different payment protocols (and new ones all the time), some of which might include direct deposits into bank accounts, charging to phone bills, reverse charges to prepaid credit cards, PayPal transfers, eGold transfers, or whatever. It's likely each registrar would offer one or more of the most well-known payment protocols by default, but there is no restriction on somebody coming out with a new payment protocol and then associating it with their song. (More details on this below.) Example:
Payment: ach://<bankaccount>,<institution ID> Payment: paypal://<email address> Payment: http://mytunes.com/s8d9fgfud6s6d6f8ds8sys6s65Payment: raise://amount=$1000&current=$950&by=2008/4/1
Hash: Though there's no strict requirement that a given song be distributed universally as a binary-identical MP3 for each given bitrate, it's reasonable to assume that this convergence would occur. Thus a valid piece of metadata would be the hash of a given encoding, which can be used by the player to verify that the file hasn't been corrupted. Example:
Hash: MP3/256/SHA1(3da3f0afc0d772825c43e310fe34eacf0dea204b)
Message of the day: A general message that the artist wants to associate with this song. Can be anything from a simple hello, a description of the song, a request for help, an advertisement, or anything. This could appear on the face of an MP3 player, or in a bubble on your desktop, or however the player feels fit to show it. Example:
MoTD: Only 1 copy left of my limited edition vinyl album, $50! MoTD: Don't forget, I'm playing the Fillmore tonight at 8pm!
Lyrics: The lyrics of the song itself could be easily included in the song, or perhaps a URL where the lyrics can be downloaded.
Other songs by this artist / recommended by this artist: Links to other songs by this artist. A player could be configured to poll this at some frequency to be automatically notified when new music by an artist becomes available.

The important thing to take away is that metadata can contain anything, and registrars merely record and host it -- it might or might not have any awareness of what the various name/value pairs actually mean. You needn't ask anybody's permission or get the approval of any standards body to create new metadata: just add it to your song, and any player that doesn't expect it will ignore it.

V. Artist Compensation via Player Integration

The basis of this system is to enable fans who want to compensate artists whenever and wherever the mood strikes them, in whatever amount, for whatever reason they come up with. This is enabled through integration with the players themselves, as this reduces the latency between hearing the song, making the decision to support the artist, and actually conducting the transaction.

The specific method of the integration is up to the designer of the player or service. But some examples that could be applied to any general MP3 player include a "thumb's up button" where $0.50 is sent to the artist when pressed, or an "auto-tip" option where $0.05 is sent to the artist each time his song is played in entirety, etc. All of this would be opt-in and configurable by the user in regards to the amount being paid and the frequency of payment.

Similarly, metadata and players could generally conform to standard ways of advertising merchandise and concert tickets related to the music. Depending on the player's form factor, it could even provide basic storefronts, one-click additions of tour dates to Google Calendars, or whatever type of interaction the device feels is appropriate to facilitate between artists and fans (perhaps even with a commission for the transaction paid to the device manufacturer). Ultimately, this is left up to the artists, fans, and player manufacturers to decide – the music registrar just manages the metadata without being aware of what it means or how it's used.

As for how the payment would be technically conducted, this would depend on the payment protocol and would likely be decided by a period of competition ultimately leading to a few widely supported "de facto" standards. For example, a phone-integrated player might use a payment protocol that puts song contributions straight onto your phone bill. An iPod might keep an internal count of what payouts are left to be done, and then upload the transactions to an iTunes-integrated micropayment engine when synchronized. WinAMP might accumulate transactions until they exceed some threshold where paying the artist directly via PayPal makes sense. And so on. Payment providers will compete vigorously for adoption by players and registrars alike, but the ultimate decision for who to pay, how, and how much rests with the listener.

VI. Conclusion

In summary, the above proposal outlines a global framework where fans can voluntarily support fans through a competitive ecosystem of compatible service providers. The design separates functionality along clear layers of accountability and enables competition between multiple parties within the layers. The goal is to create a flexible, powerful system that enables a degree of innovation yet unseen in the music industry (at least, in the legal music industry). Much like the web and internet itself have transitioned from small, non-profit research projects into engines of global commerce, music -- both its creation and consumption -- has the capability to be a similarly innovative and powerful force. It just needs a framework that encourages it.

VII. FAQ

Here are the questions I've heard asked on this list before, and some quick answers to each:

What if nobody decides to pay?
The base assumption of the entire music industry is that music is valuable, and that fans actually do exist. If fans -- people who value art and wish to support their artists -- do not in fact exist, then this system won't create them.
What if no music players decide to support payment options?
The system works best if the payment protocols are implemented in the players themselves. In the meantime, until these are widespread, music registrars can offer web-based gateways that help fans support artists using today's technology.
What's to prevent me from uploading the Beatles as my own mine?
The standard solution to this problem is to have a "sunrise period" where prominent trademark and copyright owners are given early access to submit their own songs to the database. The expectation is each of the labels would run its own "private" registrar to manage its songs, and thus they would simply upload a complete list of fingerprints for all their songs to the registrar-management agency. In the event anybody uploads one of the label's songs to a different registrar, a flag would be raised when the fingerprint conflicts with the existing database, and would be resolved through manual action.
So... where's the big pool of money? Where's the sampling?
That's right, this system doesn't need to globally sample listening demographics in order to disperse a central pool of money according to some arbitrary measure of value. Rather, the money is never pooled -- it goes straight from the fan to the artists (via one of many competing payment gateways). The samples are never taken -- it's not really practical in the first place, and it's just not needed. And no arbitrary measure of value is selected -- it's left up to every fan to decide how much to give his artists.
What about piracy?
What about it? It already happens today in vast amounts, and no plan on the books even claims to have a chance of doing anything about it. Piracy *is* online music -- everything else is just an aberration. This plan seeks to capitalize on the real world as it exists today, tapping into the vast sums of money that fans currently aren't giving to music labels.
What about privacy?
This system gives exceptional privacy protections to all involved because there is no one entity that sees all activity. As such, it doesn't centrally aggregate sampling data, demographic profiling, historical traffic, personally identifiable information, or any of the problems that people are generally skittish about. The centermost entity of this plan is an organization that just has anonymous fingerprints of unnamed songs, and knows absolutely nothing about the songs themselves, the artists who make them, the users who listen to them, or the interactions in between.

X got paid $Y before, will he still be?
Possibly. Maybe he'll get paid more. Or maybe less. The same can be said about every other solution on the table.

But it's not fair! How will X get paid for Y?
This plan recognizes that every fan has a different idea of what is or is not fair, and fully empowers him to act upon that notion. Even the old system that is rapidly dying wasn't "fair", it's merely "what was". This plan does not attempt to blindly copy what was, nor invent some new notion of "fair" and mandate that all fans obey it under threat of force. So in this sense, it is arguably the most fair of all.
Hasn't this been tried before?
Everything's been tried before, and everything has failed – all plans have failed – due to lack of support and outright opposition by “old guard” music industry. Virtually every innovative plan, both voluntary and compulsory, has been crippled through lawsuit, squeezed through impossible pricing, or bypassed through refusal to participate. There's very little in this plan that's new, and without action by the existing industry, this plan to create a feasible commercial alternative to raw, uncompensated piracy will fail just like all the others have and are failing. But this proposal isn't intended as a panacea. It's intended as a review of what's possible should the music industry decide to begin acting reasonably and in the interests of artists, fans, and society at large. There are signs that the industry is starting to have reason forced upon it by investors, artists, and even a gradual awakening of common sense after a decade of complete destruction of shareholder value. One day, they will either become irrelevant or will sign up to one of the many, many plans proposed and nurtured over the years. Maybe they'll choose this one. Maybe not. The point of you reading this is to be aware that the vision presented herein is in fact possible, and to either encourage the industry to adopt this proposal, or to encourage congress to strip the industry of its abused and overzealous tools of copyright enforcement such that we can continue on without them. How many more decades are we willing to wait?

So that's all well and good, but seriously... Where's the sampling?
Seriously, it's not needed. Take it in reverse.

Q: Why sample?

A: Well, we know how to at least try to sample music fingerprints transferred over the backbone, and we think that samples are somehow related to how often songs are listened to, so by sampling we can get a sense of which songs are most often listened to.

Q: Why do we care how often songs are listened to?

A: Well, we're assuming that the number of times a song is listened to is representative of how valuable it is to fans.

Q: Why do we care if a song is valuable to fans?

A: Because artists must be paid in proportion to value, obviously!

Q: Paid by whom?

A: Well... by fans, I guess... obviously.

Q: Why don't fans pay artists directly?

A: Well they *were*, through CD sales, until piracy ruined everything.

Q: I thought CD sales largely didn't go to artists.

A: Well... if you want to get *technical*, no, but they sorta "trickled down to artists"... It's complicated.

Q: Ok, again, why don't they pay artists directly?

A: Because that's impossible! What, are they supposed to track down every artist in their playlist and give them a nickel each time they play the song?

Q: Sure, why not?

A: Because... because you just can't. It's complicated. Fans can't be trusted to support their artists directly. They need help.

Q: Help from whom?

A: Well, help from me, of course. And my friends. Only we can get artists compensated.

Q: But I thought your CD sales largely didn't go to artists?

A: Yes they do! They trickle!

Q: So let me get this straight: the goal is to help artists get paid by fans in proportion to how much fans like them. But fans can't be trusted to do it directly, and instead artists need the help of organizations that historically take the lion's share of the profit and leave a trickle for the artists themselves? And the best way to do this is to force everyone to pay you a bunch of money that you distribute based on relative estimated value to fans calculated by sampling backbone traffic for a small set of music fingerprints, extrapolating global traffic, inferring total music listens from that, and then converting that sampled/extrapolated/inferred number into "value to fans" with an arbitrary formula selected by... by whom again?

A: By me.

Q: Got it.

A: That's right! Now you're getting it.

Q: And why not just let fans give artists money directly?

A: You just... you just can't! And... it's different, and therefore scary. Artists talking to fans? Fans talking to artists? What an absurd thought. Fans can't be trusted! Artists don't want to talk to fans! There need to be a middleman. Lots and lots of middlemen. And formulas! And sampling! And most importantly -- a huge, enormous pool of money. That I control. Trust the trickle. It worked for your grandpa. Why can't it work for you?

NAT penetration algorithm from iGlance, circa 2005

How's this for a blast from the past. I posted this to the iGlance
Yahoo group. You can read the original here, which has a couple
follow-up replies:

http://tech.groups.yahoo.com/group/iglance/message/52

But here's the text itself for your reading pleasure:
-----------------------------------------------------------------------
Hi, thanks for writing. NAT penetration is a very tricky subject, so
let me first give an overview of what the obstacles are, and then I'll
explain my approach for circumventing them.

(Note, the 'STUN' protocol I'm using is home-brewed -- it's not not
truly compliant with RFC3489, for reasons I can get into if you care to
hear. However, it accomplishes the same thing.)

First, assume the following network:

+--------+ +-----+ +--------+
| Client | ---> | NAT | ---> | Server |
+--------+ +-----+ +--------+

The client is connected to the NAT, and the NAT is connected (via the
internet) to the server. The client is generally on some LAN, and thus
has a "private" IP address. However, the NAT is generally on the
internet, and thus has a "public" internet IP address. Thus while the
client cannot send packets directly to the server (because the client
isn't on the internet), the client can send it "through" the NAT.

Now, UDP packets indicate from which address they originated. But which
address does the packet appear to be from when the server receives it:
the client, or the NAT? The answer is the NAT -- NAT stands for
"Network Address Translator" because it translates "private" addresses
(such as on a LAN) to "public" addresses (such as on the internet).

So the client sends a packet from the LAN address (call it privateIP)
but the server thinks it's coming from an internet address (call it
publicIP) due to the NAT's translation. So long as the client is
simply sending to the server, there's no problem -- if the
server is only receiving, it doesn't care what address the packet comes
from. But the moment the server wants to reply, then things get tricky.

In the easy case when a server is replying to a client request, the
server just sends back to the address the request packet appeared to
come from (ie, the publicIP). And when the NAT receives it, it forwards
it back to the client. In this way, when a client establishes a
connection with a server, the client and server can talk back and forth
without trouble.

However, the reverse is not so easy. Now, when the client initiates a
connection with the server, it 'punches a hole' through the NAT. This
hole (also called a 'mapping') is what the server uses to talk back with
the client. However, if the client doesn't punch the hole to the server
first, the server can't contact the client. Indeed, if the server sends
a packet to 'publicIP' before the client punches the hole through the
NAT, the NAT will just silently disregard the message and it'll never
arrive.

Thus a NAT is a bit like a one-way mirror: a client behind a NAT can
contact servers without restriction, but servers can't do the same.
Many people like this behavior for security reasons. But obviously, in
a P2P network this is less desirable because if you're behind a NAT, a
remote client can't contact you until you contact it. But if it's also
behind a NAT, you can't contact it until it contacts you. A seemingly
intractable problem.

To solve this problem, iGlance uses a directory server that acts as an
intermediary to help clients behind NATs and firewalls connect directly.
The process works as follows:

1) Client A connects to the global server and registers its IP
2) Client B connects to the global server and asks for the IP for A
3) The server informs A that B is trying to contact it
4) Client A begins trying to contact B
5) Client B begins trying to contact A
6) Eventually a direct connection is established

As mentioned before, whether A tries to contact B or B tries to contact
A, both will fail independently. But when they both try to contact each
other simultaneously, they both "punch holes" through their NATs and
firewalls, and thus both let the other's communications through. This
technique of simultaneous hole punching is the essence of NAT-to-NAT
traversal.

However, recall that each client typically only knows its "private" IP
address -- ie, the IP address on its private LAN. But just as the
server sees only a client's "public" IP address, so do peers only see
other peers' public IPs. Thus before client A can attempt to contact
client B, A needs to learn B's public IP.

This process of a client determining whether or not it is behind a NAT
(and if so, finding its public IP address) is called the 'STUN' process
-- named after the IETF standard RFC3489. (iGlance doesn't use this
protocol, but is heavily influenced by it.) The precise technique
iGlance uses is as follows:

1) STUN server is assigned 3 IP addresses -- STUN0-2

2) Client sends STUN request to STUN0

3) Client punches hole to STUN1

4) The STUN server attempts to contact the client *from* STUN0-2

Thus the STUN server sends *three* responses from *three* different
IP:port combinations, to the *same* IP:port from which the client
request originated. Depending on the NAT and firewall in place, the
client might successfully receive up to 3 responses, one each from a
different IP:port on the STUN server. Based on which requests succeed,
we can guess which type of NAT is between the client and the STUN
server. This is used to set the 'Connection_Class' as follows:

FIREWALL: (0 responses)
Something is blocking either all outbound or inbound UDP traffic.

SYMMETRIC: (1 response from STUN0)
The client can receive UDP only from the exact IP it sends to.

RESTRICTED: (2 responses, from STUN0 and STUN1)
The client can receive UDP only from remote IP:ports for which holes
have explicitly been punched.

UNRESTRICTED: (3 responses)
Once a hole is punched through the NAT, any remote IP:port can use it to
contact the client.

PUBLIC: (3 responses)
The client is not behind a NAT and thus can receive from any IP:port.

Furthermore, the server returns in the STUN response the apparent
IP:port from which the client's request appeared to originate. Recall,
the client sends from its 'private' address, while the server receives
from the client's 'public' address. If these are different, we know a
NAT must be in place. But if they are the same, then we can assume
there is no NAT in place and thus the client is connected to the
internet directly. (This is how iGlance distinguishes between the
UNRESTRICTED and PUBLIC states.)

(All this logic is contained in the file GDispatchService.cpp. The STUN
request is sent in the function GDispatchService::_requestStun( ), and
the responses are processed by GDispatchService::_onInput( ) in the
GDSS_STUN state.)

So clients with PUBLIC, UNRESTRICTED, or RESTRICTED NATs know they can
receive UDP directly from another peer. And clients behind SYMMETRIC
NATs or UDP-blocking FIREWALL know they can't (they must establish a
'TURN' connection with the server, which simply listens for UDP traffic
and sends back over HTTP). Armed with this information, clients can
ensure they are able to be contacted by remote peers, whether behind a
NAT or FIREWALL, or directly on the internet.

Does this answer your question?

-david

Why hasn't anybody built FreePandora yet?

Anybody know anything about this? Care to take any guesses?

http://torrentfreak.com/the-music-bay-pirate-bay-110122/

TPB has never really been a coding organization, so my bet is it's not
on some amazing new P2P service, but rather just a retooled version of
TPB website that is optimized for music content. In other words, the
basic foundation will still be a standard Torrent client.

I'm still amazed nobody has built a really good pirate music outfit -- a
*true* Pandora whose box when opened can't ever be closed. Music as a
product only has two real features:

1) Play this song (or list of songs) right now
- Search MusicBrainz
- Find the most popular album containing the song
- Search TPB for the highest seeded version of that album
- Download it with libtorrent
- Fish out the song you actually wanted
- Play
2) Play songs around this theme until I tell you to stop
- Given an artist name
- Look for similar artists on MusicBrainz
- Assemble a big playlist
- Download albums one at a time like (1)
- Play a random mix of whatever subset is available
- Keep expanding that subset

It's the simplest possible product to conceive. All the hard problems
have already been solved: the content is readily available, the metadata
is already there. All the pieces are in place and are just waiting for
someone to assemble it into a user-friendly package. The only "work"
involved is:

1) Build a UI with three input elements:
. A search box
. A "Play exact" button
. A "Play like" button

2) When "Play exact" is pressed it goes out, downloads, and plays that
exact song, artist, album, etc. Furthermore, if it's already downloaded
it just plays from its cache.

3) When "Play like" is pressed, it instead goes and finds a range of
songs/artists/albums like it, and plays those instead.

The only challenge is dealing with mapping the fuzzy input from the user
into MusicBrainz, and then mapping its output into ThePirateBay, and
then figuring out which song downloaded is the one you want. But again,
that's a solved problem. I don't personally know the best solution, but
if you convert everything into soundex sequences and just match based on
how many common homophones it has, I bet you'd get pretty close.

Anybody on this list could build it. Seriously, it is a one-person job,
and there are probably dozens of people on the list with the time,
energy, and inclination to do it. Why study some esoteric P2P mesh
problem that odds are won't ever matter, when in the same (or less) time
you could build a world-shaking music service, single-handedly? You
could be *the guy* to take down the music industry.

Especially if you're in a non-US jurisdiction, this seems a no brainer.

Anyway, maybe ThePirateBay will do this now, but I doubt it. I expect
we'll need to wait for some nameless individual on the other side of the
world to step up. It really, truly, only takes one person to change the
world, forever.

-david

More mesh thoughts

I think the key question, as has always been the question when it comes
to P2P network, is usability. Skype "just worked" so well that it took
off like mad. Same for the major pirate networks (though even those are
surprisingly unwieldy). A wireless mesh will only take off if it's
absolutely dead simple. In fact... I could see it leveraging some of
SocialVPN and the P2P social network concepts. Imagine:

1) You buy this USB device from WalMart, and plug it in for the first time.

2) An app launches, whether you're on Mac, Windows, Linux, iPad, whatever.

3) It asks "Welcome to the Mesh! Do you have an account, or do you want
to create a new one?" You choose "Create a new one, named Quinthar"; it
generates a huge public key.

4) It asks "There are 23 nodes in range named Alice, Bob, Cathy, etc.
Which are your friends?" You choose "Alice".

5) It shows you Alice's public profile, which is available to anyone.
It's up to Alice to decide how much to show. It asks "What password
would you like to use to friend Alice?" You say "Wonderland"

6) On Alice's computer it says "Quinthar would like to be friends, what
is the password?" She asks you, then types it in "Wonderland". It says
"Great, now you and Alice are friends, and will stay connected so long
as you are directly in range, share an intermediate mesh node, or are
both connected to the internet." [Eg, it works just like SocialVPN and
if it can't directly connect, establishes a NAT-penetrated connection
over the internet. After the initial setup, you never need to think
about it again.]

6) Once connected, you can see Alice's "Friend" profile, which is shown
to anybody who is friends Alice. It might have additional information,
such as online status, more photos and such, as she chooses. She sees
the same for you.

7) It says "Now that you're friends with Alice, what do you want to do?"
You say "Share these songs, photos, and videos, but not these other
ones." [Perhaps by folder.] When she looks at your profile, she sees
all these items. She can set offline preferences to optionally sync
your data to her computer for access if you get separated. You might
have a variety of access levels that you choose to share or not with
different people.

8) It says "Great, it's shared with Alice. Do you want to share with
any of Alice's friends -- including those you don't know?" You directly
set how many levels of indirection you'll allow, perhaps just defaulting
to 3 (Alice, Alice's friends, Alice's friends friends.)

... fast forward until you have many connections, some of which are
physically in range, others are connected via a VPN over the internet,
others are offline ...

9) You have a vast interface to browse the photos, videos, songs,
updates, profile information, and basically a lot of stuff about
everybody around you. The USB dongle is used to install on a new
computer, and connect directly without the internet, but even without
the dongle an installed computer can continue to participate in the mesh
via the internet.

10) If any particular computer gets lost or compromised, you can
unfriend them (or remove just that device) immediately. Furthermore,
your node is configured to monitor unfriending to automatically
"quarantine" any node that has become suspect. (For example, one of my
friends lost his iPhone; he'd remove that device from his profile and my
devices would stop talking with it, without any involvement from me.)

11) And because your USB dongle is owned by you, it can store data such
as your private key so you can easily move it between computers -- or
even quickly access your mesh using someone else's computer, without
leaving any trace on the computer itself.

Anyway, ultimately I think mesh technology will be far less important
than mesh *usability*. It needs to be packaged up with really simple,
excellent software that enables the most basic peer activities --
especially file transfer -- to be done in a totally seamless way

-david

Strains within China's leadership

This is the sort of thing that makes me wonder if China can maintain its
current path:

http://www.nytimes.com/2011/01/17/world/asia/17china.html

Basically: does authoritarianism scale, or inevitably succumb to
internal power struggles? This article would suggest some of the
latter, that the Chinese leadership is losing control as the military,
finance, and industrial sectors act with increasing autonomy --
sometimes in defiance of the central leaders' will.

It's interesting how the conversations about the Chinese rise to power
rarely discuss the possibility of true internal dissent, or desperate
actions to contain it. That's actually what frightens me most: there's
no legitimate reason for the Chinese to go to war with the US or its
neighbors, but war is a fantastic distraction from internal dissent.

Ultimately I continue to bet on the US because out of all nations, I
think we're the best of managing internal content, which is ultimately
the greatest threat to any power -- imperial or otherwise.

-david

How Piracy Will Hyperlocalize with Mesh Networks

While I don't think a pirate mesh is on the near horizon, I do think
it's entirely feasible -- and the easiest way to accelerate its arrival
will be to inconvenience piracy on the internet. Regardless, as a fun
thought exercise I imagine it'll happen like this:

1) Somebody packages up a software radio onto a convenient USB stick:
http://en.wikipedia.org/wiki/Software-defined_radio

2) Because it's just software, the same hardware can be used for
essentially any wireless activity. First it'll probably just be a
universal wireless broadband card, with modules to connect to every
cellular network in the world. It'll start out targeted to travelers.
It'll also include GPS, AM/FM, wifi, and pretty much everything you
could want because, well, why not. It's software; it can do it all.

3) It'll be instantly "unlocked" by the open source community, assuming
it's not in fact built by said community in the first place:
http://gnuradio.org/redmine/wiki/gnuradio

4) A whole new generation of wireless protocol research will be
unleashed by universities and individuals alike, with a clear focus on
mesh technology merely because that's the new hotness.

5) To start, mesh software will just be run by a few crazy hackers as a
background process while using their universal wireless cards, which
they use because it's easier than dealing with wifi. Node density will
be low and largely limited to toy apps like chat, single file transfer,
etc. More in the vein of being a proof of concept.

6) There will be some place where a critical mass of node density
occurs: probably a university with a combination of a strong engineering
school and overzealous network administrator. It'll always be possible
for one person to get a torrent off the real internet, but then the rest
of the dorm will get it via the mesh.

7) The next semester, students who don't really have any idea about the
mesh or have any interest in a universal wireless device will realize
"if I just buy this thing and let some dude install software on my
laptop, I can get a ton of great content without risk of detection by my
university." The device's main purpose will gradually transition away
from its advertised and intended purpose, and repurposed by pirates.

8) This will slowly, quietly grow. The hardware manufacturers will
initially be totally unaware, but gradually adopt a policy of "don't ask
don't tell". More and more students will sign up. As people go home,
students who got their content from the device won't even know how to
share it without the device; they'll convince their friends to buy it
just so they can easily share the content while home on break.

9) The software will get better and better. Torrent apps will
auto-detect if the device is there, and will try to pull from it first.
The torrent protocol itself will adjust to pull from nearby mesh
neighbors. Gradually, piracy will go hyperlocal.

10) The hardware will get better and better. All laptops will come with
this built in, because why have a dedicated wifi card (or Sprint card)
when you can have a single universal card? Why have a card at all when
it can be done as part of the main chip? After all, it's just software
-- CPUs run software too.

11) At some point Apple really starts to take notice. Apple products
will recognize a "neighborhood" network that operates across the mesh --
like Bonjour on steroids. It advertises its security and speed
advantages over "the internet", which gradually becomes used exclusively
for what it's good at -- moving data over incredible distances under
watchful eye of the state -- versus the mesh, which is for small
distances with anonymity.

Something like the above *will* happen. It's inevitable. It's not even
that creative. And it'll probably happen sooner than we expect. Sound
unlikely? Remember those researchers that cracked GSM at the CCC 2
weeks ago? They did it with "Universal Software Radio Peripheral"

http://www.zdnet.co.uk/blogs/security-bullet-in-10000166/gsm-crack-inexpensive-says-researcher-10021405/

You can buy one at http://www.ettus.com/ And yes, it plugs in via USB.

-david

100M monthly users of BitTorrent/uTorrent

Saw this today:

http://www.dmwmedia.com/news/2011/01/03/bittorrent-filesharing-apps-hit-100-million-monthly-users

That's 100M people who fire up BitTorrent/uTorrent to download
*something*. And that doesn't include all the other torrent
applications out there. Or IRC/Newsgroup piracy. Or sneakernets.

For comparison, Hulu has about 30M monthly users, and Netflix is up to
16.9M users last quarter (not sure about monthly):

http://mashable.com/2010/11/10/hulu-stats/
http://www.investmentu.com/2010/December/netflix-creates-multibillion-dollar-industry.html

Also interesting to check out the growth: Vuze.com up 81% over the last
year. uTorrent up 106%. Bittorrent.com up 144%. As opposed to
hulu.com up 61%. Netflix up 8.23%.

http://siteanalytics.compete.com/bittorrent.com+utorrent.com+vuze.com+azureus.sourceforge.net/

So, piracy still bigger and still growing faster. As has been pretty
much consistently the case since the dawn of the internet. I'd wager
the trend will hold for another decade or so, if not forever.

-david

The increasing price of progress (and how to get a discount)

Was reading this fascinating article here:

http://www.newyorker.com/reporting/2010/12/13/101213fa_fact_lehrer

And it occurred to me that science and math are typically viewed as
having the same goals -- to prove theories about the world around us.
But perhaps they should be viewed as opposites? After all, there's no
real way to prove *anything* using the scientific method. But you *can*
disprove something. Why isn't that the focus? Perhaps math should be
about proof, and science should be about *disproof*?

Due to the fact that proof (rather than disproof) is the focus of most
scientific research today, we end up with a ton of research that rides
the thin and ambiguous line between statistical relevance and
irrelevance. Indeed, the above article suggests that most scientific
"conclusions" are irreproducible nonsense. Literal nonsense. That was
created at enormous cost to society.

Now you might say "well that's just the price of progress". And that's
probably true. But we should be focused on driving that price *down*,
when in fact it seems to me that we're driving the price of progress up
through irresponsible public policy.

There are a lot of reasons why this could be the case. Probably the
most direct contributor is the largely well-intentioned but ineffectual
policy of promoting amazingly expensive formal education to people who
don't want or need it. This fills our research labs and journals with
nonsense (nonscience?) studies done in the pointless pursuit of
meaningless, debt-inducing degrees. But I think a more damaging and
insidious reason is, yes, intellectual property.

I think the reason I'm so opposed to copyright and patent** is that
those policies actually damage the world. Meaning, they irresponsibly
encourage "quantity" over "quality", creating more options of lower
quality when fewer high-quality options would have been
faster/cheaper/better.

** Trademark has a completely different aim: helping consumers correctly
differentiate between similar alternatives. Trademark is primarily
aimed at increasing "quality".

Here you might say "but who will innovate without IP protections?" And
I guess I'd say "those who need to". They say "necessity is the mother
of invention", not patent protection. In fact, I wonder if IP has done
anything *at all* to improve the quality of innovation (or, rather, the
quantity of high-quality innovation) on a "per-capita" basis.

Sure, we have more innovation today than any any point in human history.
But we also have more *people*. Furthermore, the rate of new people
coming into the world is higher than most points in in history. Even if
innovation-per-person is constant, today will be more innovative than in
the past, in aggregate. So even if IP is a total failure and does
absolutely nothing of value, today will still seem very innovative (and
those policies still seem a success).

But if there were no patents, does anybody honestly think anything we
have around us wouldn't exist? Would we have not bothered with steam
power, railroads, electricity, phones, cars, rockets, satellites, or any
of that? Would we have never noticed any of the major medical
breakthroughs? I doubt it. I think we'd have pretty much everything we
do now. We'd have them because we *need* them to compete between
nations -- in an arena where IP protections don't really exist.

Accordingly, I see no evidence whatseover that IP works. I don't know
of any major series of breakthroughs that simply wouldn't have happened
in roughly the same order at any slower pace without IP. At best it seems
just a big nuisance. But my real fear is it's more than just a
nuisance. Rather, it's an active damping function on human innovation.

I fear the primary effect of patent today is to introduce arbitrary
"waiting periods" before old inventions can be compounded into new ones.
It introduces enormous fear, uncertainty, and doubt into the
inventor's mind -- a sense of "why should I even bother doing this thing
that would be awesome when I'll probably just be sued into personal
bankruptcy out of the blue by some nameless corporation?" It's not
focused on quality *or* quantity, but creating an unnecessary tollbooth
on innovation and then charging society by the mile, with the proceeds
not even going to the innovators responsible.

Similarly, I fear the whole design of copyright is maliciously misguided
on creating "the next big thing" rather than maximizing the
accessibility and influence of the untold millions of "past big things".
It holds the output of all past artists hostage -- most of who are
dead or who are lucky to make a single thing of widespread appeal in
their entire lives -- disingenuously invoking the plight of nameless
future artists to justify another unnecessary tollboth, the vast
majority of whose proceeds don't go to artists.

This isn't a call for communism -- IP shouldn't be shared out of some
moral responsibility. And it's not a call for socialism; the government
needn't seize private invention for the public gain. It's saying IP is
a *detriment* to competition, the most important foundation of
capitalism. It's saying private inventors (and the businesses who
employ them, and the investors who fund them) would all be better off
without IP.

The world doesn't need IP. Innovators and artists don't need IP. It
was created by those who don't innovate, to control, contain, and profit
from those who do. It's just a raw deal for the world. And it needs to
be stopped.