Dan Heller's Photography Business Blog Industry analysis from www.danheller.com

The photography world -- the business, the culture, the art, the politics, the technology.

Subscribe to
Posts [Atom]

: Name: argv; Location: Santa Cruz, California, United States

My Books on the
Photography Business

Friday, March 02, 2012

Market Efficiencies and Stock Photo Pricing

In my last blog post, Selling Stock: It's About Search Rank, Not Price, I argued that the price variability in the stock photo industry can be exploited by those who garner high search rankings. The rationale is that the direct and indirect cost (overhead) of finding an image so far exceeds typical license fees, that photo buyers are more indifferent to those license fees than sellers believe. Thus well-ranked photo sites would be able to command higher license fees, simply because they have first access to the buyer.

In fact, well-ranked photo websites are undermining their own profitability by lowering prices unnecessarily, mostly because they are following their perceived competitors, not because the customer is demanding lower prices. Their rationale would follow traditional economic theory under most market conditions, but therein lies the exception. The photo industry does not represent "normal economic conditions." Indeed, the photo industry represents a classic case of an "inefficient market."

Let me explain by starting with the definition of an "efficient market." It can be summarized as a market of buyers and sellers engaging under conditions where all information is available to parties on both sides of a transaction. (See this wikipedia link for extended definitions, examples, and citations.)

Examples of efficient markets are exchange-traded commodities like oil, orange juice and automobiles, among others. Here, producers of commodities make their wares generally available, and market-makers trade on this information. It is exceedingly difficult (if not impossible) to have inventory that the market is unaware of, or to purchase commodities without the broader market's awareness. These are the conditions that lead to the definition of an "efficient market."

While there will always be price volatility, it is almost entirely governed by predictions of how supply and demand might be affected by external events. The weather affects the price of Orange Juice; war and instability affects the price of oil; and a litany of factors affect the auto industry.

When it comes to image-licensing, most buyers and sellers do not have that much information about the "global" market of buyers or sellers, let alone access to conditions that can affect future supply and demand. This results in "market inefficiency," which results in price inconsistencies, precisely as predicted by economists. Therefore, prices vary from high to low across the spectrum, depending on the perception of the buyers in any given time/place. This is because they have limited and incomplete information about the global supply chain.

This also explains why people objected to my proposition from my prior article. They do not have access to "all information," and worse, they are unaware that their worldview is limited. That is, most pro photographers are under the illusion that the entire market of stock photos is monopolized by a small number of stock agencies.

Ironically, the other markets (non-agency buyers/sellers) don't see the other side either. These discrete and separate markets will, by definition, find different prices than buyers in other markets. Stock agencies will view one another as competitors and lower their prices, whereas websites that are unaware of stock agencies (or don't attempt to compete with them) will command higher prices.

To optimize prices and create an efficient market, the following would have to take place:

Stock agencies would have to expand to cover a larger proportion of the image-buying market. As my prior article advised, the way to do this is to partner (or merge) with photo-centric websites, whose proportion of global internet traffic is very high. This will allow "more information to be more universally available to a greater proportion of the buyers and sellers." This now leads to market efficiency.
Once the market became efficient, it could then be automated through predictive pricing algorithms, precisely the way Google automated online ad prices using an auction-based mechanism. No doubt this is not a simple algorithm, and it took years to evolve, requiring considerable data mining to determine optimal market pricing. But it was achieved to a point where it is now a highly viable (and mutually beneficial) economic model for buyers and sellers. The market of photo buying is similarly large, and there's enough economic activity that appropriate data-mining efforts could lead to similar algorithms for auction-based image license pricing.

The question is whether anyone is willing to invest enough into this untapped market.

Labels: agencies, analysis, commerce, dan heller, economics, financial analysis, licensing, microstock, photo agencies, photo business, photography business, search engines, stock agencies, stock photography

Friday, February 17, 2012

Selling Stock: it's about search rank, not price

Yesterday, I reposted an article I originally wrote in 2007, discussing the misconception that microstock pricing is what's driving down overall license fees.

I got a few emails that still challenged my assertion, and it appears I haven't emphasized strongly enough the most compelling arguments supporting this thesis.

All of my research supports the premise that the primary cost of licensing images is not the license fee, but the overhead associated with finding and acquiring the right image. The overhead and administration of a project that would involve photo licensing shows that the actual license fee ranks very low on the budget -- hence, low on the buyer's priority list. My 2007 surveys of buyers showed that.

If the person responsible for finding images for a project is paid $60/hr, and this person spends 2-3 more hours looking for a photo just to pay $1 vs. $50, this translates to paying someone $120-180, just to save $50. People who control budgets know that the license fee for photos is negligible to the total cost of production, even at the traditional stock photo rates. The bigger the project, and lower the proportion of the license fee for the image(s).

Those who sell images are dropping their prices because they're looking at their competition, not the buyer. Further, there is absolutely no evidence to show that sites that have lower prices sell more images. There is definitely a perception that there's a correlation, but that's because people are comparing apples to oranges. Getty sales vs iStock sales are not apples-to-apples because the two entities vary dramatically in search engine results (and other important factors). People talk about microstock sites more, and they link to them (in blogs, discussion forums) and the quantity of images on microstock sites is rapidly growing. So naturally, these sites get higher rankings in search results. Search engines don't rank sites because they have lower prices. They rank sites by size (content), links, and a black magic formula that is best described as "dispersion of discussion in and around the net." In short, microstock sites have more content and get more attention. Hence, better rankings, which translates to more traffic, which attracts more photographers to submit images to them, perpetuating the feedback loop.

In my 2007 survey, those who indicated they were aware of--and use microstock sites-- most don't go to them because the prices are lower; it's mostly because those sites ranked higher in search engine results, where the buyer starts.

Because search engine ranking drives traffic -- especially the untapped (and unaware) segment of the global economy that doesn't use stock agencies -- and because the greatest cost in photo acquisition is time, not the license fee, 90% of the time-savings is the image results the user gets on that initial search. If it takes the buyer to a stock agency site -- microstock or otherwise -- then the deal is nearly done. Price notwithstanding.

This is primarily why I have advocated for years that stock sites should focus their entire effort towards optimizing search engine rankings. While they could have done something about it in the past, the rise of social networks and the plethora of image-related websites and apps has made it impossible for agencies to rank highly on image-search rankings on their own. In today's market, they have no choice but to either partner with, or acquire/be-acquired-by a social-networking site.

The Getty<->Flickr combination is a very pragmatic example. Yahoo is circling the drain, and it needs to shed its non-performing assets and focus its attention on ... something. Whatever that is, it isn't Flickr, and there aren't a lot of buyers that would be interested in that asset, except for Getty or Corbis. The combined product would involve retooling Flickr to be far more socially active (to keep up with modern social networking trends), and to integrate licensing/acquisition into the user/social experience. Most importantly, to provide incentive programs for photo submitters to participate economically. (I've written a great deal about this in the past.)

Of course, perhaps Yahoo should just buy Getty. Facebook is getting into the game, which tends to lead one's eyes towards Google, but they are still struggling to play catch up in the social-networking arena, and their photo division is not run by someone with a disposition towards stock or an awareness of the economics of the photo industry. The company is more interested in building assets that support their advertising model. There's no evidence that "licensing" is on their radar--a pity because they would be on the forefront of the Web 3.0 economic model, where images would play a huge role. (See here.)

In the meantime, there's a $25B shadow economy in peer-to-peer photo licensing that's up for grabs. (See here.)

So, you ask, "how do you convince agencies of this?"
I've been trying since 1998.

(For fun, see this web archive of my site from 1999 discussing this topic.)

Labels: agencies, analysis, corbis, economics, flickr, getty, licensing, photo agencies, photo business, photography, photography business, pricing, search engines, stock agencies, stock photography

Monday, February 14, 2011

Search Engine Optimization and The Long Tail

I was inspired by an entertaining article I read in today's New York Times titled, The Dirty Little Secrets of Search, detailing the rise and fall of JC Penney's Google rankings. Turns out, JC Penney's SEO consulting firm allegedly bought a huge number of paid links on websites, most of which aren't actual sites at all, but domain names purchased solely for the purpose of placing links to PC Penney. Google takes this very seriously, and has been known to eliminate sites completely.

The rationale for this approach is, as most people know by now, that your ranking is governed most largely by the number of other sites that link to yours. Unfortunately, what many people still don't know is that gaming the system doesn't work. (Link exchanges are a sure way to lower the ranking of both sites that link to each other. That's why JC Penney's SEO firm just created sites that had one-way links.) While it'd be nice to have organic linking, where people simply "talk about you" (and provide a link) on many websites on the net, that's not so easy to do and takes a lot of time.

In this day and age, if you're going to succeed as a stock photographer, you have no choice but to figure this out. This strategy begins with two questions: 1) which keywords or phrases do you want to rank highly for, and 2) how do you seed yourself around the net?

The answer to the second question begins with the first: find the right keywords.

Here is where most photographers (and agencies) get it wrong: they shoot for keywords like, "stock photography," and other industry trade terms. But this doesn't work so well. Google's Traffic Estimator shows terms like "stock photography" yields only about 90,000 global monthly searches. Sites that rank highly for only a few keywords or phrases never do well, even for popular search terms. Instead, reach for many search terms -- as many as possible.

My site (danheller.com) ranks in the top five positions on 751 search terms, and 1205 search terms rank in the top 10 on Google Search results, according to Google's Webmaster Tools. But I'm not actually trying to rank highly for any given search term at all. That would be futile. Odd as it may sound, I rank #1 for "stock photography business," but I swear I didn't try to. Of course not, because that search term doesn't generate enough traffic to warrant investing any special time or effort. That's the point. This is the "long tail" approach to keyword indexing: it's about breadth, not depth. I don't get that much traffic to any single page. By ranking highly in such a vast number of terms, it's the aggregate that matters.

All this starts with simply being indexed. That is, search engines have to know what words and phrases you have before it can rank them. Choosing the right words is one thing, but you also need Google to trust your keywords. In other words, trust you. Unlike standard text on a page, which Google is good at, photos are different. An algorithm doesn't know what's inside a photo -- it has to look at other characteristics to determine its content, such as surrounding text, the name of the page it's on, and of course, its metadata. In particular, the "keywords" tags embedded in the IPTC header of the image file.

Once again, here's where most photographers and agencies get it wrong: they "pollute" their keyword lists with dozens, if not hundreds, of phrases and expressions, hoping the target image will come up as a search result for any one of them. But Google will actually penalize people try to game the system with "black hat" approaches, like using repetition (singulars and plurals together), lots of synonyms, intended misspellings (by seeing both the misspelled and correctly spelled words together), and tons of generic terms (such as "photo", "image", "photography," etc).

Products like Cradoc's Keyword Harvester and A2Z Keywording each suffer from (and perpetuate) this problem. The main reason is because they are trying to anticipate what a searcher might look for. This is not only impossible, but the mere attempt reduces your credibility index in the eyes of almost all search engines.

Almost all? Which search engines does it actually work for? One of the people responsible for this policy told me "microstock agencies is where our customers submit their photos, and those search engines are not that smart. So, we have to be thorough."

True enough, but this raises two issues. First, despite the fact that microstock websites are popular among amateur photographers and a growing population of desperate pros, looking to pick up the pennies from as many sources as possible, the vast majority of those looking to license images don't go to stock agencies. They go to main search engines.

Second, even among the brain-dead search technology employed by stock agencies (except for Getty's whose search technology is quite good), proper keywording techniques still perform quite well at those places. The reason is that people searching for images don't go about it in the diligent, thoughtful way that photographers think they do. People do not search using conceptual terms that those who sell keywording products would lead you to believe.

Keywording properly is really boring, and far less time-intensive than people make it out to be: just the basic "facts" about the photo can be described in a handful of terms. The search engine will do the hard part. Granted, this is a bit simplified, because it doesn't address issues like word definition ambiguity, synonyms, and so on. But this isn't done by humans anyway; it needs to be handled by the search engine's heuristic engine. True, stock agencies don't have them, but again, the trade off is whether to achieve "good enough" with the less-frequently used stock agency or the "proper" method advocated by the search engines.

This is why the "proper" method achieves the best of both worlds: you will be indexed properly and given higher "credibility" with public search engines like Google, and you won't be penalized by the microstock agencies even though images might only use a handful of keywords, rather than dozens or a hundred.

The next question is how to get all those coveted links from other sites to direct traffic your way. This technique is not easy; it requires work. You need to write a lot, post to discussion forums, socialize and network, be on the "inside" with industry people, and above all, talk about what you know. And here's the real hidden secret, I'm not talking about photography. The discussion forums, industry people and the topics you talk about are best when it's something other than photography because it's highly likely that you're an expert at something other than photography.

Of course, if you are well-informed about photography and are regarded as a leader in the field, then go for it. But if you are, then you're probably not reading this... at least, not with the goal of improving your photography business. I am better known for my business analysis, which happens to be in the photography field, than I am for my photography as an art form. That I sell lots of images (prints and licenses) is not a byproduct of my artistic skills. It's the byproduct of having published so much about the business of photography.

The more you engage in discussions online and offer useful, insightful and meaningful commentary, the more people will link to you. Offer to write for magazines. Try even writing a book or two. Sure, it's an investment of time. What'd you expect? That it'd be easy?

Labels: agencies, analysis, career development, getty, licensing, microstock, photo agencies, photo business, photography business, search engines, stock agencies, stock photography

Friday, October 16, 2009

Might Picscout Ultimately Cause Yahoo to Acquire Getty?

I realize the title of this blog is rather provocative. But let me lead you through this.

It all starts with David Sanger's blog on picscout's new Image Registry and Image Exchange, which is the system that Picscout uses to index images and bring buyers and sellers together through third-party licensors. David makes insightful comments on three critical points.

First, his point #2:

Picscout aims to take a percent of sales, noting on their site: “ImageExchange acts as an online affiliate program, sharing image-licensing income between PicScout and licensors.” This will reduce the percent that goes to the photographer.

David is not the first to observe this, but it illustrates how the big picture is being missed. The premise begins with the fact that the universe of images users (some of whom are active buyers, but most of whom are not) use applications that produce documents (digital and print). Those applications are developed by third party Independent Software Vendors (ISVs), such as Adobe or Microsoft. If the applications that ISVs produce adopt the Picscout API to hook into the registry to identify images the user is using in his document, those users will not only be automatically notified they are using copyrighted images, but will also be given the opportunity to license them. This concept isn't far-fetched--exactly the same thing is done when users try to view movies or listen to songs on some devices.

However, because such a thing is not yet done for images, it has the potential to transform the stock licensing industry. If enough ISVs adopt the API and hook into the registry, a critical mass of users will be invariably recruited into the photo licensing economy. The more ISVs that adopt this API, the more applications will be using them, which casts a wider and wider net of users... who themselves become image buyers.

Here's the hitch: those ISVs will not adopt the API unless they have a stake in the game. That is, a cut of the license revenue. Unless someone has another carrot to wave in front of those ISVs, that's the only way to get them to participate in the program. If ISVs don't adopt the API, this whole discussion is moot. No one uses the registry. Game Over.

Therefore, the game is to capture the ISVs. And the only financial incentive they can possibly have is to participate in the licensing model--that is, a rev-share. This has the even greater advantage of giving the ISV even more incentive to get their own users to license images. The more they license, the more money the ISV makes. The ISVs will not just promote these features, but they may make it pretty darn difficult for users to avoid these features.

Imagine what Adobe would do if they had the ability to get a cut of a $10B economy if they just added a feature into InDesign that assured that the photos being used in any given document was properly licensed.... much the same way an iPod assures that the movie it's about to play has been purchased.

This is the same model I've described in my article, The Economics of Migrating from Web 2.0 to Web 3.0: convert the vast majority of image users into image buyers, and sales volumes go way up.

So, that David observes that photographers' percentage of royalty goes down is a true statement, but one that clearly misses the big picture. Obviously the ISV rev-sharing cuts the pie into smaller slices, but a smaller slice of a much larger pie.

David then makes another keen observation in point #7 about Picscout's underlying technology:

Evaluating an entire page of thumbnails is time-consuming. Each thumbnail must be downloaded and analyzed by the PicScout servers before returning index comparison results...

Though David only cites the Google search as an example of how users expect "speed," this is only the tip of the iceberg. Picscout's web browser plug-in that examines google searches is merely a prototype to demonstrate how the API works. Once again, the real goal is to capture ISVs.

But David's observation is more prescient than he may have thought, for performance is probably even more important than rev-sharing by ISVs. If their apps degrade in performance by using the Picscout API, they won't use it, irrespective of rev-share.

The technology Picscout has introduced is clearly first-stage prototypes to introduce the business model and be the first on the map. Yet, it's also Picscout's Achilles Heel, as there is a race about to ensue.

Let's not be naive: Picscout is not the only company on this track. Image-recognition is a science that's akin to text search: there are many ways to do it--some better than others--but it only needs to perform to minimal threshold for the business model to succeed. Many other factors dictate success or failure. Sure, though Picscout may have superior image-recognition algorithms, that part isn't the crowned jewels. Indeed, there are many companies with image-recognition algorithms, Google being one of them.

The real challenge is to build a network protocol that can communicate image information between a client and a server as quickly as possible, using as little network bandwidth as possible. Then, this mechanism needs to scale up to service huge volumes of requests from huge numbers of applications on the net. Picscout may be the first to introduce the proof-of-concept and a prototype, but the real race is on the back-end... as David pointed out.

On the surface, this would seem difficult -- and it is -- but it's hardly new. All large-scale social-network sites do this on a regular basis, from twitter to facebook to Flickr. Though cloud-computing is mature, the real barrier to entry here is the costly capital investment necessary to run such a service. There are many players in the field that already have this infrastructure. By comparison, Picscout would have a harder time ramping up to that level of computing resources than it would for a larger company to find some sort of image-recognition technology (if they don't already have one).

For now, the game is Picscout's to lose, since they're first. But "first" players often find themselves in catch-up soon thereafter. If they even moderately demonstrate viability in the concept, much larger players (such as photo-sharing sites) who have such resources already will be quick to swoop in.

Lastly, David notes in his point #3:

If buyers find it easier to find images through web search they will move away from distributor sites for search, and only use the distributor site for the final licensing.

Yes. Exactly. But that's nothing new. It's been that way since about 2002 now, a fact that I've been pounding on since that time: The vast number of licensed images are done on a peer-to-peer basis directly between buyers and photographers. Stock agencies have suffered because they've missed this point, and have since struggled in trying to figure out how to fight their way out of the paper bag.

But that struggle will end without their having to do much about it. With the combination of image-recognition and web-crawling, the emerging business model Picscout is attempting is now a Fait accompli. That is, David is correct to say that stock agencies of today will become nothing more than hosting sites and clearing houses that supply inventory to other middle-man sites (like Picscout) that do the real job of pairing buyers and sellers.

But is this really a bad thing? He says it in a way that suggests that agencies somehow preserve stock prices. Let's not forget that if ISVs and others realize there's money to be made, they don't want to under-price inventory too. If you want to preserve price stability, convert the social-networks from photo-sharing into photo-licensing businesses.

I've nothing against agencies, but their future will require them to do two things they never did before--in fact, that they avoided: rank well in search engines (so that end-users are more likely to find content in the first place), and attract as much content as possible. That is, stop being editors. Let any and all images in, and let the natural ranking abilities of search engines and social-networks be the real editors. To date, stock agencies have neither sufficient content volume or web-ranking in search results, nor do they employ social-network aspects to their sites to attract users in high volumes. (Again, their head was in the sand for too long.)

So the question is, who can do this? Answer: Photo-sharing social networks.

Back in 2008, I posted an article titled, Stock Photography, the Consumer, and the Future that forecasts this very phenomenon. Once the realization that there's lots of money to be made by creating a streamlined and automated image-licensing mechanism, the sleeping giants of the photo-sharing social networks will awaken and bulldoze over the traditional stock agencies in ways that no one would have believed.

Indeed, I wrote in January, 2008 in an article titled, Pulling the Flickr sword out of the Yahoo stone:

Flickr is one of the very few photo-asset powerhouses on the web that could monetize its content in ways that would exceed even modest expectations.

In fact, I also wrote in an article titled, The Solution to Getty's Woes that Getty should acquire Flickr for this very reason.

But times have changed considerably since then -- Getty has shrunk in size, and Yahoo! has recovered handsomely. Getty could never acquire Flickr now... but if this whole business model of using image-recognition as a vehicle for licensing images shows promise, then I wouldn't be surprised if Yahoo! starts casting devious stares towards Getty.

Hmmmm......

Labels: agencies, analysis, business model, dan heller, flickr, getty, licensing, pricing, search engines, stock agencies, stock photography

Monday, November 03, 2008

The Economics of Migrating from Web 2.0 to Web 3.0

Synopsis
The Web 2.0 financial model: user-generated content attracts visitors, which boosts search rank, which attracts more visitors, which boosts advertising revenue. The traffic-to-dollars business model resulted in unintended social phenomenons: social-networks and blogs, which themselves spurred new web design goals that focus on encouraging visitor participation and contribution. The un-monetized "waste byproduct" from these websites is a stockpile of user-generated content, such as photos, videos, paintings, drawings, stories, commentary, and opinions. To make use of all that flotsam, a suite of semantic analysis tools is being developed to organize and structure them in ways that help search engines produce better results. This is called Web 3.0--or, "the semantic web." The consequence of Web 3.0 will be the unanticipated financial incentive for websites to monetize their content, rather than just host it to attract visitors. The opportunity to make money from user-generated content will give incentive to visitors to produce "better" content, and for websites to be more discerning about the content they receive, which affects both the social and economic landscape of the internet. The ways this type of transformation may evolve is the challenge for technologists and entrepreneurs, who must be both visionary in how the future may appear, but cognizant of missed opportunities of the past.

In this second installment (of three) about the consumer's role in the future of stock photography, I turn my attention to the economic effects from new developments in search technology. As I'll make clear soon, "search" is merely a spark that launches a much larger fire: the social aspects of the web. And whenever the subject of "social networks" mixes with economics, the spotlight focuses squarely on the consumer. I presented a simple example of this in part one of this series by showing that more consumers both buy and sell photos than do professional photographers or stock photo agencies, which itself has caused a dramatic shift in how licensing is done. This phenomenon is not as readily visible to the undiscerning eye, because the greatest proportion of these transactions is done on a peer-to-peer basis (directly between the photographer and the buyer). In fact, asymmetric analysis shows that approximately 80% of photo media content is acquired directly from the photographer, and images are found primarily from search engines. This mechanism creates economic incentive for those on both sides of the transaction: the buyer uses search engines to find what they want, so the seller tunes his content to conform to the kind of information that search engines look for.

The "missed opportunity" from this shift has been primarily the lack of technical infrastructure to support broad peer-to-peer licensing. Only the traditional licensing methods are available on a broad scale, which requires photos being submitted to a company (a "stock photo agency"), who then licenses them to buyers. Each individual agency operates entirely independently from others, and none of them have prominent placement in traditional search engine results, so only a small percentage of potential buyers ever end up on those sites. The majority of them go directly to the websites of the photographers themselves (because the search engines index them), but most of these photographers are unaware that they could make money licensing their content. Indeed, even social-networking sites are unaware of the financial opportunities to license the content. The net result is that few photos are actually monetized, and of those that are, the pricing is arbitrary and spurious. It's estimated the $15-20B of licensing is done on a peer-to-peer basis, and countless more dollars are simply unrealized due to this inefficiency.

I should clarify that even though my first article demonstrated this type of media growth in the photo industry, the broader market of all types of media is evolving similarly. In fact, events in online photography serves as an excellent "base case" to help forecast economic effects for other media types, such as music, video, line-art, books, and so on. What all these have in common, and which everyone has known for years, is that non-professionals create this content as well. And, people do so without necessarily intending (or expecting) make money with it. I call this class of content creators, "consumers."

By examining the photo industry, we can extrapolate what might happen with the industries of other media types. Accordingly, photography has these important characteristics:

Everyone does photography on a regular basis in large quantities.
People upload their photos the internet more frequently and in higher volumes than other media types.
Economically, photos are used more than any other media type in publishing of both commercial and editorial content. (Thus, economic value.)
More people can create "salable content" with less expertise, less effort, and greater speed than other media types.
The high quantity and low price per unit of licensed images equate to a lower barrier of entry for both buyers and sellers.

Forecasting the economic future of media on the internet is difficult because it isn't clear whether the same lack of awareness will plague other media types as it has with photography. There's already been a major economic shift in the photo industry as a result of the internet, and evidence suggests that similar economic changes are happening with video and music as well. As technology for creating content of any type improves, as does self-publishing of this content, the economics are likely to follow the trend set forth by the photo industry, and consumers will find themselves in a position to make money with their content.

The next phase of internet search technologies and standardized communication protocols may inadvertently help. As we've seen with the evolution of the internet to date, whose economic growth came from unintentional consequences, we can learn from how those circumstances came about, and use them to forecast how the future economics might also take shape.

The goal and challenge for media-oriented industries of all types is to build a more structured, formal, internet-wide framework that handles content licensing in general, regardless of the media type, or who the buyers and sellers are, and establish these foundations before consumers set precedents that are harder to unravel, which could deflate a content's value before it has a chance to flourish.

The good news is that some developments are already under way. But, to put them into context, we need to understand today's business model, and how it evolved into what we currently work with. What we'll find is a very tight sequence that starts with technical innovation, followed by social adaptation, which affects financial incentives, which comes full circle to innovation again. This feedback mechanism of constant reinforcement and revision is an economic truism. Since the technology is already underway, and some of the social fabric is similar taking shape, the only question that remains is how the business environment evolves with it.

Web 2.0 Business Models Setting the Stage

Most people are familiar with MySpace, FaceBook, and Flickr as common and well-known examples of social-networking sites. They are essentially places where people sign up and contribute "content" in the form of information about themselves, while also contributing photos, music, poetry, writing, and ideas of various sorts. In return, they get to socialize -- learn, teach, meet, and access.

Getting people to participate and contribute content is what is commonly called, "Web 2.0", and most sites on the internet are so enabled. You can visit most any blog, news organization, movie review site, shopping site, or cooking site, and you'll probably find a way to contribute something, whether it's as simple as voting on how much you liked a book, or as involved as contributing your own recipes, movies, photographs, short stories, or politically biased nonsense that you hope others will agree with.

While this kind of activity has always been technically possible to program into websites, these features of the web were largely ignored until there was a business incentive to use them. That incentive came in the form of Google's advertising network. If a site had good information, and it was indexed well by search engines, advertisers paid more dollars to have ads there. Consequently, for a site to get those advertising dollars (or to sell its own products or services), it needed to be indexed well, which means it needed more content. The easiest and cheapest way to get content is to encourage people to contribute their content. The incentive that sites give to consumers to contribute is the "social rewards."

By being smarter, funnier, cuter, or more ridiculous than others, people get attention, and people love that. So, websites used the "social" carrot to get people to participate, and in return, the site got its free content, and of course, traffic. These both raise the site's raking and boosts advertising revenue (or sales of their own stuff). Today, it's almost unheard of that sites don't have some way for users to contribute. Everyone wins.

What's notable about this development is that it was unintentional. Social websites had been around in earlier days of the net, but didn't really gather much attention or traffic. Even of those that did, it wasn't easy to make any money from those users. No one bought anything, and they wouldn't pay subscription fees. So, having millions of users did nothing but cost the company money in technical infrastructure (which itself was vastly more expensive than it is today). Companies that had stuff to sell typically didn't garner much traffic, except for dating sites, like match.com.

It wasn't until Google introduced its auction-based advertising model that inadvertently rewarded highly-trafficked sites with unanticipated revenue did a financial incentive exist. There was then an instant awareness that social-networks is where the money is.

Yet, it was never Google's intention to create social networks or any other kind. In fact, it didn't intend to affect the nature of the internet at all. It just wanted to create a model of analyzing the internet as it is (or was) for purposes of setting auction-based advertising rates. They did not anticipate that their very act of analyzing data actually changed the very data itself. (This is a perfect example of Heisenberg's Uncertainty Principle.) Indeed, not only has the data changed because Google observes it, they created a feedback mechanism where the more they looked at data (and thus, reported their observations by way of search rankings), the more the data itself morphed into to the kind that people thought Google wanted to see. This, in turn forced Google to change how it looked at the data, because people were manipulating the content on their sites to artificially bump their rankings higher.

And so it goes to this day: websites and search engines are in an endless cat and mouse game, where sites try to get higher search rankings, and for Google to maintain a plausible ranking system that users can trust to be objective when they search. This credibility is required for advertisers to trust it.

This underscores these basic, fundamental socio-economic principles:

Financial incentives promote user behaviors.
Examining user behaviors to calculate financial incentives causes sites to filter those behaviors that optimize financial returns.
The constant feedback mechanism and subtle refinement of behaviors and economics creates a state of unpredictability.
The unpredictability invokes the Law of Unintended Consequences, which yields a new economic model.

This begs questions: What's next after Web 2.0? And what are the potential byproducts from whatever that is? More social networks? Or something else? Our objective here is to anticipate future financial opportunities without forgetting that the feedback mechanism produces unpredictable results. Also, the Law of Unintended Consequences suggests that examining user behavior changes those very behaviors, which itself often forms the basis for new developments and incentives. (Knowing the future will affect your behavior, thereby changing the future.) But, as any good entrepreneur and venture capitalist know (er, should know), the goal isn't to predict or (worse) to control or shape the future, but rather, to anticipate the parameters that are most likely to frame that future.

Web 3.0

The answer to "What's next?" is easy: Web 3.0. In inner circles, this is called, "The Semantic Web." That is, the content on the web that was generated during the Web 2.0 era will be more intelligently analyzed than before. In essence, the data is not just indexed as it is today, but is being better understood for its semantic meaning. This very core nugget of change sparks a feedback mechanism on a very large scale, which will transform the economic foundations of the internet, much the same way the web itself changed the world.

For example, let's say you have a weird and embarrassing rash. Today, searching relies on brute-force matching of search terms, such as "weird rash" or "red rash". Type that into Google, and the results you get are pages that happen to contain both words. Though the pages themselves may be ranked according to popularity (the Mayo Clinic's site may rank higher in search results than some guy's blog), you still have to sift though more than just the first set of results to find what you're really looking for.

You could do an image search for "red rash," but search engines don't really know what's in the content of photos. If you were to do such a search on images.google.com, you'll get photos of red rashes, but this isn't because Google analyzed the photos. They match because the photos happen to have the words "red" and "rash" in the image's filename. E.g., red-rash.jpg. (Google also looks at the file's pathname as well as the filename.) If you try it, you'll see there are photos of every possible kind of red rash you can get, most of them having nothing in common with the others, nor are they sorted or ranked in any intelligible way. The results are merely arbitrary listings of all matches for images with properly-named files. Google relies on the fortunate-but-useful naming convention that some people happen to use when naming their photos: that they name their files according to their content. In practicality, this technique is spurious and nearly useless, but it's the only thing they can go on for the moment. As it is today, you have to examine each photo and see whether that looks like your rash, and then examine each of the pages that the came from to determine what the rash is.

That google relies on photos' filenames to match their content is not just unreliable, it's not even complete. Statistically, most people don't change the filenames of their photos; they leave them as they were from the camera, such as DSC1004.JPG. Photos with those names are never going to come up as a search result for any search, let alone those that could be potential (and valuable) matches for relevant searches, were the search engine to genuinely know what it was searching for. So, there's a lot of photos out there that may be of red rashes, but they aren't found because there's nothing about them that indicates that's what they are.

Pro photographers may be asking, "what about metadata? What about the keywords that I apply to photos? Why doesn't Google look at that?" They don't for the same reason Google no longer looks at the "keywords" tag on html web-pages themselves when compiling results for general searches: websites learned they can game the system by stuffing these attributes with unreliable data.

People can still game the system to some degree by naming their image files accordingly, but for the moment, doing so doesn't really reward the behavior very much. It would only yield sporadic results because Google doesn't sort or prioritize results according to any sensical pattern that matters to searchers. And there's currently no other benefit to naming a photo improperly. After all, why would someone name a photo, sexy-woman.jpg if it's just a red rash?

One reason to do so would be if there were financial incentive, say, advertising revenue if your site were to get more traffic. This would provide incentive to name photos sexy-woman.jpg, even if it's a photo of a rash. As you can imagine, this would completely ruin Google's current image search feature, which is why all search engines are sort of stuck in a corner with image search results: unless they can assure some degree of reliability of results that cannot be gamed once there was a financial incentive to do so, it's best to leave the system as it is -- nearly useless, but not so much so that people don't tinker with it. Yet, there's the very paradox: because it's still the only game in town, people tinker a lot, and it's the source of most image searching on the internet today.

So, unless one can actually, reliably determine the content of a photo, image searches will have to remain circumstantial, unscientific, and without reward.

We can envision what the economic effects would be in a Web 3.0 world, where there was a better semantic understanding of media content beyond just text. Using new algorithms that can determine the content of photos, for example, you may one day search for "red rash" and get a lot more relevant search results than before, simply because the existing content on the web is better understood.

Image-recognition algorithms are evolving in many ways, and look for very different aspects of images to determine characteristics, genres, attributes and, ultimately, content. A couple examples that I often cite in my blogs are tineye.com and picscout.com, which examine photos and find "similars" on the net based on pattern recognition by identifying photos (and portions of them) by assigning a unique "fingerprint ID." Another site, xcavator.net finds photos based on conceptual elements and can do so using specific keywords, like "train" or "window." Using this technique is even better than a text search for "red rash" or "weird rash" because the image recognition algorithm can do a better job of analysis and ordering results accordingly to proximity. Here, you could just take a picture of your rash, upload it to the web, and you'll get search results that are not only more relevant, but they can see similarities and differences that humans simply can't, or would overlook by an untrained eye.

And image-recognition is only one example -- there is also music-recognition technologies that do the same sort of thing. Shazam (www.shazam.com) is a site based on music-recogition technology that is evolving its own economic model by itself. Sing a portion of a song you like into the phone (connected to the company), and it'll tell you what song it is. That's just one application of its technology, and the company is partnering with many different, diverse businesses, many having nothing to do with the web or "search engines" directly, but it is nonetheless a search technology with widespread economic effects that are, so far, unanticipated by industry watchers. A similar-but-different development from Widisoft (www.widisoft.com) does more detailed analysis of music for conversion between music file formats that assist musicians and sound engineers to better compile and arrange musical components of a song.

The social (and, by extension, economic) ramifications of all these developments should be self-evident to anyone that works in an internet or media company. Economic predictions, on the other hand, would be premature without understanding how these technologies would evolve both technically and socially.

Problems with Deployment

The first question most people ask is why aren't these websites (or their technologies) more widely deployed, or even more usefully employed, especially by larger search engines? Several reasons.

First, these algorithms are still pretty young; recognizing patterns is a difficult and imprecise science. Of course, so is traditional text search, but the difference between the two is rather substantial. (Just matching a photo with another photo -- or a song with another song -- isn't yet sufficient for a "semantic" search.)

Second, pattern-recognition of content is only the beginning. Information about what those patterns are and what they mean still need to be seeded, and that information needs to start from humans. This isn't a major barrier, as the current Web 2.0 content on the internet has a great deal of that info already. But, its vast size and disorganization means that time is required to harness it properly. During this process, major search engines are left to guess at semantic meaning from the text on the same page as a photo, for example, which is similarly error-prone and unreliable as their current method of file-naming, though a notch better. The lesson we learned about examining data altering user behavior must be heeded strongly here: if there's incentive to "lie" or manipulate data, search engines will lose credibility.

Thus, the third problem: trustworthy information about content. Because there will eventually be a financial incentive to provide trusted and controlled data feeds about various content types, new methods need to be established to "search and rank" the sources of information. That sounds similar to traditional text search and rankings of websites, but in this case, it's not the site's credibility at stake; it's the credibility of the data found on the site. Or rather, the information about the content. (That is, the description of the rash in the photo may need to be ranked, which may be independent of the site that hosts the photo.) Unlike the text on a site that can be interpreted and ranked -- which is closely linked to the site -- photos and other media types on that same website can be sourced from anywhere, and the data about that media may have well come from an entirely different source. In the Web 3.0 world, it will be much more common for crowd-sourced content to be annotated by someone other than the content's creator. (Wiki-based sites are good examples of this today.)

And lastly, the three problems noted above will be difficult to do by any one entity, since the ingredients in this recipe require participation from many different organizations. Getting that participation is difficult, especially since each is intimately focused on their own small views of the world. (This is the very problem that every player in the photo licensing industry exhibited, which is the primary cause for its slow demise.) Having an entity that sits on top of the trees and sees the broader economic opportunities that it can use to direct which direction the tribe goes in hatching through the forest is not easy, and no one is currently poised to accept such a role. It will unlikely be a small organization, and larger companies tend not to have entrepreneurial spirit or vision.

Yet, these technologies continue to develop. Just as Web 2.0 evolved relatively slowly, so too are Web 3.0 capabilities, and with them will be leading industries that pave the way for the others by establishing standards and protocols that set the economic wheel in motion. The economic wheel for Web 2.0 was advertising dollars and traffic, so people looked to Google for the parameters to design sites and user experiences that lead to traffic, so money can be made. In the 3.0 world, there is currently no such leader, since it isn't yet clear where the incentives are. Nor will such a model exist without having experienced the iterative social feedback mechanisms that are part of every economic development.

What might that social environment look like? Let's consider your rash again: You take a picture of it, upload it to an image-recognition site, which matches it to a set of potential candidates, and each one checked against a medical website that has information about the rashes, which are then fed into a pharmaceutical website, which may list potential remedies. If it turns out you just went camping over the weekend, you can assume it's likely to be poison ivy and choose the appropriate remedy. If, on the other hand, you recently visited the red light district in Bangkok, then your spouse will be alerted, and your lawyer will be notified to accept the divorce papers being prepared for you.

Such possibilities would be objectionable to many, so limitations would be naturally put into place to protect privacy. And that's just one example. As technologies develop and new capabilities are evident, people react, and social acceptance or rejection alters future developments. These must take place before effective and long-standing economic models can form.

Personalized Search

A critical component of this social evolution of Web 3.0 is found in a very old technology that hasn't been exploited to its potential: that of "predictive preferences." That is, search results being ordered according to what might be appropriate or relevant to the searcher. While many may not be aware of it, the vast amount of raw content from the Web 2.0 world is being analyzed by "crowd-analysis algorithms", which look at data that people have voted on or expressed some kind of opinion about. This data is then examined for patterns to predict how individuals might like something, which can then be used to determine whether any given search results should rise or fall in "relevance" ranking.

In the music industry, there are two applications of this technology that you may be aware of. Amazon.com has been using predictive preferences for years, and I rely on it almost exclusively when I buy music. I let amazon choose new albums for me based on what it knows about me: the things that I've bought in the past, searched for, and/or rated my preferences for as it tracks my behaviors on its site. It even looks at preferences that aren't music related. when it offers suggestions for what I'd like, I'm always shockingly surprised at its accuracy. (Porcupine Tree is my latest miraculous find.)

Another example is Pandora (www.pandora.com). People with an iPhone were recently introduced to it this way: name a song, a band, or a genre, and the site will stream music to you in radio-station format, all comprised of songs that you are likely to enjoy.

Part of how pandora does this is by applying conceptual attributes to songs, such as "acoustic guitar solo." There are over 400 such attributes, which the site calls "the music genome project." This requires humans to assign such attributes to songs manually at the moment, but music analysis isn't that hard. It isn't a stretch to envision combining these two technologies, so that an algorithm determines attributes based on digitized sound waves, which it can then assess and assign to other songs in real-time.

Hypothetically, I could rent a car in a city I've never been to and program the radio's stations by singing a song that I happen to like. The radio can be instantly programmed to assign stations to the preset buttons. No more "scan button!"

A similar "genome sequence" of photography or video has never been done (or proposed as far as I know), but it seems as one would be inevitable, and would lead to another step in the semantic understanding of media on the web. When you combine the semantic understanding of content with predictive preferences, you have a readily monetized network of resources that, currently, no one is capitalizing on.

Web 3.0 Business Model: It's the Content, Stupid

This scenario presents the potential for a pivotal economic shift of focus for where value is: from the website to content. As my earlier research in the photography realm revealed, the less time it takes for a searcher to find a relevant photo from a search, the more likely it is that the searcher will license it. There's every reason to believe that photos are not unique to this human behavior and economic need. The semantic web will make it easier to find content of any sort, and if the searcher's results are also tuned to their particular preferences, it raises the likelihood that such content would be purchased beyond the ratio we see today. Thus, the value of content on a site goes up because it has a higher chance of being monetized.

Remember all those photos named, DSC1004.JPG? That's content that is currently next to useless because it carries no semantic meaning, and is therefore not seen or understood by current search engines. The semantic web will eventually find all those abstract media objects and make sense of them, adding them to the set of possible search results. Such data exists in all media types, not just photos, making the economic possibilities far greater than anyone has anticipated. So, once all the "useless raw content" from Web 2.0 is semantically analyzed, it is likely to emerge in the Web 3.0 world as "invaluable data assets" that contribute even more to the Long Tail of internet economics.

As the perception of content's value continues to increase, Web 3.0 sites will have more incentive to attract those who create quality content.

An example illustrating this socio-economic development can be found in this story from the New York Times. Joel Moss Levinson, "a college dropout with dozens of failed jobs on his resume," has earned more than $200,000 by creating homemade movies that major corporations are now using in their mainstream commercials. Where'd they find them? YouTube. The article goes on to mention many companies getting content directly from common consumers, rather than through traditional ad agencies, and how this trend is reshaping many aspects of the Marketing and Advertising industries.

The beneficiaries of this are obviously not just limited to individuals. While Levinson created his own content, there's quite a bit of mainstream content from traditional media companies that can be applied to the same business model that benefits them: make the content available for a fee. More and more consumers are actually paying for movies and videos over the web, which is a trend that was once predicted, but failed to materialize for years. In fact, there was doubt that it would ever happen, because users just got used to the net being "free." Many early sites that tried to charge for membership found they couldn't make the numbers work. But as users are learning that good content is harder to come by, this model is finally becoming economically stable. And, it has a twist: users are not just getting content for pay, but are also given further incentives to contribute as well, such as feedback or other information about the content they're paying for. These incentives come in the form of reduced fees, or in the reduction of advertisements the visitor otherwise has to see before getting to see the content. An example is found in this article in the New York Times, where Hulu (www.hulu.com) allows visitors to view programming with fewer ads, and encourages visitors to vote on shows with thumbs-up and thumbs-down buttons.

As new and different kinds of websites are built to respond to that economic incentive, websites will continue to reinforce this behavior by adjusting compensation and other reward systems. They'll also want "semantic information" about that content, not just raw data, which changes the user experience, which changes the nature of how and why people go to websites in the first place.

Evidence of this is already making headlines: YouTube's recent announcement that they will now begin to enable users to purchase songs and other content found on the site -- whereas, before, such content was only used to attract more visitors. Similarly, Flickr's relationship with Getty Images is one where a company is cherry-picking user-generated content from a social network and selling that very same content on a professional photography site. Still another example is the growing basket of online discussion forums that are converting from "free access" to paid-for access, which is the most overt illustration where a site is changing its business model from using user-generated content to attract visitors, to one where that same content is used to generate subscription fees.

All this is part of the feedback mechanism that perpetuates unpredictable change. As users themselves are ranked and "scored" for the various content types they create and contribute, a phenomenon that already exists in many forms on social networks and discussion boards, there would be an amplification of this if there were financial incentive to raise your rankings. Or, to lower others' rankings. This type of human behavior has not yet been put to the test in a broad scale on the internet yet, so its economic effects cannot be predicted.

New Frontier for Web Design and User Participation

The economic models I described above have all been on insular sites that host content. That is, people realize that content has value, so they are using Web 2.0 world to find it. However, in the Web 3.0 world, content may very well exist on websites that don't yet have an ecommerce infrastructure. It's not just about taking credit cards or other forms of payment, it's about pricing models and legal licensing terms. This is the very inefficiency of peer-to-peer licensing that I focused on in part one of this series.

New internet-wide methods and protocols must be established to enable any website that carries licensible content. As more and better content is produced, and as search engines are better able to analyze it semantically and produce search results sorted by personalized preferences, more of the content must be licensed through a universally available infrastructure, thereby transforming the way websites are designed, further affecting the visitor behaviors and incentives.

So what about that licensing mechanism? One such development in this area is ACAP, which is found at http://www.the-acap.org/. ACAP stands for the Automated Content Access Protocol, and its main initial purpose is for communicating access and usage permissions (about the content on any given site) to web crawlers (also known as 'spiders' or 'robots'). Just as you currently accept (and need) Google and other search engines to crawl your site to index it so it will come up in search results, an ACAP search engine will do the same thing, but it looks for other details about your content besides its semantic meaning. It is used to specify license terms and conditions that the owner stipulates, should someone want to license something from your site.

Of course, this a huge and complicated effort, since mechanisms need to be put into place to track and verify content ownership. But, waving the magic wand about that for the moment, if it were to exist, this then paves the way for a content licensing protocol to sit on top of the entire stack of media and search data about it, to complete the puzzle: any content crawler could assess market conditions for any given type of media type and estimate a market value. Plug that into an existing auction-based system like Google's adwords program, and the financial models are in place for the new economic model where a series of automated analytical robots crawl the web, analyze content, rate and rank its information and its creators, and come up with a high/low range for pricing, which can be used to see a more fine-tuned auction-based mechanism.

As futuristic as this may sounds, all of the technologies that do these tasks exist today in one form or another. It's merely applying them in a generalized way to arbitrary and abstract data types that makes it an inevitable development. What's more, it's self-regulating and self-perpetuating. Taken out of the equation is the inefficiencies of peer-to-peer licensing models, where prices are arbitrary, and the transaction itself is costly and time-consuming.

Just as Web 2.0 created a feedback mechanism (where social networks yielded financial returns, which stimulated the growth of social networks), the Web 3.0 world will have a similar feedback mechanism, where content creators are given incentives to create good content, describe it well, and allow third-party, automated market-makers to handle transactions. Though the content itself may still be exchanged between creators and publishers, the transaction will more likely be officiated through market-makers.

It's also a more efficient system in that incentives to cheat are reduced. This comes in two forms. First, because everyone's search may not necessarily yield the same results, attempting to manipulate content to match what someone might think search engines are looking for may actually diminish the content's value. Searchers looking for a photo of a "woman" aren't always looking for porn -- they may genuinely be looking for a photo to be used in legitimate mainstream media. If the content creator tries to "lie" to manipulate search engine results for the photo, he may inadvertently eliminate as many buyers as he would attract if he were just honest about the content in the first place. That's not to say that all content is equally valued, but that brings up the second aspect to semantic awareness by search engines: the content itself would be ranked, not just the site it came from. If a particular set of photos were manipulated with "keyword pollution" (where the photographer adds a huge amount of keywords in the hopes of being indexed to match a large number of search parameters), then that image would be reduced in its credibility ranking, irrespective of what the photo's content actually depicted, or what site the photo came from. Being a bad actor in the economic game has penalties, and being a good actor has rewards.

The Spoiler

So, what can disrupt this potential future? The elephant in the middle of the room that I haven't mentioned is Copyright. That is, user-generated content is copyrighted material, owned by the creators of content, and that creator has rights. The ability for a website to sell content that visitors submit is restricted in ways that aren't entirely easy for everyone to quickly understand, and navigating around this restriction involves an exercise in skills in three disciplines: political, legal and socio-economic. I'll tackle all that and more in part three of this series. Stay Tuned.

Labels: analysis, asymmetric information, business model, commerce, copyright, dan heller, economics, financial analysis, search engines, stock photography

Wednesday, May 14, 2008

My take on The Orphan Works Act of 2008

If you're a pro photographer and haven't been hiding in a cave, you've probably heard about the Orphan Works Act (OWA). Also known as H.R.5889 (the House version of the bill), and S.2913 (the Senate version). Both versions are currently in draft stages, and are similar enough to discuss as a single document.

Even if you do live in a cave, you surely must have heard the screams from protesters about the bill echo throughout your cavernous walls. In fact, a google search yields more web pages advocating protests against the bill than actual content on the bill itself. These perspectives run the gamut from utter hysteria to that of a kinder, gentler kind of hysteria.

An example of the total hysteria -- which is always coupled with loads of misinformation -- is Mark Simon's blog posting, which you can read here. This is the piece that's been passed around to photographers and other artists everywhere, by email, internet forums, faxes, and word-of-mouth. It and other emails like it, are responsible for the dispensing of more untruths and rumors that have only lead to confuse people. Yet, as our culture dictates, if you got it in email, it must be true. (Hint to dumb people: whenever you read something that is peppered with lots of exclamation points, you are reading propaganda, and are also being lied to.)

A far more sound, balanced, and informed retort to Simon's piece can be found on Meredith Patterson's blog. Unfortunately, Meredith's post doesn't really make its rounds in photo circles.

In short, just about every objection I've read about the OWA has been rife with unsubstantiated statements about how photographers will lose their copyright protections, or that people will be able to use their images for free. Yet, at no time does anyone cite text from the bill that even hints at this possibility.

And though Meredeth does a good job at dispensing with the most common misconceptions about the OWA, it doesn't talk about the stuff that really matters to artists. So that's what I'd like to do.

To begin, I'd like to do what no one else that argues about this bill typically does: actually provide a link to the bill itself so that those playing the home game can read along. I'll be citing text from it to illustrate the points that matter, so this reference point might help:

http://www.thomas.gov/cgi-bin/query/z?c110:H.R.5889:

The bill, which is surprisingly short and easy enough to read (if you don't mind long lists of comma-separated items), is broken down into several sections. Only one of which has real substance to the "uses and limitations" that is the source of everyone's consternation. I'll get to that very soon. But first the summary: the OWA intends to provide certain protections for those who use copyrighted works in certain ways, so long as the original author of the work cannot be found. Hence, the work is an "orphan." If you need more background than that, then you should do some independent research. A fantastic summary of what the problem is that is intended to be solved can be found here: http://www.copyright.gov/orphan/

Of all the objections you can find on the internet, if you exclude the unfounded and ridiculous (which is virtually everything), what's really left to discuss is the notion that publishers can potentially use a copyrighted work (like a photograph) "for free", so long as they claim that they couldn't find who the copyright holder is. This has created the fear that major publishers and broadcast television stations will crawl the internet for photos, and just use them carte blanche, and never paying license fees.

This is the part of the code alludes to this very point:

Section 2(c)(1)(B)
An order requiring the infringer to pay ... compensation for the use of the infringed work may not be made ... if the infringer is a nonprofit educational institution, library, or archives, or a public broadcasting entity...

In short, the protesters are worried that non-profits, libraries and TV stations have free reign to steal photos at will. Then the fear mongers take it one step further: that a user of the photo that isn't one of those above entities, may try to use legal maneuvering or other forms of masquerade as one, so as to ultimately steal images for commercial use (a use which normally commands an even higher license fee that the photographer will have missed out on).

Fortunately, it's not so simple. And this is why it's important to read the text of the bill. As mentioned above, the meat of the bill that applies here is Section 2, which has three headings: (a): Definitions, (b): Conditions for Eligibility, and (c), Limitations on Remedies. The quoted excerpt above is from section (c), where it lists the entities that do not have to pay compensation if they use a work that does have a copyright holder who comes forward. But, the mistake people are making is assuming these entities are automatically exempt. No, they're not. First, they must become eligible for exemption by satisfying part (b), which states that the user must have done a "Qualifying Search" to discover who the copyright holder is. And this is a rather arduous process, as you can read for yourself:

(A) REQUIREMENTS FOR QUALIFYING SEARCHES- (i) IN GENERAL- For purposes of paragraph (1)(A)(i)(I), a search is qualifying if the infringer undertakes a diligent effort to locate the owner of the infringed copyright. (ii) DETERMINATION OF DILIGENT EFFORT- In determining whether a search is diligent under this subparagraph, a court shall consider whether-- (I) the actions taken in performing that search are reasonable and appropriate under the facts relevant to that search, including whether the infringer took actions based on facts uncovered by the search itself; (II) the infringer employed the applicable best practices maintained by the Register of Copyrights under subparagraph (B); and (III) the infringer performed the search before using the work and at a time that was reasonably proximate to the commencement of the infringement. (iii) LACK OF IDENTIFYING INFORMATION- The fact that a particular copy or phonorecord lacks identifying information pertaining to the owner of the infringed copyright is not sufficient to meet the conditions under paragraph (1)(A)(i)(I).

In other words, before anyone is eligible for limitations on damages, they must have done a search that is compliant with the methodologies listed above, and documented in such a way so as to prove to a court that the user has complied with the Act. This makes the task of "frivolously stealing an image and hiding behind the OWA" less likely of a problem. One would have to carefully weigh the cost of properly documenting a legally defensible "diligent search" against the cost of just licensing the photo in the first place. (Actually, there's more to it than this, and I'll come back to it soon.)

Of course, this also assumes that the photographer is known. And that might not be the case. Hence, the second concern is that because photos are passed around the internet like wind blowing sand in the desert, it's nearly impossible to really know where any given picture might have originated. Even honest publishers don't know whom to go to. So, could they also get away with using the photo for free? Perhaps, but they also have to assume risk: that someone would still come forward and file an infringement claim. Few want to take this risk, as I'll come back to later.

But, it's the requirement to do a "diligent search" that brings me to what I believe to be the best part of the OWA:
Section 3: DATABASE OF PICTORIAL, GRAPHIC, AND SCULPTURAL WORKS
This section states, "The Register of Copyrights shall undertake a certification process for the establishment of an electronic database to facilitate the search for pictorial, graphic, and sculptural works that are subject to copyright protection." Furthermore, the Copyright Office "shall make available to the public through the Internet a list of all electronic databases that are certified."

Read that closely: a certification process for the establishment of a database. This means that it isn't just the copyright office that has a database, but that many companies could build such a solution and apply for certification. Each would then offer services to the general public for finding copyright holders. For example, a service may provide the user with a form to upload a photo to the site, much like the way you upload photos to a photo-sharing site, and the user gets back a report detailing who the registered copyright owner of that photo is.

Sound like magic? Sound too good to be true? Sound familiar? I publicly proposed such an idea in a blog I wrote on January 21, 2008, in this blog entry, after I had privately proposed it to the Copyright Office the prior year. I make no claims that it was my idea that made its way into the bill. I am only saying that, because of its similarity to my proposal, I am familiar with the ideas and intents that it provides, and feel it does everyone a great deal of good.

My intent at the time had nothing to do with OWA or anything like it, but rather, to provide an infrastructure to verify who owns a photo for a variety of reasons. At the time, the topic du jour was the Creative Commons dilemma. Here, any anonymous person could declare any image to be "free" by placing it under a Creative Commons License, but they can do so with no registration, verification or authentication of any kind. I argued that this aspect of the CC had created a breeding ground (not to mention incentive) on both sides (photographers and licensees alike) to game the system for their own profit. To avoid this problem, Licensees need a way of verifying that a photo hasn't already been copyrighted (at least). My idea of the certification process happened to address that problem, but it can also easily address the OWA as well. (As you can see, it's part of it.).

One of the things I pointed out in my proposal, and which applies directly to why it's so great to see it in the OWA, is that the entire idea can be turned on like a light switch (well, in government time, that is). This could be done nearly the same time the OWA were to be enacted because because both the database, and the image matching/search technology already exist. Several firms, like picscout and IdeeInc, use image recognition algorithms right now: they start with a sample image, determine it's "fingerprint" (that's the algorithm), and then find where else on the internet where the photo exists. They do this by comparing this fingerprint against all the other fingerprints they've collected from the web pages their robots have been crawling for years. If the crawl is far and deep enough into the web, more matches are found. The clients of these companies are large stock agencies who pay to find infringers of their works, and then demand payments or damages.

If the OWA passes, each of these companies would just process the copyright "library" of images (just like it did when it crawled the web), fingerprint them, and then do instant comparison analysis against an input image by any given user. The only thing keeping that from happening today is access to the copyright office's database of images.

Other players in this could be Google, Yahoo and other search engines, because they already do all this as well. In fact, faster and more thoroughly, for obvious reasons. They don't make it available to the public due to certain business and legal concerns that are beyond the scope of this article, but the OWA would alleviate these legal concerns. The doors would open up to a truly public system nearly the same day the certification process would become enabled.

Of course, the one thing this relies on is photographers actually registering their works with the Copyright Office. Not doing so has always been dim-witted, but after the OWA is enacted, there's all the incentive to do so. And now that you can register online, the process is even easier than that one-page form you used to have to fill out.

What's the net effect of all this on the photo licensing industry? As I wrote in my January 2008 blog, Infringements themselves could become a thing of the past. While people could still "steal" images and publish them without the photographer's consent, they'd be taking a huge risk in doing so because if the photographer caught them (a highly likely event, given that media of all sorts is being digitized and indexed, therefore "findable"), the case in court is pretty cut and dried: "Your honor, all one needs to do is simply input the photo in the copyright office database and my name comes right up." How could a judge not find the infringer guilty? Better still, it could determine that the search is so easy, that not doing so would imply a willful infringement. By statute, "willful infringements" increase the ceiling of the damages the judge may award from $30,000 per infringement to $150,000. With that kind of risk and a sure-fire losing case in court, the number of infringements would drop considerably.

Another unexpected benefit of the copyright database: it might even generate sales. If an honest company finds your photo on a website somewhere, or it's been passed around in email, and they want to use it, just use the database to find you and license the image legitimately. Today, they'd never know it was you.

Here's another benefit: it would be harder for someone to claim someone else's images as their own--a phenomenon that's already happened everywhere from major stock photo agencies to social-networking sites like Flickr. So long as the photo's been registered with the copyright office, a simple search will usually yield the correct owner. Though this is obviously not bullet-proof, it's far superior than what's available today.

True, there will always be "orphaned works" out there, much of it not on the internet. But the provisions of the OWA's "diligent search" requirements are onerous enough, that one doesn't want to mess with offline content frivolously as well. After all, they may not be online, but they may still have been registered with the copyright office, and if the promise of the online database holds true, these offline items may end up being found as well.

Once again, this works best when works have been registered. But, what about those that haven't been? Does the OWA have sufficient teeth to address everyday people and their works, whether images or songs, or what-have-you? If the work is not registered, it won't turn up in the database search, thereby making it much hard to legitimately find the copyright holder. There are those who say that this alone makes stealing easy for publishers: because it's easy to claim that there are tens of billions of photos online, and finding the owner is like finding a needle in a haystack. But the court also knows that the OWA isn't there to protect people from litigation just because they didn't find that needle. The court is going to consider whether the publisher was looking in a haystack where every straw probably has a known, current, copyright holder in an environment that's inherently crowded with such. Judges look for "intent" by the parties, and it isn't going to be hard to see what's going on when such cases come before them.

Oh, and let's remember the pragmatic reality of how these things go in real life. If a company were to be dumb enough to try to hide behind the OWA, and they get sued by a copyright holder for infringement, the company's lawyer is going to do what every lawyer does: avoid the litigation by trying to reach a settlement. Though it's sad when innocent companies get sued on baseless claims, they still know it's always better to settle than to go to court. And those are the innocent companies. I'll bet you Bill Gates' next paycheck that a guilty party is even more eager to settle than risk going to court and losing. That would not only make them ineligible for safe harbor (even if they are a nonprofit, library or PBS station), but that the existing statutory damages would apply. This settlement is virtually assured to be a much higher price than what they would have paid had they licensed it legitimately. (A good lawyer will assure that!)

In the end, photographers are really not losing anything at all with the OWA, and I see no real concern for risk in any of the areas that has been getting all the hoopla. Granted, it's not a perfect bill, and I don't doubt there is probably language that needs cleaning up. Nor am I disputing the (currently unknown) possibility that the OWA might exacerbate infringements. But that doesn't mean they will necessarily be "successful" infringements. And, even if there is an increase, it would be a short-term anomaly, quick to subside once people become aware that the OWA doesn't protect them as they thought they would.

In my mind, the true golden nugget is Section 3 of the bill, where the public can access databases of registered works. This will have the greatest effect on providing disincentive for infringers of all types, even those that have nothing to do with the OWA.

Labels: analysis, copyright, creative commons, dan heller, infringement, licensing, registration, search engines

Thursday, August 30, 2007

Radio Interview: the future of photography

On Thursday, Aug 30, I was interviewed on the morning radio talk show, "Forum" on KQED FM (san francisco).

The one-hour broadcast can be heard here: http://www.kqed.org/epArchive/R708301000

For those interested, this is a summary of the status of the photo industry today:

Over the past 15 years, the internet crawled its way into our collective culture, and with it, the technology behind digital photography has made it possible for everyday people to produce images that were once done only by professionals. The combination of photography and the internet has created not just a new wave of interest in the art form, but the social networking aspects has also given rise to new business opportunities as well. Just about anyone with a camera and an internet connection to engage in the business of photography and make money--and they are doing so in rising volumes. This can come in the form of consumers making and selling their own prints, self-publishing books or other merchandise, or through the most common of all, licensing photos to third party publishers who use them for everything from magazine and newspaper articles to using images on product packaging and other marketing purposes.

Selling "photographs" as a business can be broken down into two basic forms: the service industry (wedding, portrait, event, staff and/or work-for-hire shooters), and the more general freelance photographer, who usually shoots first and sells the pictures later.

The overall stock photo industry has grown by orders of magnitude since the internet has reached more and more people. The largest growth of the photo sector has been in freelance and stock photography for these primary reasons:

1) technological advances in digital cameras have enabled more people to create professional images, and 2) the internet acts as a distribution channel for those who never had had access before (i.e., traditional stock agencies.)

The effect has been a massive influx of both supply and demand, new new buyers/sellers, new ways to compete, different sales and marketing models, and a fundamental change in the "culture."

What was once considered a difficult profession to break into, much less succeed in, photography is now being pursued casually by everyday people, who also happen to be making very good money at it.

SIZE OF MARKET Many industry analysts currently believe the size of the photo market to be around $2B based on traditional survey methods that pre-date the internet era. Thus, the data doesn't factor in non-traditional photo sources, such as consumers, semi-pros and pros who had traditionally not sold "stock" footage (but now do so because of the convenience of the internet). Detailed discussion on the total size of the stock photo market is here:

http://danheller.blogspot.com/2007/07/total-size-of-licensing-market.html

The size of the stock-photography market is more likely around $20B based on inference data like sales of pro-level cameras by the major manufacturers and statistics from sub-industry segments within the photo field that are not part of traditional surveys.

LACK OF INFRASTRUCTURE Because the market is much bigger than anyone has yet prepared for, there is currently a lack of infrastructure to accommodate both the informal buyers and sellers, who are currently doing business through direct (one-to-one) contact, rather than through traditional sales channels, such as agencies.

This is much like the dormant market that was awaiting the emergence of Ebay: the assets were there, and the buyers/sellers were there, but these elements never came together till Ebay facilitated it. And even that required some time for people to "get it." Ebay's continued growth today still shows signs of the masses of people still getting into the business of selling their otherwise unwanted junk.

Similarly, the massive number of photos being taken have "value," and there are many people who use these photos in informal ways (often "lifted" from the net without thought). What's missing is the infrastructure to find these assets and to facilitate transactions.

TWO EMERGING (TRANSITIONAL) SOLUTIONS The most recent attempt to address that is the emergence of "microstock" agencies, who pitch themselves as being the agencies that anyone can (and does) join to sell their photos. However, they suffer from many problems, as discussed here:

http://danheller.blogspot.com/2007/03/myth-that-microstock-agencies-hurt.html

In short, microstock agencies evolved from an insider's exchange program, without the intent of forming a broader business outside of the industry. Because they don't appeal to the consumer, buyers or sellers of photos are largely unaware of these sites outside of the industry.

Ironically, "photo-sharing" and social-networking sites that millions of consumers use every day are in the best position to provide that infrastructure, but don't. The simple, but unfortunate reason is the lack of awareness that the latent demand for photography exists. This topic is discussed in full here:

http://danheller.blogspot.com/2007/02/future-of-photo-sharing-sites-and.html

This has placed the industry in an unsual state of transition, bifurcated between two models: the larger, traditional stock agencies (Getty, Corbis, et al.) who manage traditionally more senior pro photographers and cater to larger media and advertising companies; and the eratic and organic grassroots industries, such as photo-sharing sites and the websites of individual photographers. The "hole" in between these two extremes is currently large enough to drive a truck through.

The biggest losers of this are emerging pro photographers, not the consumers or the existing pros. Consumers are happy to sit back and let things happen casually through the informal networks, and seasoned pros are already engaged with existing larger agencies (though their future in in flux). The emerging pro has the hardest work ahead of him because no existing infrastructure works well for him in today's economic climate. There's no room in the top agencies, which are already in peril by their downsizing, and microstocks simply don't make money sufficient for a pro. Photo-sharing social network sites are fine to plant seeds, but they don't result in short-term income that a pro would need. their only option, which has always been a good one anyway, is to build and evolve one's own photo site. Yet, this is not a simple task with today's tools (though easier than ever before), and it also takes considerable time to rise high enough in search engine results to yield sufficient returns.

The consumer is currently in the best position because of their lack of immediate income from photography. People are finding and licensing photos through photo-sharing/social-networking sites in small doses, but enough to bring not insignificant income. Microsoft made news within the confines of the stock industry when they announced that several of the photos used in Windows Vista were obtained from everyday consumers through www.flickr.com, the largest of such photo-sharing social networks. Yet, because the ratio of purchases are small to the total lot, the perception is that these are anomalies. Still, the growth has been as persistent as the growth of Ebay was in its early days. Still, without the formal infrastructure for licensing these photos, the growth has been stymied, hence the perception of weak demand.

PARADIGM SHIFT The misunderstanding of the photo market is largely due to the mundane nature of photography in the first place: everyone does it, and everyone assumes that only the "pros" really make any money at it. And, of course, the "pros" prefer it that way. Thus, most pro organizations promote the same model of the industry as a way of preserving their livelihoods. But this short-sightedness has not served their constituents well, causing an even greater decline of "pros" in the traditional defintion, and escalating the growth of the non-pro photographer's economic activity. The number of members of pro photo organizations, such as ASMP (American Society of Media Photographers, the largest of the group) has been fairly static over the past 10-20 years. In 1999, ASMP had about 5000 members, whereas today, there are about 5500. Yet, the number of pro-level cameras from all manufacturers has grown from about 2M units per year it 1999, to well over 100M units in 2006. If even a small percentage of buyers of pro-level cameras are actually pros, then it certainly suggests that organizations that propose to represent their interests are failing in this task.

A primer on the basics of the traditional photographer's viewpoint--and the counterpoint--is here:

Chapter 2: The Five Truisms of the Photography Business

In a nutshell, if there's money to be made, entrepreneurs will find a way to make it. As people discover some way of getting income from photography, it'll be as obvious to everyone else about the opportunities in the photo industry, as it was that Ebay is to selling second-hand junk.

FUTURE DIRECTIONS The catalyst that will bring about change will come in the form of a familiar player: search. That is, what makes photos sell will be the same thing that accounts for why many other things sell on the net: the user finds them. Coming up first in search results is the holy grail for most businesses on the internet, and has proven to be a multi-billion dollar business for sites like Google. So it is (and will be more so) for photos.

Opportunity lies in the fact that traditional search engines only play a minor role in photo search (for the time-being, at least) because searching image data is not the same as parsing language. Today's photo-search industry is in much the same condition that more generic internet search was long before Google came onto the scene. Sure, it was there, but its usefulness was a crap shoot at best, and placement within search results was untrustworthy. A detailed discussion on this subject is here:

http://danheller.blogspot.com/2007/04/keywording-and-future-of-stock.html

This brings us to where innovators and entrepreneurs will eventually make a big shift in the industry towards the consumer (who is already there, whether they know it or not). As search technology for images improves, and as photo-intensive sites realize that revenue is available from licensing, there will be a fusion (through consolidation) of traditional stock agencies with photo-sharing/networking sites, resulting in a type of Ebay-for-photographers model, but even easier. Rather than people selling individual photos piece-meal, they'll just have a continuously updated supply (an activfity they already do on photo-sharing sites).

LEGAL SUPPORT The supply is there, the demand is there, and the infrastructure is coming. what's left to support the premise: a legal infrastructure. Again, that's already there: copyrighted material (such as photos) is easy to protect, and is supported by substantial fines and accommodating courts. For discussion, see:

http://danheller.blogspot.com/2007/06/making-money-from-your-stolen-images.html

As "image search" becomes more routine in everyday search engines, the ability to find your images on other people's sites will be as easy as finding your own "text" on other sites. If it's this easy to monitor, and the fines are hefty, there's little incentive to steal images. This, in turn, spurs sales.

We have supply, demand, infrastructure, search, and legal recourse--all the elements necessary to sustain a viable economic model. The only thing needed is a triggering event, which will likely come as a byproduct of other internet-related plates shifting in the ground. The timeframe? About 2-4 years for the emergence of a truly viable business, and 4-6 years for it to properly gain the attention of the wider consumer public.

WHAT TO DO IN THE MEANTIME So, what can people do in the interim? The answer is similar to the same question posed in 1995, well after the "hype" of the internet had been spread, but well before there were any mature web-development tools or infrastructure for internet commerce: land grab. It was clear that the future of the net would provide economic opportunities, but only those most advanced were going to really capitalize on it well. People eventually made millions on doing nothing more than registering popular domain names, like "jeans.com" or "laundry.com". For photography, it's less the domain name as it is its traffic.

And that's the name of the game to making money in the photo industry of tomorrow: take pictures, get them online, and publicize yourself through any means possible: go to photo-sharing sites, discussions groups, writing a blog, and, of course, have your own website. The main "kicker" that draws traffic is to be known for something. It doesn't have to be photography--it can be anything. If people cite you as a source for information, and your site also contains photos, then your photo assets' value piggy-backs on the success of your reputation.

As the infrastructure eventually emerges for more recognizable name-brand companies that are used to market and sell consumer-based imagery, your content (and even your "web property") become vastly more valuable. Knowing that "search" will be critical, you should keyword all photo content well, and so on.

Camera "equipment" is irrelevant--all SLR cameras today can take perfectly good photos, and even point-n-shoot models are mostly good enough for generic consumer-related uses.

Are your pictures "good enough?" Who knows, but another misunderstanding is that only really good photographers make money. Not so--even the most mediocre pictures sell quite well. The most influential factor in a sale is the image being found in the first place.

Should you set up a formal business? Do you need to think about taxes and other things? It all depends on how seriously you intend to get into this business. Again, think about Ebay: if you sell a few things here and there, don't sweat it. If you're making real money, you may need to formalize your business.

Labels: agencies, analysis, corbis, dan heller, economics, flickr, interview, search engines, stock agencies