This paper has been written partially in response to recent ruminations by Microsoft about their new or newly emphasized source code sharing initiatives. I discuss four strategies for proprietary source code distribution, including a brief Unix history lesson, and a recommendation for legislative action.
It may be the case that a company wishes to share its code in such a way that it can be looked at but not used. These strategies are only useful for presenting a pretense of sharing, or for augmenting technical documentation. There are a variety of ways for a company can do this. First of all, they can offer an incomplete system that is not capable of regenerating itself. Parts of it may build, but they may do so in such a way that they cannot actually be executed. For example, this is what Digital Equipment Corporation did with some versions of Ultrix. Clearly, if you can't rebuild your system, you're not going to fix any bugs or make any non-trivial improvements. You can only peer into a perfect world encased in glass (or plastic), and perhaps turn it upside down and watch the snow fall.
Even easier than using technical measures to prevent useful modification of the source code, and perhaps also more useful, is the use of contractual provisions. This can be done by denying (or not explicitly granting) permission to create derivative works of the licensed source code. While I don't know the terms of Microsoft's source code licenses, I assume that they are not using a snow globe strategy. If that were the case, their rhetoric of sharing and gaining the advantages of Open Source would be utterly indefensible. However, it is really impossible to evaluate their claims without seeing their licensing terms. The free software community freely publishes its licensing agreements. I am not aware if Microsoft has done the same.
Shade 2: Paternalism
The second choice is a much more reasonable one. Under the paternalism scheme, the licensed user is granted the technical means and the legal permission to modify the software and regenerate a full, operable system. This is probably a bare minimum for any kind of advanced research work. The system is paternalistic if the user is denied the right to share the results of their research with others. This denial includes a prohibition on the distribution of patches and technically detailed information on the system to others having the same license. In this case, since you cannot share patches with your peers, you have scant recourse if if the proprietor does not accept them. Subsequent releases from the vendor may break your modifications and you might spend all your time just keeping them up to date. Larger customers will obviously fare better, but can any single customer be powerful enough to swim against a strategic objective? Paternalism is good for a proprietor who might want to get a few ideas from its users now and then, or who would like to coordinate research and simply obtain a large unpaid workforce. On the other hand, the total lack of control acts as a disincentive for a user to invest heavily in adding new features, particularly features that the proprietor is not interested in supporting. The inability to share effectively bars the users from providing any meaningful community support.
Shades 1 and 2 put the user under the complete domination of the intellectual property owner. The evolution of the product is determined purely by the economic impact that particular improvements have on the proprietor. In fact, if the users are restricted from sharing patches, it probably means the users cannot even discuss their innovations with anybody but the proprietor. The system would break down if you could discuss your work in detail sufficient that it could be reproduced. As the reproduction of experimental results is one of the foundations of science, this means that under the snow globe or paternalism regime, science is impossible.
This issue is of primary importance in academia. Computer scientists need to think this through before signing any deals. Perhaps more importantly, high school students should thoroughly investigate what kinds of secrecy agreements they will be bound to before deciding on what school to attend.
Shade 3: Gated Communities
Under this regime, the user gets get a full system as in 2, but can share their enhancements with other licensees. Thus with enough effort they can fork the code, but such a fork has a dependency. Even if their patches are freely redistributable to everybody, patches alone are useless without a license for the base system. It has been stated elsewhere, but it is worth reiterating that this is the model under which Unix was developed for most of its history. It allows a limited but significant escape from an ineffectual proprietor. Virtual memory, the fast file system, networking, IPC, RPC and NFS, to name a few, were all first added to Unix as part of a dependent fork. Of course, if the proprietor is doing its job, there will probably be no need to fork. There would have been no BSD if AT&T had kept up with the changes coming from the Unix community. Berkeley effectively became a publisher working on behalf of AT&T. They sold a lot of copies of Unix that wouldn't have been sold otherwise. If AT&T had been managed Unix well and been competitive on pricing, there might have been no Linux or FreeBSD. Notice also that the really serious forking problems did not occur until AT&T entered the commercial Unix market in 1984. Who many believe was the worst offender in terms of incompatability? That's right, Microsoft XENIX.
AT&T had very liberal terms in its Unix source licenses. They explicitly granted permission to create derivative works, and then claimed no ownership in stuff added by users that was not originally the property of AT&T. In fact, AT&T's educational software agreements said that if you make a modification and make your work available, you must share it with any other licensee who wants it for no more than the cost of distribution. In effect, the Unix educational license agreement was precisely a license to fork.
What is the risk of this strategy? First bear in mind that it succeeded for AT&T for 20 years. The only risk is that you must actually do the job of providing software that people want at a reasonable price. Economically speaking, this is a role that free markets and profit maximizing incentives are supposed to play. In principle, rational software firms will provide exactly the features that users are willing to pay for. In practice this is not always the case. Permission to fork cures some of the market failures that make software different from other economic goods. You can't sit back and use market distorting techniques such as predatory file format and interface changes. You can't exploit your network effects and barriers to entry by intentionally breaking interoperability. The only thing Microsoft needs trade secrets for is monopoly maintenance.
If you are are slow to support your users and raise your licensing fees so high that you price your most valuable contributors out of the market, you stand the chance of getting killed by free clones, which is what happened to Unix. But the fact is, most users don't want to spend their days cloning software merely on principle. However, if your users revolt, you'd better hope you have mentally contaminated the developer community enough to keep them in their place.
Throughout the 1980's, the Unix source code was so widely available that virtually anybody studying operating systems could get access to it if they wanted to. Richard Stallman never looked at the Unix source code. He was trying to clone it, and he was justifiably affraid of getting contaminated by trade secrets. On the other hand, how do you claim that something that has been seen and talked about by a generation of operating systems researchers is a secret? Should someone be allowed thus to own the minds of a generation of computer scientists?
Believe it or not, this is actually still a major open question in American law.
A trade secret is defined to be something that is not generally known. How can a widely distributed work of source code be said to be a secret? In 1978, the National Commission on New Technological Uses of Copyrighted Works, established by Congress to study the impact of new technologies on copyright law, thought the answer was clear. "Because secrecy is paramount, [trade secrecy] is inappropriate for protecting works that contain the secret and are designed to be widely distributed." CONTU Final Report at 43. Courts have yet to face this issue head on, but there is some reason to think they might get it wrong. However, are Microsoft's investors prepared to take the risk that they don't get it wrong, resulting in Microsoft thereby inadvertantly giving away "the very thing they produced that was of greatest value?"
The courts almost decided this issue in 1992 when Unix System Laboratories sued Berkeley Software Design, Inc. over BSDI's commercial Unix product based on the University of California at Berkeley's Net/2 release. After a ruling calling into question the Unix copyrights, USL apparently did not want to press the trade secret issue and risk losing their sole asset. The case settled, leading to the 4.4BSD-Lite release, which became the basis for BSDI, NetBSD, and FreeBSD.
The heart of the issue is that a software vendor must pick a strategy. Are you going to contribute broadly to the education of the world's computer scientists, as AT&T did with Unix? Are you going to promulgate standards with the aim that they be universally adopted? Or are you going to hoard your knowledge by sharing it only with a select few, under strict contractual provisions? I believe the law as it stands today requires that you make this choice -- you cannot have it both ways.
Shade 4: No More Secrets
One solution to the gated community problem is a return to the principles of copyright law. A book, film, or song is an expressive work. The economic value to the proprietor and the social value to the public of such a work lies precisely in that expression. Software is unique among copyrightable works in that its value is predominantly functional rather than expressive. In fact, software can be protected by copyright even when it is obfuscated and compiled to binary machine code. A work thus intentionally rendered incomprehensible (or nearly so) is nevertheless protectable under a statute that defines a computer program as a literary work (yes, this is the same statute that defines a "useful article" as a boat hull). It is secrets, not copyrights, that threaten the user community. Congress or the courts must resolve this question by holding that trade secrets cannot be applicable to source code or interoperability specifications with an extremely wide degree of distribution. Note that this change would have no effect on the current marketability of binary only software where the vendor retains the source code as an "unpublished work." It would only set a minimum standard: if you wish to educate the greater public and gain true benefits from that learning, you may not do so in such a way as to create a fiefdom.
Copyright violations, source code or binary, are still illegal in the "digital millenium," and the vast majority of the public has proven to be more than willing to pay a reasonable price for good software. Some companies may well choose to voluntarily abandon the trade secrets, while retaining copyrights, on some of their programs. This is a step short of open source or free software. Under a pure copyright regime, the proprietor can control the number of users of the software, block derivative works, and block the distribution of full systems. In all likelyhood, copyright protection alone will prove insufficient to block the distribution of patches. Ideally, the software proprietor would grant rights to create and distribute derivative works to other licensees. Thus, this regime has the positive characteristics of a gated community, it maintains a proprietary software model with sufficient incentives to create, and it avoids the spectre of trade secret contaimination.