Source Code Review: Best Practices
Source Code Review: Best Practices
“Few tasks excite a defendant less… Engineers and management howl at the notion of providing strangers, and especially a fierce competitor, access to the crown jewels. Counsel struggles to understand even exactly what code exists and exactly how it can be made available for reasonable inspection. All sorts of questions are immediately posed… Put simply, source code production is disruptive, expensive, and fraught with monumental opportunities to screw up.”
Apple Inc. v. Samsung Elecs. Co., №11–1846, 2012 U.S. Dist. LEXIS 62971, *10–11 (N.D. Cal. May 4, 2012) (ECF №898).
The last five years have seen an enormous change in the U.S. judicial climate as it relates to intellectual property and patent litigation in particular. A direct consequence of the America Invents Act was an increased burden of proof for plaintiffs before or shortly after filing a lawsuit. The trend somewhat continued with Alice and other similar decisions — the going has gone tougher for patent holders, especially software patent holders. Today, an overwhelming portion of cases terminate at the PTAB, and even if they survive the IPR, plaintiffs are on notice to really drill down their contentions of infringement deep into physical implementation. In such situations, while the number of software cases has decreased, the importance of a detailed source code review has increased in the software, telecom and other software-relevant cases that do survive the IPR.
Hosting and Conducting a Source Code Review
It can be expensive on more than one dimension: time, cost and security. For a number of modern corporations such as Google, Uber and Facebook their real business value lies in their source code. Any theft that lands the crucial algorithms in the hands of competition can become an existential threat. Similarly, with an increasing amount of our identity (and finances) now online, a theft of source code also represents a general security risk that can in turn lead to theft of personal information of consumers. Outside counsel must therefore assure an even greater sensitivity and responsibility when source code needs to be produced or reviewed in a case. Additionally, clients reasonably want to cut litigation costs, presenting yet another constraint that attorneys need to balance during the process.
Beyond the security risk, source code presents unique intricacies that necessitate extra diligence:
- A single software can undergo dozen of iterations (versions) before and after it is released as a product.
- Specialized tools are required for reading and reviewing the code
- It requires special security procedures for review and transport
- Code is often a combination of open source, proprietary and third party modules
- It is highly interconnected and one file (or even functions within a file) cannot be analyzed independently of others.
These differences necessitate a number of additional considerations that attorneys must take when hosting or reviewing code. Both parties must, at a minimum, converge on the following provisions, either in the protective order or through a separate meet and confer, ahead of the code production.
1. REPRESENTATIVE VERSIONS
As code journeys in time from inception to a full product (and versions thereof), it grows in size, complexity, components and number of authors. For most software, and especially enterprise software, these changes can result in terabytes of code — which is labor-intensive to review as well as to collect and host. It is not uncommon for software to contain dozens if not hundreds of versions — especially if any portions of the software are open source.
It is usually advantageous for both the plaintiff and the defendant to concur on specific products and versions of the code that will be produced for review. If the specific functionality has undergone extensive modifications over the product lifetime, the first version, the most recent released version and/or the version corresponding to the most popular accused product can be designated as representative versions for production.
For the plaintiff, narrowing down the size and scope of production translates directly to reduction in necessary effort and cost of review by experts and attorneys. It can even help simplify damages valuation.
For the defendant, having to produce fewer versions means a reduction in code collection costs — but more importantly a reduction in exposure of critical code assets to strangers.
2. CODE REVIEW TOOLS
Several tools exist that can help experts and attorneys review source code easily and quickly. Using industry standard tools can not only reduce the cost of review but also help experts generate flowcharts and diagrams that can be used as exhibits. The following is a list of most popular tools used by source code experts in the industry:
- Scitools Understood
An easy-to-use review platform for C/C+, Objective C, Objective C++, C#, FORTRAN, Java, JOVIAL, Delphi/Pascal, PL/M, VHDL, Cobol, PHP, JavaScript and Python. It also, provides advanced diagramming and graphing capabilities.
- Eclipse SDK
Used most often as a development platform for Java applications, but also useful for reviewing production in other languages like Ada, ABAP, C, C++, COBOL, Fortran, Haskell, JavaScript, Julia, Lasso, Lua, NATURAL, Perl, PHP, Prolog, Python, R, Ruby, Rust, Scala, Clojure, Groovy, Scheme and Erlang.
- Microsoft Visual Studio
Used as a development platform for .NET applications and Windows software applications. Supports C, C++, VB.NET, C# and F#.
- Xcode
Used as a development platform for iOS and MacOS software. Supports C, C++, Objective-C, Objective-C++, Java, AppleScript, Python and Ruby.
- Netbeans
While used primarily for Java code development, useful for reviewing code written in web-scripting languages such as PHP and HTML5 as well.
- BeyondCompare
Useful for showing side-by-side comparisons of file content. Is language-independent and used most often for copyright and trade secret cases.
- CodeSuite
Useful for trade secret and copyright cases where portions of source code may have been copied/modified by the alleged infringer. Offers advanced code abstraction and comparison analyses.
- WinGrep / PowerGREP
Useful for quickly searching file content for specific keywords.
- Notepad++
Useful for reading text files, unknown file types and formatted code files. Also useful for printing code files with line numbers for easy reference.
In addition to the specialized source code tools above, counsel should also deliberate if generic software such as a word processing software (such as MS Office or OpenOffice), PDF reader/creator (Adobe Acrobat Reader, Print2PDF, etc.) or an archiving utility (WinRAR, 7-zip) should be requested — depending on production specifics. Code review experts can ascertain if any such tools are necessary based on a quick reconnaissance of the production at the beginning of the review.
3. ELECTRONIC DEVICES
Most protective orders entered nowadays prohibit any external electronic devices to be brought into the room where a code production is hosted. Electronic devices such as smartphones, tablets, USB drives, portable disk drives, cameras, etc. pose a security risk for source code theft. Producing parties and their counsel therefore often have good reason to prohibit any such devices from being in close proximity to the source code.
Notwithstanding, however, there are definite advantages to the receiving party if at least certain electronics are allowed to be brought into the same room. The code reviewers for example may want to bring their own laptops or be able to take phone calls during the review in order to complete the review more efficiently. The receiving party’s counsel should work closely with the code reviewers on whether such an allowance- given the size and complexity of the production makes a significant impact on the review costs.
Even when such an allowance is made, producing party can undertake additional security measures to ensure the electronic devices are not used for information theft. For example:
- Producing party can themselves provide a second desktop computer for taking notes. The second computer can be communicably disconnected from the production computer.
- If the code reviewers are allowed to bring in their own laptops/phones — all cameras and ports on the devices can be disabled using a tamper-evident tape.
- If the code is being produced on a desktop PC, the hardware cabinet can be secured inside a metal lockbox which does not allow a user to remove or connect any devices to the computer.
- Unused device ports and network connectivity on the production computer can and should be disabled by the IT admin
- A person can be staffed for monitoring the code reviewers without directly observing activity on the computer or their notes.
Given these measures, a mutually agreeable allowance can be achieved that reduces costs for the receiving party while ensuring reduced exposure risks for the producing party.
4. VOLUME OF PRINTOUTS
A limit on the number of printouts that the receiving party can request from the code review can present a challenging restriction on the code reviewer. Counsel should deliberate on the appropriate limits, taking into account the size of the accused products, scope of the claims as well as the particular programming languages used in the production. For example, certain languages like Objective C, C#, C/C++ often necessitate large files — therefore a limit on printing consecutive pages (such as 10–15 consecutive pages) can present undue burden on the receiving party. On the other hand, Java files tend to be smaller in size individually — but are often more in number per functionality as compared to other languages. Hence for Java, a limit on number of files can be a daunting challenge for the reviewers.
Counsel should deliberate with the code review experts to better understand the limits, that can be arrived upon, that will not place undue burden on any party while also limiting costs of logistics and exposure risks associated with any code production.
5. DEDICATED SOURCE CODE EXPERTS
Source code review is not only a highly critical component of e-Discovery, but also requires a niche expertise. A good technology expert witness should be able to analyze source code and point out specific excerpts to support infringement.
However, schedules, budgets and expertise constraints can often necessitate that dedicated code reviewers be retained to bring in specific expertise and take off the bulk of the expert witness’ workload.
Adding more people to the effort carries with it extra cost and time, of course. Therefore, in choosing the right code reviewers, counsel should consider the reviewers’ technology expertise, credentials, physical location and knowledge of prevalent programming platforms and languages.
A reviewer with the right combination of skills and experience can cover the review at a fraction of cost — and the more experienced ones can bring down the overall cost of review by almost 50% by streamlining the discovery process.
Purva works with leading counsel and corporations on identifying licensing opportunities to sharply realize business value. She specializes in analyzing patents and products in consumer electronics, software and telecommunications industry.
For over 8 years, she has been working with small businesses and large corporations to handle multi-million dollar projects in software, telecommunications, media and travel-related websites.