SSRN 2844964
SSRN 2844964
Alan C. Marco
U.S. Patent and Trademark Office
Joshua D. Sarnoff
DePaul University
Charles A. deGrazia
U.S. Patent and Trademark Office (ADDX Corp.) and Royal Holloway,
University of London
The views expressed are those of the individual authors and do not necessarily reflect official positions
of the Office of Chief Economist or the U. S. Patent and Trademark Office. USPTO Economic Working
Papers are preliminary research being shared in a timely manner with the public in order to stimulate
discussion, scholarly debate, and critical comment.
For more information about the USPTO’s Office of Chief Economist, visit www.uspto.gov/economics.
The USPTO’s Patent Claims Research Dataset will be made available at www.uspto.gov/economics.
Electroniccopy
Electronic copyavailable
available at:
at: https://ssrn.com/abstract=2844964
https://ssrn.com/abstract=2844964
Patent Claims and Patent Scope
Alan C. Marco
U.S. Patent and Trademark Office
Joshua D. Sarnoff
DePaul University
Charles A. deGrazia
U.S. Patent and Trademark Office (ADDX Corporation) and Royal Holloway,
University of London
Abstract
Patent scope is one of the important aspects in the debates over “patent quality.”
The purported decrease in patent quality over the past decade or two has
supposedly led to granting patents of increased breadth (or “overly broad”
patents), decreased clarity, and questionable validity. Such patents allegedly
diminish the incentives for innovation due to increased licensing and litigation
costs. However, these debates often occur without well-defined measurements of
patent scope. This paper explores two very simple metrics for measuring patent
scope based on claim language: independent claim length and independent claim
count. We validate these measures by showing that they have explanatory power
for several correlates of patent scope used in the literature: patent maintenance
payments, forward citations, the breadth of patent classes, and novelty. Using
these data, we provide the first large-scale analysis of patent scope changes during
the examination process. Our results show that narrower claims at publication are
associated with a higher probability of grant and a shorter examination process
than broader claims. Further, we find that the examination process tends to narrow
the scope of patent claims in terms of both claim length and claim count, and that
the changes are more significant when the duration of examination is longer.
The USPTO’s Patent Claims Research Dataset will soon be made available at
www.uspto.gov/economics.
Electroniccopy
Electronic copyavailable
available at:
at: https://ssrn.com/abstract=2844964
https://ssrn.com/abstract=2844964
1 Introduction
For many years, debates over the effectiveness of the patent system have
focused on the central issue of “patent quality.” In 2002, then-former Assistant
Secretary of Commerce and Commissioner of the U.S. Patent and Trademark
Office (PTO) Gerald Mossinghoff noted a “real concern that with the dramatic
increase in the number of patent applications filed and patents granted - and with
the influx of new and unavoidably inexperienced examiners hired to handle the
workload - compromises to patent quality may be inevitable.”1 This increasing
number of patents of purportedly diminishing quality supposedly led to dramatic
increases in assertion of patents to extract rents through licensing and litigation,
particularly by non-practicing entities (NPEs).2 In turn, the purported decrease in
patent quality supposedly led to diminished innovation due to increased licensing
and litigation costs as well as to reduced sequential innovation in various
industries, particularly in regard to software patents.3
1 Gerald J. Mossinghoff & Vivian S. Kuo, Post-Grant Review of Patents: Enhancing the Quality of the Fuel
of Interest, 43 IDEA 83, 83 (2002).
2 See, e.g., Patent Quality Improvement: Hearing Before the Subcomm. on Courts, the Internet, and
Intellectual Property of the H. Comm. on the Judiciary, 109th Cong. 18 (2005) (statement of Richard J.
Lutton, Jr., Chief Patent Counsel, Apple) (“The current patent system has given rise to too many low quality
patents being issued, and a growing pattern of assertions of weak patents that threaten to damage productive
companies and stifle innovation.”); David L. Schwartz & Jay P. Kesan, Analyzing the Role of Non-Practicing
Entities in the Patent System, 99 CORNELL L. REV. 425, 429 (2014) (“Under this narrative, NPEs assert
marginal patents and read patent claims unreasonably expansively. Under any reasonable view, the patents
are likely invalid or not infringed by the NPEs’ targets. NPEs, who themselves do not innovate or introduce
any products into the marketplace, merely extract rents from the large, innovative companies that they sue.
They create fear of holdup by selecting venues where injunctive relief is available such as the International
Trade Commission. They seek and accept ‘nuisance’ settlement amounts, far below the cost of litigation, so
that the NPEs’ targets have no incentive to defend in costly litigation.”). But see John R. Allison & Ronald J.
Mann, The Disputed Quality of Software Patents, 85 WASH. U. L. REV. 297, 298 (2007) (“In general, we find
that patents the computer technology firms obtain on software inventions have more prior art references,
claims, and forward citations than the patents that the same firms obtain on nonsoftware inventions. We also
find that the patents that ‘pure’ software firms (those producing only software) obtain on software inventions
have more prior art references, claims, and forward citations than the software patents obtained by the firms
that derive revenues from other product lines.”).
3 See, e.g., Arti K. Rai, Improving (Software) Patent Quality Through The Administrative Process, 51 HOUS.
L. REV. 503, 505 (2013) (“Low-quality software patents … generate the usual negative static effects, in the
form of either unnecessary licensing fees or deadweight loss. They also generate deleterious dynamic effects,
as firms in the information and communications technology industries must accumulate large defensive
arsenals in order to avoid being sued.”). Cf. Jay Pil Choi & Heiko Gerlach, Patent Pools, Litigation, and
Innovation, 46 RAND. J. ECON. 499, 499 (2015) (“If patents are sufficiently weak, patent pools with
complementary patents reduce social welfare as they charge higher licensing fees and chill subsequent
innovation incentives.”).
Electroniccopy
Electronic copyavailable
available at:
at: https://ssrn.com/abstract=2844964
https://ssrn.com/abstract=2844964
Patent claim scope and claim clarity have been identified as significant
concerns for patent quality.4 Even the basic approach to determining claim
meaning has been called into question.5 Software patents in particular have been
criticized for having unduly broad and/or unclear claims employing functional
claiming language.6 Thus, in 2003, a Federal Trade Commission (FTC) Report
summarized hearings where “[m]any panelists and participants expressed the
view that software and Internet patents are impeding innovation.”7 And in 2004,
the authors of the most comprehensive litigation study of patents to that date
noted – after determining that litigated (and by hypothesis more valuable)
individual patents experienced significantly longer and more complex
prosecutions at the PTO – that this “could suggest that the much-maligned PTO is
doing a better job than expected in evaluating the patents that really matter, or it
could mean that patent examiners are buried in paper by those critical
applications.”8
4 See, e.g., Rai, supra note 3, at 512 (identifying as concerns raised by commentators “the grant of patents
with scope that exceeds their level of disclosure; and the grant of patents with unclear claim language that
fails to provide adequate notice”); Dargaye Churnet, Patent Claims Revisited, 11 NW. J. TECH. & INT. PROP.
501, 502 (2013) (“Reading an entire patent application and gaining a thorough understanding of the claims
may take weeks. Patent examiners, however, are expected to do so in less than 24 hours. It is no wonder,
then, that many have questioned the quality of patents the PTO has issued.”); Lee Petherbridge, On
Addressing Patent Quality, 158 U. PA. L. REV. PENNUMBRA 13 (2009) (Polk Wagner “thoroughly and
dispassionately identifies and examines the incentives that patent applicants and the Patent Office have to
draft and issue, respectively, large quantities of patents with opaque disclosures and indeterminate claims.”)
(citing R. Polk Wagner, Understanding Patent Quality Mechanism, 157 U. Pa. L. Rev. 2135 (2009)): cf. John
P. Zimmer, To Infinity and Beyond: The Problem of Open-Ended Claim Language in the Unpredictable Arts,
59 S. Car. L. Rev. 865, 867-68 (2008) (arguing for a more stringent enablement examination approach to
open-ended claims having uncertain scope at one end of a range of claimed values).
5 See, e.g., Dan L. Burk & Mark A. Lemley, Fence Posts or Sign Posts? Rethinking Patent Claim
Construction, 157 U. PA. L. REV. 1743, 1745 (2009) (“Claim construction is sufficiently uncertain that many
parties don’t settle a case until after the court has construed the claims, because there is no baseline for
agreement on what the patent might possibly cover. Even after claim construction, the meaning of the claims
remains uncertain, not only because of the very real prospect of reversal on appeal but also because lawyers
immediately begin fighting about the meaning of the words used to construe the words of the claims.”).
6 See, e.g., U.S. Government Accountability Office (GAO), Report to Congressional Committees, Intellectual
Property: Assessing Factors that Affect Patent Infringement Litigation Could Help Improve Patent Quality,
GAO-13-465, at 28-30 (Aug. 2013), available at
http://www.uspto.gov/sites/default/files/aia_implementation/GAO-12-
465_Final_Report_on_Patent_Litigation.pdf (last visited Nov. 3, 2015) (citing views of various
stakeholders). See generally Mark A. Lemley, Software Patents and the Return of Functional Claiming,
2013 WISC. L. REV. 905.
7 Federal Trade Commission, To Promote Innovation: The Proper Balance of Competition and Patent Law
9 Christi J. Guerrini, Defining Patent Quality, 82 FORDHAM L. REV. 3091, 3096 (2014). See, e.g., James E.
Malackowski & Jonathan A. Barney, What is Patent Quality? A Merchant Banc’s Perspective, 43 LES
NOUVELLES 123, 124-28 (2008) (distinguishing low quality resulting from examination – validity – errors
from low quality resulting from low standards of patentability).
10 See Guerrini, supra note 9, at 3098 & n.27, 3099 & nn.28-30 (citing numerous sources).
11 See 35 U.S.C. §§ 120, 121 (2014); 37 C.F.R. § 1.78(a) (2014).
12 See American Inventors Protection Act of 1999, § 4403 (Title IV of the Intellectual Property and
Communications Omnibus Reform Act of 1999 (S. 1948)), as enacted by Pub. L. No. 106-113, § 1000(a)(9),
Division B, 113 Stat. 1501 (1999); 37 C.F.R. § 1.114 (2014).
13 See, e.g., Bruce A. Kaser, Patent Application Recycling: How Continuations Impact Patent Quality &
What The USPTO Is Doing About It, 88 J. PAT. & TRADEMARK OFF. SOC’Y 426, 427-35 (2006). See
generally Cecil D. Quillen Jr. & Ogden H. Webster, Continuing Patent Applications and Performance of the
U.S. Patent and Trademark Office, 11 FED. CIR. B.J. 1 (2001); Cecil D. Quillen Jr. & Ogden H. Webster,
Continuing Patent Applications and Performance of the U.S. Patent and Trademark Office - Extended, 12
FED. CIR. B.J. 35 (2002); Cecil D. Quillen Jr. & Ogden H. Webster, Continuing Patent Applications and
Performance of the U.S. Patent and Trademark Office - Updated, 15 FED. CIR. B.J. 635 (2005-2006); Cecil
D. Quillen Jr. & Ogden H. Webster, Continuing Patent Applications and Performance of the U.S. Patent and
Trademark Office – One More Time, 18 FED. CIR. B.J. 379, 387-94 (2008-2009); Christopher A. Cotropia,
Cecil D. Quillen Jr. & Ogden H. Webster, Patent Applications and the Performance of the U.S. Patent and
Trademark Office, 23 FED. CIR. B.J. 179 (2013).
14 See generally Mark A. Lemley, Rational Ignorance at the Patent Office, 95 NW. U. L. REV. 1495 (2001).
The PTO itself has expressed concerns about patent quality, and has adopted
many other initiatives over the last decade to address patent quality concerns.17
In early 2015, the PTO adopted an “Enhanced Patent Quality Initiative,”
supervised by a new Deputy Commissioner for Patent Quality, which focuses on
three “pillars” of “excellence”: (1) work quality; (2) measuring patent quality; and
(3) customer service.18 Some of the measures being considered by the PTO
include metrics of quality that go far beyond assessing validity of final
determinations, such as processing time, correctness of intermediate actions, and
(particularly) assuring clarity of claims and of other aspects of the prosecution
record.19 As noted by the PTO in 2011, its “previous focus on the correctness of
actions taken by an examiner in an individual application has been widened to
better encompass the entirety of the patent application and examination
process.”20 And as noted by the PTO in 2015:
15 See USPTO, Changes to Practice for Continuing Applications, Requests for Continued Examination
Practice, and Applications Containing Patentably Indistinct Claims, 71 Fed. Reg. 48, 58-61 (Jan. 3, 2006)
(proposing restrictions by amending Rules 78 and 114); USPTO, Changes To Practice for Continued
Examination Filings, Patent Applications Containing Patentably Indistinct Claims, and Examination of
Claims in Patent Applications, 72 Fed. Reg. 46716, 46837-41 (Aug. 21, 2007) (adopting amendments to
Rules 78 and 114).
16 See USPTO, Changes to Practice for Continued Examination Filings, Patent Applications Containing
Patentably Indistinct Claims, and Examination of Claims in Patent Applications, 74 Fed. Reg. 52686, 52689-
91 (Oct. 14, 2009) (withdrawing amendments to rules 78 and 114); Tafas v. Kappos, 586 F.3d 1369, 1370
(Fed. Cir. 2009) (en banc) (dismissing appeal without vacating District Court judgment); Tafas v. Doll, 328
Fed.Appx. 658 (Fed. Cir. 2009) (en banc) (granting en banc rehearing and vacating panel opinion); Tafas v.
Doll, 559 F.3d 1345, 1359-63 (Fed. Cir. 2009) (panel decision invalidating PTO rule restricting continuation
filings, but upholding PTO rule restricting RCEs); Tafas v. Doll, 541 F. Supp. 2d 805, 814-16 (E.D. Va.
2008) (District Court judgment invalidating amendments restricting both continuations and RCEs).
17 See, e.g., USPTO, 2010-2015 Strategic Plan (2010), available at http://
5, 2015) (hereinafter “PTO Request for Comments 2015”). See also Guerrini, supra note 9, at 3099 & nn.31-
32 (citing various PTO documents).
20 USPTO, Adoption of Metrics for the Enhancement of Patent Quality Fiscal Year 2011, at
Litigation and Patent Quality? A (Partial) Defense of the Most Litigated Patents, 16 STAN. TECH. L. REV. 313
(2013); John R. Allison, Mark A. Lemley & Joshua Walker, Patent Quality and Settlement Among Repeat
Patent Litigants, 99 GEO. L.J. 677 (2011); John R. Allison, Mark A. Lemley & Joshua Walker, Extreme Value
or Trolls on Top? The Characteristics of the Most-Litigated Patents, 158 U. PA. L. REV. 1 (2009); Stuart J.H.
Graham, et al., Post-Issue Patent “Quality Control”: A Comparative Study of US Patent Re-Examinations and
European Patent Oppositions, NBER Working Paper 8807, at 1-5 (2002), available at
http://www.nber.org/papers/w8807 (last visited Nov. 3, 2015).
23 See Michael D. Frakes & Melissa F. Wasserman, Does the U.S. Patent and Trademark Office Grant Too
Many Bad Patents?: Evidence From A Quasi-Experiment, 67 STAN. L. REV. 613 (2015).
24 Michael D. Frakes & Melissa F. Wasserman, Does Agency Funding Affect Decisionmaking?: An Empirical
25 See Pierre Régibeau and Katharine Rockett, Innovation Cycles and Learning at the Patent Office: Does the
Early Patent Get the Delay?, 58 J. INDUS. ECON. 222, 222-24 (2010).
26 See, e.g., Stefano Comino & Clara Graziano, How Many Patents Does It Take to Signal Innovation
Quality?, 43 INT’L. J. INDUS. ORG. 66, 66-69 (2015) (positing that “true innovators” are forced to patent more
intensively in the presence of “bad patents”).
27 See, e.g., Florian Schuett, Patent Quality and Incentives at the Patent Office, 44 RAND J. Econ. 313
(2013).
28 See, e.g., Bernard Caillaud & Anne Duchêne, Patent Office in Innovation Policy: Nobody’s Perfect, 29
(2010).
30 See, e.g., Mark A. Lemley & Bhaven Sampat, Is the Patent Office a Rubber Stamp? 58 EMORY L.J. 181,
182-83 (2008).
31 Note that the coding of actions in PAIR does not itself distinguish claim amendments from other
application amendments (such as changes in the specification), although it is possible to read the associated,
scanned documents to distinguish these types of amendments.
32 See, e.g., Petherbridge, supra note 4, at 16-18 (discussing various critiques focused on litigation); Alan
Marco et al., U.S. Patent and Trademark Office, Patent Litigation and USPTO Trials: Implications for Patent
Examination Quality 7-9 (January 2015) (discussing various studies of the relationship of patent quality to
litigation and post-grant administrative reviews).
We are able to observe the claim language for applications at the date of
their pre-grant publication (“PGPub”) and for granted patents at the date of
issuance. We also calculate changes in ICL and ICC from publication to grant for
each published application resulting in a grant (“publication-grant pair”).
Moreover, we make the underlying claims data available for public use in order to
stimulate more research in the area.35 We validate ICL and ICC as measures of
scope by testing the explanatory power with respect to several patent scope
correlates from the previous literature: patent maintenance fee payments, forward
citations, the number of technology classes to which the patent was assigned, and
patent novelty as defined by Fleming (2001) and Strumsky et al (2012).
This paper presents the first large-scale analysis of patent application and
granted patent scope changes during the examination process. Our results reveal
several interesting features about the patent examination process. First, we find
that applications with narrower claims (in terms of ICL) are more likely to be
33 See, e.g., Shawn P. Miller, “Fuzzy” Software Patent Boundaries and High Claim Construction Reversal
Rates, 17 STAN. TECH. L. REV. 809 (2015); J. Jonas Anderson & Peter S. Menell, Informal Deference: A
Historical, Empirical, and Normative Analysis of Patent Claim Construction, 108 NW. U. L. REV. 1 (2014);
Thomas W. Krause & Heather F. Auyang, What Reversals and Close Cases Reveal About Claim
Construction: The Sequel, 13 J. MARSHALL REV. INTELL. PROP. L. 525 (2014); Thomas W. Krause & Heather
F. Auyang, What Close Cases and Reversals Reveal About Claim Construction, 12 J. MARSHALL REV.
INTELL. PROP. L. 583 (2013); R. Polk Wagner & Lee Petherbridge, Did Phillips Change Anything? Analysis
of the Federal Circuit’s Claim Construction Jurisprudence, in INTELLECTUAL PROPERTY AND THE COMMON
LAW (S. Balganesh ed. Cambridge U. Press 2011); David L. Schwartz, Practice Makes Perfect?: An
Empirical Study of Claim Construction Reversal Rates in Patent Cases, 107 MICH. L. REV. 223 (2008);
Kimberley A. Moore, Markman Eight Years Later: Is Claim Construction More Predictable?, 9 LEWIS &
CLARK L. REV. 231 (2005); Joseph Scott Miller & James A. Hilsenteger, The Proven Key: Roles and Rules
for Dictionaries and the Patent Office and the Courts, 54 AM. U. L. REV. 829 (2005).
34 We also considered alternative measures for ICL including the average independent claim length and the
length of the first independent claim. The results are largely insensitive to the definition of ICL.
35 The USPTO’s Patent Claims Research Dataset will soon be made available at www.uspto.gov/economics.
36See, e.g., Kristen J. Osenga, The Shape of Things to Come: What We Can Learn from Patent Claim Length,
28 SANTA CLARA COMPUTER & HIGH TECH. L.J. 617, 619-37 (2012). Cf. Johannes Koenen & Martin Peitz,
Firm Reputation and Incentives to “Milk” Pending Patents, 43 INT’L J. INDUS. ORG. 18, 18-23 (2015)
(discussing equilibrium effects of reputation to seek only meritorious grants and benefits from extending
beyond that and from examination errors); Stephen Yelderman, Improving Patent Quality with Applicant
Incentives, 28 HARV. J.L. & TECH. 77, 78-81 (2014) (arguing that various measures, such as fees, could be
used to affect applicant willingness to file overbroad claims).
Our analysis proceeds from the theoretical presumption that the word length
of an independent claim and the scope of the claim (equivalently, claim breadth)
tend to be negatively correlated: adding more words to a claim should generally
decrease its scope of potential application.37 A patent application contains two
distinct parts: the specification and the claims.38 The specification encompasses a
written description and background, along with drawings or figures. The claims
represent the legal metes and bounds of the invention. Importantly, the
specification may not be significantly altered after filing, whereas it is common to
amend claims during prosecution.
However, it is common in the industry to refer to the “spec” as distinct from the claims.
Where claim language is ambiguous or vague, the examiner may reject the
claim under section 112.40 Clarification by adding words normally narrows the
claim scope because it excludes a set of potential embodiments, whether by
restricting the meaning of the ambiguous or vague language or by specifying a
narrower conception of the things (or relevant properties of things) that the
meaning denotes. Note that the approach of treating additional words as
narrowing does not necessarily mean that comparing two different claims from
different patents on unrelated inventions will permit a general inference that the
longer claim implies the narrower scope. Rather, it only indicates that adding
words in a particular application tends (all else being equal) to add limitations that
reduce or otherwise restrict claim scope. However, comparing word lengths
within narrow technology groups may be appropriate. Further, comparing word
lengths across patents may enable us to observe general trends over time.
39 See Appendix A for the full text claim language of application 10/495,059 as reflected in U.S. pre-grant
publication 20050065799 (published March 24, 2005) and U.S. patent 7,769,690 (issued August 3, 2010).
40 35 USC 112.
10
Our observations below focus on claim length and on changes to claim length
in particular patent applications during prosecution. Claims as published in the
PGPub are a good indication of the claims at filing, because only 8.1 percent of
patents have any claim amendments between the date of filing and the date of
publication. Further, the change in independent claim length acts as a good proxy
for assessing changes to patent scope during prosecution. In contrast, in an effort
to assess changes to claiming practices over decades, Osenga (2012) looked at
average independent and dependent claim length at grant alone, using small
samples of randomly selected patents. She found that claim length practices had
remained surprisingly stable over five decades, notwithstanding significant
doctrinal and technological changes.42 In contrast, we find significant variations
in claim length from 1976 to 2014 for granted patents and 2001 to 2014 for
published applications.
41 Similarly, a Markush claim provides alternatives as being “selected from the group consisting of A, B, and
C” (MPEP 803.02). Adding more elements to the group would add words and increase the scope.
42 See Osenga, supra note 36, at 619-22, 632-37.
43 In comparison, many scholars have used a count of total claims. For example, Allison and Lemley (2000)
performed analyses on the total number of claims at grant, based on the assumption that comparative
increases across unrelated patents in the total number of claims should reflect either increased complexity or
increased value of the technology sought to be protected, given that additional claims will normally cost
patent applicants additional filing fees and drafting and prosecution costs. With respect to scope, however,
the number of independent claims is more accurate, because dependent claims may not be broader than their
independent claims.
44 See, e.g., World Class Tech. Corp. v. Ormco Corp., 769 F.3d 1120, 1125 (Fed. Cir. 2014) (“‘The doctrine
of claim differentiation creates a presumption that distinct claims, particularly an independent claim and its
11
3 Data
To develop the datasets, we first cleaned and identified the claims section
of each bulk file for published applications and patents. Second, we applied an
algorithm to the parsed files to identify individual claims as well as the
dependency relationships between claims. From the parsed claims text, we
measured the length of each claim based on word count. We created data sets at
the claim level and summary statistics at the document level.46
dependent claim, have different scopes.’”) (quoting Kraft Foods, Inc. v. Int'l Trading Co., 203 F.3d 1362,
1368 (Fed.Cir.2000)). See generally Joshua D. Sarnoff & Edward D. Manzo, An Introduction to, Premises of,
and Problems with Patent Claim Construction, in CLAIM CONSTRUCTION IN THE FEDERAL CIRCUIT § 0:4(2)
(2014 on-line ed. Thompson West). Again, this does not necessarily mean that comparing two unrelated
patents with different numbers of claims will indicate that the patent with the larger number of claims has the
greater scope.
45 The USPTO’s Patent Claims Research Dataset will soon be made available at www.uspto.gov/economics.
46 The data sets are provided at www.uspto.gov/economics. More information about the methodology and the
12
From the full data set, we constructed publication-patent pairs, for those
applications for which we can identify both a PGPub and a granted patent. These
pairs enable the observation of changes in an application’s claims between
publication and grant. For these pairs we define ICL and ICC, which represent
the value of ICL or ICC at grant less the corresponding values at publication.
Note that the shortest independent claim at grant may be a different claim number
than the shortest independent claim at publication. First, claims may be
renumbered at various times during prosecution and the particular forms on which
claim amendments are made are not machine readable. Second, amendments may
cause the shortest independent claim on the PGPub to grow longer than another
independent claim.
13
14
4.2 Trends
Figures 3 to 10 show the trends over time in claims for PGPubs and for
grants, which provide some insights into applicant filing behavior as well as
potential changes in examination practice. For patents, we can observe claims
information from 1976 to 2014. For published applications, we can observe
claims data from 2001 to 2014. The figures graph annual arithmetic means for
three different cohort aggregations.
15
A few stylized facts emerge from examining the trends in Figures 3-10:
There have been significant trends in ICL and ICC over time. Figures
3 and 4 show the trend in ICL and ICC, respectively for granted
patents. There is a notable shift towards broader patents from 1984-
2004, after which there is a shift towards narrower patents (2004-
2014). The trend holds for both ICL (Figure 3) and ICC (Figure 4).
There has been an upward trend in the number of words added to ICL
between publication and grant as shown in Figure 9. At the same time,
the number of independent claims removed from applications has gone
from -0.7 in 2001 to -0.2 in 2014 (Figure 10). These facts are
consistent with the cohort comparisons in Figures 5 and 6.
47 The paired comparisons are aggregated based on the date of disposal (issuance). As with cohort
comparisons, it is feasible to aggregate by date of publication, which may better highlight applicant filing
rather than examination behaviors.
16
48See, e.g., 21st Century Strategic Plan, supra note 17, at 8-9 (discussing measures to improve examiner
competency and to enhance quality assurance techniques); National Academy of Public Administration,
Report for the U.S. Congress and the USPTO, U.S. Patent and Trademark Office: Transforming to Meet the
Challenges of the 21st Century 66-67 (Aug. 2005).
17
These results are intuitive, particularly for ICC, as examiners should require
more time to evaluate each additional independent claim (which by hypothesis
should require independent evaluation). The results are identical if we restrict the
definition of pendency to examination pendency only (post-first-action
pendency).49
For our publication-patent pairs, we calculate the change in ICL and the
change in ICC between publication and grant (as defined in Figures 9 and 10).
Our interest is in whether these differences are correlated with pendency. In both
cases, we find that greater pendency is associated with more narrowing of the
claims during prosecution. Figure 13 shows that there is a positive correlation
between pendency and ICL (more time is correlated with more words added to
the claim). Correspondingly, Figure 14 shows that greater pendency is associated
with more independent claims being removed during prosecution.
In short we find that broader applications are subject to longer pendency, and
longer pendency is associated with more significant narrowing of claims, both in
the length of claims, and the number of claims. This is confirmed in Figures 15
and 16, which show the change in ICL and ICC during prosecution, against the
values of ICL and ICC at publication, respectively. The scatter plot (Figure 15)
shows a negative correlation between ICL and ICL, indicating that broader
49Post-first-action pendency measures the time from the first action by an examiner to the time of final
disposal. This definition of pendency reflects the time under examination at the office, which is impacted by
both examiner and applicant behavior.
18
Table 2 provides the ICL and ICC for PGPubs grouped by entity size,50
examination unit (technology center),51 technology category,52 and parent
application type.53 The technology center analysis was generally similar to that for
the NBER technology categories; thus, we restrict our discussion to the
technology centers. For each case, the ICL is higher for PGPub-grants relative to
PGPub-abandonments. The number of claims is not substantially different
50 Entity status is based on fee payments at the time of filing. Small and micro entities are combined as a
single category relative to large entities.
51 There are eight technology centers (TCs) used during our period of study, including Biotechnology and
Organic Chemistry (1600), Chemical and Materials Engineering (1700), Computer Architecture, Software,
and Information Security (2100), Computer Networks, Multiplex Communication, Video Distribution, and
the Security (2400), Communications (2600), Semiconductors, Electrical, and Optical Systems and
Components (2800), Transportation, Construction, Electronic Commerce, Agriculture, National Security and
License & Review (3600), and Mechanical Engineering, Manufacturing and Medical Devices/Processes
(3700).
52 NBER technology categories, as defined by Hall, Jaffe, and Trajtenberg (2001) and Marco, et al (2015)
are: Chemical (1), Computers and Communications (2), Drugs and Medical (3), Electrical and Electronics
(4), Mechanical (5), and Other (6).
53 Parent application type or application status relative to the parent. If there was no parent (a first time
filing), we identified the application as having “no parent” (not applicable, or USNA). For applications
having a parent application, we identified the type of such application. These were divided into applications
having a parent that was: a foreign application (Foreign, or FOR); a Patent Cooperation Treaty (PCT)
application (which was further subdivided by the designated office of the parent – either PCT-foreign or
PCT-US); a prior US non-provisional application (and if so, the relationship to that parent application as
discussed below), or a US provisional application (US-provisional, or US-PRO). If the application had a prior
US non-provisional application as its parent, we denoted the application’s relationship to the parent as a
continuation (CON), a divisional (DIV), or a continuation-in-part (CIP) application to a US application.
19
There are some notable characteristics that differ from the means. With
regard to technology, applications in biotechnology (TC 1600) have the largest
difference in ICL between granted applications and abandoned applications:
approximately 28 words. This is driven by the very low values for PGPub-
abandonments, which are about 12-15 words below the PGPub-abandonment
mean of 94 (from Table 1). However, these applications tend to have the most
independent claims at filing. Further, biotech is the only examination unit for
which PGPub-abandonments have more claims, on average, than PGPub-grants.
Table 3 provides the ICL and ICC for publication-patent pairs, by groups
based on application characteristics. By comparing claims at publication to claims
at grant, we can identify the average change in claims during patent prosecution.
The publication values in Table 3 match those found in Table 2 for granted
applications. There are several interesting facts that emerge from Table 3. Most
notably, for each group applications are narrowed between publication and grant,
in terms of ICL and ICC. We also see interesting differences between application
types.
Small and large entity applications tend to be similar at filing, but small
entities experience greater narrowing during prosecution, leading to 5 more words
and 0.25 fewer claims at issue relative to large entities. Biotech applications again
stand out relative to other examination groups: they are not significantly narrowed
with respect to ICL (only 11 words), but they lose an average of 1.5 independent
claims during prosecution. This is likely based on nature of the invention and the
terminological (nomenclature) conventions for how certain types of inventions
(particularly chemical products) are claimed. Computer-related patents—on the
other hand—are more subject to increases in ICL than to decreases in ICC.
20
5 Validation
To validate our ICL and ICC measures of patent scope, we employ several
statistical tests to compare these measures with post-grant outcomes and other
variables traditionally correlated with patent scope, as shown in Tables 4a and 4b.
The tests extend the previous literature and examine the impact of patent scope—
based on ICL and ICC—on (1) forward citations, (2) the number of Cooperative
Patent Classification (CPCs)54 subclasses to which the patent was assigned, (3)
patent maintenance, and (4) a novelty measure based on whether the granted
patent was issued in a “new” US patent classification subclass. We use a variant
of the validation method from Lerner (1994), which analyzes the relationship
between a proxy for patent scope—the number of 4-digit International Patent
54The CPC classification system was jointly developed by the USPTO and European Patent Office (EPO)
and is a descendent of the IPC classification. For more information on t the CPC classification system, please
visit http://www.cooperativepatentclassification.org/.
21
Lerner (1994) found that a proxy for patent scope, the number of 4-digit IPCs,
was positively and significantly related to the number of forward citations a patent
receives. An increase in the number of 4-digit IPCs assigned to a patent reflects
an increasing number of distinct technologies incorporated into the invention,
which can be interpreted as increasing broadness of a given patent. Lerner (1994)
used a simple Poisson regression to examine the relationship between the
dependent variable, a count of forward citations for a given patent, and the
independent variable, the number of IPCs. He also controlled for the time since
grant, to account for varying exposure time among patents in his sample of
biotechnology firms. The results show that as the number of IPCs increases, the
number of forward citations in a given patent increases as well.
22
23
For each of the four dependent variables we estimate three models based on the
explanatory variables: ICL, ICC, and ICL and ICC together. Each model includes
year fixed effects and US Patent Class fixed effects. Our expectation is that ICL
will have a negative coefficient and ICC will have a positive coefficient, both of
which correspond to a positive correlation between our scope measures and the
dependent variables of value and scope.
For ICC, all coefficients are positive and statistically significant at the 1
percent significance level for all specifications. For ICL all coefficients are
negative and statistically significant at the 0.1 percent significance level with
three exceptions. The coefficient is positive for forward citations when combined
with ICC (Model 6), and it is negative but not statistically significant for the new
subclass specifications. The robustness of the results across specifications implies
that ICL and ICC are useful measures of patent scope. Because the models that
include both measures tend to have the expected signs further imply that ICC and
ICL represent different aspects of patent scope.
As further evidence that ICL and ICC represent patent scope, we rely on results
from Marco et al (2015). There, the authors find that patent scope—as measured
by average independent claim length and independent claim count—is correlated
with the incidence of patent litigation. Lanjouw and Schankerman (2001) explain
why patent breadth should be positively correlated with litigation. Thus, the result
provides more evidence that ICL and ICC are indicators of patent scope.
6 Conclusion
This paper presents the first large-scale analysis of patent claim language as it
applies to patent scope. We define two document-level measurements of scope
that should be useful to researchers interested in patent value and patent quality:
independent claim length (ICL) and independent claim count (ICC). Our
hypotheses that ICL is negatively correlated with patent scope and ICC is
positively correlated with patent scope are born out in several ways. First, we find
that the narrowing process that occurs during examination tends to add words to
the shortest independent claim and tends to remove independent claims, leading to
greater ICL and lower ICC. Second, our formal validation exercise shows that
ICL and ICC independently explain patent maintenance, forward citations, the
breadth of patent classes, and—to a lesser extent—novelty.
24
Our continuing research agenda includes more in-depth analysis into the
examination process, as well as exploring how natural language processing
techniques can be applied to claim text. By making these data widely available we
hope to stimulate more research into the usefulness of analyzing claim text in
order to understand patent scope and its relationship to examination quality and
patent quality.
References
"Adoption of Metrics for the Enhancement of Patent Quality Fiscal Year 2011."
U.S. Patent and Trademark Office (blog). Accessed November 3, 2015.
http://www.uspto.gov/sites/default/files/patents/init_events/qual_comp_metric.pdf
.
Allison, John R., and Ronald J. Mann. "The Disputed Quality of Software
Patents." Washington University Law Review 85 (2007): 297.
Allison, John R., Mark A. Lemley, Kimberly A. Moore, and R. Derek Trunkey.
"Valuable Patents." Georgetown Law Review 92 (2004): 435.
Allison, John R., J. H. Walker, and Mark A. Lemley. "Patent Quality and
Settlement among Repeat Patent Litigants." Georgetown Law Review 99 (2011):
677.
Allison, John R., Mark A. Lemley, and Joshua Walker. "Extreme Value or Trolls
on Top? The Characteristics of the Most-Litigated Patents." University of
Pennsylvania Law Review 158 (2009): 1.
Allison, John R., and Mark A. Lemley. "Who's Patenting What? An Empirical
Exploration of Patent Prosecution." Vanderbilt Law Review 53 (2000): 2107.
25
Burk, Dan L. and Mark A. Lemley, “Fence Posts or Sign Posts? Rethinking
Patent Claim Construction,” U. Pa. L. Rev. 157 (2009): 1743.
Choi, Jay Pil. "Patent Pools And Cross-Licensing In The Shadow Of Patent
Litigation." International Economic Review 51, no. 2 (2010): 441.
Dargaye Churnet, “Patent Claims Revisited,” Nw. J. Tech. & Int. Prop. 11 (2013):
501.
Comino, Stefano, and Clara Graziano. "How Many Patents Does It Take to Signal
Innovation Quality?" International Journal of Industrial Organization 43 (2015):
66.
Cotropia, Christopher A., Cecil D. Quillen, Jr., and Ogden H. Webster. "Patent
Applications and the Performance of the U.S. Patent and Trademark Office." The
Federal Circuit Bar Journal 23 (2013): 179.
Frakes, Michael D., and Melissa F. Wasserman. "Does the U.S. Patent and
Trademark Office Grant Too Many Bad Patents?: Evidence From A Quasi-
Experiment." Stanford Law Review 67 (2015): 613.
26
Graham, Stuart J., Bronwyn Hall, Dietmar Harhoff, and David Mowery. "Post-
Issue Patent "Quality Control": A Comparative Study of US Patent Re-
examinations and European Patent Oppositions." NBER Working Paper 8807,
2002. Accessed November 3, 2015. http://www.nber.org/papers/w8807.
Hall, Bronwyn, Adam Jaffe, and Manuel Trajtenberg. "The NBER Patent Citation
Data File: Lessons, Insights and Methodological Tools." NBER Working Paper
8498, October 2001.
Koenen, Johannes, and Martin Peitz. "Firm Reputation and Incentives to ‘Milk’
Pending Patents." International Journal of Industrial Organization 43 (2015): 18.
Krause, Thomas W. & Heather F. Auyang, “What Close Cases and Reversals
Reveal About Claim Construction,” J. Marshall Rev. Intell. Prop. L. 12 (2013):
583.
Krause, Thomas W. & Heather F. Auyang, “What Reversals and Close Cases
Reveal About Claim Construction: The Sequel,” J. Marshall Rev. Intell. Prop. L.
13 (2014): 525.
Lemley, Mark A., and Bhaven Sampat. "Is the Patent Office a Rubber Stamp?"
27
Marco, Alan C., Michael Carley, Steven Jackson, and Amanda F. Myers. "The
USPTO Historical Patent Data Files: Two Centuries of Innovation." U.S. Patent
and Trademark Office Working Paper, 2015.
Marco, Alan C., Richard D. Miller, Kathleen Kahler Fonda, Pinchus M. Laufer,
Paul Dzierzynski, and Martin Rater, “U.S. Patent and Trademark Office, Patent
Litigation and USPTO Trials: Implications for Patent Examination Quality”
(January 2015). Accessed November 3, 2015.
http://www.uspto.gov/sites/default/files/documents/Patent%20litigation%20and%
20USPTO%20trials%2020150130.pdf
Miller, Shawn P. "What’s the Connection Between Repeat Patent Litigaiton and
Patent Quality? A (Partial) Defense of the Most Litigated Patents." Stanford
Technology Law Review 16 (2013): 313.
Moore, Kimberley A., “Markman Eight Years Later: Is Claim Construction More
Predictable?,” Lewis & Clark L. Rev. 9 (2005): 231.
National Academy of Public Administration, “Report for the U.S. Congress and
the USPTO, U.S. Patent and Trademark Office: Transforming to Meet the
Challenges of the 21st Century” (Aug. 2005).
Osenga, Kristen J. "The Shape of Things to Come: What We Can Learn from
28
Quillen, Cecil D., Jr., and Ogden H. Webster. "Continuing Patent Applications
and Performance of the U.S. Patent and Trademark Office." The Federal Circuit
Bar Journal 11 (2001): 1.
Quillen, Cecil D., Jr., and Ogden H. Webster. "Continuing Patent Applications
and Performance of the U.S. Patent and Trademark Office - Extended." The
Federal Circuit Bar Journal 12 (2002): 35.
Quillen, Cecil D., Jr., and Ogden H. Webster. "Continuing Patent Applications
and Performance of the U.S. Patent and Trademark Office - Updated." The
Federal Circuit Bar Journal 15 (2006): 635.
Quillen, Cecil D., Jr., and Ogden H. Webster. "Continuing Patent Applications
and Performance of the U.S. Patent and Trademark Office – One More Time."
The Federal Circuit Bar Journal 18 (2009): 379.
Sarnoff, Joshua D. and Edward D. Manzo, “An Introduction to, Premises of, and
Problems with Patent Claim Construction,” in Claim Construction in the Federal
Circuit § 0:4(2) (2014 on-line ed. Thompson West).
Schuett, Florian. "Patent Quality and Incentives at the Patent Office." The RAND
Journal of Economics 44, no. 2 (2013): 313-36.
29
Wagner, R. Polk & Lee Petherbridge, “Did Phillips Change Anything? Analysis
of the Federal Circuit’s Claim Construction Jurisprudence,” in Intellectual
Property and the Common Law (S. Balganesh ed. Cambridge U. Press 2011).
Zimmer, John P., “To Infinity and Beyond: The Problem of Open-Ended Claim
Language in the Unpredictable Arts,” S. Car. L. Rev. 59 (2008): 865.
30
Table 1. Distributional statistics for pre-grant publications (2001-2014) and patent grants
(1976-2014)
ICL ICC
Frequency Mean P25 P50 P75 Mean P25 P50 P75
Publications (2001-2014)
Later Abandoned 1089427 94.2 46 75 115 3.03 1 2 3
Later Granted 2113273 111.4 58 90 137 3.08 2 3 4
Pending* 790019 107.1 59 90 133 2.73 2 3 3
All 3992719 105.8 54 86 130 2.99 2 3 3
Grants
At Publication 2113273 111.4 58 90 137 3.08 2 3 4
At Grant (previously published) 2113273 155.9 93 136 195 2.70 1 2 3
At Grant (not previously published) 634235 141.0 82 121 176 3.12 2 3 4
At Grant (1976-2000) 2203409 155.6 92 137 198 2.43 1 2 3
* As of December 31, 2016
31
IC Length IC Count
Later Issued Later Abandoned Difference Later Issued Later Abandoned Difference
Small entity status
Large 111.03 94.07 16.96 3.09 3.08 0.02
Small or Micro 112.24 94.03 18.21 3.03 2.94 0.08
Technology Center
1600 110.23 81.79 28.44 3.75 3.98 -0.23
Biotech, Organic Chem
1700 97.75 84.37 13.38 2.74 2.66 0.08
Chem & Mat Engineering
2100 107.80 95.68 12.11 3.60 3.50 0.10
Comp Architecture
2400 107.73 95.60 12.13 3.59 3.50 0.10
Comp Networks
2600 109.21 95.68 13.53 3.47 3.19 0.27
Communications
2800 110.99 95.65 15.35 2.89 2.62 0.27
Semiconductors, Electrical
3600 125.45 106.86 18.60 2.80 2.78 0.02
Trans, Constr, E-Comm, Ag
3700 117.04 99.93 17.11 2.84 2.67 0.17
Mech, Mfg, Products
NBER category
1 - Chemicals 102.07 95.20 6.87 2.91 2.84 0.07
2 - Comp & Comm 109.71 97.43 12.28 3.43 3.36 0.07
3 - Drugs & Medical 107.28 78.80 28.47 3.54 3.72 -0.18
4 - Electrical 110.82 95.41 15.41 2.85 2.63 0.23
5 - Mechanical 123.43 105.25 18.18 2.66 2.49 0.17
6 - Others 114.17 95.72 18.45 2.83 2.58 0.25
Parent application type
Foreign 122.95 101.84 21.10 2.69 2.66 0.03
PCT - foreign 119.92 97.11 22.81 2.66 2.81 -0.15
PCT - US 109.40 87.94 21.46 3.39 3.60 -0.21
CIP of US app 107.15 95.77 11.37 3.58 3.51 0.07
CON of US app 112.09 94.85 17.24 3.27 3.43 -0.17
DIV of US app 108.99 94.25 14.74 3.16 3.12 0.04
No parent 98.73 91.91 6.82 3.32 2.97 0.36
US provisional 98.83 83.19 15.64 3.67 3.44 0.23
32
IC Length IC Count
At publication At issuance Difference At publication At issuance Difference
Small entity status
Large 111.03 154.97 43.94 3.09 2.75 -0.34
Small or Micro 112.24 160.03 47.79 3.03 2.50 -0.53
Technology Center
1600 110.23 121.39 11.16 3.75 2.27 -1.48
Biotech, Organic Chem
1700 97.75 138.64 40.88 2.74 2.21 -0.54
Chem & Mat Engineering
2100 107.80 175.53 67.73 3.60 3.32 -0.28
Comp Architecture
2400 107.73 183.72 75.99 3.59 3.34 -0.25
Comp Networks
2600 109.21 159.73 50.53 3.47 3.26 -0.21
Communications
2800 110.99 145.36 34.36 2.89 2.66 -0.23
Semiconductors, Electrical
3600 125.45 179.36 53.90 2.80 2.58 -0.22
Trans, Constr, E-Comm, Ag
3700 117.04 168.23 51.18 2.84 2.53 -0.31
Mech, Mfg, Products
NBER category
1 - Chemicals 102.07 135.32 33.25 2.91 2.20 -0.71
2 - Comp & Comm 109.71 165.59 55.88 3.43 3.19 -0.24
3 - Drugs & Medical 107.28 138.32 31.04 3.54 2.47 -1.07
4 - Electrical 110.82 148.19 37.37 2.85 2.60 -0.25
5 - Mechanical 123.43 167.59 44.16 2.66 2.44 -0.22
6 - Others 114.17 165.71 51.54 2.83 2.53 -0.30
Parent application type
Foreign 122.95 166.83 43.88 2.69 2.49 -0.20
PCT - foreign 119.92 168.06 48.14 2.66 2.22 -0.44
PCT - US 109.40 150.78 41.38 3.39 2.58 -0.81
CIP of US app 107.15 149.30 42.15 3.58 3.02 -0.56
CON of US app 112.09 141.68 29.59 3.27 2.89 -0.38
DIV of US app 108.99 140.48 31.49 3.16 2.37 -0.79
No parent 98.73 150.02 51.30 3.32 3.03 -0.29
US provisional 98.83 146.68 47.85 3.67 3.04 -0.63
Note: 10,311 of 2,113,273 publication-patent pairs were lost due to data availability issues for application
characteristics. IC Length is defined as the length of an application's shortest Independent Claim
33
34
Figure 2
35
Figure 4
36
Figure 6
37
Figure 8
38
Figure 10
39
Figure 12
40
Figure 14
41
Figure 16
42
Independent Claim
1. A method for supply of data relating to a described entity to a relying entity, the
method comprising:
generating a first digital certificate signed with an electronic signature by a first
signing entity and including:
one or more attributes of the described entity;
one or more attributes of the first digital certificate which include one or
more attributes identifying the first signing entity;
an indication of data relating to the described entity which is to be
supplied;
an indication of one or more sources for the data to be supplied; and
one or more attributes identifying one or more relying entities to which
the data is to be supplied;
the relying entity forwarding the first digital certificate for processing; and
a source supplying the data indicated in the first digital certificate.
Dependent Claims
5. The method of claim 1, wherein some or all of the data relating to the described entity
is supplied by a second digital certificate to the relying entity, the second digital
certificate signed with an electronic signature by a second signing entity and including:
one or more attributes of the described entity including the data which is to be
supplied;
one or more attributes of the second digital certificate which include one or more
attributes identifying the second signing entity; and
one or more attributes identifying one or more relying entities to which the data
is to be supplied.
6. The method of claim 5, wherein the first digital certificate authorises the relying entity
to use the first digital certificate to obtain a second digital certificate.
43
Independent Claim
1. A method for supply of data relating to a described entity to a relying entity, the
method comprising:
generating, using a computer device, a first digital certificate signed with an
electronic signature by a first signing entity and including:
one or more attributes of the described entity;
one or more attributes identifying the first signing entity;
an indication of data relating to the described entity which is to be
supplied;
an indication of one or more sources for the data to be supplied; and
one or more attributes identifying one or more relying entities to which
the data is to be supplied;
the relying entity forwarding the first digital certificate for processing; and
after the processing, the one or more sources supplying the data indicated in the first
digital certificate to the relying entity,
wherein some or all of the data relating to the described entity is supplied by a second
digital certificate to the relying entity, the second digital certificate signed with an
electronic signature by a second signing entity and including:
one or more attributes of the described entity including the data which is to be
supplied;
one or more attributes of the second digital certificate which include one or more
attributes identifying the second signing entity; and
one or more attributes identifying one or more relying entities to which the data
is to be supplied, and
wherein the first digital certificate authorizes the relying entity to use the first digital
certificate to obtain the second digital certificate.
44
This Appendix details the data sources, methodology, descriptive statistics, and
some general trends that can be observed in the claims_stats,
claims_fulltext, and document_stats datasets. It is our hope that
researchers will be able to use this data to enhance understanding of the
examination process, including but not limited to assessing patent scope and how
it changes during examination.
Data Sources
Our primary data sources for the claims-level datasets include the Patent
Application Publication Full-Text and Patent Grant Full Text files provided by the
U.S. Patent and Trademark Office (USPTO).59 The Patent Application Publication
Full-Text data, provided in XML format and disseminated as separate files by
57 The Python code used to generate the USPTO’s Patent Claims Research Datasets will be made available
soon on GitHub.
58 The data were obtained from USPTO Electronic Bulk Data Products (http://www.uspto.gov/learning-and-
resources/electronic-bulk-data-products)
59 Full-text of patents and patent applications is available at http://patft.uspto.gov/. Bulk data is available at
http://www.uspto.gov/learning-and-resources/electronic-bulk-data-products.
45
60 Cancelled claims were identified in claims_fulltext but were not included in independent claim count and
length summary statistic calculations.
61 For a fuller description of all of the prosecution characteristic variables that were available for coding,
46
Data Limitations
Relying on publicly available information on claims as captured from existing
databases limits our sample in several ways. First, we can observe the claim text
only at the time of publication and at the time of grant. This reliance also restricts
the time period, because pre-grant publication of patent applications has been
practiced by the USPTO only for applications filed after November 29, 2000.63
Since that time, and without a non-publication request (which requires foregoing
international protection on the patented innovation), publication has been required
by statute 18 months after the filing priority date requested in relation to the
earliest related parent application.64 Applications filed prior to November 29,
2000 are unpublished. Thus, although our source patent dataset (grants) extends
back to 1976, the bulk patent application data contains applications filed only
during and after 2000. We have calculated that since November 29, 2000,
approximately ten percent of filed applications have opted out of publication.
Further, in contrast to the captured data on claims from granted applications (at
publication and at issue), machine-readable claim text is not readily available for
abandoned applications (after publication). That is, we cannot observe the change
in claims between publication and abandonment. Consequently, we limit our
analysis of difference variables (dif_wrd_min, dif_wrd_avg, and
dif_clm_ct) to publication-patent pairs (i.e., to applications that resulted in
granted patents).
Although it is possible for claims in a particular application to change between
filing and publication, we believe this is a relatively infrequent event. Our
analysis shows that only 8.11 percent of total applications in the dataset have a
preliminary claims amendment filed after their actual (not priority) filing date but
before the publication date. Normal office practice is to incorporate preliminary
47
65 It should be noted that not all preliminary amendments are included in an application’s publication. See
MPEP 1121.
66 There are exceptions in the claims_fulltext data set: (1) the first claims of twenty-two utility patents begin
with the general (introductory) claiming language, “I claim”; and (2) claims in ten patents, such as patent
6,901,209, begin with the words, “I Claim.” For example, claim 5 states, “I claim the access system of claim
4 characterized by the addition of data manager means to allow a user to access the program.” This list is not
exhaustive.
67 The Claim number can be found as a separate field in the claims_fulltext data set (claim_no).
68 See https://www.google.com/patents/US4788349
69 See Python code in Appendix D
48
Following our assumption that patent scope depends on the length and number of
independent claims, it is important to provide the arithmetic difference in the
length and number of independent claims between publication and grant. These
differences from publication to grant provide an approximation of the changes in
breadth of the independent claims from filing to grant and thus of the change in
the scope of the applications during prosecution. For example, as a direct result of
70 It may be the case that a claim will contain referents to other claims that do not incorporate the other
claims’ limitations. However, we believe this to be a rare event.
71 Because the algorithm uses natural language processing, claims that separate portions of words with spaces
are automatically read as including separate words, which may thereby artificially increase the claim’s word
count. For example, chemical formula sometimes are written as a single word without spaces, but
occasionally may contain many spaces, which would artificially increase the word count by as many spaces
as are added. See US Patent 3,262,977, claim 4 (“N – [1’ -phenyl-propyl-(‘1)] – 1,1 diphenyl-propyl-(3)-
amine”).
72 Our algorithm also identifies specific words or phrases (e.g., “or” and “selected from”) that are more likely
to have the potential to broaden the scope of an independent claim by addition of other words, to permit
robustness checks.
73 To measure the dependent claim length (DCL), we would need to start with a simple count of the number
of words in each dependent claim, and then add the count of the limitations language of the claim(s) from
which the dependent claim depends and eliminate the count of the referential language in the dependent claim
(as such language would then become duplicative and unnecessary). Nevertheless, the data in
claims_fulltext are coded with the claim number(s) from which each dependent claim directly
depends. Accordingly, some automated counts to approximate the number of words of dependent claims are
possible to perform, e.g., by tracing the chains of dependency and adding the simple count of the words of
each dependent claim and of the claim(s) from which it depends. (Such simple counts would be slightly over-
weighted, by including counts of both the referential language and of the full text of the claim(s) to which
those dependent claims refer). Some dependent claims, moreover, reference multiple independent or
dependent claims that may have different lengths, which makes it more difficult to provide a count that is an
accurate length for any such dependent claim. (Of course, each such multiply dependent claim could be
decomposed into separate claims for further analysis.)
49
50
** While document-level dependent claim word count statistics are included in our datasets, these statistics
are not accurate measures of claim scope. The dependent claim word counts do not add the word counts of
referenced claims on which the given dependent claims are dependent. For more information, please see footnote
73.
51