Bayesian Inference for Incomplete $2\times 2$ Diagnostic Tables

Sara Antonijevic, Danielle Sitalo, and Brani Vidakovic

(Department of Statistics, Texas A&M University, College Station, TX 77843, USA)

Abstract

Incomplete reporting of diagnostic accuracy data remains a persistent problem in medical research. In many studies, only part of the $2\times 2$ diagnostic table is reported, leaving denominators for diseased and non-diseased groups unknown and preventing direct calculation of sensitivity, specificity, predictive values, and related operating characteristics. To address this limitation, we develop hierarchical Bayesian models for reconstructing incomplete $2\times 2$ diagnostic tables from such partial information. Two motivating scenarios are considered: one in which only a single test-outcome row is observed, and another in which true positives, false positives, and the total sample size are reported but the remaining cells are missing. The proposed models are illustrated on a benchmark breast MRI study with complete counts, treated as partially observed in order to assess reconstruction performance under controlled missingness. The framework yields posterior inference for the missing cell counts and associated diagnostic measures, together with uncertainty quantification in weakly identified settings.

Introduction

Incomplete or partially reported diagnostic accuracy data remain a common obstacle to interpretation, reproducibility, and evidence synthesis. In many applied studies, authors report only fragments of the $2\times 2$ diagnostic table, for example a single cell count, one observed row, or a pair of summary measures, without providing the full denominators for diseased and non-diseased groups. As a result, readers may be unable to reconstruct the full table, verify internal consistency, or derive clinically relevant quantities such as predictive values, false discovery rates, or expected error counts (U.S. Food and Drug Administration, 2007; Macaskill et al., 2010).

This problem is not merely clerical. Complete cross-tabulation is central to transparent reporting of diagnostic accuracy studies, and the STARD initiative was developed precisely to improve such reporting. The STARD 2015 revision explicitly recommends reporting both the cross-tabulation of index test results against the reference standard and a participant flow diagram showing how the study denominators were obtained (Bossuyt et al., 2015; Cohen et al., 2016; EQUATOR Network, 2015). When these elements are omitted, clinically important operating characteristics may no longer be recoverable from the published record.

Empirical assessments suggest that such omissions remain common. Earlier audits showed that incomplete reporting often prevented reconstruction of the full $2\times 2$ table (Smidt et al., 2005; Wilczynski et al., 2008), and more recent work indicates that adherence to STARD recommendations remains uneven even in contemporary medical imaging diagnostic accuracy studies (White et al., 2025). For AI-centered diagnostic accuracy studies, the recent STARD-AI extension further underscores the need for clear reporting of dataset construction, model evaluation, and clinical applicability (Sounderajah et al., 2025). The practical consequence is that studies may be difficult to check, compare, or incorporate into evidence syntheses, even when the underlying clinical question is important.

A related methodological literature addresses situations in which the reference standard is applied only to a subset of participants, creating verification problems and the potential for work-up bias (de Groot et al., 2011; Buzoianu and Kadane, 2008; Umemneku Chikere et al., 2019). That literature is important for the present paper because one of our motivating examples arises from precisely such a setting. At the same time, incomplete reporting can also occur in studies where the reference standard is not obviously missing for design reasons, but the published article still omits sufficient cell counts to prevent recovery of the full diagnostic table. Our focus is on this reporting and reconstruction problem.

In practical terms, if only sensitivity and specificity are reported, a full $2\times 2$ table cannot usually be reconstructed unless additional information is available, such as the total sample size, the number of diseased subjects, or an externally justified prevalence estimate. Because many evidence-synthesis frameworks require study-level $2\times 2$ tables, missing denominators directly limit downstream meta-analytic use (Macaskill et al., 2010).

This paper develops statistical models for reconstructing incomplete $2\times 2$ diagnostic tables from partially reported data. Our motivation comes from real examples in the diagnostic accuracy literature where key cells are unobserved in the published report. We consider two representative scenarios. In the first, based on Svirsky et al. (2002), only the test-positive subgroup is reported in detail, leaving both cells of the test-negative row unobserved. This case is naturally connected to the literature on partial verification because the reference standard was applied only to a selected subgroup. In the second, based on Wismueller et al. (2020), counts for true positives and false positives are reported and the total sample size is known, but the negative-class counts are omitted. This second setting is better viewed as a constrained incomplete-table problem, since the known total sample size links the missing denominators.

Our aim is not to replace the broader verification-bias literature, nor to claim that incomplete diagnostic tables can always be uniquely recovered from sparse summaries alone. Rather, we develop Bayesian reconstruction strategies for settings in which deterministic recovery is impossible, but principled posterior inference on the missing denominators and derived operating characteristics remains feasible under clearly stated modeling assumptions.

The remainder of the paper is organized as follows. We first review notation for diagnostic $2\times 2$ tables and summarize the binomial- $n$ problem that underlies our reconstruction strategy. We then present the two motivating incomplete-table scenarios, develop the corresponding models, and illustrate their performance on a benchmark example with complete counts treated as partially observed.

Review and Notation for $2\times 2$ Diagnostic Accuracy Tables and the Binomial $n$ Problem

A $2\times 2$ diagnostic table, also called a confusion table, is the standard framework for summarizing the performance of an index test against a reference standard. It records the counts of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN), and serves as the basis for the usual measures of diagnostic performance.

	Disease Present	Disease Absent	Total
Test Positive	TP	FP	$n_{+}=\mathrm{TP}+\mathrm{FP}$
Test Negative	FN	TN	$n_{-}=\mathrm{FN}+\mathrm{TN}$
Total	$n_{1}=\mathrm{TP}+\mathrm{FN}$	$n_{2}=\mathrm{FP}+\mathrm{TN}$	$N$

Table 1: Standard

2\times 2

diagnostic table showing true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN), with row totals

n_{+}

n_{-}

, column totals

n_{1}

n_{2}

, and overall sample size

N

From these counts one obtains the familiar diagnostic accuracy measures:

$\displaystyle\mathrm{Se}$	$\displaystyle=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}},$	$\displaystyle\qquad\mathrm{Sp}$	$\displaystyle=\frac{\mathrm{TN}}{\mathrm{FP}+\mathrm{TN}},$	(1)
$\displaystyle\mathrm{PPV}$	$\displaystyle=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}},$	$\displaystyle\qquad\mathrm{NPV}$	$\displaystyle=\frac{\mathrm{TN}}{\mathrm{FN}+\mathrm{TN}},$	(2)
$\displaystyle\mathrm{Accuracy}$	$\displaystyle=\frac{\mathrm{TP}+\mathrm{TN}}{N},$			(3)

(Langlotz, 2003; Eusebi, 2013; Vidakovic, 2017).

We write $n_{1}=\mathrm{TP}+\mathrm{FN}$ for the total number of diseased individuals, $n_{2}=\mathrm{FP}+\mathrm{TN}$ for the total number of non-diseased individuals, and $N=n_{1}+n_{2}$ for the total sample size. When all four cell counts are available, all standard operating characteristics can be computed directly. When only partial information is reported, for example TP and FP without the corresponding denominators, the full table cannot be reconstructed from the published data alone. In that setting, sensitivity, specificity, and NPV are not identified without additional information or modeling assumptions.

Bayesian Approaches to the Binomial $n$ Problem

A central ingredient in our reconstruction problem is the classical binomial- $n$ problem: infer the number of trials $n$ in a Bin $(n,p)$ model when $n$ is unknown and $p$ may also be unknown. This problem is well known to be difficult, especially when only a single observation, or very limited data, are available. Classical estimators based on moments or maximum likelihood can be unstable, particularly when the sample variance is close to the sample mean, and the difficulty becomes more pronounced when $p$ is small or $n$ is large (Haldane, 1945; Olkin and Petkau, 1993; DasGupta and Rubin, 2005).

Bayesian methods provide a natural way to regularize this problem by introducing prior information on both $n$ and $p$ . Early work considered simple prior structures such as $n\sim\mathrm{Uniform}(1,N)$ and $p\sim\mathrm{Beta}(\alpha,\beta)$ , leading to posterior inference for $n$ through either posterior modes or posterior means under the chosen loss function (Draper and Guttman, 1978; Carroll and Lombard, 1985). Rubin’s empirical Bayes treatment of the problem was especially influential in motivating later hierarchical formulations (Rubin, 1978).

Subsequent developments introduced more flexible prior models. For example, Raftery (1988) considered a predictive framework with $n\sim\mathrm{Poisson}(\mu)$ and $p\sim\mathrm{Uniform}(0,1)$ . Other extensions include beta priors for $p$ together with truncated Poisson priors for $n$ (Bayoud, 2011), as well as continuous gamma approximations for $n$ that simplify computation (Günel and Chilko, 2000). These approaches reduce the instability of purely classical procedures and allow substantive prior information about plausible ranges of $n$ and $p$ to enter the analysis.

A practically attractive alternative is the empirical Bayes or integrated-likelihood approach, which estimates $n$ jointly with beta hyperparameters for the prior on $p$ by maximizing the beta-binomial likelihood (Carroll and Lombard, 1985; DasGupta and Rubin, 2005). Such methods are often computationally convenient and can perform well in small-sample settings where direct likelihood-based inference is erratic.

Because our incomplete-table problem involves missing denominators rather than merely missing cell probabilities, the binomial- $n$ literature provides a natural modeling foundation. In particular, the unknown stratum totals $n_{1}$ and $n_{2}$ can be viewed as latent trial counts that must be inferred from partially observed binomial information under suitable prior structure.

A recent and comprehensive review of estimation in the binomial- $n$ problem, including classical, Bayesian, empirical Bayes, and computational aspects, is given by Georgieva and Vidakovic (2025).

Incomplete Diagnostic Tables

Published diagnostic studies do not always report the full set of cell counts needed to reconstruct the $2\times 2$ table. When only a subset of the cells is available, quantities such as sensitivity, specificity, negative predictive value, and overall accuracy may no longer be identified from the published data alone. In some cases, positive predictive value can still be computed from the reported true positives and false positives, but the absence of information on false negatives and true negatives prevents recovery of the complete diagnostic table.

The two examples below illustrate that incomplete $2\times 2$ tables may arise in more than one way. In the first case, the missingness is linked to study design: only test-positive subjects underwent verification with the reference standard, so the test-negative row is unobserved. This setting is naturally connected to the literature on partial verification and work-up bias (de Groot et al., 2011; Buzoianu and Kadane, 2008; Cronin and Vickers, 2008; Kohn, 2022; Umemneku Chikere et al., 2019). In the second case, the total sample size is known and the published report provides counts for true positives and false positives, but the negative-class counts are omitted. This is better viewed as an incomplete reporting problem with a known total, which leads to a different reconstruction strategy. Together, these two cases motivate the models developed in the next section.

Case 1: Partial Verification

Svirsky et al. (2002) compared computer-assisted oral brush biopsy results with follow-up scalpel biopsy and histology in order to estimate the positive predictive value of an abnormal brush-biopsy result. Among 243 patients with abnormal brush-biopsy findings who then underwent scalpel biopsy, 93 were confirmed as dysplasia or carcinoma by histology and 150 were histology negative. Thus, PPV can be calculated directly within the test-positive subgroup as $93/243\approx 0.38$ .

The difficulty is that only patients with abnormal brush-biopsy results underwent the reference standard. Patients with normal brush-biopsy results were not verified by histology, so the entire test-negative row is unobserved. As a consequence, the numbers of false negatives and true negatives are unknown, and the full $2\times 2$ table cannot be reconstructed from the published report. Sensitivity, specificity, negative predictive value, and overall accuracy therefore remain unidentified. In this sense, Case 1 is not merely an incomplete-table problem. It is also a partial-verification design, because the reference standard was applied only to a selected subgroup.

	Histology Positive	Histology Negative	Total
Brush biopsy abnormal (test positive)	93 (TP)	150 (FP)	243
Brush biopsy normal (test negative)	? (FN)	? (TN)	?
Total	$n_{1}=?$	$n_{2}=?$	$N=?$

Table 2: Available and missing information from Svirsky et al. (2002). Only patients with abnormal brush-biopsy results underwent scalpel biopsy and histology, so the test-negative row and all column and overall totals are unobserved.

Although the system has at times been described in later discussions as AI-assisted, the technology in Svirsky et al. (2002) is more accurately viewed as an early rule-based computer-assisted diagnostic tool rather than artificial intelligence in the contemporary sense.

Case 2: Incomplete Reporting with Known $N$

Wismueller et al. (2020) evaluated an AI-based system for detecting intracranial hemorrhage on emergent head CT scans. The paper reports that 105 of 122 AI-positive cases were true positives, so PPV can be calculated as $105/122\approx 0.86$ . The study also reports the total number of scans, namely $N=620$ .

However, the total number of actual hemorrhage cases, $n_{1}=\mathrm{TP}+\mathrm{FN}$ , is not reported, and neither are the counts of false negatives and true negatives. Consequently, sensitivity and specificity cannot be computed directly, and the complete $2\times 2$ table cannot be reconstructed from the published data alone.

	ICH Present	ICH Absent	Total
AI positive (flagged as ICH)	105 (TP)	17 (FP)	122
AI negative (not flagged)	? (FN)	? (TN)	?
Total	$n_{1}=?$	$n_{2}=?$	$N=620$

Table 3: Partially observed

2\times 2

table implied by Wismueller et al. (2020). The publication reports the AI-positive counts and the total sample size

N=620

, but the negative-class counts and diseased/non-diseased totals remain unknown.

Case 2 differs from Case 1 in an important way. Here the main obstacle is not selective verification of the reference standard, but incomplete reporting despite a known total sample size. Because $N$ is available, the missing diseased and non-diseased totals are linked through the identity $n_{1}+n_{2}=N$ . This makes Case 2 a constrained reconstruction problem rather than a partial-verification design. From a modeling point of view, that structural constraint provides information that is absent in Case 1.

These two examples therefore represent distinct forms of incomplete diagnostic reporting. Case 1 combines incomplete reporting with selective verification, whereas Case 2 is an incomplete-table problem with known total sample size. The distinction is important because it determines how much structural information is available for reconstruction and, consequently, what type of model is appropriate.

Models

We consider two settings for reconstructing incomplete $2\times 2$ diagnostic tables. The first arises when only one row of the table is observed, typically the test-positive row, and the corresponding denominators are unreported. The second arises when $\mathrm{TP}$ , $\mathrm{FP}$ , and the total sample size $N$ are known, so that the missing cells are linked through the constraint $n_{1}+n_{2}=N$ .

Independent Binomial- $n$ Reconstruction for a Single Observed Row

Suppose that only one test-outcome row is reported, with observed counts in the diseased and non-diseased columns. Let $y$ denote the observed count in one column of that row, and let $n$ denote the corresponding unreported denominator. The interpretation of $p$ depends on the column under analysis. If $y$ is the number of true positives among diseased subjects, then $p$ is sensitivity and $n=n_{1}$ . If $y$ is the number of false positives among non-diseased subjects, then $p$ is the false positive rate and $n=n_{2}$ .

We model the observed count as

\displaystyle y\mid n,p

\displaystyle\sim

\displaystyle\mathrm{Bin}(n,p),\qquad n\geq y.

(4)

The success probability is assigned a beta prior

\displaystyle p

\displaystyle\sim

\displaystyle\mathrm{Beta}(\alpha,\beta),

(5)

with hyperparameters chosen to reflect plausible values of sensitivity or false positive rate, depending on the column.

To regularize the unknown denominator, we assign a truncated negative-binomial prior in the WinBUGS parameterization,

\displaystyle n\mid p^{\star},r

\displaystyle\sim

\displaystyle\mathrm{NegBin}(p^{\star},r)\,\mathbf{1}\{n\geq y\},

(6)

where $\mathrm{NegBin}(p^{\star},r)$ denotes the number of failures before $r$ successes with success probability $p^{\star}$ . Before truncation,

\mathbb{E}[n]=\frac{r(1-p^{\star})}{p^{\star}},\qquad\mathrm{Var}(n)=\frac{r(1-p^{\star})}{p^{\star 2}}.

We complete the hierarchy with

$\displaystyle r\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda),$	(7)
$\displaystyle\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Gamma}(a,b),$	(8)
$\displaystyle p^{\star}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha^{\star},\beta^{\star}).$	(9)

Posterior inference for $(n,p)$ follows from (4)-(9). If the observed count is $\mathrm{TP}$ , then the missing cell is

\mathrm{FN}=n-\mathrm{TP}.

If the observed count is $\mathrm{FP}$ , then the missing cell is

\mathrm{TN}=n-\mathrm{FP}.

When both columns of the reported row are available, we fit this model separately to the diseased and non-diseased strata. This yields posterior inference for $n_{1}$ and $n_{2}$ , and hence for the missing cells.

Identifiability and prior sensitivity.

With only a single binomial observation, $n$ and $p$ are only weakly identified from the likelihood. The beta prior on $p$ and the negative-binomial hierarchy on $n$ provide the regularization needed for posterior inference. In practice, sensitivity analyses over $(\alpha,\beta)$ and $(\alpha^{\star},\beta^{\star},a,b)$ are important and can be summarized through posterior intervals for $n$ and the derived missing counts.

The WinBUGS/OpenBUGS implementation of this single-column model is provided in the Supplemental File and can be used twice in Case 1 type applications, once for the diseased column and once for the non-diseased column. For the diseased stratum, $(\alpha,\beta)$ may encode plausible sensitivity values; for the non-diseased stratum, it may encode plausible false positive rates.

Reconstruction of the Full $2\times 2$ Table Given $\mathrm{TP}$ , $\mathrm{FP}$ , and $N$

We now consider the setting in which $\mathrm{TP}$ , $\mathrm{FP}$ , and the total sample size $N$ are reported. Let

n_{1}=\mathrm{TP}+\mathrm{FN},\qquad n_{2}=\mathrm{FP}+\mathrm{TN},\qquad n_{1}+n_{2}=N.

Once $n_{1}$ is inferred, the remaining quantities follow from

n_{2}=N-n_{1},\qquad\mathrm{FN}=n_{1}-\mathrm{TP},\qquad\mathrm{TN}=n_{2}-\mathrm{FP}.

The likelihood is

	$\displaystyle\mathrm{TP}\mid n_{1},p_{1}$	$\displaystyle\sim$	$\displaystyle\mathrm{Bin}(n_{1},p_{1}),$		(10)
	$\displaystyle\mathrm{FP}\mid n_{2},p_{2}$	$\displaystyle\sim$	$\displaystyle\mathrm{Bin}(n_{2},p_{2}),$		(11)

where $p_{1}$ is sensitivity and $1-p_{2}$ is specificity. The three models below share this same likelihood and differ only in the prior assigned to $n_{1}$ .

Model 1: Discrete uniform prior.

A non-informative baseline model assigns equal prior mass to each feasible value of $n_{1}$ :

	$\displaystyle n_{1}$	$\displaystyle\sim$	$\displaystyle\mathrm{Uniform}\{1,\dots,N-1\},$		(12)
	$\displaystyle n_{2}$	$\displaystyle=$	$\displaystyle N-n_{1},$		(13)

with independent beta priors

	$\displaystyle p_{1}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(a_{1},b_{1}),$		(14)
	$\displaystyle p_{2}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(a_{2},b_{2}).$		(15)

Model 2: Truncated Poisson prior.

To favor moderate values of $n_{1}$ , we replace the discrete uniform prior by a truncated Poisson prior:

$\displaystyle n_{1}\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda)\,\mathbf{1}\{1\leq n_{1}\leq N-1\},$	(16)
$\displaystyle n_{2}$	$\displaystyle=$	$\displaystyle N-n_{1},$	(17)
$\displaystyle\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Gamma}(a_{\lambda},b_{\lambda}),$	(18)

again with independent beta priors on $p_{1}$ and $p_{2}$ .

Model 3: Truncated negative-binomial prior.

To allow additional dispersion in the diseased stratum size, we use a truncated negative-binomial prior:

$\displaystyle n_{1}\mid p_{3},r$	$\displaystyle\sim$	$\displaystyle\mathrm{NegBin}(p_{3},r)\,\mathbf{1}\{\mathrm{TP}\leq n_{1}\leq N-1\},$	(19)
$\displaystyle n_{2}$	$\displaystyle=$	$\displaystyle N-n_{1},$	(20)
$\displaystyle p_{3}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(a_{3},b_{3}),$	(21)
$\displaystyle r$	$\displaystyle\sim$	$\displaystyle\mathrm{Gamma}(a_{r},b_{r}),$	(22)

together with independent beta priors on $p_{1}$ and $p_{2}$ .

Comparison of the three priors.

Model 1 provides a flat baseline over the feasible values of $n_{1}$ . Model 2 introduces mild regularization through the Poisson mean $\lambda$ . Model 3 allows heavier tails and greater dispersion through $(p_{3},r)$ , and is therefore more flexible when disease prevalence is uncertain or substantial imbalance between strata is plausible.

In all three cases, posterior inference for $n_{1}$ determines $n_{2}$ , $\mathrm{FN}$ , and $\mathrm{TN}$ , thereby yielding a reconstructed $2\times 2$ table and allowing calculation of sensitivity, specificity, predictive values, and accuracy.

WinBUGS code for all three variants is provided in the Supplemental File.

Empirical Application

To evaluate the proposed models, we applied them to a complete contingency table from a breast MRI study (Langlotz, 2003). The dataset consists of 182 women with clinically or mammographically suspicious lesions, all of whom underwent biopsy, taken here as the reference standard. A true positive (TP) denotes an MRI-positive case with malignancy confirmed on biopsy, a false positive (FP) an MRI-positive case with benign biopsy, a false negative (FN) an MRI-negative case with malignancy on biopsy, and a true negative (TN) an MRI-negative case with benign biopsy.

Table 4 gives the complete $2\times 2$ table. Because the full table is known, this example permits direct comparison between reconstructed and true counts.

Table 4: Patient data from the breast MRI study (Langlotz, 2003).

MRI Result	Malignant	Benign	Total
Positive	71	28	99
Negative	3	80	83
Total	74	108	182

Single-Row Reconstruction

The next two subsections apply the single-row model separately to the diseased and non-diseased strata. We assume that only the first row of Table 4 is available, namely $\mathrm{TP}=71$ malignant and $\mathrm{FP}=28$ benign cases among MRI-positive patients. The stratum totals $n_{1}$ and $n_{2}$ are then treated as unknown and estimated from the corresponding single-row models.

Diseased Stratum

For the diseased stratum, we observe $\mathrm{TP}=71$ and model

\displaystyle\mathrm{TP}\mid n_{1},p

\displaystyle\sim

\displaystyle\mathrm{Bin}(n_{1},p),

(23)

where $p$ is the sensitivity of MRI. The denominator $n_{1}$ is assigned the truncated negative-binomial prior

\displaystyle n_{1}\mid p^{\star},r

\displaystyle\sim

\displaystyle\mathrm{NegBin}(p^{\star},r)\,\mathbf{1}\{n_{1}\geq\mathrm{TP}\},

(24)

with hierarchical priors

$\displaystyle r\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda),\qquad\lambda\sim\mathrm{Gamma}(a,b),$	(25)
$\displaystyle p^{\star}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha^{\star},\beta^{\star}),$	(26)
$\displaystyle p$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha,\beta).$	(27)

For this example we use

\displaystyle a=1,\;b=0.1,\qquad\alpha=2,\;\beta=1,\qquad\alpha^{\star}=1,\;\beta^{\star}=1.

(28)

The prior $\mathrm{Beta}(2,1)$ on $p$ reflects the expectation of moderate to high sensitivity without being overly restrictive. The prior on $\lambda$ is diffuse, and the uniform $\mathrm{Beta}(1,1)$ prior on $p^{\star}$ allows the negative-binomial prior on $n_{1}$ to adapt to the data.

MCMC sampling is initialized at $n_{1}=100$ , $r=70$ , $\lambda=70$ , and $p=0.5$ .

Table 5: Posterior summary: diseased stratum, single-row model (100,000 MCMC iterations).

Parameter	Mean	SD	2.5%	Median	97.5%
$\lambda$	16.25	11.59	1.49	13.74	45.32
$n_{1}$	76.57	8.73	71.0	74.0	99.0
$p$	0.926	0.080	0.705	0.951	0.998
$p^{\star}$	0.178	0.104	0.019	0.165	0.410
$r$	16.88	12.05	1.0	14.0	47.0

The posterior mean of $n_{1}$ is 76.6, close to the true value of 74, and the 95% credible interval $(71,99)$ contains the truth. The posterior mean of $p$ is 0.93, consistent with high MRI sensitivity. The implied estimate $\mathrm{FN}=n_{1}-\mathrm{TP}\approx 6$ is reasonably close to the true value of 3. Together with the observed $\mathrm{FP}=28$ , this yields a plausible near-complete reconstruction of the diagnostic table.

Non-Diseased Stratum

For the benign stratum, we observe $\mathrm{FP}=28$ and estimate the unreported denominator $n_{2}$ together with the false positive rate $p$ . The model is

$\displaystyle\mathrm{FP}\mid n_{2},p$	$\displaystyle\sim$	$\displaystyle\mathrm{Bin}(n_{2},p),$	(29)
$\displaystyle n_{2}\mid p^{\star},r$	$\displaystyle\sim$	$\displaystyle\mathrm{NegBin}(p^{\star},r)\,\mathbf{1}\{n_{2}\geq\mathrm{FP}\},$	(30)
$\displaystyle r\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda),\qquad\lambda\sim\mathrm{Gamma}(a,b),$	(31)
$\displaystyle p^{\star}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha^{\star},\beta^{\star}),\qquad p\sim\mathrm{Beta}(\alpha,\beta).$	(32)

Here we set

\displaystyle a=2,\;b=1,\qquad\alpha=2,\;\beta=5,\qquad\alpha^{\star}=1,\;\beta^{\star}=50.

(33)

The prior $\mathrm{Beta}(2,5)$ reflects the expectation that the false positive rate is below $0.5$ while remaining flexible. The small prior mean of $p^{\star}$ places substantial mass on larger values of $n_{2}$ , consistent with the expectation that the non-diseased stratum may be larger than the diseased stratum.

Sampling is initialized at $n_{2}=100$ , $r=70$ , $\lambda=70$ , and $p=0.5$ .

Table 6: Posterior summary: non-diseased stratum, single-row model (100,000 MCMC iterations).

Parameter	Mean	SD	2.5%	Median	97.5%
$\lambda$	1.712	1.205	0.200	1.439	4.723
$n_{2}$	106.0	75.3	40.0	86.0	295.0
$p$	0.336	0.149	0.094	0.319	0.664
$p^{\star}$	0.016	0.015	0.0004	0.011	0.054
$r$	1.428	1.556	0.0	1.0	5.0

The posterior mean of $n_{2}$ is 106, close to the true total of 108, although the credible interval is wide. This reflects the limited information contained in a single observed cell together with the intentionally overdispersed prior. The posterior mean of $p$ is 0.34, consistent with a moderate false positive rate.

The resulting estimate implies $\mathrm{TN}=n_{2}-\mathrm{FP}\approx 78$ , close to the true value of 80. Combining this with the estimated diseased total produces the reconstructed table in Table 7.

Table 7: Reconstructed

2\times 2

table from the single-row model.

	Malignant	Benign
MRI positive	71	28
MRI negative	6	78

Even with only one observed row, the hierarchical Bayesian model recovers plausible denominators and yields a reasonable approximation to the full diagnostic structure, albeit with substantial uncertainty in the non-diseased stratum.

Single-Stratum Models with Known $N$

We next examine the same strata when the total sample size is treated as known. This adds the design constraint that each stratum size must lie below $N$ , which changes the posterior behavior, especially for the non-diseased group.

Diseased Stratum with Known $N$

For the diseased stratum we use

$\displaystyle\mathrm{TP}\mid n_{1},p$	$\displaystyle\sim$	$\displaystyle\mathrm{Binomial}(n_{1},p),$	(34)
$\displaystyle n_{1}\mid p^{\star},r$	$\displaystyle\sim$	$\displaystyle\mathrm{Negative\text{-}Binomial}(p^{\star},r)\,\mathbf{1}\{\mathrm{TP}\leq n_{1}\leq N\},$	(35)
$\displaystyle r\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda),\qquad\lambda\sim\mathrm{Gamma}(a,b),$	(36)
$\displaystyle p^{\star}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha^{\star},\beta^{\star}),\qquad p\sim\mathrm{Beta}(\alpha,\beta).$	(37)

The truncation $\mathrm{TP}\leq n_{1}\leq N$ enforces compatibility with both the observed true positives and the known study size.

We again set

\displaystyle a=1,\;b=0.1,\qquad\alpha=2,\;\beta=1,\qquad\alpha^{\star}=1,\;\beta^{\star}=1.

(38)

These choices favor moderate to high sensitivity while keeping the prior on $n_{1}$ fairly diffuse. MCMC sampling is initialized at $n_{1}=100$ , $r=70$ , $\lambda=70$ , $p=0.5$ , and $p^{\star}=0.5$ .

Table 8: Posterior summary: diseased stratum, known

N

(900,000 MCMC iterations).

Parameter	Mean	SD	2.5%	Median	97.5%
$\lambda$	16.68	12.02	1.963	13.89	46.98
$n_{1}$	76.34	7.792	71.0	74.0	97.0
$p$	0.927	0.076	0.716	0.952	0.998
$p^{\star}$	0.182	0.106	0.026	0.167	0.420
$r$	17.35	12.50	2.082	14.45	48.84

The posterior mean of $n_{1}$ is 76.3, again close to the true diseased count of 74 and contained within the 95% credible interval. This estimate is nearly identical to that obtained under the unconstrained single-row model, suggesting that inference on $n_{1}$ is driven mainly by the observed true positives rather than by the upper-bound constraint. The posterior for $p$ remains concentrated near high sensitivity.

Non-Diseased Stratum with Known $N$

For the non-diseased stratum, the observed count is $\mathrm{FP}=28$ and the inferential target is the number of benign cases $n_{2}\leq N$ together with the false positive rate $p$ . The model is

$\displaystyle\mathrm{FP}\mid n_{2},p$	$\displaystyle\sim$	$\displaystyle\mathrm{Binomial}(n_{2},p),$	(39)
$\displaystyle n_{2}\mid p^{\star},r$	$\displaystyle\sim$	$\displaystyle\mathrm{NegBin}(p^{\star},r)\,\mathbf{1}\{\mathrm{FP}\leq n_{2}\leq N\},$	(40)
$\displaystyle r\mid\lambda$	$\displaystyle\sim$	$\displaystyle\mathrm{Poisson}(\lambda),\qquad\lambda\sim\mathrm{Gamma}(a,b),$	(41)
$\displaystyle p^{\star}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(\alpha^{\star},\beta^{\star}),\qquad p\sim\mathrm{Beta}(\alpha,\beta).$	(42)

We use

\displaystyle a=2,\;b=1,\qquad\alpha=2,\;\beta=5,\qquad\alpha^{\star}=1,\;\beta^{\star}=50.

(43)

The prior on $p$ again reflects the expectation of a modest false positive rate, while the prior on $p^{\star}$ favors larger values of $n_{2}$ without allowing arbitrarily large realizations once the upper bound $N$ is imposed.

Sampling is initialized at $n_{2}=100$ , $r=70$ , $\lambda=70$ , $p=0.5$ , and $p^{\star}=0.5$ .

Table 9: Posterior summary: non-diseased stratum, known

N

(900,000 MCMC iterations).

Parameter	Mean	SD	2.5%	Median	97.5%
$\lambda$	2.092	1.249	0.387	1.854	5.147
$n_{2}$	80.36	32.39	38.0	73.0	162.0
$p$	0.388	0.142	0.165	0.373	0.697
$p^{\star}$	0.025	0.018	0.002	0.021	0.072
$r$	2.186	1.438	0.304	1.888	5.749

Imposing $N$ as an upper bound substantially concentrates the posterior for $n_{2}$ relative to the unconstrained single-row model. In the unconstrained analysis, the heavy right tail pushed the posterior mean toward the true value of 108, albeit with considerable uncertainty. Under the bounded model, those extreme values are removed, producing a posterior mean of 80.4 and a median of 73. Thus, the known- $N$ constraint improves stability and interpretability, but in this example it reduces point-estimation accuracy for the non-diseased stratum.

Joint $\mathrm{TP}/\mathrm{FP}$ Model with Fixed $N$

We finally consider a joint model in which $\mathrm{TP}=71$ and $\mathrm{FP}=28$ are analyzed simultaneously under the fixed total $N=182$ . The inferential targets are $n_{1}$ , $n_{2}=N-n_{1}$ , sensitivity $p_{1}$ , and false positive rate $p_{2}$ .

The model is

$\displaystyle\mathrm{TP}\mid n_{1},p_{1}$	$\displaystyle\sim$	$\displaystyle\mathrm{Binomial}(n_{1},p_{1}),$	(44)
$\displaystyle\mathrm{FP}\mid n_{2},p_{2}$	$\displaystyle\sim$	$\displaystyle\mathrm{Binomial}(n_{2},p_{2}),$	(45)
$\displaystyle n_{1}\mid p_{3},r$	$\displaystyle\sim$	$\displaystyle\mathrm{Negative\text{-}Binomial}(p_{3},r)\,\mathbf{1}\{\mathrm{TP}\leq n_{1}\leq N-1\},$	(46)
$\displaystyle n_{2}$	$\displaystyle=$	$\displaystyle N-n_{1},$	(47)
$\displaystyle p_{1}$	$\displaystyle\sim$	$\displaystyle\mathrm{Beta}(a_{1},b_{1}),\qquad p_{2}\sim\mathrm{Beta}(a_{2},b_{2}),\qquad p_{3}\sim\mathrm{Beta}(a_{3},b_{3}),$	(48)
$\displaystyle r$	$\displaystyle\sim$	$\displaystyle\mathrm{Gamma}(0.1,0.01).$	(49)

The fixed- $N$ constraint couples the two strata and enforces coherence across the reconstructed table.

We set

\displaystyle a_{1}=1,\;b_{1}=0.1,\qquad a_{2}=0.1,\;b_{2}=1,\qquad a_{3}=0.1,\;b_{3}=0.5,

(50)

with $r\sim\mathrm{Gamma}(0.1,0.01)$ . The prior on $p_{1}$ places substantial mass near one, the prior on $p_{2}$ favors smaller false positive rates, and the prior on $p_{3}$ controls dispersion in the negative-binomial prior for $n_{1}$ . MCMC is initialized at $n_{1}=80$ , $r=20$ , and $p_{1}=p_{2}=p_{3}=0.5$ .

Table 10: Posterior summary: joint

\mathrm{TP}/\mathrm{FP}

model with fixed

N

(900,000 MCMC iterations).

Parameter	Mean	SD	2.5%	Median	97.5%
$n_{1}$	73.04	8.205	71.0	71.0	93.0
$n_{2}$	109.0	8.205	89.0	111.0	111.0
$p_{1}$	0.978	0.067	0.768	1.000	1.000
$p_{2}$	0.089	0.197	${<}10^{-3}$	${<}10^{-3}$	0.782
$p_{3}$	0.144	0.187	0.014	0.059	0.673
$r$	19.64	38.84	0.039	4.435	143.1

Under this joint specification, posterior inference for the stratum totals is both concentrated and internally consistent: the posterior means are $n_{1}=73.0$ and $n_{2}=109.0$ , differing from the true values by at most one individual. The posterior for $p_{1}$ concentrates near one, while the posterior for $p_{2}$ is centered near zero but retains a right tail, reflecting residual uncertainty in the non-diseased stratum.

Compared with the preceding analyses, the joint model combines the information in $\mathrm{TP}$ and $\mathrm{FP}$ under the fixed- $N$ constraint and therefore avoids the instability of fitting the two strata separately. In this example, it provides the most balanced reconstruction, with realistic stratum sizes and appropriately quantified uncertainty.

Conclusions

We have developed hierarchical Bayesian models for reconstructing incomplete $2\times 2$ diagnostic tables in settings where only partial cell counts are reported. The proposed framework covers both the single-row setting, in which the denominators of the diseased and non-diseased strata are unobserved, and the constrained setting in which the total sample size $N$ is known. By combining binomial likelihoods with flexible priors on the latent denominators and diagnostic probabilities, the models provide a coherent way to infer missing cells and derived accuracy measures from incomplete published information.

The empirical application illustrates both the potential and the limitations of this approach. When only the test-positive row is observed, posterior inference can recover plausible values for the missing denominators and yield a reasonable reconstruction of the full diagnostic table, although uncertainty may remain substantial, especially for the non-diseased stratum. When the total sample size is known, the additional structural constraint can sharpen inference, and in the joint fixed- $N$ formulation the reconstructed stratum sizes are close to the true values in the benchmark example. At the same time, the results also show that reconstruction accuracy depends on the amount of information available and on the prior specification, particularly in weakly identified single-row settings.

The main contribution of the paper is therefore methodological and practical rather than purely deterministic. The proposed models do not claim to identify a unique missing table from sparse summaries alone. Rather, they provide a principled Bayesian framework for posterior inference on plausible completions of incompletely reported diagnostic tables, together with uncertainty quantification for the reconstructed cells and the resulting operating characteristics.

A further point concerns likelihood specification. One might consider a multinomial likelihood for the full $2\times 2$ table as an alternative starting point. Within the settings studied here, however, this does not appear to yield a substantive advantage over the binomial formulations already used. Once the unobserved cells are treated as latent and the structural constraints are imposed, the multinomial representation does not contribute additional identifiability beyond that already supplied by the observed counts, the prior structure, and, when available, the known total sample size.

More broadly, the work highlights the continuing importance of complete and transparent reporting in diagnostic accuracy studies. When denominators or entire rows of the diagnostic table are omitted, clinically relevant measures may become unavailable without additional modeling assumptions. Bayesian reconstruction cannot replace good reporting practice, but it can provide a useful inferential tool when incomplete reporting prevents direct recovery of the full table.

Data and code availability.

Reproducible code supporting the analyses in this manuscript, including the WinBUGS and R scripts used in the empirical analyses, is available at https://github.com/saraantonijevic/bayesian_diagnostic_table-reconstruction. A standalone appendix containing the full Bayesian analyses of the Svirsky (2002) and Wismueller (2020) motivating examples, including posterior summaries and reconstructed $2\times 2$ tables, is also provided at the same repository.

Acknowledgment. B. Vidakovic and S. Antonijavic acknowledge support from the National Science Foundation under Grant No. 2515246 at Texas A&M University.

References

F. Bayoud (2011) Bayesian and empirical Bayes estimation of binomial $n$ under truncated Poisson priors. Journal of Statistical Computation and Simulation 81, pp. 121–135. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
P. M. Bossuyt, J. B. Reitsma, D. E. Bruns, C. A. Gatsonis, P. P. Glasziou, L. Irwig, and et al. (2015) STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 351, pp. h5527. External Links: Document, Link Cited by: Introduction.
M. Buzoianu and J. B. Kadane (2008) Adjusting for verification bias in diagnostic test evaluation: a Bayesian approach. Statistics in Medicine 27 (13), pp. 2453–2473. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
R. J. Carroll and F. Lombard (1985) A Bayesian approach to the binomial $n$ problem using the integrated likelihood. Biometrika 72, pp. 583–590. Cited by: Bayesian Approaches to the Binomial $n$ Problem, Bayesian Approaches to the Binomial $n$ Problem.
J. F. Cohen, D. A. Korevaar, D. G. Altman, D. E. Bruns, C. A. Gatsonis, L. Hooft, and et al. (2016) STARD 2015 guidelines for reporting diagnostic accuracy studies: Explanation and elaboration. BMJ Open 6 (11), pp. e012799. External Links: Document, Link Cited by: Introduction.
A. M. Cronin and A. J. Vickers (2008) Statistical methods to correct for verification bias in diagnostic studies. Statistics in Medicine 27 (23), pp. 4670–4685. External Links: Document Cited by: Incomplete Diagnostic Tables.
A. DasGupta and D. B. Rubin (2005) Improved moment and maximum likelihood estimators for the binomial $n$ problem. Statistica Sinica 15, pp. 709–722. Cited by: Bayesian Approaches to the Binomial $n$ Problem, Bayesian Approaches to the Binomial $n$ Problem.
J. A. H. de Groot, P. M. M. Bossuyt, J. B. Reitsma, A. W. S. Rutjes, N. Dendukuri, K. J. M. Janssen, and K. G. M. Moons (2011) Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ 343, pp. d4770. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
N. R. Draper and I. Guttman (1978) Bayesian estimation of binomial $n$ with beta prior for $p$ . Technometrics 20, pp. 217–222. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
EQUATOR Network (2015) STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Note: Checklist and resources available at https://www.equator-network.org/reporting-guidelines/stard/Accessed 2025-10-21 Cited by: Introduction.
P. Eusebi (2013) Diagnostic accuracy measures. Cerebrovascular Diseases 36 (4), pp. 267–272. External Links: Document Cited by: Review and Notation for $2\times 2$ Diagnostic Accuracy Tables and the Binomial $n$ Problem.
M. Georgieva and B. Vidakovic (2025) Revisiting estimation of number of trials in the Binomial $(n,p)$ problem with a single observation. International Statistical Review 93 (2), pp. 246–266. External Links: Document, Link Cited by: Bayesian Approaches to the Binomial $n$ Problem.
F. Günel and R. Chilko (2000) Continuous approximations for Bayesian estimation of binomial $n$ . Computational Statistics 15 (3), pp. 345–361. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
J. B. S. Haldane (1945) On a method of estimating $n$ and $p$ in the binomial distribution. Biometrika 33, pp. 264–274. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
M. A. Kohn (2022) Partial verification bias and test result-based sampling. Journal of Clinical Epidemiology 145, pp. 179–182. External Links: Document Cited by: Incomplete Diagnostic Tables.
C. P. Langlotz (2003) Fundamental measures of diagnostic examination performance: usefulness for clinical decision making and research. Radiology 228, pp. 3–9. External Links: Document Cited by: Review and Notation for $2\times 2$ Diagnostic Accuracy Tables and the Binomial $n$ Problem, Table 4, Empirical Application.
P. Macaskill, C. Gatsonis, J. J. Deeks, R. M. Harbord, and Y. Takwoingi (2010) Cochrane handbook for systematic reviews of diagnostic test accuracy, version 1.0, chapter 10. The Cochrane Collaboration. External Links: Link Cited by: Introduction, Introduction.
I. Olkin and A. J. Petkau (1993) Stabilized estimators for the binomial $n$ problem. Journal of Statistical Planning and Inference 37, pp. 89–105. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
A. E. Raftery (1988) A Bayesian approach to the binomial $n$ problem. Journal of the American Statistical Association 83, pp. 703–709. Cited by: Bayesian Approaches to the Binomial $n$ Problem.
H. Rubin (1978) Empirical Bayes estimation of the Binomial $n$ problem. Journal of the American Statistical Association 73 (363), pp. 173–178. External Links: Document Cited by: Bayesian Approaches to the Binomial $n$ Problem.
N. Smidt, A. W. S. Rutjes, D. A. W. M. van der Windt, R. W. J. G. Ostelo, J. B. Reitsma, P. M. Bossuyt, and et al. (2005) Quality of reporting of diagnostic accuracy studies. Radiology 235 (2), pp. 347–353. External Links: Document, Link Cited by: Introduction.
V. Sounderajah, A. Guni, X. Liu, G. S. Collins, A. Karthikesalingam, S. R. Markar, R. M. Golub, A. K. Denniston, S. Shetty, D. Moher, P. M. Bossuyt, A. Darzi, H. Ashrafian, and S. S. Committee (2025) The stard-ai reporting guideline for diagnostic accuracy studies using artificial intelligence. Nature Medicine 31 (10), pp. 3283–3289. External Links: Document Cited by: Introduction.
J. A. Svirsky, H. L. Burns, S. S. Carpenter, and et al. (2002) Comparison of computer-assisted brush biopsy results with follow-up scalpel biopsy and histology. General Dentistry 50 (5), pp. 500–503. Cited by: Introduction, Case 1: Partial Verification, Case 1: Partial Verification, Table 2.
U.S. Food and Drug Administration (2007) Statistical guidance on reporting results from studies evaluating diagnostic tests. Note: Guidance for Industry and FDA Staff, March 13, 2007 External Links: Link Cited by: Introduction.
C. M. Umemneku Chikere, K. Wilson, S. Graziadio, L. Vale, and A. J. Allen (2019) Diagnostic test evaluation methodology: a systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard – an update. PLOS ONE 14 (10), pp. e0223832. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
B. Vidakovic (2017) Engineering Biostatistics: An Introduction using MATLAB and WinBUGS. 1st edition, Wiley, Hoboken, NJ. External Links: ISBN 978-1119168966 Cited by: Review and Notation for $2\times 2$ Diagnostic Accuracy Tables and the Binomial $n$ Problem.
S. J. White, M. Chau, E. Arruzza, M. Ong, H. John, R. Theiss, K. L. Yaxley, and M. To (2025) Assessment of standards for reporting of diagnostic accuracy (stard) 2015 guideline adherence in medical imaging diagnostic accuracy studies published in 2023. Journal of Clinical Epidemiology 179, pp. 111654. External Links: Document Cited by: Introduction.
N. L. Wilczynski, R. B. Haynes, and H. Team (2008) Quality of reporting of diagnostic accuracy studies: no change since stard statement publication, before and after study. Radiology 248 (3), pp. 817–823. External Links: Document, Link Cited by: Introduction.
A. Wismueller, A. M. McKinney, M. A. Riedl, E. J. Rummeny, and R. Wismueller (2020) A prospective randomized clinical trial for measuring radiology study reporting time on artificial intelligence-based detection of intracranial hemorrhage in emergent care head ct. Note: arXiv preprint arXiv:2002.12515Accessed 2025-10-20 External Links: Link Cited by: Introduction, Case 2: Incomplete Reporting with Known $N$ , Table 3.

Bayesian Inference for Incomplete 2×22\times 2 Diagnostic Tables

Abstract

Introduction

Review and Notation for 2×22\times 2 Diagnostic Accuracy Tables and the Binomial nn Problem

Bayesian Approaches to the Binomial nn Problem

Incomplete Diagnostic Tables

Case 1: Partial Verification

Case 2: Incomplete Reporting with Known NN

Models

Independent Binomial-nn Reconstruction for a Single Observed Row

Identifiability and prior sensitivity.

Reconstruction of the Full 2×22\times 2 Table Given TP\mathrm{TP}, FP\mathrm{FP}, and NN

Model 1: Discrete uniform prior.

Model 2: Truncated Poisson prior.

Model 3: Truncated negative-binomial prior.

Comparison of the three priors.

Empirical Application

Single-Row Reconstruction

Diseased Stratum

Non-Diseased Stratum

Single-Stratum Models with Known NN

Diseased Stratum with Known NN

Non-Diseased Stratum with Known NN

Joint TP/FP\mathrm{TP}/\mathrm{FP} Model with Fixed NN

Conclusions

Data and code availability.

References

Bayesian Inference for Incomplete $2\times 2$ Diagnostic Tables

Review and Notation for $2\times 2$ Diagnostic Accuracy Tables and the Binomial $n$ Problem

Bayesian Approaches to the Binomial $n$ Problem

Case 2: Incomplete Reporting with Known $N$

Independent Binomial- $n$ Reconstruction for a Single Observed Row

Reconstruction of the Full $2\times 2$ Table Given $\mathrm{TP}$ , $\mathrm{FP}$ , and $N$

Single-Stratum Models with Known $N$

Diseased Stratum with Known $N$

Non-Diseased Stratum with Known $N$

Joint $\mathrm{TP}/\mathrm{FP}$ Model with Fixed $N$