Bayesian Inference for Incomplete Diagnostic Tables
Abstract
Incomplete reporting of diagnostic accuracy data remains a persistent problem in medical research. In many studies, only part of the diagnostic table is reported, leaving denominators for diseased and non-diseased groups unknown and preventing direct calculation of sensitivity, specificity, predictive values, and related operating characteristics. To address this limitation, we develop hierarchical Bayesian models for reconstructing incomplete diagnostic tables from such partial information. Two motivating scenarios are considered: one in which only a single test-outcome row is observed, and another in which true positives, false positives, and the total sample size are reported but the remaining cells are missing. The proposed models are illustrated on a benchmark breast MRI study with complete counts, treated as partially observed in order to assess reconstruction performance under controlled missingness. The framework yields posterior inference for the missing cell counts and associated diagnostic measures, together with uncertainty quantification in weakly identified settings.
Introduction
Incomplete or partially reported diagnostic accuracy data remain a common obstacle to interpretation, reproducibility, and evidence synthesis. In many applied studies, authors report only fragments of the diagnostic table, for example a single cell count, one observed row, or a pair of summary measures, without providing the full denominators for diseased and non-diseased groups. As a result, readers may be unable to reconstruct the full table, verify internal consistency, or derive clinically relevant quantities such as predictive values, false discovery rates, or expected error counts (U.S. Food and Drug Administration, 2007; Macaskill et al., 2010).
This problem is not merely clerical. Complete cross-tabulation is central to transparent reporting of diagnostic accuracy studies, and the STARD initiative was developed precisely to improve such reporting. The STARD 2015 revision explicitly recommends reporting both the cross-tabulation of index test results against the reference standard and a participant flow diagram showing how the study denominators were obtained (Bossuyt et al., 2015; Cohen et al., 2016; EQUATOR Network, 2015). When these elements are omitted, clinically important operating characteristics may no longer be recoverable from the published record.
Empirical assessments suggest that such omissions remain common. Earlier audits showed that incomplete reporting often prevented reconstruction of the full table (Smidt et al., 2005; Wilczynski et al., 2008), and more recent work indicates that adherence to STARD recommendations remains uneven even in contemporary medical imaging diagnostic accuracy studies (White et al., 2025). For AI-centered diagnostic accuracy studies, the recent STARD-AI extension further underscores the need for clear reporting of dataset construction, model evaluation, and clinical applicability (Sounderajah et al., 2025). The practical consequence is that studies may be difficult to check, compare, or incorporate into evidence syntheses, even when the underlying clinical question is important.
A related methodological literature addresses situations in which the reference standard is applied only to a subset of participants, creating verification problems and the potential for work-up bias (de Groot et al., 2011; Buzoianu and Kadane, 2008; Umemneku Chikere et al., 2019). That literature is important for the present paper because one of our motivating examples arises from precisely such a setting. At the same time, incomplete reporting can also occur in studies where the reference standard is not obviously missing for design reasons, but the published article still omits sufficient cell counts to prevent recovery of the full diagnostic table. Our focus is on this reporting and reconstruction problem.
In practical terms, if only sensitivity and specificity are reported, a full table cannot usually be reconstructed unless additional information is available, such as the total sample size, the number of diseased subjects, or an externally justified prevalence estimate. Because many evidence-synthesis frameworks require study-level tables, missing denominators directly limit downstream meta-analytic use (Macaskill et al., 2010).
This paper develops statistical models for reconstructing incomplete diagnostic tables from partially reported data. Our motivation comes from real examples in the diagnostic accuracy literature where key cells are unobserved in the published report. We consider two representative scenarios. In the first, based on Svirsky et al. (2002), only the test-positive subgroup is reported in detail, leaving both cells of the test-negative row unobserved. This case is naturally connected to the literature on partial verification because the reference standard was applied only to a selected subgroup. In the second, based on Wismueller et al. (2020), counts for true positives and false positives are reported and the total sample size is known, but the negative-class counts are omitted. This second setting is better viewed as a constrained incomplete-table problem, since the known total sample size links the missing denominators.
Our aim is not to replace the broader verification-bias literature, nor to claim that incomplete diagnostic tables can always be uniquely recovered from sparse summaries alone. Rather, we develop Bayesian reconstruction strategies for settings in which deterministic recovery is impossible, but principled posterior inference on the missing denominators and derived operating characteristics remains feasible under clearly stated modeling assumptions.
The remainder of the paper is organized as follows. We first review notation for diagnostic tables and summarize the binomial- problem that underlies our reconstruction strategy. We then present the two motivating incomplete-table scenarios, develop the corresponding models, and illustrate their performance on a benchmark example with complete counts treated as partially observed.
Review and Notation for Diagnostic Accuracy Tables and the Binomial Problem
A diagnostic table, also called a confusion table, is the standard framework for summarizing the performance of an index test against a reference standard. It records the counts of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN), and serves as the basis for the usual measures of diagnostic performance.
| Disease Present | Disease Absent | Total | |
| Test Positive | TP | FP | |
| Test Negative | FN | TN | |
| Total |
From these counts one obtains the familiar diagnostic accuracy measures:
| (1) | ||||||
| (2) | ||||||
| (3) | ||||||
We write for the total number of diseased individuals, for the total number of non-diseased individuals, and for the total sample size. When all four cell counts are available, all standard operating characteristics can be computed directly. When only partial information is reported, for example TP and FP without the corresponding denominators, the full table cannot be reconstructed from the published data alone. In that setting, sensitivity, specificity, and NPV are not identified without additional information or modeling assumptions.
Bayesian Approaches to the Binomial Problem
A central ingredient in our reconstruction problem is the classical binomial- problem: infer the number of trials in a Bin model when is unknown and may also be unknown. This problem is well known to be difficult, especially when only a single observation, or very limited data, are available. Classical estimators based on moments or maximum likelihood can be unstable, particularly when the sample variance is close to the sample mean, and the difficulty becomes more pronounced when is small or is large (Haldane, 1945; Olkin and Petkau, 1993; DasGupta and Rubin, 2005).
Bayesian methods provide a natural way to regularize this problem by introducing prior information on both and . Early work considered simple prior structures such as and , leading to posterior inference for through either posterior modes or posterior means under the chosen loss function (Draper and Guttman, 1978; Carroll and Lombard, 1985). Rubinβs empirical Bayes treatment of the problem was especially influential in motivating later hierarchical formulations (Rubin, 1978).
Subsequent developments introduced more flexible prior models. For example, Raftery (1988) considered a predictive framework with and . Other extensions include beta priors for together with truncated Poisson priors for (Bayoud, 2011), as well as continuous gamma approximations for that simplify computation (GΓΌnel and Chilko, 2000). These approaches reduce the instability of purely classical procedures and allow substantive prior information about plausible ranges of and to enter the analysis.
A practically attractive alternative is the empirical Bayes or integrated-likelihood approach, which estimates jointly with beta hyperparameters for the prior on by maximizing the beta-binomial likelihood (Carroll and Lombard, 1985; DasGupta and Rubin, 2005). Such methods are often computationally convenient and can perform well in small-sample settings where direct likelihood-based inference is erratic.
Because our incomplete-table problem involves missing denominators rather than merely missing cell probabilities, the binomial- literature provides a natural modeling foundation. In particular, the unknown stratum totals and can be viewed as latent trial counts that must be inferred from partially observed binomial information under suitable prior structure.
A recent and comprehensive review of estimation in the binomial- problem, including classical, Bayesian, empirical Bayes, and computational aspects, is given by Georgieva and Vidakovic (2025).
Incomplete Diagnostic Tables
Published diagnostic studies do not always report the full set of cell counts needed to reconstruct the table. When only a subset of the cells is available, quantities such as sensitivity, specificity, negative predictive value, and overall accuracy may no longer be identified from the published data alone. In some cases, positive predictive value can still be computed from the reported true positives and false positives, but the absence of information on false negatives and true negatives prevents recovery of the complete diagnostic table.
The two examples below illustrate that incomplete tables may arise in more than one way. In the first case, the missingness is linked to study design: only test-positive subjects underwent verification with the reference standard, so the test-negative row is unobserved. This setting is naturally connected to the literature on partial verification and work-up bias (de Groot et al., 2011; Buzoianu and Kadane, 2008; Cronin and Vickers, 2008; Kohn, 2022; Umemneku Chikere et al., 2019). In the second case, the total sample size is known and the published report provides counts for true positives and false positives, but the negative-class counts are omitted. This is better viewed as an incomplete reporting problem with a known total, which leads to a different reconstruction strategy. Together, these two cases motivate the models developed in the next section.
Case 1: Partial Verification
Svirsky et al. (2002) compared computer-assisted oral brush biopsy results with follow-up scalpel biopsy and histology in order to estimate the positive predictive value of an abnormal brush-biopsy result. Among 243 patients with abnormal brush-biopsy findings who then underwent scalpel biopsy, 93 were confirmed as dysplasia or carcinoma by histology and 150 were histology negative. Thus, PPV can be calculated directly within the test-positive subgroup as .
The difficulty is that only patients with abnormal brush-biopsy results underwent the reference standard. Patients with normal brush-biopsy results were not verified by histology, so the entire test-negative row is unobserved. As a consequence, the numbers of false negatives and true negatives are unknown, and the full table cannot be reconstructed from the published report. Sensitivity, specificity, negative predictive value, and overall accuracy therefore remain unidentified. In this sense, Case 1 is not merely an incomplete-table problem. It is also a partial-verification design, because the reference standard was applied only to a selected subgroup.
| Histology Positive | Histology Negative | Total | |
| Brush biopsy abnormal (test positive) | 93 (TP) | 150 (FP) | 243 |
| Brush biopsy normal (test negative) | ? (FN) | ? (TN) | ? |
| Total |
Although the system has at times been described in later discussions as AI-assisted, the technology in Svirsky et al. (2002) is more accurately viewed as an early rule-based computer-assisted diagnostic tool rather than artificial intelligence in the contemporary sense.
Case 2: Incomplete Reporting with Known
Wismueller et al. (2020) evaluated an AI-based system for detecting intracranial hemorrhage on emergent head CT scans. The paper reports that 105 of 122 AI-positive cases were true positives, so PPV can be calculated as . The study also reports the total number of scans, namely .
However, the total number of actual hemorrhage cases, , is not reported, and neither are the counts of false negatives and true negatives. Consequently, sensitivity and specificity cannot be computed directly, and the complete table cannot be reconstructed from the published data alone.
| ICH Present | ICH Absent | Total | |
| AI positive (flagged as ICH) | 105 (TP) | 17 (FP) | 122 |
| AI negative (not flagged) | ? (FN) | ? (TN) | ? |
| Total |
Case 2 differs from Case 1 in an important way. Here the main obstacle is not selective verification of the reference standard, but incomplete reporting despite a known total sample size. Because is available, the missing diseased and non-diseased totals are linked through the identity . This makes Case 2 a constrained reconstruction problem rather than a partial-verification design. From a modeling point of view, that structural constraint provides information that is absent in Case 1.
These two examples therefore represent distinct forms of incomplete diagnostic reporting. Case 1 combines incomplete reporting with selective verification, whereas Case 2 is an incomplete-table problem with known total sample size. The distinction is important because it determines how much structural information is available for reconstruction and, consequently, what type of model is appropriate.
Models
We consider two settings for reconstructing incomplete diagnostic tables. The first arises when only one row of the table is observed, typically the test-positive row, and the corresponding denominators are unreported. The second arises when , , and the total sample size are known, so that the missing cells are linked through the constraint .
Independent Binomial- Reconstruction for a Single Observed Row
Suppose that only one test-outcome row is reported, with observed counts in the diseased and non-diseased columns. Let denote the observed count in one column of that row, and let denote the corresponding unreported denominator. The interpretation of depends on the column under analysis. If is the number of true positives among diseased subjects, then is sensitivity and . If is the number of false positives among non-diseased subjects, then is the false positive rate and .
We model the observed count as
| (4) |
The success probability is assigned a beta prior
| (5) |
with hyperparameters chosen to reflect plausible values of sensitivity or false positive rate, depending on the column.
To regularize the unknown denominator, we assign a truncated negative-binomial prior in the WinBUGS parameterization,
| (6) |
where denotes the number of failures before successes with success probability . Before truncation,
We complete the hierarchy with
| (7) | |||||
| (8) | |||||
| (9) |
Posterior inference for follows from (4)-(9). If the observed count is , then the missing cell is
If the observed count is , then the missing cell is
When both columns of the reported row are available, we fit this model separately to the diseased and non-diseased strata. This yields posterior inference for and , and hence for the missing cells.
Identifiability and prior sensitivity.
With only a single binomial observation, and are only weakly identified from the likelihood. The beta prior on and the negative-binomial hierarchy on provide the regularization needed for posterior inference. In practice, sensitivity analyses over and are important and can be summarized through posterior intervals for and the derived missing counts.
The WinBUGS/OpenBUGS implementation of this single-column model is provided in the Supplemental File and can be used twice in Case 1 type applications, once for the diseased column and once for the non-diseased column. For the diseased stratum, may encode plausible sensitivity values; for the non-diseased stratum, it may encode plausible false positive rates.
Reconstruction of the Full Table Given , , and
We now consider the setting in which , , and the total sample size are reported. Let
Once is inferred, the remaining quantities follow from
The likelihood is
| (10) | |||||
| (11) |
where is sensitivity and is specificity. The three models below share this same likelihood and differ only in the prior assigned to .
Model 1: Discrete uniform prior.
A non-informative baseline model assigns equal prior mass to each feasible value of :
| (12) | |||||
| (13) |
with independent beta priors
| (14) | |||||
| (15) |
Model 2: Truncated Poisson prior.
To favor moderate values of , we replace the discrete uniform prior by a truncated Poisson prior:
| (16) | |||||
| (17) | |||||
| (18) |
again with independent beta priors on and .
Model 3: Truncated negative-binomial prior.
To allow additional dispersion in the diseased stratum size, we use a truncated negative-binomial prior:
| (19) | |||||
| (20) | |||||
| (21) | |||||
| (22) |
together with independent beta priors on and .
Comparison of the three priors.
Model 1 provides a flat baseline over the feasible values of . Model 2 introduces mild regularization through the Poisson mean . Model 3 allows heavier tails and greater dispersion through , and is therefore more flexible when disease prevalence is uncertain or substantial imbalance between strata is plausible.
In all three cases, posterior inference for determines , , and , thereby yielding a reconstructed table and allowing calculation of sensitivity, specificity, predictive values, and accuracy.
WinBUGS code for all three variants is provided in the Supplemental File.
Empirical Application
To evaluate the proposed models, we applied them to a complete contingency table from a breast MRI study (Langlotz, 2003). The dataset consists of 182 women with clinically or mammographically suspicious lesions, all of whom underwent biopsy, taken here as the reference standard. A true positive (TP) denotes an MRI-positive case with malignancy confirmed on biopsy, a false positive (FP) an MRI-positive case with benign biopsy, a false negative (FN) an MRI-negative case with malignancy on biopsy, and a true negative (TN) an MRI-negative case with benign biopsy.
TableΒ 4 gives the complete table. Because the full table is known, this example permits direct comparison between reconstructed and true counts.
| MRI Result | Malignant | Benign | Total |
|---|---|---|---|
| Positive | 71 | 28 | 99 |
| Negative | 3 | 80 | 83 |
| Total | 74 | 108 | 182 |
Single-Row Reconstruction
The next two subsections apply the single-row model separately to the diseased and non-diseased strata. We assume that only the first row of TableΒ 4 is available, namely malignant and benign cases among MRI-positive patients. The stratum totals and are then treated as unknown and estimated from the corresponding single-row models.
Diseased Stratum
For the diseased stratum, we observe and model
| (23) |
where is the sensitivity of MRI. The denominator is assigned the truncated negative-binomial prior
| (24) |
with hierarchical priors
| (25) | |||||
| (26) | |||||
| (27) |
For this example we use
| (28) |
The prior on reflects the expectation of moderate to high sensitivity without being overly restrictive. The prior on is diffuse, and the uniform prior on allows the negative-binomial prior on to adapt to the data.
MCMC sampling is initialized at , , , and .
| Parameter | Mean | SD | 2.5% | Median | 97.5% |
|---|---|---|---|---|---|
| 16.25 | 11.59 | 1.49 | 13.74 | 45.32 | |
| 76.57 | 8.73 | 71.0 | 74.0 | 99.0 | |
| 0.926 | 0.080 | 0.705 | 0.951 | 0.998 | |
| 0.178 | 0.104 | 0.019 | 0.165 | 0.410 | |
| 16.88 | 12.05 | 1.0 | 14.0 | 47.0 |
The posterior mean of is 76.6, close to the true value of 74, and the 95% credible interval contains the truth. The posterior mean of is 0.93, consistent with high MRI sensitivity. The implied estimate is reasonably close to the true value of 3. Together with the observed , this yields a plausible near-complete reconstruction of the diagnostic table.
Non-Diseased Stratum
For the benign stratum, we observe and estimate the unreported denominator together with the false positive rate . The model is
| (29) | |||||
| (30) | |||||
| (31) | |||||
| (32) |
Here we set
| (33) |
The prior reflects the expectation that the false positive rate is below while remaining flexible. The small prior mean of places substantial mass on larger values of , consistent with the expectation that the non-diseased stratum may be larger than the diseased stratum.
Sampling is initialized at , , , and .
| Parameter | Mean | SD | 2.5% | Median | 97.5% |
|---|---|---|---|---|---|
| 1.712 | 1.205 | 0.200 | 1.439 | 4.723 | |
| 106.0 | 75.3 | 40.0 | 86.0 | 295.0 | |
| 0.336 | 0.149 | 0.094 | 0.319 | 0.664 | |
| 0.016 | 0.015 | 0.0004 | 0.011 | 0.054 | |
| 1.428 | 1.556 | 0.0 | 1.0 | 5.0 |
The posterior mean of is 106, close to the true total of 108, although the credible interval is wide. This reflects the limited information contained in a single observed cell together with the intentionally overdispersed prior. The posterior mean of is 0.34, consistent with a moderate false positive rate.
The resulting estimate implies , close to the true value of 80. Combining this with the estimated diseased total produces the reconstructed table in TableΒ 7.
| Malignant | Benign | |
|---|---|---|
| MRI positive | 71 | 28 |
| MRI negative | 6 | 78 |
Even with only one observed row, the hierarchical Bayesian model recovers plausible denominators and yields a reasonable approximation to the full diagnostic structure, albeit with substantial uncertainty in the non-diseased stratum.
Single-Stratum Models with Known
We next examine the same strata when the total sample size is treated as known. This adds the design constraint that each stratum size must lie below , which changes the posterior behavior, especially for the non-diseased group.
Diseased Stratum with Known
For the diseased stratum we use
| (34) | |||||
| (35) | |||||
| (36) | |||||
| (37) |
The truncation enforces compatibility with both the observed true positives and the known study size.
We again set
| (38) |
These choices favor moderate to high sensitivity while keeping the prior on fairly diffuse. MCMC sampling is initialized at , , , , and .
| Parameter | Mean | SD | 2.5% | Median | 97.5% |
|---|---|---|---|---|---|
| 16.68 | 12.02 | 1.963 | 13.89 | 46.98 | |
| 76.34 | 7.792 | 71.0 | 74.0 | 97.0 | |
| 0.927 | 0.076 | 0.716 | 0.952 | 0.998 | |
| 0.182 | 0.106 | 0.026 | 0.167 | 0.420 | |
| 17.35 | 12.50 | 2.082 | 14.45 | 48.84 |
The posterior mean of is 76.3, again close to the true diseased count of 74 and contained within the 95% credible interval. This estimate is nearly identical to that obtained under the unconstrained single-row model, suggesting that inference on is driven mainly by the observed true positives rather than by the upper-bound constraint. The posterior for remains concentrated near high sensitivity.
Non-Diseased Stratum with Known
For the non-diseased stratum, the observed count is and the inferential target is the number of benign cases together with the false positive rate . The model is
| (39) | |||||
| (40) | |||||
| (41) | |||||
| (42) |
We use
| (43) |
The prior on again reflects the expectation of a modest false positive rate, while the prior on favors larger values of without allowing arbitrarily large realizations once the upper bound is imposed.
Sampling is initialized at , , , , and .
| Parameter | Mean | SD | 2.5% | Median | 97.5% |
|---|---|---|---|---|---|
| 2.092 | 1.249 | 0.387 | 1.854 | 5.147 | |
| 80.36 | 32.39 | 38.0 | 73.0 | 162.0 | |
| 0.388 | 0.142 | 0.165 | 0.373 | 0.697 | |
| 0.025 | 0.018 | 0.002 | 0.021 | 0.072 | |
| 2.186 | 1.438 | 0.304 | 1.888 | 5.749 |
Imposing as an upper bound substantially concentrates the posterior for relative to the unconstrained single-row model. In the unconstrained analysis, the heavy right tail pushed the posterior mean toward the true value of 108, albeit with considerable uncertainty. Under the bounded model, those extreme values are removed, producing a posterior mean of 80.4 and a median of 73. Thus, the known- constraint improves stability and interpretability, but in this example it reduces point-estimation accuracy for the non-diseased stratum.
Joint Model with Fixed
We finally consider a joint model in which and are analyzed simultaneously under the fixed total . The inferential targets are , , sensitivity , and false positive rate .
The model is
| (44) | |||||
| (45) | |||||
| (46) | |||||
| (47) | |||||
| (48) | |||||
| (49) |
The fixed- constraint couples the two strata and enforces coherence across the reconstructed table.
We set
| (50) |
with . The prior on places substantial mass near one, the prior on favors smaller false positive rates, and the prior on controls dispersion in the negative-binomial prior for . MCMC is initialized at , , and .
| Parameter | Mean | SD | 2.5% | Median | 97.5% |
|---|---|---|---|---|---|
| 73.04 | 8.205 | 71.0 | 71.0 | 93.0 | |
| 109.0 | 8.205 | 89.0 | 111.0 | 111.0 | |
| 0.978 | 0.067 | 0.768 | 1.000 | 1.000 | |
| 0.089 | 0.197 | 0.782 | |||
| 0.144 | 0.187 | 0.014 | 0.059 | 0.673 | |
| 19.64 | 38.84 | 0.039 | 4.435 | 143.1 |
Under this joint specification, posterior inference for the stratum totals is both concentrated and internally consistent: the posterior means are and , differing from the true values by at most one individual. The posterior for concentrates near one, while the posterior for is centered near zero but retains a right tail, reflecting residual uncertainty in the non-diseased stratum.
Compared with the preceding analyses, the joint model combines the information in and under the fixed- constraint and therefore avoids the instability of fitting the two strata separately. In this example, it provides the most balanced reconstruction, with realistic stratum sizes and appropriately quantified uncertainty.
Conclusions
We have developed hierarchical Bayesian models for reconstructing incomplete diagnostic tables in settings where only partial cell counts are reported. The proposed framework covers both the single-row setting, in which the denominators of the diseased and non-diseased strata are unobserved, and the constrained setting in which the total sample size is known. By combining binomial likelihoods with flexible priors on the latent denominators and diagnostic probabilities, the models provide a coherent way to infer missing cells and derived accuracy measures from incomplete published information.
The empirical application illustrates both the potential and the limitations of this approach. When only the test-positive row is observed, posterior inference can recover plausible values for the missing denominators and yield a reasonable reconstruction of the full diagnostic table, although uncertainty may remain substantial, especially for the non-diseased stratum. When the total sample size is known, the additional structural constraint can sharpen inference, and in the joint fixed- formulation the reconstructed stratum sizes are close to the true values in the benchmark example. At the same time, the results also show that reconstruction accuracy depends on the amount of information available and on the prior specification, particularly in weakly identified single-row settings.
The main contribution of the paper is therefore methodological and practical rather than purely deterministic. The proposed models do not claim to identify a unique missing table from sparse summaries alone. Rather, they provide a principled Bayesian framework for posterior inference on plausible completions of incompletely reported diagnostic tables, together with uncertainty quantification for the reconstructed cells and the resulting operating characteristics.
A further point concerns likelihood specification. One might consider a multinomial likelihood for the full table as an alternative starting point. Within the settings studied here, however, this does not appear to yield a substantive advantage over the binomial formulations already used. Once the unobserved cells are treated as latent and the structural constraints are imposed, the multinomial representation does not contribute additional identifiability beyond that already supplied by the observed counts, the prior structure, and, when available, the known total sample size.
More broadly, the work highlights the continuing importance of complete and transparent reporting in diagnostic accuracy studies. When denominators or entire rows of the diagnostic table are omitted, clinically relevant measures may become unavailable without additional modeling assumptions. Bayesian reconstruction cannot replace good reporting practice, but it can provide a useful inferential tool when incomplete reporting prevents direct recovery of the full table.
Data and code availability.
Reproducible code supporting the analyses in this manuscript, including the WinBUGS and R scripts used in the empirical analyses, is available at https://github.com/saraantonijevic/bayesian_diagnostic_table-reconstruction. A standalone appendix containing the full Bayesian analyses of the Svirsky (2002) and Wismueller (2020) motivating examples, including posterior summaries and reconstructed tables, is also provided at the same repository.
Acknowledgment. B. Vidakovic and S. Antonijavic acknowledge support from the National Science Foundation under Grant No.Β 2515246 at TexasΒ A&MΒ University.
References
- Bayesian and empirical Bayes estimation of binomial under truncated Poisson priors. Journal of Statistical Computation and Simulation 81, pp.Β 121β135. Cited by: Bayesian Approaches to the Binomial Problem.
- STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 351, pp.Β h5527. External Links: Document, Link Cited by: Introduction.
- Adjusting for verification bias in diagnostic test evaluation: a Bayesian approach. Statistics in Medicine 27 (13), pp.Β 2453β2473. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
- A Bayesian approach to the binomial problem using the integrated likelihood. Biometrika 72, pp.Β 583β590. Cited by: Bayesian Approaches to the Binomial Problem, Bayesian Approaches to the Binomial Problem.
- STARD 2015 guidelines for reporting diagnostic accuracy studies: Explanation and elaboration. BMJ Open 6 (11), pp.Β e012799. External Links: Document, Link Cited by: Introduction.
- Statistical methods to correct for verification bias in diagnostic studies. Statistics in Medicine 27 (23), pp.Β 4670β4685. External Links: Document Cited by: Incomplete Diagnostic Tables.
- Improved moment and maximum likelihood estimators for the binomial problem. Statistica Sinica 15, pp.Β 709β722. Cited by: Bayesian Approaches to the Binomial Problem, Bayesian Approaches to the Binomial Problem.
- Verification problems in diagnostic accuracy studies: consequences and solutions. BMJ 343, pp.Β d4770. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
- Bayesian estimation of binomial with beta prior for . Technometrics 20, pp.Β 217β222. Cited by: Bayesian Approaches to the Binomial Problem.
- STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. Note: Checklist and resources available at https://www.equator-network.org/reporting-guidelines/stard/Accessed 2025-10-21 Cited by: Introduction.
- Diagnostic accuracy measures. Cerebrovascular Diseases 36 (4), pp.Β 267β272. External Links: Document Cited by: Review and Notation for Diagnostic Accuracy Tables and the Binomial Problem.
- Revisiting estimation of number of trials in the Binomial problem with a single observation. International Statistical Review 93 (2), pp.Β 246β266. External Links: Document, Link Cited by: Bayesian Approaches to the Binomial Problem.
- Continuous approximations for Bayesian estimation of binomial . Computational Statistics 15 (3), pp.Β 345β361. Cited by: Bayesian Approaches to the Binomial Problem.
- On a method of estimating and in the binomial distribution. Biometrika 33, pp.Β 264β274. Cited by: Bayesian Approaches to the Binomial Problem.
- Partial verification bias and test result-based sampling. Journal of Clinical Epidemiology 145, pp.Β 179β182. External Links: Document Cited by: Incomplete Diagnostic Tables.
- Fundamental measures of diagnostic examination performance: usefulness for clinical decision making and research. Radiology 228, pp.Β 3β9. External Links: Document Cited by: Review and Notation for Diagnostic Accuracy Tables and the Binomial Problem, Table 4, Empirical Application.
- Cochrane handbook for systematic reviews of diagnostic test accuracy, version 1.0, chapter 10. The Cochrane Collaboration. External Links: Link Cited by: Introduction, Introduction.
- Stabilized estimators for the binomial problem. Journal of Statistical Planning and Inference 37, pp.Β 89β105. Cited by: Bayesian Approaches to the Binomial Problem.
- A Bayesian approach to the binomial problem. Journal of the American Statistical Association 83, pp.Β 703β709. Cited by: Bayesian Approaches to the Binomial Problem.
- Empirical Bayes estimation of the Binomial problem. Journal of the American Statistical Association 73 (363), pp.Β 173β178. External Links: Document Cited by: Bayesian Approaches to the Binomial Problem.
- Quality of reporting of diagnostic accuracy studies. Radiology 235 (2), pp.Β 347β353. External Links: Document, Link Cited by: Introduction.
- The stard-ai reporting guideline for diagnostic accuracy studies using artificial intelligence. Nature Medicine 31 (10), pp.Β 3283β3289. External Links: Document Cited by: Introduction.
- Comparison of computer-assisted brush biopsy results with follow-up scalpel biopsy and histology. General Dentistry 50 (5), pp.Β 500β503. Cited by: Introduction, Case 1: Partial Verification, Case 1: Partial Verification, Table 2.
- Statistical guidance on reporting results from studies evaluating diagnostic tests. Note: Guidance for Industry and FDA Staff, March 13, 2007 External Links: Link Cited by: Introduction.
- Diagnostic test evaluation methodology: a systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard β an update. PLOS ONE 14 (10), pp.Β e0223832. External Links: Document, Link Cited by: Introduction, Incomplete Diagnostic Tables.
- Engineering Biostatistics: An Introduction using MATLAB and WinBUGS. 1st edition, Wiley, Hoboken, NJ. External Links: ISBN 978-1119168966 Cited by: Review and Notation for Diagnostic Accuracy Tables and the Binomial Problem.
- Assessment of standards for reporting of diagnostic accuracy (stard) 2015 guideline adherence in medical imaging diagnostic accuracy studies published in 2023. Journal of Clinical Epidemiology 179, pp.Β 111654. External Links: Document Cited by: Introduction.
- Quality of reporting of diagnostic accuracy studies: no change since stard statement publication, before and after study. Radiology 248 (3), pp.Β 817β823. External Links: Document, Link Cited by: Introduction.
- A prospective randomized clinical trial for measuring radiology study reporting time on artificial intelligence-based detection of intracranial hemorrhage in emergent care head ct. Note: arXiv preprint arXiv:2002.12515Accessed 2025-10-20 External Links: Link Cited by: Introduction, Case 2: Incomplete Reporting with Known , Table 3.