Last updated: 2026-01-06

Checks: 7 0

Knit directory: fiveMinuteStats/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(12345)

The command set.seed(12345) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: c123ede

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version c123ede. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


working directory clean

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/LR_and_BF.Rmd) and HTML (docs/LR_and_BF.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	f2849bd	Peter Carbonetto	2026-01-06	A few more updates to LR_and_BF.
Rmd	c33776f	Peter Carbonetto	2026-01-06	A few small updates to the LR_and_BF vignette.
html	74deb19	Peter Carbonetto	2026-01-06	Various misc updates.
Rmd	a361e11	Peter Carbonetto	2026-01-06	A bunch of small edits to the bayes_multiclass vignette.
Rmd	7a8ffe4	Peter Carbonetto	2025-12-29	Created pdf version of LR_and_BF vignette.
html	7a8ffe4	Peter Carbonetto	2025-12-29	Created pdf version of LR_and_BF vignette.
html	5f62ee6	Matthew Stephens	2019-03-31	Build site.
Rmd	0cd28bd	Matthew Stephens	2019-03-31	workflowr::wflow_publish(all = TRUE)
html	d251967	stephens999	2018-04-23	Build site.
Rmd	1b4481e	Yue-Jiang	2017-08-16	minor, fix typo
html	34bcc51	John Blischak	2017-03-06	Build site.
Rmd	5fbc8b5	John Blischak	2017-03-06	Update workflowr project with wflow_update (version 0.4.0).
Rmd	391ba3c	John Blischak	2017-03-06	Remove front and end matter of non-standard templates.
html	8e61683	Marcus Davy	2017-03-03	rendered html using wflow_build(all=TRUE)
html	c3b365a	John Blischak	2017-01-02	Build site.
Rmd	67a8575	John Blischak	2017-01-02	Use external chunk to set knitr chunk options.
Rmd	5ec12c7	John Blischak	2017-01-02	Use session-info chunk.
Rmd	bd6aa11	Nan Xiao	2016-01-31	formula typo fix
Rmd	93e619c	Nan Xiao	2016-01-27	LR_and_BF URL Fix
Rmd	9ef3240	stephens999	2016-01-19	small notation change
Rmd	e5abac8	stephens999	2016-01-19	add exercise
Rmd	bf76cbd	stephens999	2016-01-12	add LR and BF

See here for a PDF version of this vignette.

Summary

This document introduces the idea that drawing conclusions from likelihood ratios (or Bayes Factors) from fully specified models is context dependent. In particular, it involves weighing the information in the data against the relative (prior) plausibility of the models.

Prerequisites

You should understand the concept of using likelihood ratio for discrete data and continuous data to compare support for two fully specified models.

Overview

Recall that a fully specified model is one with no free parameters. So a fully specified model for \(X\) is simply a probability distribution \(p(x \mid M)\). And, given observed data \(X = x\), the Likelihood Ratio for comparing two fully specified models, \(M_1\) vs. \(M_0\), is defined as the ratio of the probabilities of the observed data under each model: \[ \mathrm{LR}(M_1,M_0) := \frac{p(x \mid M_1)}{p(x \mid M_0)}. \] For fully specified models, the likelihood ratio is also known as the “Bayes Factor” (BF), so we could also define the Bayes Factor for \(M_1\) vs. \(M_0\) as \[ \mathrm{BF}(M_1,M_0) := \frac{p(x \mid M_1)}{p(x \mid M_0)}. \] When comparing fully specified models, the LR and BF are just two different names for the same thing.

In the example here, we considered the problem of determining whether an elephant tusk came from a savanna elephant or a forest elephant based on examining DNA data. Specifically, we computed the LR (or BF) comparing two models, \(M_S\) and \(M_F\), where \(M_S\) denotes the model that the tusk came from a savanna elephant, and \(M_F\) denotes the model that the tusk came from a forest elephant. In our example, we found LR = 1.8, so the data favored \(M_S\) by a factor of 1.8. We commented that a factor of 1.8 is relatively modest, and not sufficient to convince that the tusk definitely came from a savanna elephant.

In the example here, we considered the problem of testing a patient for a disease based on measuring the concentration of a particular diagnostic protein in the blood. Specifically we computed the LR (or BF) comparing two models, \(M_n\) and \(M_d\), where \(M_n\) denotes the model in which the patient is normal and \(M_d\) denotes the model that the patient has the disease. In our example, we found that the LR for \(M_n\) vs. \(M_d\) was approximately 0.07, so the data favored \(M_d\) by a factor of 14. This is a much larger LR than in the first case, but is it convincing? Can we confidently conclude that the patient has the disease?

More generally, we would like to address the question: what value of the LR should we treat as “convincing” evidence for one model vs. another?

It is crucial to recognize that the answer to this question has to be context dependent. In particular, the extent to which we should be “convinced” by a particular LR value has to depend on the relative plausibility of the models we are comparing. For example, in statistics there are many situations where we want to compare models that are not equally plausible. Suppose that model \(M_1\) is much less plausible than \(M_0\). Then we must surely demand stronger evidence from the data (larger LR) to be “convinced” that it arose from model \(M_1\) rather than \(M_0\), compared to contexts where \(M_1\) and \(M_0\) are equally plausible.

In the disease screening example, the interpretation depends on the frequency of the disease in the population being screened. For example, suppose that only 5% of people screened actually have the disease. Then to outweigh that fact, we would have to demand a high LR to “convince” us that a particular person has the disease. In contrast, if 50% of people screened have the disease, then we might be convinced by a much smaller LR.

Calculations using Bayes Theorem

The following calculation formalizes this intuition. Suppose we are presented with a series of observations \(x_1,\dots,x_n\), some of which are generated from model \(M_0\), and the others of which are generated from model \(M_1\). Let \(Z_i \in {0,1}\) denote whether \(x_i\) was generated from model \(M_0\) or \(M_1\), and let \(\pi_0, \pi_1\) denote the proportion of the observations that came from models \(M_0, M_1\). So in the disease screening example, \(\pi_1\) would be the proportion of screened individuals who have the disease. That is, \(\pi_j = \Pr(Z_i=j)\).

Bayes Theorem says that \[ \Pr(Z_i = 1 \mid x_i) = \frac{\Pr(x_i \mid Z_i = 1) \Pr(Z_i=1)}{\Pr(x_i)}. \] Also, \[ \Pr(x_i) = \Pr(x_i \mid Z_i=0)\Pr(Z_i=0) + \Pr(x_i \mid Z_i=1)\Pr(Z_i=1). \] Putting these together, then substituting \(\pi_j\) for \(\Pr(Z_i=j)\), and dividing the numerator and denominator by \(\Pr(x_i | Z_i = 0)\) yields \[ \Pr(Z_i = 1 \mid x_i) = \frac{\pi_1 LR_i}{\pi_0 + \pi_1 LR_i}, \] where \(LR_i\) denotes the likelihood ratio for \(M_1\) vs. \(M_0\) in observation \(x_i\).

Example

Suppose that only 5% of screened people have the disease. Then a LR of 14 in favor of disease yields \[ \Pr(Z_i=1 \mid x_i) = 0.05 \times 14/(0.95 + 0.05 \times 14), \] which equals 0.4242424.

In contrast, if 50% of screened people have the disease, then an LR of 14 yields \[ \Pr(Z_i = 1 \mid x_i) = 0.5 \times 14/(0.5 + 0.5 \times 14) \] which equals 0.9333333.

Thus in the first case, of patients with \(LR = 14\), less than half would actually have the disease. In the second case, of patients with \(LR = 14\), more than 90% would have the disease!

A useful formula

There is another way of laying out this kind of calculation which may be slightly easier to interpret and remember, and has the advantage of holding even when more than two models are under consideration. From Bayes theorem we have \[ \Pr(Z_i = 1 \mid x_i) = \frac{\Pr(x_i \mid Z_i = 1) \Pr(Z_i=1)}{\Pr(x_i)}. \] and \[ \Pr(Z_i = 0 \mid x_i) = \frac{\Pr(x_i \mid Z_i = 0) \Pr(Z_i=0)}{\Pr(x_i)}. \] Taking the ratio of these gives \[ \frac{\Pr(Z_i = 1 \mid x_i)}{Pr(Z_i = 0 \mid x_i)} = \frac{\Pr(Z_i=1)}{\Pr(Z_i=0)} \times \frac{\Pr(x_i \mid Z_i = 1)}{\Pr(x_i \mid Z_i = 0)}. \] This formula can be conveniently stated in words, using the notion of “odds”, as follows: \[ \text{Posterior Odds} = \text{Prior Odds} \times \text{LR}. \] Or, recalling that the LR is sometimes referred to as the Bayes Factor (BF), we have \[ \text{Posterior Odds} = \text{Prior Odds} \times \text{BF}. \] Note that the “Odds” of an event \(E_1\) vs. an event \(E_2\) means the ratio of their probabilities. So \(\Pr(Z_i=1)/\Pr(Z_i=0)\) is the “Odds” of \(Z_i=1\) vs. \(Z_i=0\). It is referred to as the “Prior Odds”, because it is the odds prior to seeing the data \(x\). Similarly, the “Posterior Odds” refers to the Odds of \(Z_i=1\) vs. \(Z_i=0\) “posterior to” (after) seeing the data \(x\).

Example

Suppose that only 5% of screened people have the disease. Then the prior odds for disease is 1/19. And a LR of 14 in favor of disease yields a posterior odds of 14/19 (or “14 to 19”).

Suppose that 50% of screened people have the disease. Then the prior odds for disease is 1. And a LR of 14 in favor of disease yields a posterior odds of 14 (or “14 to 1”).

In cases where there are only two possibilities, as in this example, then the posterior odds determines the posterior probabilities.

Exercise

1a. Write a function to simulate data for the medical screening example above. The function should take as input the proportion of individuals who have the disease, and the number of individuals to simulate. It should output a table, one row for each individual, with two columns: the first column (\(x\)) is the protein concentration for that individual, the second column (\(z\)) is an indicator of disease status (1 = disease, 0 = normal).

1b. Write a function to compute the likelihood ratio in the medical screening example.

1c. Use the above functions to answer the following question by simulation. Suppose we screen a population of individuals, 20% of which are diseased, and compute the LR for each individual screened. Among individuals with an LR “near” \(c\), what proportion are truly diseased? Denoting this proportion by \(q_D(c)\), make a plot of \(\log_{10} c\) [X axis] vs. \(q_D(c)\) [Y axis], with \(c\) varying from \(1/10\) to \(10\), say (i.e., \(log_{10} c\) varies from -1 to 1.) Or maybe a wider range if you like. (The wider the range, the larger the simulation study you will need to get reliable results.)

1d. Use the computations introduced in this vignette to compute the theoretical value for \(q_D(c)\), and plot these on the same graph as your simulation results to demonstrate that your simulations match the theory. [It should provide a good agreement, provided your simulation is large enough.]

1e. Repeat c and d, but in the case where only 2% of the screened individuals are diseased.

sessionInfo()
# R version 4.3.3 (2024-02-29)
# Platform: aarch64-apple-darwin20 (64-bit)
# Running under: macOS 15.7.1
# 
# Matrix products: default
# BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
# LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# time zone: America/Chicago
# tzcode source: internal
# 
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# 
# loaded via a namespace (and not attached):
#  [1] vctrs_0.6.5       cli_3.6.5         knitr_1.50        rlang_1.1.6      
#  [5] xfun_0.52         stringi_1.8.7     promises_1.3.3    jsonlite_2.0.0   
#  [9] workflowr_1.7.1   glue_1.8.0        rprojroot_2.0.4   git2r_0.33.0     
# [13] htmltools_0.5.8.1 httpuv_1.6.14     sass_0.4.10       rmarkdown_2.29   
# [17] evaluate_1.0.4    jquerylib_0.1.4   tibble_3.3.0      fastmap_1.2.0    
# [21] yaml_2.3.10       lifecycle_1.0.4   whisker_0.4.1     stringr_1.5.1    
# [25] compiler_4.3.3    fs_1.6.6          Rcpp_1.1.0        pkgconfig_2.0.3  
# [29] later_1.4.2       digest_0.6.37     R6_2.6.1          pillar_1.11.0    
# [33] magrittr_2.0.3    bslib_0.9.0       tools_4.3.3       cachem_1.1.0

Likelihood Ratio: how big is convincing?

Matthew Stephens

January 6, 2025

Summary

Prerequisites

Overview

Calculations using Bayes Theorem

Example

A useful formula

Example

Exercise