- [[recall]], [[false positives]], [[F1 score]], [[signal detection theory]]
# Idea
Precision is a term that comes from information retrieval theory and document retrieval systems. It measures on the "precision" of the positive instances that the system identified. It asks: "*Of all the instances we identified as positive, how many are actually positive?*" (see [[confusion matrix]]).
**What proportion of positive predictions were actually positive?**
Goal: **All positive predictions should actually be positive.** We don't care about getting every positive sample classified correctly. **If I say something is positive, it must be positive. I can't make mistakes.** It is okay if we don't identify all positive instances.
**A positive prediction must be a positive instance.**
![[20240909101407.png]]
Precision is defined as the following (contrast with [[recall]] and [[accuracy]]):
$\frac{true \ positives}{all \ positives}$
$\frac{true \ positive}{true \ positive + false \ positive}$
$\frac{true \ positive}{total \ predicted \ positive}$
See [[confusion matrix]] for another way to visualize precision.
![[s20230523_125150 1.png|400]]
Mnemonic: **P** stands for "pie" (circle); **rec** stands for rectangle. The denominator in precision is a circle (pie); the denominator in recall is a rectangle.
It's the proportion of positives identified by a model that are actually correctly identified. It's closely related to [[signal detection theory]].
It tells us **how valid the results are**.
It tells you how accurate the model's **positive predictions** are. It is often traded off against [[recall]], though ideally, we want to maximize both precision and recall.
Assuming we're classifying spam, it tells us the proportion of messages we classified as spam, actually were spam.
It's the ratio of true positives (messages classified as spam and are actually spam) to all positives (all messages classified as spam, regardless of whether they were classified correctly or incorrectly).
It's used to compute the [[F1 score]].
Search engine query example: Search returns 30 pages; 20 are relevant, but failed to return 40 additional relevant pages
- precision: 20/30 = 2/3 ("how **valid** the results are")
- recall: 20/60 = 1/3 ("how **complete** the results are")
## Significance of precision
Use precision when **false positives** are costlier.
High precision means the model returns substantially more relevant results than irrelevant ones. Precision matters in many real-world applications where false positives can be costly of dangerous. It is crucial when we want to avoid false positives: When we detect something as positive, we want to be extremely certain it is positive.
In medical diagnoses, a high precision classifier will rarely misdiagnose patients as having disease when they are actually healthy (i.e., low false positive).
Fraud detection: Flagging legitimate transactions as fraudulent can damage customer relationships and cause financial loss. High precision ensures most flagged cases are true fraud.
The core theme is that **improper positive classifications (i.e., false positives) have direct negative consequences**. Maximizing precision safeguards systems and users from harm due to **false positives**. It provides reliable and trustworthy results.
To balance recall and precision, use [[F1 score]] or [[general F-beta score]].
# References
- [A Technique to Remember Precision and Recall](https://blog.dailydoseofds.com/p/a-technique-to-remember-precision)
- https://en.wikipedia.org/wiki/Precision_and_recall
- [You Will Never Forget Precision and Recall If You Use the Mindset Technique](https://www.blog.dailydoseofds.com/p/you-will-never-forget-precision-and?publication_id=1119889&post_id=140686520&isFreemail=true&r=35ltar)
- [A Simple Technique to Understand TP, TN, FP and FN](https://blog.dailydoseofds.com/p/a-simple-technique-to-understand)