**Your company implemented a biometric system that matches fingerprints against the model database to control access to the computer room. A Type I error occurs when an IT engineer is authorized to enter the computer room by the management but rejected by the system. Which of the following is the best null hypothesis to determine Type I error? (Wentz QOTD)**A. The subject is either an employee or an imposter

B. The false rejection rate is higher than the false acceptance rate

C. The sample fingerprint matches the template in the model repository

D. The sample fingerprint doesn’t match the template in the model repository

**Kindly be reminded that the suggested answer is for your reference only. It doesn’t matter whether you have the right or wrong answer. What really matters is your reasoning process and justifications.**

My suggested answer is C. The sample fingerprint matches the template in the model repository.

Wentz’s book, *The Effective CISSP: Security and Risk Management*, helps CISSP and CISM aspirants build a solid conceptual security model. It is a tutorial for information security and a supplement to the official study guides for the CISSP and CISM exams and an informative reference for security professionals.

## Null and Alternative Hypotheses

The null hypothesis is a presumption of zero or no deviation from the normal state. As proving an assumption is difficult, we typically find evidence against the null hypothesis and accept the alternative hypothesis instead of proving the alternative is true directly. As a result, the null and alternative hypotheses can be written as follows:

- Alternative Hypothesis: The sample fingerprint doesn’t match the template in the model repository
- Null Hypothesis: The sample fingerprint matches the template in the model repository

Biometrics-related terms like False Acceptance Rate (FAR) and False Rejection Rate (FRR) are commonly used and quite effective for communication. It’s not uncommon for people or books to relate FAR/FRR to Type I/II error (used in statistical hypothesis) or False Positive/Negative (used in binary classification). I wrote this question to stress the importance of the null hypothesis when we are talking about Type I/II error.

## Type I and Type II Errors

In statistics, we typically don’t propose only one hypothesis that requires sufficient evidence to prove it. Instead, we accept the alternative hypothesis because we reject the null hypothesis based on the evidence against it with a predefined significance level (e.g., 5%).

The decision of statistical hypothesis testing is to reject the null hypothesis or not. However, some decisions can be wrong, which can be classified as follows:

- Type I Error: we reject a null hypothesis, which is true. (reject a normal case)
- Type II Error: we fail to reject a null hypothesis, which is false. (accept an abnormal case)

## False Positive and False Negative

When it comes to binary classification in machine learning, a model is trained as the binary classifier based on a small portion of sample data, classifying instances/cases by labels (e.g., 0/1, spam/not spam, weapon/no weapon).

In a system that implements anomaly-based detection, it may use Imposter/No Imposter for classification as follows:

- “Imposter” is the label for a positive class.
- “No Imposter” is the label for a negative class.

A false positive means an imposter is identified/detected but the decision is wrong. A false negative means an imposter is not identified/detected, and sill, the decision is wrong. It’s common for people to relate false-positive to Type I error and false-negative to Type II error, even though they are used in contexts using different techniques. Li’s paper compares statistical hypothesis testing with machine learning binary classification very well.

# Reference

- Statistical Hypothesis Testing versus Machine Learning Binary Classification: Distinctions and Guidelines
- Intro to Hypothesis Testing in Statistics – Hypothesis Testing Statistics Problems & Examples
- Intro to Hypothesis Testing
- 4 Types of Classification Tasks in Machine Learning
- Classification: True vs. False and Positive vs. Negative

**您的公司實施了一個生物識別系統，將指紋與模型數據庫進行匹配，以控制對計算機房的訪問。 當IT工程師被管理層授權進入機房但被系統拒絕時，就會發生第一型 (Type I)錯誤。 以下哪一項是確定第一型錯誤的最佳虛無假設(null hypothesis)？ (Wentz QOTD)**

A. 主體(subject)是員工或冒名頂替者

B. 錯誤拒絕率(FRR)高於錯誤接受率(FAR)

C. 指紋樣本與模型庫(model repository)中的模板(template)匹配

D. 指紋樣本與模型庫中的模板不匹配

Pingback: 樣本指紋與模型庫中的模板匹配(The sample fingerprint matches the template in the model repository) – Choson資安大小事