A Gentle Intro to Bayesian Statistics for College Admissions
Predictive analytics are becoming an increasingly important factor for college admissions modeling. They help predict the likelihood that a student will enroll if admitted. Predictive analytics can explain why demonstrated interest has become a key factor in college admissions decisions.
Bayesian statistics, which were first developed in 1763, are a key component of the mathematics behind predictive analytics. It all begins with the concept of a conditional probability. For example, what is the probability that a student will enroll if admitted, given the student’s GPA and zip code?
The conditional probability of an event A, given an event B, is written as P( A | B ). This is the same as the probability that both A and B occur, divided by the independent probability of B. Thus,
P( A | B ) = P( A and B ) / P( B )
This definition can be used to prove Bayes’ Theorem, which relates the conditional probability of A, given B, to the conditional probability of B, given A. Thus,
P( A | B ) = P( B | A ) * P( A ) / P( B )
For example, suppose that A occurs half the time and B one quarter of the time, and A and B occur together one-tenth of the time. Then, we can use these equations to calculate the conditional probability of A, given B, as (1/10) / (1/4) = 2/5.
Actual Bayesian analysis is a bit more complicated than this, but in effect uses known information to reason about the probabilities of unknown information. You’ll often hear people talking about the Monte Carlo method and Markov Chains.
There’s one other important thing you need to know about Bayesian Statistics, which is how Bayesian methods involve an independent assumption. Two probability distributions can be combined if the variables are independent of each other. For example, among applicants for college admission, age and gender are usually independent of each other. The mutual information between the two variables is very small. However, age and year-in-school are not independent. There is a high degree of mutual information between the two variables. If you know a student’s age, you can usually predict their year in school with a high degree of accuracy.
One of the core benefits of the Cappex system is that it provides student inquiries based on demonstrated interest. As these students are more likely to enroll if admitted, admissions departments can more accurately predict student behaviors and optimize their overall recruiting efforts.