ML Katas

KL Divergence Calculation and Interpretation

medium (<30 mins) probability KL-divergence info-theory statistics VI
this year by E

The Kullback-Leibler (KL) Divergence (also known as relative entropy) is a non-symmetric measure of how one probability distribution P is different from a second, reference probability distribution Q. A KL divergence of 0 indicates that the two distributions are identical. It is commonly used in variational autoencoders (VAEs), reinforcement learning, and to measure the information gain when going from Q to P.

The discrete KL divergence from Q to P is defined as:

DKL(P||Q)=iP(i)log(P(i)Q(i))

For continuous distributions, the sum is replaced by an integral. Note that it is undefined if Q(i)=0 for any P(i)>0. For numerical stability, it's often computed as iP(i)(logP(i)logQ(i)).

Your task is to implement a function to calculate the KL Divergence for discrete probability distributions.

Implementation Details: * kl_divergence(p, q): * p: A 1D NumPy array representing probability distribution P. * q: A 1D NumPy array representing probability distribution Q. * Both p and q should sum to 1.0 and contain non-negative values. * Handle the case where P(i)>0 but Q(i)=0. In such cases, the KL divergence tends to infinity, which should be represented by np.inf. * For P(i)=0, the term P(i)log(P(i)/Q(i)) is 0log(0/Q(i)), which is typically defined as 0. Use np.where or similar for robust handling. * Return the scalar KL divergence.

Verification: 1. Test with identical distributions: p = np.array([0.5, 0.5]), q = np.array([0.5, 0.5]). Expected KL divergence should be 0. 2. Test with different distributions: p = np.array([0.1, 0.9]), q = np.array([0.9, 0.1]). 3. Test with a distribution where Q(i)=0 for some P(i)>0: p = np.array([0.5, 0.5]), q = np.array([1.0, 0.0]). Expected result np.inf. 4. Compare your results with scipy.stats.entropy(pk=p, qk=q). Ensure the values match for valid cases.