KL Divergence Calculation and Interpretation

medium (<30 mins) probability KL-divergence info-theory statistics VI

this year by E

The Kullback-Leibler (KL) Divergence (also known as relative entropy) is a non-symmetric measure of how one probability distribution $P$ is different from a second, reference probability distribution $Q$ . A KL divergence of 0 indicates that the two distributions are identical. It is commonly used in variational autoencoders (VAEs), reinforcement learning, and to measure the information gain when going from $Q$ to $P$ .

The discrete KL divergence from $Q$ to $P$ is defined as:

D_{K L} (P | | Q) = \sum_{i} P (i) \log (\frac{P (i)}{Q (i)})

For continuous distributions, the sum is replaced by an integral. Note that it is undefined if $Q (i) = 0$ for any $P (i) > 0$ . For numerical stability, it's often computed as $\sum_{i} P (i) (\log P (i) - \log Q (i))$ .

Your task is to implement a function to calculate the KL Divergence for discrete probability distributions.

Implementation Details: * kl_divergence(p, q): * p: A 1D NumPy array representing probability distribution $P$ . * q: A 1D NumPy array representing probability distribution $Q$ . * Both p and q should sum to 1.0 and contain non-negative values. * Handle the case where $P (i) > 0$ but $Q (i) = 0$ . In such cases, the KL divergence tends to infinity, which should be represented by np.inf. * For $P (i) = 0$ , the term $P (i) \log (P (i) / Q (i))$ is $0 \log (0 / Q (i))$ , which is typically defined as 0. Use np.where or similar for robust handling. * Return the scalar KL divergence.

Verification: 1. Test with identical distributions: p = np.array([0.5, 0.5]), q = np.array([0.5, 0.5]). Expected KL divergence should be 0. 2. Test with different distributions: p = np.array([0.1, 0.9]), q = np.array([0.9, 0.1]). 3. Test with a distribution where $Q (i) = 0$ for some $P (i) > 0$ : p = np.array([0.5, 0.5]), q = np.array([1.0, 0.0]). Expected result np.inf. 4. Compare your results with scipy.stats.entropy(pk=p, qk=q). Ensure the values match for valid cases.