The Implicit Higher Dimension of Kernels
Support Vector Machines (SVMs) are powerful, and the "kernel trick" allows them to find non-linear decision boundaries without explicitly mapping data to high-dimensional spaces.
- Linear Separability: Consider a 2D dataset where points are not linearly separable in their original feature space. Sketch an example of such a dataset (e.g., concentric circles or an XOR-like pattern).
- Feature Mapping: Imagine a simple mapping function that transforms a 2D input into a 3D feature space. Apply this mapping to two sample points, say and .
- The Kernel Trick: The "kernel trick" avoids explicit mapping by directly computing the dot product in the higher-dimensional space using a kernel function . For the mapping above, show that . This is the quadratic (polynomial degree 2) kernel.
- Intuition: Explain in simple terms how the kernel trick allows SVMs to find non-linear boundaries in the original space. Why is explicitly computing often computationally expensive or even intractable for very high-dimensional spaces?
- Verification: You can compute directly and compare it to for the points you picked in step 2 to confirm your derivation of the kernel function.