Bishop - Pattern Recognition & Machine Learning, Exercise 1.4

Moderator: Statistisches Maschinelles Lernen

Beiträge: 45
Registriert: 20. Okt 2011 22:38

Bishop - Pattern Recognition & Machine Learning, Exercise 1.4

Beitrag von Stefan1992 » 28. Sep 2018 23:34

I'm working on exercise 1.4 in Bishops Pattern Recognition & Machine Learning book.

This exercise is about probability densities. I've two questions about this exercise.

At first I don't understand equation 1.27. He writes:
"Under a nonlinear change of variable , a probability density transforms differently from a simple function, due to the Jacobian Factor."

I never ever heard about the Jacobian factor. What is that factor?

"For instance, if we consider a change of variables $x = g(y)$, then a function $f(x)$ becomes $\tilde f(g(y))$. Now consider a probabilty density $p_x(x)$ that corresponds to a density $p_y(y)$ with respect to the new variable $y$, where the suffices denote the fact that $p_x(x)$ and $p_y(y)$ are different densities. Observations falling in range $(x, x + \delta x)$ will, for small values of $\delta x$, be transformed into the range $(y, \delta y)$ where $p_x(x)\delta x \simeq p_y(y)\delta y$, [...]"

What does the relation $\simeq$ mean in this context?

"[...] and hence
p_y(y) &= p_x(x) \left| \frac{\text{d}x}{\text{d}y}\right|\\
&= p_x(g(y))\left|g'(y)\right|."

This is equation 1.27. I don't understand where this equation comes from. Why is there this absolute value?

"One consquence of this property is that the concept of the maximum of a probabilty density is dependent on the choice of variable."

And at this point the book refers to exercise 1.4:

"Consider a probability density $p_x(x)$ defined over a continous variable $x$, and suppose that we make a nonlinear change of variable using $x = g(y)$, so that the density transforms according (1.27). By differentiating (1.27), show that the location $\hat y$ of the maximum of the density in $y$ is not in general related to the location $\hat x$ of the maximum of the density over $x$ by the simple functional relation $\hat x = g(\hat y)$ as a consequence of the Jacobian factor. This shows that the maximum of a probability density (in contrast to a simple function) is dependent on the choice of variable. Verify that, in the case of a linear transformation the location of the maximum transforms in the same way as the variable itself."

I don't understand, what this exercise asks me to do... :/

Would be great, if someone could help me...

Zurück zu „Statistisches Maschinelles Lernen“