Learning calculus in order to understand neural nets

I've decided I need to learn what derivatives are before I can get to the heart of understanding what's happening inside the "hidden layers" of neural nets, and backpropagation is still not making sense for me. Turns out derivatives are at the heart of calculus. I avoided calculus all my life, but now I have an actual use for it, so here goes. But first, a rant of sorts.

It took me a long time to figure out that the reason I didn't like math is because I organize information intuitively, thinking in pictures, not linearly, abstractly, and most math is taught -- even to those who learn linearly -- with huge and arbitrarily-placed leaps of intuitive logic required.

When I want to understand a new mathematical concept, I have learned not to go straight for Wikipedia or even Wolfram or Khan or anywhere mathy types might go first. Instead, I know to hit a search engine with "new-math-concept" plus the word "intuitive" or "new-math-concept INTUITION" and sure enough, a few websites and articles written by others who think intuitively on the subject will appear. Sometimes there are many. Using this approach, often I am able to grok the new idea within seconds, as soon as the intuitive images are presented. Other times I still labor through the concepts a few times before I get them, but even then, learning by allegory is so much more fun than the complete bafflement I get when trying to learn in the more abstract way which seems to be common.

Anyway, point is, I've at least learned what my limitation is -- that I learn intuitively not linearly, and with that I understand how Wikipedia does not organize mathematical information this way, although it does alright with things like history, biography, and so forth.

So as I was saying, now I'm finally learning what a derivative is, because the concept keeps popping up while I'm trying to figure out how neural nets work. Here's the first useful link I found, a little verbose, but with some good intuitive points being made: Calculus: Building Intuition for the Derivative. It has rich images like the following: "Imagine a shirtless Santa on a treadmill. We're going to measure his heart rate in a stress test: we attach dozens of heavy, cold electrodes and get him jogging." I can do that, and while doing so, learn a little about the limitations of measurement.

The following link, on the other hand, pretends to be intuitive but starts right out by defining the derivative of a function as "instantaneous rate of change" (see An Intuitive Introduction to Derivatives).

Now people who already understand what it means will think it's silly that I don't like that definition, but here's what's going on: that definition is of almost no value whatsoever to me, and it stops me cold. It's a meaningless phrase, as gibberish as if I said "flying pink elephant."

What instantaneously changes?

Seems incoherent to me, right out of the box. Here's why: Instantaneous means a slice of time smaller than that in which any change can be made. Change means something that requires time, so forget about joining it with a word like instantaneous, which means without time. It's a logically impossible thing, this concept of instantaneous change. My brain sits there and churns over trying to make sense out of such nonsense, while the rest of the paragraph goes on, boldly tossing out other such local paradoxes within its globally-coherent narrative that somehow makes enough sense to be the introductory paragaph on the subject in a popular online encyclopedia.

That site then goes on and says "well, if that made no sense, it's the slope of a curve" and carries on like this for a while. Really? That's of little value also. Slope of a curve? What's instantaneous about a slope? My brain is still trying to parse "slope of a curve" -- which has problems if you're picturing a circle instead of a hill -- as something similar to the equivalent phrase "instantaneous change." I can see how "curve" and "change" could be similar because of the non-linearity, but "instantaneous" is less sensible with "slope" than with "change"!

Brain is melting. I abandon trying to map "instantaneous change" to "slope of a curve" and look for a picture somewhere.

Show me a way I can find this idea in my kitchen or backyard and I'll understand what you're talking about, but throw abstract ideas like "slopes" and "curves" around and there's all kinds of assumptions you're making which I cannot draw out of your words. And all of those assumptions have other such assumptions. Give me something I can stand on, not just a house of words that fit together in your world -- not mine. Give me pictures (not icons and graphics necessarily, but ideas which I can picture in my mind), some structure I can begin to see inwardly, so I can figure out how to fit it together with what already exists my world.

The page goes on, delighted in how intuitive it is, and soon points to another long page: The Definition of Derivative: The Intuition Behind It which is slightly more intuitive, but also not as useful to me as the first link above (even though the first link has lots of irrelevant asides like this present parentheses here, the content is easier for me to move through).

So there's my rant.

I'm not stupid, I just think like an artist not a mathematician, and I've learned not only that intuitive explanations are the best for me, but sometimes, even so called "intuitive" explanations are not so intuitive after all.

Does this lament make sense to you? If so, then hopefully it helped you. You see, it helped me, merely to discover that intuitive thinking is a perfectly reasonable way to think and even has a large following of many others who also think it's the right way of seeing the world. I have learned to be persistent: keep searching to find the right articles, written by people who are truly intuitive, not people who think an abstract linear jargon-laden description is sufficiently intuitive to greet newbies.

Lastly, that being said, I'm still GREATLY enjoying the intuitive (for a programmer) Hacker's Guide to Neural Nets mentioned in a previous post. Carry on.

 

Add a comment

HTML code is displayed as text and web addresses are automatically converted.

Page top