Ravi Mohan's Blog

Monday, January 30, 2006

To Teach is To Learn Twice ... And More

I am sometimes asked by clients to implement Artificial Intelligence based solutions to problems involving massive datasets, real time requirements etc. A non trivial part of this work is to transfer the knowledge of AI algorithms and the underlying maths to the client's programmers. After doing this a few times, I now have a clearer idea of some of the difficulties involved and some partial solutions for them.
  1. Problem 1 : Transferring Mathematical Intuition Using AI effectively often demands a deep understanding of the mathematics underlying whatever AI 'paradigm'/algorithm you use. For e.g. to even think about whether Neural Networks are a good solution for a problem, one needs to understand Linear Algebra at very intuitive level. To many people maths == a set of equations or symbols to be manipulated. And this manipulation and re arrangement of concepts to achieve precise effects is something programmers are naturally good at and so this tendency is even more deeply rooted in good programmers.

    For e.g., a lot of us have learned the equation F = m * a , where 'F' is Force, 'm' is mass and 'a' is acceleration. And to get through our exams we treat this as some kind of pluggable system in which two quantities are known (or can be derived from the given data) and the third has to be derived. This looks fairly simple. To check whether you understand the reality underlying the equation ask yourself this - "Does Force cause acceleration? Or Vice versa? Or both? Why? When?".

    The use of equations to *calculate* a quantity is different from being able to think effectively in terms of the concepts underlying the equation. To use mathematics effectively one has to constantly translate between the world of mathematics and the domain of the problem. Very few people(including mathematicians) feel the need to acquire this skill. People who use mathematics to get work done (e.g. Physicists/ Astronomers/ Race Car Designers) acquire this skill out of sheer necessity. Forcing oneself to translate (and asking students to translate) back and forth between English(without using mathemetical terms) and Mathematics is a valuable exercise. (try doing this for a simple differential equation and you will see what i mean)

    Thus, it isn't possible for someone to grab a few equations out of the latest AI/Pattern Recognition/Robotics etc book and apply them straight away to solve real world problems. To even know what is possible takes an understanding of the mathematical undepinnings of the various "families" of AI algorithms and this intuition takes a long time(at least for folks like me) to acquire. If programmers are just shown some algorithms or mathematical proofs, it is very unlikely they will be able to program an effective system or even maintain a system to meet changing environments. This is complicated by the fact that intuition may take a few years to acquire but has to be transferred in a week or less.

    While there is no perfect solution to this, I find that one can transfer large "chunks" of intuition through carefully selected (sequences of) motivating examples. In my first "iteration" of teaching, I said (in effect), "o.k. here is the (basic) theory. This is the algorithm in pseudocode. Use this and you will get the results you want". Everyone seemed happy but the whole effort "thrashed " for quite a while before delivering the (astoundingly effective) results.

    So these days, I try to instill an understanding of the theory distinct from the programming effort. More of a "teach how to fish" while giving the student enough "fish" so he doesn't starve till he learns to fend for himself.

    Thus instead of just throwing out the algorithms for (say) prediction and monitoring of data streams using Markov models, I start with real world examples of prediction vs monitoring and slowly feed in the maths (equations first, trivial programs next, then proofs, then real world programs). Given sharp "students" (as most of the programmers I interact with on a daily basis are, (Thank God!)) this works suprisingly well. I sometimes feel I am spinning a rope bridge across yawning chasms but the notion of a "slice" through a system works just as well in mathematics as in agile "story card" based implementation. Once a few cables are thrown across, programmers feel brave enough to explore the abyss a few feet at a time without too much assistance.

  2. Problem 2 : Getting beyond "right" and "wrong" Sometimes in spite of my best attempts at simplifying and communicating, my "students" (I think of them as peer programmers rather than as students, hence the quote marks) sometimes misunderstand and "screw up" the concepts and write strange programs or advance baffling arguments. In the beginning my response was "That is NOT what I said! THIS is what you need to do. PLEASE look at the examples/notes/code. Aaaargh!!!! Not Again! ".

    A more effective way is to try to figure out what the student's mental model is, faulty though it be. So now, instead of reacting to a totally off track argument by (mentally) banging my head on the nearest wall, I say to myself "What he says makes sense to him. So what mental/model thought processes has he adopted which causes him to see the world in a fashion that makes this argument logical?". The moment I can identify this precisely I am able to come up with an illuminating example that demonstrates the error. And sometimes it turns out they are on to something important and it is my perceptions that need fixing. "Learn Twice" indeed.

  3. Problem 3 : Ability to Do != Ability to Teach Sometimes the "teaching" gets so fatigue-inducing that I think to myself "I could have coded up all this in one tenth the time it taks me to communicate to someone else". My client, (being wiser than I) insisted on the teaching approach. What I have realized that many people (including Yours Truly) know how to do many thngs but are not often able to explain how they do them or teach others to do likewise. When you teach others you have to know each concept in a crystal clear fashion and also grok all the inteconnections between the concepts. Teaching often involves re-arranging concepts in increasing order of complexity and interdependence, seeking real life and programming examples that illustrate each facet of a concept with great clarity and slowly transitioning into complex real world problems and solutions. After teaching others, I understand many things at a much deeper level than I used to.

    That being said, teaching consumes massive amounts of time. I have to work 10 hours or more to prepare for a three hour "lecture". Thus "teaching" conflicts with "doing". For the time being, I'll focus on being a programmer more than a "teacher" type but I wonder how others manage.

  4. There is a lot more I could write but this post is long enough. In another entry I will look into the difference between "Real world" and "Toy" AI.

7 comments:

Anonymous said...

Enlightening blog!
It says most of the things explicitly and very clearly. Many of us may have come across one or the other situation mentioned, but never ever realized so clearly. But reading this I could relate a clear experience, which I had been through long time back. One of my friend's father is a scientist and we were doing a project under his guidance. He was making us understand the domain of the problem to be solved using maths. I was (and so was his son:)) , amazed/dazzled by the insight he had in the underlaying concepts behind those mathematical equations. He would make us write an equation and explain it's meaning(of every variable in it) in the physical world . He would then go ahead and slowly transform the domain object/concept (say the initial state of a object/particle was x and it starts moving in a particular fashion and achieves a state y) and explain the whole process so clearly, that we could, not just visualize the process, but also "see" how exactly the variables in the equations were changing. At the end of a complex tranformation of the physical object, he asked us to write the final result and we were kind of speechless, as we could write the final equation without deriving it mathematically from the initial equation. I am not sure if this is possible always as I am not a great mathematician, but ofcourse the problem was quite complex than most of the simple textbook physics questions and so was the domain and that is why it was so exciting to see that with a little college maths insight, we were able to write the results so perfectly and map the final state of the object to the mathematical equations, so well. I had heard about maths as an alternative representation/view of the physical world, but never ever "seen" the relation applied so well to a very complex problem. I think most of the people who create/discover new concepts/algorithms go back and forth between maths and real world and use each other as complimentary tools to understand the domain in question.
This is one of the best learning experiences I ever had. Most of the time, we do not have the luxury of a great teacher like the person above; but recently I was quite amused, when one of my friends told me about a great scientist/programmer, who learns from his code! Of course, he follows this most probably, as there is no greater authority/teacher in his domain, than himself; but it was a whole new perspective.I had generally seen programs as a way to make the computers do things faster/more efficiently, but not as a teaching aid(or rather a teacher). Soon enough I also realized that only when you work on complex domains, this thing applies. I mean what could using say AJAX(it has it's own merits though), possibly "teach" me? Of course it gives me more ideas to implement things *I am clear about* but thought had technical limitations earlier, in a more efficient/better way, but does it clarify a vague concept in my mind, unlike say AI, where I learn more about a concept or real world domain from the behaviour of my program?

pavan said...

Ravi,

i did make a copy of this particular post in my blog(http://yarapavan.blospot.com) wit h out ur consent. Hope u don't mind.

Ravi said...

Pavan,
No Problem. I see you have attributed the blog to me. No issues at all.
(The url you pasted n your comment does not work. It should be http://yarapavan.blogspot.com . (you missed a 'g')

Tejaswi said...

Am waiting for the next post in this series.

Ravi said...

series? :-P
I just write what I feel like writing at a particular moment with zero editing (as the messed up punctuation and misspelled words show) :-).

Having said that, I do hope to continue writing. Urk that sounds very grandiose. Oh well, we'll see.
Once I get my new laptop loaded wih my favorite pieces of software ....

Tejaswi said...

In another entry I will look into the difference between "Real world" and "Toy" AI.

I specifically meant the above when I said I am waiting for the next post in the series [on AI, say].

Ravi said...

Teju,
I have some inchoate toughts hirling in the empty and hoing caverns of my brain. Will post smething as soon as I can string some logical thoughts together.
Regds,