Thursday, May 28, 2009

Fulton course/instructor evaluations

Dear all:

Hope you are starting to have a good break and all your mental bruises are slowly healing.

I just received the results of the  teaching evaluations that you folks filled and enjoyed reading them.

Thanks to all of you who took time to fill the evaluations!

It is my somewhat quixotic custom to allow access to the evaluations to the class students for a limited time. It might give you a feel as to how your individual
views stacked up with the rest of the class (you know--sorrow desires company and all that).

In keeping with it, here are links to the full evaluations--warts and all--in case you are interested:   (471 section)      (598 section)

Regarding the comments about the difficulty level of the course, the following is the link for a comparable course taught by the textbook author 

So look at the bright side--you got almost all that, with a lower tuition, fewer projects, easier exams, *and*  a suaver accent! ;-)


Monday, May 18, 2009



 I agonized for some three days and  just submitted your grades; you should be able to see them on the registrar's site.

 It has been fun teaching you folks; I hope to see some of you in other classes. Feel free to drop by if I can be of any help.

Good luck with your degree programs (or real life, if you were so unlucky as to graduate already :).
Hope you get to recall and use at least some of the things we talked about this semester
down the line somewhere.


Sunday, May 17, 2009

Go Huygens! A new world record in a difficult game for computers

From the article:
At the Taiwan Open 2009, held in Taiwan from Feb. 10-13, the Dutch national supercomputer Huygens, which is located at SARA Computing and Networking Services in Amsterdam, defeated two human Go professionals in an official match.
Here's more:

specimen solutions for the final..

In case you are interested, here is a link to specimen solutions for the final (note that these are solutions written by the student
scoring the highest, but not perfect, on the final).


Saturday, May 16, 2009

Full spreadsheet of the final gradebook

Someone wanted to have one last look at the final grade book I am operating with. Here it is...



Final cumulatives (with the final exam scores thrown in)


 Here are the final cumulatives (with final exam scores thrown in). The highest for final in UG is 102 and in grad is 104 (out of 110).
The averages are 64 and 84 respectively.

The top students in CSE471 and the CSE598 categories are both guaranteed A+ grades (assuming their photo-finish holds up ;-)).
They both are welcome to offer me (non-binding) advice on where to put grade cutoffs for the rest of the class...

As for the rest, they shall find out their letter grades from the registrar come Tuesday.



Thursday, May 14, 2009

a brain aphorism based on the final..

There is a quote I used to like, and it goes like this:
 "If our brains are so simple that we can understand them,
    we will be so simple that we can't"

(Of course, I don't believe it, but I like the sound of it ;-)

Anyways, I was thinking about it, as I am grading the final and several of you wrote
"True" for the question which says that an agent has the best chance of success when
all the variables are independent.. yes you can do reasoning fast, but to what end? You
can't change a thing in the world...


Another short-answer question that only a few people got right is the last one. The point there is that
if you add a random number to the h value, then you are likely to make the h-value of each node unique.
Which means there can be as many distinct f-values as there are nodes.  This is a death-knell of IDA*


Saturday, May 9, 2009

Re: Regarding graph planning in Homework 5

First of all, since I discussed mutual exclusion propagation only briefly in the class, you are not responsible for that part.

Since you asked however, just because two actions are mutex doesn't mean that  their effects are mutex--since after all the effects may also have been given by other actions.

Consider for example a situation where p is given by m different actions, and q is given by n different actions.

In order for p and q to be mutex, it must be the case that every pair of actions in the cartesian product must be
mutex--ie there must be m*n mutexes.  If there exists even one pair--say ai giving p and bk giving q such that
ai and bk are not mutex, then p and q are not mutex.

[Suppose you started with the belief that you shouldn't hit people because god might punish you. If you then went on to become an atheist, it doesn't necessarily mean that you should now believe that hitting people is fine. You may have found other reasons why hitting people is not reasonable.]

[In contrast, if *any* pair of preconditions of two actions are mutex, then the actions themselves are mutex.
I.e. if action a has m precods and action b has n preconds, if any of the m*n precondition pairs are mutex, then the action a and b are mutex. ]


On Sat, May 9, 2009 at 3:05 PM, Sidharth Gupta <> wrote:
The mutexes shown between variables at level 2  in the problem in planning graph for homework 5 solutions dont quite seem right.... Shouldnt there be a mutexes between all the pairs of variables whose actions were also having a mutex...? Many seem to be missing like between R and S whose actions o1 and o2 are also in mutex?

Sidharth Gupta

(yet another mail about) Undergraduate research opportunity...


 I just learned that we will be getting a National Science Foundation grant to support some work on stochastic planning.
This is based on the ideas in the  paper

This grant also has a "Research Experiences for Undergraduates" component (through which UG students can take part in research projects, and also get a modest stipend of about 4K/semester ).

If you are interested, let me know. This can start as early as this summer.


Friday, May 8, 2009

Final pep-rally.. (and anxiety amelioration)

Some of you have started wondering whether it makes sense to plan to stand in the interminable security checks in your nice polyester cookers (err graduation gowns) and get into the Obammencement given your worrisome cumulatives in this class.

My suggestion is that you quit worrying and  focus on the final and do that well.

As I said after the mid-term, I am much more interested in torturing (I mean educating)  you during the semester than in haunting your GPA after it.
I have  a lot of respect for students who had the perseverance to stay with what people tell me is a challenging course.

Good luck
  "Not to make-up your minds, but to open them
     to make the agony of decision-making so intense
       that you can escape only by thinking"

Cumulatives for everything other than participation and final..


 Here are the cumulatives for 75% of your grade (the participation credit and the final exam marks are missing).
Note that some of you still have project 4 points missing--this is because TA had contacted you for your source code and
has to complete grading after that.


Re: Final exam question

Yes. Sorry for the typo.

Sent from my iPod

On May 8, 2009, at 2:38 PM, wrote:

> On Qn VII (1), Given a database D and a fact f, if D does not entail
> p, then D entails ~p.
> Is fact f supposed to be fact p?
> Thanks,
> Cameron

Fwd: CSE Undergraduate Research Scholarship Fall 2009 - deadline today!

Let me know if any of you UG students are interested in being nominated.


---------- Forwarded message ----------
From: Amy Sever <>
Date: Fri, May 8, 2009 at 7:55 AM
Subject: CSE Undergraduate Research Scholarship Fall 2009 - deadline today!

CSE Faculty,


I'm writing to remind you that today is the deadline to nominate a student or students for the CSE Undergraduate Research Scholarship for Fall 2009.  See details and get the application form at:


Please consider supporting one of our top students!





Amy Sever

Assistant Director, Academic Services

School of Computing and Informatics







From: Amy Sever
Sent: Tuesday, April 28, 2009 1:51 PM
To: ''
Cc: Sandra Hoeffer
Subject: CSE Undergraduate Research Scholarship - nominate a student for Fall 2009!


CSE Faculty,

It is time to nominate exceptional students for the CSE Undergraduate Research Scholarship! This program supports strong academic performance among our undergraduate students and to encourage interest in graduate studies. Please review the following guidelines.

  1. Faculty must initiate all applications and turn them into department. These applications will not be accepted from students. However, if students find any potential opportunities on this site or elsewhere, they can come to you to initiate the application.
  2. Faculty overseeing research must commit $1,000 to student support.
  3. Students can work a maximum of ten hours per week in lab.
  4. The department will award an additional $1,000 to the student based on a competitive process.
  5. Monies will be awarded on a semester basis.
  6. All applications must be received by May 8th for the Fall 2009 term.
  7. All students must meet the qualifications listed below.

Student Qualifications:

  • Grade Point Average of 3.25 or above
  • Must be registered for at least 12 credit hours in the Fall 2009 term
  • Junior/Senior level in program
  • Every student receiving this award is expected to produce a poster at the end of the semester in which they received financial support through the FURI Symposium. Instructions for completing these posters will be provided.

The application can be downloaded at:  Please share as much information as possible about the student you are nominating, as there are a limited number of awards and the selection process is competitive!

Please turn in the application for the Fall 2009 semester to me in the SCI Advising Center, BYENG 208, by May 8th. I've attached a query of students who are juniors and seniors with a 3.25 GPA or higher. Perhaps one of the names is of a promising student from one of your classes!  Please contact me if you know of a student that is not on this list that you believe is a junior or senior in the program.

Here are other ways students can engage in research:

1)      Mentor a student in Fulton Undergraduate Research Initiative (FURI).  Application deadline for Fall 2009 is May 15th.   This is a student-initiated process. See for more information.

2)      Supervise a student in CSE 499 Independent Study: For CS and CSE seniors with a 3.0 GPA or higher in their major area. Form is at

3)      Supervise an honors student in CSE 492 Research and CSE 493 Thesis.  Form is at

4)      Post any available research positions for undergraduates (and graduate students) at:

Thanks for your help in supporting our students to engage in undergraduate research.



Amy Sever

Assistant Director, Academic Services





Amy Sever

Assistant Director, Academic Services

School of Computing and Informatics





What is the difference....?

A nice FAQ on the differences between functional, logic and procedural languages....

1. What makes a language functional, logic programming or procedural ?

A procedural or imperative language focuses on telling the computer
what to do, step by step. These are descended from Turing machines
and assembly language. You knew a lot about them before this course.

A logic programming language focuses on computation as constructive
proofs, i.e., proving a term that says that two employees or three
numbers are in a particular relationship. Typically unification and
backtracking are used to help efficiently construct the proofs, but
there are other possible strategies. These languages are descended
from Prolog, and you have been learning about them.

A functional language focuses on computation as the evaluation of
mathematical functions. These are descended from the lambda calculus
and Lisp. I'll say a little more below in answer to your questions.

In a purely functional language, a program simply defines a bunch of
functions, in the mathematical sense of "function." The return value
of a function depends only on its arguments; the function has no side
effects and is not sensitive to side effects, so it returns the same
value every time it's called.

Of course, you can program in this functional style in other
languages, too. But programming in a actual functional language
forces you to learn this style. :-) More important, compilers for a
functional language can take advantage of the guaranteed lack of side
effects to introduce optimizations -- including memoization and lazy

In a really pure functional language, the only thing that happens
within a function is to call other functions -- e.g., you would define
f(x,y) as g(h(x),f(x,minus(y,1))). This is just putting f,g,h, and
minus together by wiring the outputs of some functions into the inputs
of others. Note that minus is presumably a built-in function. So are
conditionals: if(condition, then-value, else-value).

Similarly, in a really pure logic language, the only thing that a
query can do is to combine other queries by unifying their arguments,
which is similar to the way that a function call in a functional
language combines other function calls by wiring their inputs and
outputs together. A query may also have nondeterminism, e.g., there
may be a choice of several ways to answer it (e.g., several clauses
with the same head). This is why constructing a proof may involve

So in both pure functional and pure logic programming languages, there
is no mechanism for modifying values. This rules out side effects,
but it actualy rules out more than that, because there's not even a
way to modify values privately within the definition of a function.
Objects can't be modified once they're created. Indeed, there is no
way even to change the value of a variable! (You can introduce a
LOCAL variable as a temporary name for something, but once you've
introduced it, its value will never change (except perhaps for being
specialized through unification), and it goes away once you leave the
scope where the variable was introduced.)

As an example, you can't write loops since you can't change the loop
variable. You use recursion instead. This introduces new local
variables on every recursive call rather than changing the value of
your loop variable. The compiler may secretly turn the recursive
calls back into loops where it can, of course.

As another example, you can't write a destructive function to append
two lists A and B, i.e., changing the last pointer of A to point to
the start of B, because this would be a side effect. You have to
write a function that leaves A and B intact and returns a new list C,
which is basically the list that you would get if you made a copy of A
and changed the last pointer of the copy to point to the start of B.
(There is no need to copy B.)

As another example, you can't easily memoize the result of a function,
because it would be a side effect to store the result in an global
array that would be accessible to future calls of that function. To
avoid handling this as a side effect, you would have to pass the array
as an argument to the function, and the function would return a
modified array that you could pass to the next call of the function.
Some functional or logic languages do have memoization built in,
however -- the lack of side effects means that the memo will remain

Many functional and logic languages are not pure, though -- they are
extended with some ability to have side effects. In Prolog, there are
"assert" and "retract" predicates which actually modify the program as
it's running (e.g., adding new facts to the database) if you query
them. The original functional language Lisp allows modification fo
both local and global variables. Several recently developed
functional languages like OCaml and Haskell have more principled or
careful ways to let side effects into the language in restricted

Is it ability to use functions as arguments to other functions ?

No, but this is indeed a feature commonly found in functional
languages. That is because the feature is useful, and because those
languages are mainly descended from the lambda calculus, in which
functions are the only objects in the language.

A language "has first-class functions" if functions can be treated
like any other object, e.g.,
* as the argument to another function
* as the return value of another function
* as the value of a variable
* as fair game for type checking (i.e., if there is a strong type
system, then it must distinguish different types of functions, e.g.,
by their input and output types)

A "higher-order function" is a particular function that has other
functions as arguments or return values. A language that has
first-class functions obviously allows higher-order functions.

If it is then what is python ?

Python is a procedural, object-oriented language that has first-class
functions, recursion, and other things that make it easy to program in
a functional style if you choose not to use any side effects.

2. How important is the concept of immutability to any functional
programming language ?

Immutability means that objects can be created but not subsequently
modified. In pure functional languages, everything is immutable, as
noted above.

3. How is functional programming different from logic programming ?

Logic programming usually has nondeterminism (backtracking) and
unification as built-in features. Few functional programming
languages have this.

Logic programming doesn't have return values; it describes relations
like times(A,B,C) (which is either provable or not) rather than
functions like times(A,B).

Logic programming generally does *not* have higher-order functions,
since it doesn't have functions. But there are logic programming
languages that do, like lambda-Prolog.

Thursday, May 7, 2009



 Here is the URL for the final exam. Please read the instructions on the front page carefully. You can return the exam to either the front office
or push it under my door.

Please note that giving the fianl exam as a  take-home is a mark of trust I have in your academic honesty. Please don't give me any reason to regret it.


Final (Word document)
(pdf form)

Wednesday, May 6, 2009

My availability this week


 I should be in my office much of the day tomorrow (Thursday) as well as before noon on Friday.  If you have questions related to the course and exam
feel free to stop by. If you want to make sure I am in my office before you come, call 965-0113 to confirm.


Solutions for the final homework posted online


Chess game showing which moves the computer is considering

I stumbled across this chess game this morning
Every time you make a move, it shows you what moves the program is considering. The brighter the lines are, the better the move.

On the "About" page, they say this:

The chess engine we built is simple and uses only basic algorithms from the 50s (alpha-beta pruning and quiescence search). The program's unconventional initial moves may raise eyebrows among experts: we did not give it an "opening book" of standard lines since we wanted it to think through every position.

Although I wouldn't recommend using this if you actually wanted to complete an entire game in less than a couple of hours, it's pretty cool seeing all of the moves being analyzed in real time.

Tuesday, May 5, 2009

Bias, generalization and stereotypes: A half-baked lesson in Ethics and computation..


[ACM suggests that some percentage of time in all Undergraduate CS courses should be spent on discussing
ethics. May be this will fill that role... At any rate, I have been sending some version of the following since Fall 2003, and I see no reason
to break the tradition this year ;-) ]

We talked a lot about the role of biases in making learning feasible. When the kid jumps to the conclusion that the
whole big thing his mommy is pointing to and crying "BUS" must be the bus, or when you assume that the rabbit-like thing
that jumped into your line of vision, as you stood in the African savannah with a masai irrationally screaming "GAVAGAI" in your ears, must
be gavagai, you seemed to be making both computationally efficient and correct generalizations.

Inductive generalizations are what allow the
organisms with their limited minds to cope with the staggering complexity
of the real world. Faced with novel situations, our ancestors had to
make rapid "fight or flight" decisions, and they had to do biased
learning to get anywhere close to survival.

So, after the wisdom of this class, should we really wear complaints of biases in our behavior
as badges of honor?

Hmm..  Where does this leave us vis-a-vis
stereotypes and racial profiles--of the type
 "all Antarciticans are untrustworthy" or "all
Krakatoans are smelly" variety.

Afterall, they too are instances of
our mind's highly useful ability to induce patterns from limited
samples. How can we legitimately ask our mind not to do the thing it is so darned good at doing?

So, what, if any, is the best computational argument against stereotyping?

One normal argument is that the stereotype may actually be wrong--in
other words, they are actually wrong (non-PAC) generalizations, either
because they are based on selective (non-representative) samples, or
because the learner intentionally chose to ignore training samples
disagreeing with its hypothesis. True, some
stereotypes--e.g. "women can't do math", "men can't cook" variety--are of this form.

However, this argument alone will not suffice, as it leaves open the
possibility that it is okay to stereotype if the stereotype is
correct. (By correct, we must, of course, mean "probably approximately
correct," since there are few instances where you get metaphysical
certainty of generalization.)

What exactly could be wrong in distrusting a specific Antarcitican because
you have come across a large sample of untrustworthy Antarciticans?

I think one way to see it is perhaps in terms of "cost-based
learning". In these types of scenarios, you, the learning agent, have
a high cost on false negatives--if you missed identifying an
untrustworthy person, or a person who is likely to mug you on a dimly
lit street, or a person who is very likely to be a "bad" employee in
your organization, your success/survival chances slim down.
At the same time, the agent has much less cost on false positives, despite
the fact that the person who is classifed falsely positive by your
(negative) stereotype suffers a very large cost. Since the false
positive *is* a member of the society, the society does incur a cost for
your false positives, and we have the classic case of individual good
clashing with societal good.

This then is the reason civil societies must go the extra mile to
discourage acting on negative stereotypes, so we do not round up all
antarciticans and put them in bootcamps, or stop all Krakatoans at
airport securities and douse them with Chanel 5. And societies, the
good ones, by and large, do, or at least try to do. The golden rule,
the "let a thousand guilty go free than imprison one innocent", and
the general societal strictures about negative streotypes--are all
measures towards this.

You need good societal laws (economists call these "Mechanism Design")
 precisely when the individual good/instinct clashes with the societal good.

So, you are forced to learn to sometimes avoid acting on the highly
efficient, probably PAC, generalizations that your highly evolved
brain makes. I think.

Yours illuminatingly... ;-)

Epilogue/can skip:

It was a spring night in College Park, Maryland sometime in
1988. Terrapins were doing fine.  The Len Bias incident was slowly
getting forgotten.  It was life as usual at UMD. About the only big
(if a week-old) news was that of a non-caucasian guy assaulting a
couple of women students in parking lots.  I was a graduate student,
and on this particular night I did my obligatory late-evening visit to
my lab to feign the appearance of  some quality work. My lab is towards the edge of the campus;
just a couple more buildings down the Paint Branch Drive, and you get
to the poorly lit open-air parking lots.

On that night I parked my car, walked down the couple of blocks to my
lab, only to remember that I left a book in the car. So, I turned, and
started walking back to the parking lot. As I was walking, I noticed
that this woman walking in front turned a couple of times to look back at me. I remembered
that I had passed her by in the opposite direction. Presently I
noticed her turning into the Cryogenics building, presumably her
lab. As I passed by the cryo lab, however, I saw the woman standing
behind the glass doors of the lab and staring at me.

Somewhere after I took a few more steps it hit me with lightning
force--I was a false positive! The woman was  ducking into
the lab to avoid the possibility that I might be the non-caucasian
male reportedly assaulting campus women. I knew, at a rational level,
that what she was exhibiting is a reasonably rational survival
instinct. But it did precious little to assuage the shock and
diminution I felt (as evidenced by the fact that I still remember the
incident freshly, after these many years.).
There is no substitute for assessing the cost of false positives than being a false positive
yourself sometime in your life...

Decision regarding final being a take-home or in-class is made (democratically) today in class... up and think about which option you would want.


Heads up: Participation sheet that you will be asked to fill in the class today


 Each of you will be asked to fill in the participation sheet below in the class today. Please get ready with the relevant numbers:


CSE 471/598   Participation Evaluation Sheet



As you recall, the participation credit for this class is measured in terms of attendance, attentiveness and active participation (either through questions in class or via comments on the blog). Please help me evaluate your participation by providing answers to the following.


You will have to return this to me, in hard copy, in the last class.






How many regular classes (i.e., not counting the one makeup class), did you miss:



(Please look at if you have trouble remembering which classes, if any, you missed)



How many of the above classes did you miss with prior notification:


Class participation:


Approximately how many times did you ask a question (or respond to one asked):




Blog participation:



How many times did you post on the blog (not counting times you posted questions asking for clarifications on the homework)?




(I know quantity does not equal quality; but you can count on me to take quality into account ;-). It is the quantity I want help with.)

Monday, May 4, 2009

My digression has a digression--a change-of-heart re: "Gang Leader for a Day"

Paraphrasing the woman in Brazil (who says "my complications started having complications"), I feel compelled to make
a digression about a digression.

I mentioned the book "Gang Leader For a Day" in passing last class, and put it in a positive light. Well, I was half-way through at that time.
Over the weekend, I read the rest of the book and was quite disturbed by several things.

I then looked around on the internet for critical reviews, and found the following which summarizes some of my concerns.

Anyways, just on the off chance that I may have convinced any of you to read the book, I want to take it back. You are
welcome to read it of course, just don't blame me for it.

This *does not* change the point about gamma=0; just the generally positive tone of my description of the book.
Like Columbus, the author seems to take his own (re)discovery of known ideas a tad too seriously.

I know you didn't sign-up for this course to be swayed by  my biases, but I am afraid you might nonetheless, if I am any good at this
teaching thing. So, I felt compelled to send this note.

We now return you to the regularly scheduled homework.


The learning question in hw 4 is not required (as we have not covered the material yet in the class)

Sunday, May 3, 2009

Homework 4 Part 5f-5i

I don't believe we discussed neural networks yet. Are parts 5f-5i still required for this homework?

Reminder: Mandatory blog posting of homework 4 question

Just a reminder that the answer to the following homework 4 question must be posted onto the blog by Tuesday's class.


[Mandatory] [Answer to this question *must* also be posted on the class blog as a comment to my post (see the link below)]. List upto five non-trivial ideas you were able to appreciate during the course of this semester. (These cannot be "I thought Bayes Nets were Groovy" variety--and have to include a sentence of justification). (Here is the link to the blog comment section: )

(im)Possibility of an *optional* extra class on "Learning" [Monday 10:30--11:45, BY 210] Free Food Event. [RSVP]


Just to let you know that there will not be an extra class tomorrow (Monday).

 Apparently the recession isn't hitting the ASU students all that hard. Even offers of free food are able to
rustle-up only a couple of students to warm the seats of an extra class.

See you Tuesday.


On Thu, Apr 30, 2009 at 3:45 PM, Subbarao Kambhampati <> wrote:

 I felt a little queasy that I may not have the full opportunity to sufficiently brainwash you with my view of learning.

So, I am offering to run a completely optional extra class on Monday 5/4 morning in BY 210 (10:30--11:45AM) *if* there is
sufficient interest.

As a clarification, nothing I discuss in this class would be included in the final exam, so if you skip it, you won't be in anyway
disadvantaged with respect to your course grade.

Also, the Tuesday regular class will not depend on what is discussed on Monday; it will take off from where we left today.

What I want to do is give a bigger picture about learning, that I didn't get to do because of time constraints.

*if* you are interested and will be able to attend, let me know by email.*************

If I get sufficient interest, I will confirm by Sunday if the meeting is going to be held.

Since I cannot use sticks, I will dangle the carrot of free food if you show up ;-)


Saturday, May 2, 2009

Fwd: [cse471/598 Intro to AI Spring 2009 Blog] Homework 5 alpha-beta

Yes--please show where the cutoffs occurs and what the backedup value and corresponding move is (as was shown in the examples done in the class)


---------- Forwarded message ----------
From: Cameron L <>
Date: Sat, May 2, 2009 at 5:05 PM
Subject: [cse471/598 Intro to AI Spring 2009 Blog] Homework 5 alpha-beta

The power point slide doesn't really have any instructions. I'm assuming just show which nodes get pruned and the final value at A. But as we're traversing the tree, do we know that all values on a given level are ordered from least to most as we go left to right?

Posted By Cameron L to cse471/598 Intro to AI Spring 2009 Blog at 5/02/2009 05:01:00 PM

Homework 5 alpha-beta

The power point slide doesn't really have any instructions. I'm assuming just show which nodes get pruned and the final value at A. But as we're traversing the tree, do we know that all values on a given level are ordered from least to most as we go left to right?

Homework 5 MDP 1c

In question 1c of the MDP section of the homework what is meant by synchronous? Does that mean that state 2 depends on the result of state 1, or does it mean that state 1 and state 2 should be evaluated independently.

Friday, May 1, 2009

Is DeepBlue intelligent? Some extra-curricular philosophy

This is a mail I had sent to the class in Fall 2003. As they say
about re-runs, if you haven't seen it, it is *new* for you ;-)


Here is an article that discusses the question whether Deep Blue--the
Kasparov-beating chess program--that we are discussing in the class--is

I send this to you because this is pretty much my bias/position too on
this issue (plus I like Drew McDermott's style--if you ever get a
chance, you should read his paper "Artificial Intelligence meets
Natural Stupidity"--which can be found at -- and was written in the
early days of AI (~1978) to criticize researchers' tendency to
self-delude... (which is also related to the AI/Thermos Flask
joke--ask me about it sometime).

Bottom line: Introspection is a lousy way to theorize about thinking.

See the end for a pointer to a different perspective


How Intelligent is Deep Blue?

Drew McDermott

[This is the original, long version of an article that appeared in the
May 14, 1997 New York Times with more flamboyant title.]

IBM's chess computer, Deep Blue, has shocked the world of chess by
defeating Garry Kasparov in a six-game match. It surprised many in
computer science as well. Last year, after Kasparov's victory against
the previous version, I told the students in my class, ``Introduction
to Artificial Intelligence,'' that it would be many years before
computers could challenge the best humans. Now that I and many others
have been proved wrong, there are a lot of people rushing to assure us
that Deep Blue is not actually intelligent, and that its victory this
year has no bearing on the future of artificial intelligence as such.
I agree that Deep Blue is not actually intelligent, but I think the
usual argument for this conclusion is quite faulty, and shows a basic
misunderstanding of the goals and methods of artificial intelligence.

Deep Blue is unintelligent because it is so narrow. It can win a
chess game, but it can't recognize, much less pick up, a chess piece.
It can't even carry on a conversation about the game it just won.
Since the essence of intelligence would seem to be breadth, or the
ability to react creatively to a wide variety of situations, it's hard
to credit Deep Blue with much intelligence.

However, many commentators are insisting that Deep Blue shows no
intelligence whatsoever, because it doesn't actually ``understand'' a
chess position, but only searches through millions of possible move
sequences ``blindly.'' The fallacy in this argument is the assumption
that intelligent behavior can only be the result of intelligent
cogitation. What the commentators are failing to acknowledge is that
if there ever is a truly intelligent computer, then the computations
it performs will seem as blind as Deep Blue's. If there is ever a
nonvacuous explanation of intelligence, it will explain intelligence
by reference to smaller bits of behavior that are not themselves
intelligent. Presumably *your brain* works because each of its
billions of neurons carry out hundreds of tiny operations per second,
none of which in isolation demonstrates any intelligence at all.

When people express the opinion that human grandmasters do not examine
200,000,000 move sequences per second, I ask them, ``How do you
know?'' The answer is usually that human grandmasters are not *aware*
of searching this number of positions, or *are* aware of searching
many fewer. But almost everything that goes on in our minds we are
unaware of. I tend to agree that grandmasters are not searching the
way Deep Blue does, but whatever they are doing would, if implemented
on a computer, seem equally ``blind.'' Suppose most of their skill
comes from an ability to compare the current position against 10,000
positions they've studied. (There is some evidence that this is at
least partly true.) We call their behavior insightful because they
are unaware of the details; the right position among the 10,000 ``just
occurs to them.'' If a computer does it, the trick will be revealed;
we will see how laboriously it checks the 10,000 positions. Still, if
the unconscious version yields intelligent results, and the explicit
algorithmic version yields essentially the same results, then they
will be intelligent, too.

Another example: Most voice-recognition systems are based on a
mathematical theory called Hidden Markov Models. Consider the
following argument: ``If a computer recognizes words using Hidden
Markov Models, then it doesn't recognize words the way I do. I don't
even know what a Hidden Markov Model is. I simply hear the word and
it sounds familiar to me.'' I hope this argument sounds silly to you.
The truth is that we have no introspective idea how we recognize
spoken words. It is perfectly possible that the synaptic connections
in our brains are describable, at least approximately, by Hidden
Markov Models; if they aren't, then some other equally
counterintuitive model is probably valid. Introspection is a lousy
way to theorize about thinking. There are fascinating questions about
why we are unaware of so much that goes on in our brains, and why our
awareness is the way it is. But we can answer a lot of questions
about thinking before we need to answer questions about awareness.

I hope I am not taken as saying that all the problems of artificial
intelligence have been solved. I am only pointing out one aspect of
what a solution would look like. There are no big breakthroughs on
the horizon, no Grand Unified Theory of Thought. Doing better and
better at chess has been the result of many small improvements (as was
the proof of a novel theorem last year by a computer at Argonne Lab.)
There have been other such developments, such as the
speech-recognition work I referred to earlier, and many results in
computer vision, but few ``breakthroughs.'' As the field has matured,
it has focused more and more on incremental progress, while worrying
less and less about some magic solution to all the problems of
intelligence. A good example is the reaction by AI researchers to
neural nets, which are a kind of parallel computer based on ideas from
neuroscience. Although the press and some philosophers hailed these
as a radical paradigm shift that would solve everything, what has
actually happened is that they have been assimilated into the AI
toolkit as a technique that appears to work some of the time --- just
like Hidden Markov Models, game-tree search, and several other
techniques. Of course, there may be some breakthroughs ahead for the
field, but it is much more satisfying to get by on a diet of solid but
unglamorous results. If we never arrive at a nonvacuous theory of
intelligence, we will no doubt uncover a lot of useful theories of
more limited mental faculties. And we might as well aim for such a

So, what shall we say about Deep Blue? How about: It's a ``little
bit'' intelligent. It knows a tremendous amount about an incredibly
narrow area. I have no doubt that Deep Blue's computations differ in
detail from a human grandmaster's; but then, human grandmasters differ
from each other in many ways. On the other hand, a log of Deep Blue's
computations is perfectly intelligible to chess masters; they speak
the same language, as it were. That's why the IBM team refused to
give game logs to Kasparov during the match; it would be equivalent to
bugging the hotel room where he discussed strategy with his seconds.
Saying Deep Blue doesn't really think about chess is like saying an
airplane doesn't really fly because it doesn't flap its wings.

It's entirely possible that computers will come to seem alive before
they come to seem intelligent. The kind of computing power that fuels
Deep Blue will also fuel sensors, wheels, and grippers that will allow
computers to react physically to things in their environment,
including us. They won't seem intelligent, but we may think of them
as a weird kind of animal --- one that can play a very good game of

For a radically different view point, see

This one is by David Gelernter, who, get this, is a colleague of Drew
McDermott at Yale. On an unrelated note, Gelernter is also one of the
scientists who was targeted by the Unabomber--Kazinscky(?), and was
seriously injured by a letter bomb--thus the title of his book at the
end of the article.]]


Thursday, April 30, 2009

Using "evaluation function" learning as a segue into Machine Learning isn't as artificial as it might sound..

So when I used "learning of evaluation functions" as a segue to shift from Game trees to learning, it may not have looked
much more than a contrivance of convenience.

Turns out however that this has almost biblical importance. One of the first successful computer programs to use machine learning
(and the program that pretty much gave the name "Machine Learning" to the field) is Samuel's Checkers player.  Samuel was an IBM researcher
who developed one of the first checkers playing programs, that learned to improve its performance by learning evaluation functions.

Here are few links:  (definitely see this--you will be impressed at how many buzzwords you can understand now ;-) (slightly more technical)


Last chance to don the [Thinking Cap] on MDPS & learning..


1. What happens to MDPs if the underlying dynamics are "deterministic"? Can we still talk about value and policy iteration? Do we still have non-stationary policies for finite horizon deterministic MDPs?
2. We talked about how infinite horizon case is easier than finite horizon case, and said, in passing, that here is one case where "infinite" is easier to handle. Consider the following two cases:
2.1. CSPs where the variables are "real valued", and constraints between variables are expressed as linear inequalities. Clearly, the number of potential assignments for the CSP variables is infinite. What do you think will be the complexity of finding a satisfying assignment of the variables? (recall that discrete variable CSP is NP-complete) If the complexity is "low" by AI standards, what can you do to increase it? (hint: consider modifying constraints).

2.2. Planning problem where (some of) the variables are real valued. Actions have effects that can increment and decrement the variables by arbitrary amounts. What do you think will be the complexity of planning in this case? (recall that discrete variable planning is PSPACE-complete, i.e., it is among the hardest problems that can be solved with polynomial space).

3. The MDPs we considered until now are called "Fully observable"--in that during execution, the agent can tell which state it is in (thus the policy needs to only map states to actions).  What happens if the domain is only "partially observable".
Note that this means that the agent may not know which unique state it is in, but knows the  "probability distribution" over the possible states it could be in. When the agent does an action, it effect of the action is to modify the distribution into a new distribution over states (with some states being more likely and others less.

Notice that the "distribution" is fully observable although the underlying states aren't.
So, one idea will be to consider the distributions as the states. What happens to the complexity of MDPs? How is this connected to MDPs over "continuous" state spaces?

[Notice that the above still works for the special case fully observable domain. The initial distribution will be a delta function--with probablity being 1.0 for a single state, and 0.0 for all others. Doing an action converts into another delta function--with the 1.0 for a different state].


Qn 0. [George Costanza qn] Consider two learners that are trying to solve the same classification problem with two classes (+  and -). L1 seems to be averaging about 50% accuracy on the test cases while L2 seems to be averaging 25% accuracy. Which learner is good?  Why is this called the George Costanza question? ;-)
Qn 1. Consider a scenario where the training set examples have been labeled by a slightly drunk teacher--and thus they sometimes have wrong labels (e.g. +ve are wrongly labelled negative etc.).  Of course, for the learning to be doable, the percentage of these mislabelled instances should be quite small.  We have two learners, L1 and L2. L1 seems to be 100% correct on the *training* examples. L2 seems to be 90% correct on the training examples. Which learner is likely to do well on test cases?
Qn 2.  Compression involves using the pattern in the data to reduce the storage requirements of the data.  One way of doing this would be to find the rule underlying the data, and keep the rule and throw the data out. Viewed this way, compression and learning seem one and the same. After all, learning too seems to take the training examples, find a hypothesis ("pattern"/"rule") consistent with the examples, and use that hypothesis instead of the training examples.  What, if any, differences do you see between Compression and Learning?
Qn 3. We said that most human learning happens in the context of prior knowledge. Can we view prior knowledge as a form of bias?
In particular, can you say that our prior knowledge helps us focus on certain hypotheses as against other ones in explaining the data?

that is all for now.

Possibility of an *optional* extra class on "Learning" [Monday 10:30--11:45, BY 210] Free Food Event. [RSVP]


 I felt a little queasy that I may not have the full opportunity to sufficiently brainwash you with my view of learning.

So, I am offering to run a completely optional extra class on Monday 5/4 morning in BY 210 (10:30--11:45AM) *if* there is
sufficient interest.

As a clarification, nothing I discuss in this class would be included in the final exam, so if you skip it, you won't be in anyway
disadvantaged with respect to your course grade.

Also, the Tuesday regular class will not depend on what is discussed on Monday; it will take off from where we left today.

What I want to do is give a bigger picture about learning, that I didn't get to do because of time constraints.

*if* you are interested and will be able to attend, let me know by email.*************

If I get sufficient interest, I will confirm by Sunday if the meeting is going to be held.

Since I cannot use sticks, I will dangle the carrot of free food if you show up ;-)


Clarification on the planning question in Homework 4


We have not done partial order planning and mutual exlusion propagation in the class. SO you don't need to do
parts 1.d and 1.h in the planning question.


Monday, April 27, 2009

If Chess is gone, why should Jeopardy stay in human sphere?

The New York Times
This copy is for your personal, noncommercial use only. You can order presentation-ready copies for distribution to your colleagues, clients or customers here or use the "Reprints" tool that appears next to any article. Visit for samples and additional information. Order a reprint of this article now.
Printer Friendly Format Sponsored By

April 27, 2009

Computer Program to Take On 'Jeopardy!'

YORKTOWN HEIGHTS, N.Y. — This highly successful television quiz show is the latest challenge for artificial intelligence.

What is "Jeopardy"?

That is correct.

I.B.M. plans to announce Monday that it is in the final stages of completing a computer program to compete against human "Jeopardy!" contestants. If the program beats the humans, the field of artificial intelligence will have made a leap forward.

I.B.M. scientists previously devised a chess-playing program to run on a supercomputer called Deep Blue. That program beat the world champion Garry Kasparov in a controversial 1997 match (Mr. Kasparov called the match unfair and secured a draw in a later one against another version of the program).

But chess is a game of limits, with pieces that have clearly defined powers. "Jeopardy!" requires a program with the suppleness to weigh an almost infinite range of relationships and to make subtle comparisons and interpretations. The software must interact with humans on their own terms, and fast.

Indeed, the creators of the system — which the company refers to as Watson, after the I.B.M. founder, Thomas J. Watson Sr. — said they were not yet confident their system would be able to compete successfully on the show, on which human champions typically provide correct responses 85 percent of the time.

"The big goal is to get computers to be able to converse in human terms," said the team leader, David A. Ferrucci, an I.B.M. artificial intelligence researcher. "And we're not there yet."

The team is aiming not at a true thinking machine but at a new class of software that can "understand" human questions and respond to them correctly. Such a program would have enormous economic implications.

Despite more than four decades of experimentation in artificial intelligence, scientists have made only modest progress until now toward building machines that can understand language and interact with humans.

The proposed contest is an effort by I.B.M. to prove that its researchers can make significant technical progress by picking "grand challenges" like its early chess foray. The new bid is based on three years of work by a team that has grown to 20 experts in fields like natural language processing, machine learning and information retrieval.

Under the rules of the match that the company has negotiated with the "Jeopardy!" producers, the computer will not have to emulate all human qualities. It will receive questions as electronic text. The human contestants will both see the text of each question and hear it spoken by the show's host, Alex Trebek.

The computer will respond with a synthesized voice to answer questions and to choose follow-up categories. I.B.M. researchers said they planned to move a Blue Gene supercomputer to Los Angeles for the contest. To approximate the dimensions of the challenge faced by the human contestants, the computer will not be connected to the Internet, but will make its answers based on text that it has "read," or processed and indexed, before the show.

There is some skepticism among researchers in the field about the effort. "To me it seems more like a demonstration than a grand challenge," said Peter Norvig, a computer scientist who is director of research at Google. "This will explore lots of different capabilities, but it won't change the way the field works."

The I.B.M. researchers and "Jeopardy!" producers said they were considering what form their cybercontestant would take and what gender it would assume. One possibility would be to use an animated avatar that would appear on a computer display.

"We've only begun to talk about it," said Harry Friedman, the executive producer of "Jeopardy!" "We all agree that it shouldn't look like Robby the Robot."

Mr. Friedman added that they were also thinking about whom the human contestants should be and were considering inviting Ken Jennings, the "Jeopardy!" contestant who won 74 consecutive times and collected $2.52 million in 2004.

I.B.M. will not reveal precisely how large the system's internal database would be. The actual amount of information could be a significant fraction of the Web now indexed by Google, but artificial intelligence researchers said that having access to more information would not be the most significant key to improving the system's performance.

Eric Nyberg, a computer scientist at Carnegie Mellon University, is collaborating with I.B.M. on research to devise computing systems capable of answering questions that are not limited to specific topics. The real difficulty, Dr. Nyberg said, is not searching a database but getting the computer to understand what it should be searching for.

The system must be able to deal with analogies, puns, double entendres and relationships like size and location, all at lightning speed.

In a demonstration match here at the I.B.M. laboratory against two researchers recently, Watson appeared to be both aggressive and competent, but also made the occasional puzzling blunder.

For example, given the statement, "Bordered by Syria and Israel, this small country is only 135 miles long and 35 miles wide," Watson beat its human competitors by quickly answering, "What is Lebanon?"

Moments later, however, the program stumbled when it decided it had high confidence that a "sheet" was a fruit.

The way to deal with such problems, Dr. Ferrucci said, is to improve the program's ability to understand the way "Jeopardy!" clues are offered. The complexity of the challenge is underscored by the subtlety involved in capturing the exact meaning of a spoken sentence. For example, the sentence "I never said she stole my money" can have seven different meanings depending on which word is stressed.

"We love those sentences," Dr. Nyberg said. "Those are the ones we talk about when we're sitting around having beers after work."


Sunday, April 26, 2009

Added a video for the second alpha-beta example...

I added a short video explaining the second alpha-beta example (right below the audio for the lecture). You might find it useful
if you found the discussion in the (make-up)class too fast to follow.

A second alpha-beta example. Video with narration (150mb) (Powerpoint slide (with animnation)


Friday, April 24, 2009

Re: Lecture notes and audio for today's class are available...

If you want a single avi file that has both the moving slides and audio together, and you have a high-speed internet,  you can try the following link
(Warning: this is a 2gig avi file..)


On Fri, Apr 24, 2009 at 3:40 PM, Subbarao Kambhampati <> wrote:
The slides are at
 (start at slide 44--which says 4/24  and end at slide 63--which says "4/24 class ended here")

here is the link to the audio

Here is a summary of what is done

Audio of [Apr 24, 2009] (Make-up for April 28th class). Policy Iteration for MDPS, Finding policies in dynamic domains. Real-time dynamic programming. RTA* as a special case. Going from dynamic to multi-agent domains--where RTDP becomes min-max (or max-max if you are Ned in Simpsons). Discussion of adversarial search. Types of games (and the exciting nature of deterministic version of Snakes and Ladders ;-). Minmax with depth-first search. Alpha-beta pruning. The effectiveness of alpha-beta pruning.

You now have everything needed to do all parts of homework except the learning ones.

When I come back for the regular class on Thursday, I will tie up a few loose ends on game tree search (~15min)
and start learning  (for which we will cover 18.1-18.5 and then 20.1 and 20.2)


Ps: As I said, the Tuesday class will now be an optional review session lead by Will Cushing.

Tuesday class/review

In addition to letting me know if you have any specific requests, I'd
appreciate a generic "I'll be showing up" from everyone who is, in
fact, planning on attending.


P.S. For that matter, if your *not* planning on attending it would
still be a good idea to let me know. Or if your plans change one way
or another.

Lecture notes and audio for today's class are available...

The slides are at
 (start at slide 44--which says 4/24  and end at slide 63--which says "4/24 class ended here")

here is the link to the audio

Here is a summary of what is done

Audio of [Apr 24, 2009] (Make-up for April 28th class). Policy Iteration for MDPS, Finding policies in dynamic domains. Real-time dynamic programming. RTA* as a special case. Going from dynamic to multi-agent domains--where RTDP becomes min-max (or max-max if you are Ned in Simpsons). Discussion of adversarial search. Types of games (and the exciting nature of deterministic version of Snakes and Ladders ;-). Minmax with depth-first search. Alpha-beta pruning. The effectiveness of alpha-beta pruning.

You now have everything needed to do all parts of homework except the learning ones.

When I come back for the regular class on Thursday, I will tie up a few loose ends on game tree search (~15min)
and start learning  (for which we will cover 18.1-18.5 and then 20.1 and 20.2)


Ps: As I said, the Tuesday class will now be an optional review session lead by Will Cushing.

Interactive Review Qn added to the homework--post your answer as a comment to this thread on the blog

 I added an extra question to the last homework

 [Mandatory] [Answer to this question *must*  also be posted on the
            class blog as a comment to my post]. List upto five non-trivial ideas you were
            able to appreciate during the course of this
            semester. (These cannot be "I thought Bayes Nets were Groovy"
            variety--and have to include a sentence of

Your answer to this should be posted to the blog as a comment on *this* thread. (Due by the same time--last class)

The collection of your answers will serve as a form of interactive review of the semester.


Optional review class with Will Cushing on Tuesday during class time


 As you know, I won't be here on Tuesday and am making up for that class with a meeting today.
Those of you who cannot make it can use the slides and audio (and may be video if my recording works).

Regarding Tuesday, since all of you will be available during the class, it seemed to me that we should use it
for a review session (since the semester is ending anyway, and getting a review session with everyone being
available is going to be hard).

Will Cushing--who did the review before mid-term--is willing to come and sort of pick-up where he left off
(at least go over the stuff from homework 3).   I encourage you to make use of it.

Send an email to Will (cc'd here) if you have any questions about specific things you want him to go over.


Final Homework posted; Solutions to last home work posted


 I posted the final homework--it has 5 questions. You already are ready to do the first two. You will be able to do the
first four by the end of today's class. The last question, on learning, will be "due" only if the material gets covered
on next Thursday.

I posted the solutions to homework 4; will post solutions to this homework on the last class (which is May 5th--Tuesday after next).


Wednesday, April 22, 2009

Project 4

Do we have to include a trace on each of the querys of the second domain?
It seems like they get pretty large

Tuesday, April 21, 2009

(really really) mandatory reading for the next class..

You should make sure to read 17.1--17.3 for the next class.

After that, we will get into 6.1-6.4


Sunday, April 19, 2009

Proj4 task 2 & task 3 questions

Rao or Yunsong,
A) Can you say if the prolog function in task 2 is supposed to include the depth check from task 1, or is that giving too much away?

B) Also, in task 3, it says to modify theorem prover to produce an answer other than T or nil, but then the example shows the prolog function being called. To me it makes more sense to modify the prolog function, rather than theorem prover, but can you clear this up for me?

Var Substitution

When dealing with the variable substitution function in part 2 what is the format for the parameters. In the project write up it
(varsubst '(parent (? x) (mother-of (? y)))
'( ((? x) Fred) ((? y) Mary)))

But it seems that '(parent (? x) (mother-of (? y))) should be '((parent (? x) (mother-of(?y)))

What is the correct format?

Saturday, April 18, 2009

Thinking Cap Questions: Planning

See if you can crack some of these...

[0.][Don't ask why something is bad. Ask why it is not worse?] We said that regression searches in the space of partial states or "sets" of real world states. We also said that there are
3^n such partial states (where n is the number of state variables). However, given that there are actually 2^n complete states,
there should be 2^{2^n} distinct sets of states. How come regression is searching in only 3^n of them? (Notice 3^n is hugely better
than 2^2^n).

[1] One thing that I left unsaid w.r.t. planning graph based heuristic computation is how many layers the planning graph should be expanded. Can you see a natural place to stop expanding?
      Also, if you are too lazy to go all that far, can you think of how the heuristic will change if you expanded it a little less farther than is needed?

[2]Suppose the actions have differing costs (i.e., they are not all equally costly). Can you think of how planning graph based heuristics will behave in that case? (how does your answer to 1 above change?)

[3] In all the actions we looked at, we assumed that the effects are unconditional. That is, the action will give all its effects once it executed. Consider modeling the action of driving a bus from tempe to LA
The requirements for the driving are that the bus is originally in Tempe (and that there is a driver). The "unconditional" effects of the action are that the bus will be in LA, and the driver also will be in LA.
Now how about anyone else who is in the bus? Suppose Tom is in the bus when it is in Tempe--Tom will be in LA after the action. And yet, Tom being in the bus is not a requirement of the bus taking off. In
these cases, the effect of tom being in LA is a "conditional effect" ---if Tom is in the bus before the action, then Tom will be in LA after the action.  How do you see progression and regression changing
in the presence of conditional effects?


ps: if you are curious about planning graph heuristics, check out

Thursday, April 16, 2009

Note for Project 4

I was having trouble getting lisp's trace function to show any useful information so I talked to the professor. If you're compiling and then executing your lisp files like I was, the compiler can optimize out tail-recursion. So when you trace a recursive function, it looks like it only gets called once.

The solution is to load your lisp file using lisp's load function, like this:
(load "Z:/Documents/CSE471/Project4/Project4.lisp")

It won't get compiled. Now you can trace your recursive function:
(trace my-function)

And then when you call it, you'll see the traced output. Hope that helps someone.

Query regarding Task1 Project 4

Hi All,

I am missing something in my understanding of Task 1. It is mentioned in the project description that we need to implement a depth criteria that will cut off the recursion when it crosses a depth-limit. But it is not clear to me what the value of depth-limit should be. Also further up ahead the description hints at "guessing the limit". So does that mean we need to experiment with different depth-limit values?

Please help me out.


Wednesday, April 15, 2009

The expressiveness roller-coaster picture revisited...


 I went ahead and added a slide to yesterday's lecture for the expressiveness roller coaster that we discussed at the beginning of the class yesterday.
I am enclosing a jpeg here..

Comments and/or questions welcome (on the blog)


Thursday, April 9, 2009

Undergraduate research opportunities with my group..

This is meant especially for the undergraduate students in the class who
think they are doing well in the class and enjoying it.

I will likely have funding--starting summer--for supporting  undergraduate students
in AI research with my group (which does work in automated planning--
a topic we will hear a bit about next week). If this sort of thing interests you, get in touch with me.

Recent publications from my group can be found at
(the titles of the papers might give at least some inkling on the sorts of things we do).


Offer hours for project 2 and project 3 grading

Dear All, 
    I will be holding office hours through out the day(from 10am to 5pm) tomorrow for project 3 and project 4 grading. My cubicle is 464AC in brickyard building. If your project2 is deducted 5 points for only considering one diagonal, you can just go to the TA yunsong's office to have that corrected.I apologize for this mis-deduction as I wasn't aware that professor Rao told you guy one diagonal was ok.  For any other concerns, you are welcome to see me at my office. Best wishes. 

Xin Sun
Phd student at Arizona State University
Tempe, AZ, 85281, USA
Tel: 4809659038

In case the (ir)rationality of sqrt(2)^sqrt(2) is bugging you... + Constructive vs. Existential math.

..In case you are dying to know whether sqrt(2)^sqrt(2) is rational or irrational, you can be rest assured
that it is irrational (actually transcendental  (*)). So a constructive proof for
our theorem is with p=sqrt(2)^sqrt(2) and q=sqrt(2)


(which also points out a more general and easy to understand constructive proof. Consider
  e^{log_e q} for any transcendental number e and rational number q--which will be q. All you need to show is log_e(q) is irrational and you can show this easily (If log_e(q) = m/n with integers m and n without common factors, then
q = e^{m/n}. This would mean that e is the root of an algebraic equation x^m - q^n = 0. But the definition of transcendental number is that it cannot be the root of any algebraic equation!).


(*) By the way, transcendental => irrational but not vice versa. In particular, transcendentals are those irrational numbers that cannot be roots of any algebraic equation. Two famous examples of course are e and pi.  Notice that proving that a number e *is* transcendental involves showing that e^r for any rational number r cannot be rational (since if it is, then e will be the root of an algebraic equation). Thus, proving transcendentality is not all that easy.

(ps 2:

Check out

for a nice discussion on the Constructive vs. Classical mathematics--and how during Hilbert's time there was a pretty big controversy in mathematics--with mathematicians such as Brouer insisted that all math that depended on existential proofs be thrown out.Papa Hilbert had to come to rescue--pretty heady stuff.

You might also look at

which also talks about the slick "irrational power irrational can be rational proof..."

*Important*--a fully worked out example for variable elimination..

Several of you have been asking questions on the details of the variable elimination.

The URL here contains a fully worked out version of the example that the text book has. See if this reduces your confusion:

I am also linking it from the lecture notes.


Wednesday, April 8, 2009

CS101 Robot Program

On my way to the Computre Lab on the second floor. A bunch of students were making mini LEGO robots for maze solving. All equipped with a fixed Ultrasonic sensor(can compute distance from obstacles like walls) and a pair of wheels and a pivot to rotate. The maze was card board and on uneven surface. meaning the robots couldn't always keep going without often slipping or sliding, leading to changes in orientaion, often messing up everything. This was happening a lot.
The memory is sufficient but not enough to build a map of the maze. The good news is that the uneven surface can only cause the robot to change orientation by a max of 30-40 degrees....Which is potentially recoverable.... the maze is not too big and time is unlimited.

I know that it was simple DFS using the sensors, distance from the furthest wall would do as the depth measure...When you get to a wall pivot Right and left and spot the next furthest wall...

But there was a vey implicit "A starish" behaviour used in the best performing design, even when the search was DFS...Can you spot it!!??

Tuesday, April 7, 2009

project 4 released


 Project 4 is now released. It will require you to implement a prolog-style theorem prover starting from a given code base.
It basically involves doing the apartment-pet example discussed in today's class.

Please look at that example as well as the project assignment by next class so I can answer any questions you may have.


Another problem in FOL
In FOL there seems to be a separation between the constants and atomic terms of the logic and the actual statements made over variables. So natural question comes to mind is that while constructing a statements like FORALL(x) Happy(x)=> Playing(x) and then start putting in the constants and terms in our Knowledge base or model how do we validate the model that we build? When we did it in propositional logic we essentially made sure by using certain refutation and resolution methods that the knowledge base was consistent. Since we saw in the lecture how there is exponential blow up if we do inference the same would go for model checking too.... Another big problem...?

Monday, April 6, 2009

Required reading for tomorrow's class (class will cover project 4...!)

Please read 9.1, 9.2, 9.4(*) and 9.5

9.2 (unification) and 9.4--backward chaining--are connected to project 4 that will be released this week. So
it helps for you to be sure of this material.


Friday, April 3, 2009

project 2 grades and current cumulatives


 The project 2 just got graded. I will return them next Tuesday. However, since some of you wanted to
know your project 2 grade before this weekend, I am sending the information about who got how much
(as well as the current cumulatives--out of 51 points until now).

 Please note that I haven't yet had a chance to look at the graded projects and there may be some changes
in the points by the time you get them back on Tuesday.



Thursday, April 2, 2009

Homework 3 released--included bayes networks and First-order Logic topics

The fopc topics will get covered by next thursday--the homework is due the Tuesday after that.


Wednesday, April 1, 2009

Part 4 which network to use

In part 4 it says " Modify the bayes network to show this improved understanding of the
domain. Show the topology as well as the .bn representation"

Do we use the network we created in part 1 or part 3? Also, the first entry in the edit menu gives .bif files, not .bn files, is that okay?

Godel and First Order Logic?

With help from wikipedia, here's my summary of Godel's Theorems:

Godel's incompleteness theorems talk about the limitations of formal systems. First, two definitions: A set of axioms is consistent if you can't find a statement such that you can prove the statement and its negation from those axioms. A set of axioms is complete if for any statement in the axioms' language, you can prove it or its negative using those axioms.

So there are actually two theorems. The first one is something like "any system capable of expressing basic arithmetic cannot be both consistent and complete." So, if a system can prove basic arithmetic, then there is an arithmetic statement that is true, but not provable in that system.

The second theorem is something like "for any system that can express arithmetic and provability, that system can describe its own consistency if (and only if) it's inconsistent.

This was a big deal when Godel proved this, because mathematicians at the time were trying to rebuild math from the ground up, 100% provable and complete, and he basically showed that it was an impossible task.

So that's my non-mathematician's interpretation.

As far as our class and first order logic, the question becomes, is first order logic sufficiently capable of expressing basic arithmetic such that it must be either consistent or complete, but not both? This matters because we need to know if an agent can reach faulty conclusions because its logic is inconsistent, or if true statements can't be reached because its logic is incomplete, right?

Based on what Rao said in class, Godel's theorem doesn't apply to first order logic. Is it because you can create an infinite number of axioms in first order logic, and thus it's impossible to try and prove a given statement starting from an infinity of axioms? Or is there another reason? I'm not good enough at math to see the answer on my own.

(Also, there's a great book called "Godel, Escher Bach" that I think everyone in this class would really like).

Tuesday, March 31, 2009

Google develops worlds first artificial intelligence tasked-array system...

Research group switches on world's first "artificial intelligence" tasked-array system.

For several years now a small research group has been working on some challenging problems in the areas of neural networking, natural language and autonomous problem-solving. Last fall this group achieved a significant breakthrough: a powerful new technique for solving reinforcement learning problems, resulting in the first functional global-scale neuro-evolutionary learning cluster.

Glow-in-the-Dark Lenny & Watery Squishies

The probability of glow-in-the-dark employees given a meltdown is 0.5, and the probability of glow-in-the-dark employees given Homer is an idiot is 0.05 (P(Homer is an Idiot) = 1.0). Should these combine in a fashion similar to the noisy-or combination of inferior Pu and heavy water, i.e. is P(GitD|MD,Homer) = 0.525 or simply 0.5?
Also, is the same true of Apu's bad Squishie machine, the meltdown, and watery squishies?

mandatory Readings for next class:

For First order logic, we will cover

8.1, 8.2, 9.1, 9.2, 9.4 and 9.5

read at least the first three sections before coming to class on Thursday


Monday, March 30, 2009

(at-home) optional midterm marks


 Here are the marks on the at-home version of the mid-term. I juxtaposed them with
the in-class one just so you have an idea.  Note that the marks are not completely comparable
since the at-home one is graded by Yunsong.



Thursday, March 26, 2009

An article about the Loebner Prize Turing Test Competition

Re: clarification?

On Thu, Mar 26, 2009 at 12:21 PM, student wrote
Hi professor,
Could you help me get a breakdown of the marks allotted to homeworks and projects and exams? I think you mentioned in class today that the homework is about 5 points each and that the exam is worth 20 points. Could you also tell me approximately how many projects are left before the finals and how much the final would be worth?
Thank you so much!

The "default", as I announced at the beginning, is  20pts for all homeworks together; 35-40 for projects, and 40-45  for exams and 5-10 points for class/blog participation.

homeworks are typically worth 5pts each

projects are, on the average, worth 10 points each (bayes net project doesnt
involve coding and will probably have a little less weight than 10)

After the bayes net project, we will either have at least one more longer project (more likely; will be on first order logic) or
two B.N. like coding-light projects; haven't quite decided.

Hope this helps.


ps: I am broadcasting my answer as others might also be interested in it

Wednesday, March 25, 2009

Current cumulatives


 With two homeworks, project 0, project 1 and mid-term graded, here are the current cumulatives.
Note that I scaled the project 0 to be 1 percentage point, each homework to be 5 perecentage points,
the project 10 percentage points and the midterm 20 percentage points--so you have completed 41 points

The "percentage" column just shows your score (over 41 points) as a percentage.

Note that this is meant to be a rough estimate. The exact weightings for the projects/homeworks/midterm may change.
We will also make room for class/blog participation grade. Nevertheless, you can get a feel for how you are doing right now.



URL for midterm solutions

Here are the solutions for the midterm. Feel free to question them in the blog...


Midterm grade distribution...

Here are the stats for midterm:

 Average: 43.75;  Standard Deviation: 16
 Max: 69.5    Min: 15.5
     70-80: 0 
     60-70: 6
     50-60: 1
     40-50: 10
     30-40: 1
     20-30: 5
     10-20: 1

 Average: 55.3   Standard Dev: 18.35
 Max: 76.5   Min: 21.5
   70-80: 4
   60-70: 0
   50-60: 3
   40-50: 2
   30-40: 1
   20-30: 1

I will distribute the graded midterms in the class tomorrow.


Tuesday, March 24, 2009

Status of midterm grading.. (for the in-class version)


 I am about three-quarters way through grading the midterm. I expect to post the grades and bring the exams for distribution next class.

If any of you need to know your midterm grade earlier than that, please do let me know by email and I will let you know your grade asap.


Mandatory reading for next class: sections 14.4 and 14.5 in the textbook


 Please make sure to read sections 14.4 and 14.5 before coming to the class on Thursday. It is only 12 pages
and even a cursory reading will significantly increase your chances of following the lecture.


Saturday, March 21, 2009

Another cool course

This is what I found is being taught in John hopkins ATM... Very cool course structure...

Thursday, March 19, 2009

An exam so nice, some do it twice... or the low-down on the at-home version of the in-class exam...

The at-home-version-of-the-in-class-exam (ahvotice) is a pedagogical innovation
next only to the socket-open-socket-close-homework-assignments and blunt-force-trauma-causing
thinking-cap questions. Here is the standard FAQ on ahvotice

0. What are the ground rules for doing this--

Only that (a) you work independently and (b) you
submit it at the beginning of the class
on Tuesday 3/24

1. Can I just do the parts that I thought I didn't do well in the in-class version?

No. The at-home and in-class versions are graded as full papers.

2. Do I lose anything if I don't do it at home?

No (okay--you do lose the satisfaction of doing it twice;-). Your
grade in in-class will stand.

3. How is the effective midterm grade computed?

Eff = max( in-class; w*in-class+(1-w)*at-home )

4. What is the range of w?

0.5 < w <1

(typical values in the past ranged between .6 and .666)

5. But if everyone else does it at home and improve their grade, and
I decide to watch Simpsons/Seinfeld reruns instead, don't I lose out?

No. First of all, *nobody* ever loses out by watching reruns of Simpsons (Channel 6, weeknights 10 & 10:30) and
Seinfeld (Channel 10; week nights 10:30 and again at 11:30).

The difference between your inclass score and the Eff score will be
considered as your _extra credit_ on the mid term (and thus those
points wont affect grade cutoffs).

6. How do you device these ludicrously complex schemes?

Well, I had a relaxing spring break ;-)

7. Okay. I have no life outside of this course anyways. Tell me where I can find the exam?



Thinking Cap qns on Bayes Networks...

0. In class, we seemed to convince ourselves that the CPT entries don't have to add up to 1. Suppose you have a boolean node with m boolean parents. What is the maximum value of the sum of CPT entries? When does it happen?
1. You have been given the topology of a bayes network, but haven't yet gotten the conditional probability tables
    (to be concrete, you may think of the pearl alarm-earth quake scenario bayes net).
    Your friend shows up and says he has the joint distribution all ready for you. You don't quite trust your
    friend and think he is making these numbers up. Is there any way you can prove that your friends' joint
    distribution is not correct?

2. Continuing bad friends, in the question above, suppose a second friend comes along and says that he can give you
   the conditional probabilities that you want to complete the specification of your bayes net. You ask him a CPT entry,
   and pat comes a response--some number between 0 and 1. This friend is well meaning, but you are worried that the
   numbers he is giving may lead to some sort of inconsistent joint probability distribution. Afterall, your friend is a bayesian and is making up is *personal* probabilities that may not have any interpretation from a frequency point of view. Is your worry justified ( i.e., can your
   friend give you numbers that can lead to an inconsistency?)

  (To understand "inconsistency", consider someone who insists on giving you P(A), P(B), P(A&B) as well as P(AVB)  and they
wind up not satisfying the P(AVB)= P(A)+P(B) -P(A&B)
[or alternately, they insist on giving you P(A|B), P(B|A), P(A) and P(B), and the four numbers dont satisfy the bayes rule]

Your other friend (okay--your social life is full of geeks ever since you started taking this course) heard your claims that Bayes Nets can represent any possible conditional independence assertions exactly. She comes to you
and says he has four random variables, X, Y, W and Z, and only TWO conditional independence assertions:

X .ind. Y |  {W,Z}
W .ind. X  |  {X, Y}

She dares you to give him a bayes network topology on these four nodes that exactly represents these and only these conditional independencies.
Can you? (Note that you only need to look at 4 vertex directed graphs).
4. If your  answer to 3 above is going to be "No", how serious an issue do you think this is? In particular, suppose your domain has exactly set A of conditional independencies. You have two bayes network configurations B1 and B2. The CIA(B1) is a superset of
A and CIA(B1) is a subset of A.   Clearly, neither B1 nor B2 exactly represent what you know about the domain. If you have to choose one to model the domain, what are the tradeoffs in choosing B1 vs. B2?