Thursday, April 30, 2009
Using "evaluation function" learning as a segue into Machine Learning isn't as artificial as it might sound..
much more than a contrivance of convenience.
Turns out however that this has almost biblical importance. One of the first successful computer programs to use machine learning
(and the program that pretty much gave the name "Machine Learning" to the field) is Samuel's Checkers player. Samuel was an IBM researcher
who developed one of the first checkers playing programs, that learned to improve its performance by learning evaluation functions.
Here are few links:
http://en.wikipedia.org/wiki/Arthur_Samuel#Computer_checkers_.28draughts.29_development (definitely see this--you will be impressed at how many buzzwords you can understand now ;-)
http://www.ece.eps.hw.ac.uk/~pym1/private/docs/sutton/node109.html (slightly more technical)
cheers
rao
Last chance to don the [Thinking Cap] on MDPS & learning..
2.2. Planning problem where (some of) the variables are real valued. Actions have effects that can increment and decrement the variables by arbitrary amounts. What do you think will be the complexity of planning in this case? (recall that discrete variable planning is PSPACE-complete, i.e., it is among the hardest problems that can be solved with polynomial space).
3. The MDPs we considered until now are called "Fully observable"--in that during execution, the agent can tell which state it is in (thus the policy needs to only map states to actions). What happens if the domain is only "partially observable".
Note that this means that the agent may not know which unique state it is in, but knows the "probability distribution" over the possible states it could be in. When the agent does an action, it effect of the action is to modify the distribution into a new distribution over states (with some states being more likely and others less.
Notice that the "distribution" is fully observable although the underlying states aren't.
So, one idea will be to consider the distributions as the states. What happens to the complexity of MDPs? How is this connected to MDPs over "continuous" state spaces?
[Notice that the above still works for the special case fully observable domain. The initial distribution will be a delta function--with probablity being 1.0 for a single state, and 0.0 for all others. Doing an action converts into another delta function--with the 1.0 for a different state].
*****************Learning
Qn 0. [George Costanza qn] Consider two learners that are trying to solve the same classification problem with two classes (+ and -). L1 seems to be averaging about 50% accuracy on the test cases while L2 seems to be averaging 25% accuracy. Which learner is good? Why is this called the George Costanza question? ;-)
Possibility of an *optional* extra class on "Learning" [Monday 10:30--11:45, BY 210] Free Food Event. [RSVP]
I felt a little queasy that I may not have the full opportunity to sufficiently brainwash you with my view of learning.
So, I am offering to run a completely optional extra class on Monday 5/4 morning in BY 210 (10:30--11:45AM) *if* there is
sufficient interest.
As a clarification, nothing I discuss in this class would be included in the final exam, so if you skip it, you won't be in anyway
disadvantaged with respect to your course grade.
Also, the Tuesday regular class will not depend on what is discussed on Monday; it will take off from where we left today.
What I want to do is give a bigger picture about learning, that I didn't get to do because of time constraints.
*if* you are interested and will be able to attend, let me know by email.*************
If I get sufficient interest, I will confirm by Sunday if the meeting is going to be held.
Since I cannot use sticks, I will dangle the carrot of free food if you show up ;-)
cheers
Rao
Clarification on the planning question in Homework 4
We have not done partial order planning and mutual exlusion propagation in the class. SO you don't need to do
parts 1.d and 1.h in the planning question.
regards
rao
Wednesday, April 29, 2009
Monday, April 27, 2009
If Chess is gone, why should Jeopardy stay in human sphere?
Computer Program to Take On 'Jeopardy!'
YORKTOWN HEIGHTS, N.Y. — This highly successful television quiz show is the latest challenge for artificial intelligence.
What is "Jeopardy"?
That is correct.
I.B.M. plans to announce Monday that it is in the final stages of completing a computer program to compete against human "Jeopardy!" contestants. If the program beats the humans, the field of artificial intelligence will have made a leap forward.
I.B.M. scientists previously devised a chess-playing program to run on a supercomputer called Deep Blue. That program beat the world champion Garry Kasparov in a controversial 1997 match (Mr. Kasparov called the match unfair and secured a draw in a later one against another version of the program).
But chess is a game of limits, with pieces that have clearly defined powers. "Jeopardy!" requires a program with the suppleness to weigh an almost infinite range of relationships and to make subtle comparisons and interpretations. The software must interact with humans on their own terms, and fast.
Indeed, the creators of the system — which the company refers to as Watson, after the I.B.M. founder, Thomas J. Watson Sr. — said they were not yet confident their system would be able to compete successfully on the show, on which human champions typically provide correct responses 85 percent of the time.
"The big goal is to get computers to be able to converse in human terms," said the team leader, David A. Ferrucci, an I.B.M. artificial intelligence researcher. "And we're not there yet."
The team is aiming not at a true thinking machine but at a new class of software that can "understand" human questions and respond to them correctly. Such a program would have enormous economic implications.
Despite more than four decades of experimentation in artificial intelligence, scientists have made only modest progress until now toward building machines that can understand language and interact with humans.
The proposed contest is an effort by I.B.M. to prove that its researchers can make significant technical progress by picking "grand challenges" like its early chess foray. The new bid is based on three years of work by a team that has grown to 20 experts in fields like natural language processing, machine learning and information retrieval.
Under the rules of the match that the company has negotiated with the "Jeopardy!" producers, the computer will not have to emulate all human qualities. It will receive questions as electronic text. The human contestants will both see the text of each question and hear it spoken by the show's host, Alex Trebek.
The computer will respond with a synthesized voice to answer questions and to choose follow-up categories. I.B.M. researchers said they planned to move a Blue Gene supercomputer to Los Angeles for the contest. To approximate the dimensions of the challenge faced by the human contestants, the computer will not be connected to the Internet, but will make its answers based on text that it has "read," or processed and indexed, before the show.
There is some skepticism among researchers in the field about the effort. "To me it seems more like a demonstration than a grand challenge," said Peter Norvig, a computer scientist who is director of research at Google. "This will explore lots of different capabilities, but it won't change the way the field works."
The I.B.M. researchers and "Jeopardy!" producers said they were considering what form their cybercontestant would take and what gender it would assume. One possibility would be to use an animated avatar that would appear on a computer display.
"We've only begun to talk about it," said Harry Friedman, the executive producer of "Jeopardy!" "We all agree that it shouldn't look like Robby the Robot."
Mr. Friedman added that they were also thinking about whom the human contestants should be and were considering inviting Ken Jennings, the "Jeopardy!" contestant who won 74 consecutive times and collected $2.52 million in 2004.
I.B.M. will not reveal precisely how large the system's internal database would be. The actual amount of information could be a significant fraction of the Web now indexed by Google, but artificial intelligence researchers said that having access to more information would not be the most significant key to improving the system's performance.
Eric Nyberg, a computer scientist at Carnegie Mellon University, is collaborating with I.B.M. on research to devise computing systems capable of answering questions that are not limited to specific topics. The real difficulty, Dr. Nyberg said, is not searching a database but getting the computer to understand what it should be searching for.
The system must be able to deal with analogies, puns, double entendres and relationships like size and location, all at lightning speed.
In a demonstration match here at the I.B.M. laboratory against two researchers recently, Watson appeared to be both aggressive and competent, but also made the occasional puzzling blunder.
For example, given the statement, "Bordered by Syria and Israel, this small country is only 135 miles long and 35 miles wide," Watson beat its human competitors by quickly answering, "What is Lebanon?"
Moments later, however, the program stumbled when it decided it had high confidence that a "sheet" was a fruit.
The way to deal with such problems, Dr. Ferrucci said, is to improve the program's ability to understand the way "Jeopardy!" clues are offered. The complexity of the challenge is underscored by the subtlety involved in capturing the exact meaning of a spoken sentence. For example, the sentence "I never said she stole my money" can have seven different meanings depending on which word is stressed.
"We love those sentences," Dr. Nyberg said. "Those are the ones we talk about when we're sitting around having beers after work."
-----------
Sunday, April 26, 2009
Added a video for the second alpha-beta example...
if you found the discussion in the (make-up)class too fast to follow.
A second alpha-beta example. Video with narration (150mb) (Powerpoint slide (with animnation)
Rao
Friday, April 24, 2009
Re: Lecture notes and audio for today's class are available...
(Warning: this is a 2gig avi file..)
http://www.public.asu.edu/~subbarao/ai-class-4-24.avi
rao
The slides are at
http://rakaposhi.eas.asu.edu/cse471/week13-s09.ppt
(start at slide 44--which says 4/24 and end at slide 63--which says "4/24 class ended here")
here is the link to the audio
http://rakaposhi.eas.asu.edu/cse471/cse471-s09-week14.1.WAV
Here is a summary of what is done
Audio of [Apr 24, 2009] (Make-up for April 28th class). Policy Iteration for MDPS, Finding policies in dynamic domains. Real-time dynamic programming. RTA* as a special case. Going from dynamic to multi-agent domains--where RTDP becomes min-max (or max-max if you are Ned in Simpsons). Discussion of adversarial search. Types of games (and the exciting nature of deterministic version of Snakes and Ladders ;-). Minmax with depth-first search. Alpha-beta pruning. The effectiveness of alpha-beta pruning.
You now have everything needed to do all parts of homework except the learning ones.
When I come back for the regular class on Thursday, I will tie up a few loose ends on game tree search (~15min)
and start learning (for which we will cover 18.1-18.5 and then 20.1 and 20.2)
rao
Ps: As I said, the Tuesday class will now be an optional review session lead by Will Cushing.
Tuesday class/review
appreciate a generic "I'll be showing up" from everyone who is, in
fact, planning on attending.
-Will
P.S. For that matter, if your *not* planning on attending it would
still be a good idea to let me know. Or if your plans change one way
or another.
Lecture notes and audio for today's class are available...
http://rakaposhi.eas.asu.edu/cse471/week13-s09.ppt
(start at slide 44--which says 4/24 and end at slide 63--which says "4/24 class ended here")
here is the link to the audio
http://rakaposhi.eas.asu.edu/cse471/cse471-s09-week14.1.WAV
Here is a summary of what is done
Audio of [Apr 24, 2009] (Make-up for April 28th class). Policy Iteration for MDPS, Finding policies in dynamic domains. Real-time dynamic programming. RTA* as a special case. Going from dynamic to multi-agent domains--where RTDP becomes min-max (or max-max if you are Ned in Simpsons). Discussion of adversarial search. Types of games (and the exciting nature of deterministic version of Snakes and Ladders ;-). Minmax with depth-first search. Alpha-beta pruning. The effectiveness of alpha-beta pruning.
You now have everything needed to do all parts of homework except the learning ones.
When I come back for the regular class on Thursday, I will tie up a few loose ends on game tree search (~15min)
and start learning (for which we will cover 18.1-18.5 and then 20.1 and 20.2)
rao
Ps: As I said, the Tuesday class will now be an optional review session lead by Will Cushing.
Interactive Review Qn added to the homework--post your answer as a comment to this thread on the blog
I added an extra question to the last homework
===========
[Mandatory] [Answer to this question *must* also be posted on the
class blog as a comment to my post]. List upto five non-trivial ideas you were
able to appreciate during the course of this
semester. (These cannot be "I thought Bayes Nets were Groovy"
variety--and have to include a sentence of
justification).
=============
Your answer to this should be posted to the blog as a comment on *this* thread. (Due by the same time--last class)
The collection of your answers will serve as a form of interactive review of the semester.
Rao
Optional review class with Will Cushing on Tuesday during class time
As you know, I won't be here on Tuesday and am making up for that class with a meeting today.
Those of you who cannot make it can use the slides and audio (and may be video if my recording works).
Regarding Tuesday, since all of you will be available during the class, it seemed to me that we should use it
for a review session (since the semester is ending anyway, and getting a review session with everyone being
available is going to be hard).
Will Cushing--who did the review before mid-term--is willing to come and sort of pick-up where he left off
(at least go over the stuff from homework 3). I encourage you to make use of it.
Send an email to Will (cc'd here) if you have any questions about specific things you want him to go over.
Rao
Final Homework posted; Solutions to last home work posted
I posted the final homework--it has 5 questions. You already are ready to do the first two. You will be able to do the
first four by the end of today's class. The last question, on learning, will be "due" only if the material gets covered
on next Thursday.
I posted the solutions to homework 4; will post solutions to this homework on the last class (which is May 5th--Tuesday after next).
Rao
Wednesday, April 22, 2009
Project 4
It seems like they get pretty large
Tuesday, April 21, 2009
(really really) mandatory reading for the next class..
After that, we will get into 6.1-6.4
rao
Sunday, April 19, 2009
Proj4 task 2 & task 3 questions
Var Substitution
(varsubst '(parent (? x) (mother-of (? y)))
'( ((? x) Fred) ((? y) Mary)))
But it seems that '(parent (? x) (mother-of (? y))) should be '((parent (? x) (mother-of(?y)))
What is the correct format?
Saturday, April 18, 2009
Thinking Cap Questions: Planning
[0.][Don't ask why something is bad. Ask why it is not worse?] We said that regression searches in the space of partial states or "sets" of real world states. We also said that there are
3^n such partial states (where n is the number of state variables). However, given that there are actually 2^n complete states,
there should be 2^{2^n} distinct sets of states. How come regression is searching in only 3^n of them? (Notice 3^n is hugely better
than 2^2^n).
[1] One thing that I left unsaid w.r.t. planning graph based heuristic computation is how many layers the planning graph should be expanded. Can you see a natural place to stop expanding?
Also, if you are too lazy to go all that far, can you think of how the heuristic will change if you expanded it a little less farther than is needed?
[2]Suppose the actions have differing costs (i.e., they are not all equally costly). Can you think of how planning graph based heuristics will behave in that case? (how does your answer to 1 above change?)
[3] In all the actions we looked at, we assumed that the effects are unconditional. That is, the action will give all its effects once it executed. Consider modeling the action of driving a bus from tempe to LA
The requirements for the driving are that the bus is originally in Tempe (and that there is a driver). The "unconditional" effects of the action are that the bus will be in LA, and the driver also will be in LA.
Now how about anyone else who is in the bus? Suppose Tom is in the bus when it is in Tempe--Tom will be in LA after the action. And yet, Tom being in the bus is not a requirement of the bus taking off. In
these cases, the effect of tom being in LA is a "conditional effect" ---if Tom is in the bus before the action, then Tom will be in LA after the action. How do you see progression and regression changing
in the presence of conditional effects?
Rao
ps: if you are curious about planning graph heuristics, check out
http://rakaposhi.eas.asu.edu/pgSurvey.pdf
and
http://rakaposhi.eas.asu.edu/pg-tutorial
Thursday, April 16, 2009
Note for Project 4
The solution is to load your lisp file using lisp's load function, like this:
(load "Z:/Documents/CSE471/Project4/Project4.lisp")
It won't get compiled. Now you can trace your recursive function:
(trace my-function)
And then when you call it, you'll see the traced output. Hope that helps someone.
Query regarding Task1 Project 4
I am missing something in my understanding of Task 1. It is mentioned in the project description that we need to implement a depth criteria that will cut off the recursion when it crosses a depth-limit. But it is not clear to me what the value of depth-limit should be. Also further up ahead the description hints at "guessing the limit". So does that mean we need to experiment with different depth-limit values?
Please help me out.
Thanks.
Wednesday, April 15, 2009
The expressiveness roller-coaster picture revisited...
I went ahead and added a slide to yesterday's lecture for the expressiveness roller coaster that we discussed at the beginning of the class yesterday.
I am enclosing a jpeg here..
Comments and/or questions welcome (on the blog)
Rao
Thursday, April 9, 2009
Undergraduate research opportunities with my group..
think they are doing well in the class and enjoying it.
I will likely have funding--starting summer--for supporting undergraduate students
in AI research with my group (which does work in automated planning--
a topic we will hear a bit about next week). If this sort of thing interests you, get in touch with me.
Recent publications from my group can be found at http://rakaposhi.eas.asu.edu/papers.html
(the titles of the papers might give at least some inkling on the sorts of things we do).
cheers
rao
Offer hours for project 2 and project 3 grading
--
Xin Sun
Phd student at Arizona State University
Tempe, AZ, 85281, USA
Tel: 4809659038
In case the (ir)rationality of sqrt(2)^sqrt(2) is bugging you... + Constructive vs. Existential math.
that it is irrational (actually transcendental (*)). So a constructive proof for
our theorem is with p=sqrt(2)^sqrt(2) and q=sqrt(2)
see http://www.math.hmc.edu/funfacts/ffiles/10004.3-5.shtml
(which also points out a more general and easy to understand constructive proof. Consider
e^{log_e q} for any transcendental number e and rational number q--which will be q. All you need to show is log_e(q) is irrational and you can show this easily (If log_e(q) = m/n with integers m and n without common factors, then
q = e^{m/n}. This would mean that e is the root of an algebraic equation x^m - q^n = 0. But the definition of transcendental number is that it cannot be the root of any algebraic equation!).
Rao
(*) By the way, transcendental => irrational but not vice versa. In particular, transcendentals are those irrational numbers that cannot be roots of any algebraic equation. Two famous examples of course are e and pi. Notice that proving that a number e *is* transcendental involves showing that e^r for any rational number r cannot be rational (since if it is, then e will be the root of an algebraic equation). Thus, proving transcendentality is not all that easy.
(ps 2:
Check out
http://digitalphysics.org/Publications/Cal79/html/cmath.htm
for a nice discussion on the Constructive vs. Classical mathematics--and how during Hilbert's time there was a pretty big controversy in mathematics--with mathematicians such as Brouer insisted that all math that depended on existential proofs be thrown out.Papa Hilbert had to come to rescue--pretty heady stuff.
You might also look at
http://plato.stanford.edu/entries/mathematics-constructive/
which also talks about the slick "irrational power irrational can be rational proof..."
*Important*--a fully worked out example for variable elimination..
The URL here contains a fully worked out version of the example that the text book has. See if this reduces your confusion:
http://www.cs.uiuc.edu/class/sp08/cs440/notes/varElimLec.pdf
I am also linking it from the lecture notes.
rao
Wednesday, April 8, 2009
CS101 Robot Program
The memory is sufficient but not enough to build a map of the maze. The good news is that the uneven surface can only cause the robot to change orientation by a max of 30-40 degrees....Which is potentially recoverable.... the maze is not too big and time is unlimited.
I know that it was simple DFS using the sensors, distance from the furthest wall would do as the depth measure...When you get to a wall pivot Right and left and spot the next furthest wall...
But there was a vey implicit "A starish" behaviour used in the best performing design, even when the search was DFS...Can you spot it!!??
Tuesday, April 7, 2009
project 4 released
Project 4 is now released. It will require you to implement a prolog-style theorem prover starting from a given code base.
It basically involves doing the apartment-pet example discussed in today's class.
Please look at that example as well as the project assignment by next class so I can answer any questions you may have.
thanks
Rao
In FOL there seems to be a separation between the constants and atomic terms of the logic and the actual statements made over variables. So natural question comes to mind is that while constructing a statements like FORALL(x) Happy(x)=> Playing(x) and then start putting in the constants and terms in our Knowledge base or model how do we validate the model that we build? When we did it in propositional logic we essentially made sure by using certain refutation and resolution methods that the knowledge base was consistent. Since we saw in the lecture how there is exponential blow up if we do inference the same would go for model checking too.... Another big problem...?
Monday, April 6, 2009
Required reading for tomorrow's class (class will cover project 4...!)
9.2 (unification) and 9.4--backward chaining--are connected to project 4 that will be released this week. So
it helps for you to be sure of this material.
thanks
rao
Friday, April 3, 2009
project 2 grades and current cumulatives
The project 2 just got graded. I will return them next Tuesday. However, since some of you wanted to
know your project 2 grade before this weekend, I am sending the information about who got how much
(as well as the current cumulatives--out of 51 points until now).
Please note that I haven't yet had a chance to look at the graded projects and there may be some changes
in the points by the time you get them back on Tuesday.
Rao
Thursday, April 2, 2009
Homework 3 released--included bayes networks and First-order Logic topics
rao
Wednesday, April 1, 2009
Part 4 which network to use
domain. Show the topology as well as the .bn representation"
Do we use the network we created in part 1 or part 3? Also, the first entry in the edit menu gives .bif files, not .bn files, is that okay?
Godel and First Order Logic?
Godel's incompleteness theorems talk about the limitations of formal systems. First, two definitions: A set of axioms is consistent if you can't find a statement such that you can prove the statement and its negation from those axioms. A set of axioms is complete if for any statement in the axioms' language, you can prove it or its negative using those axioms.
So there are actually two theorems. The first one is something like "any system capable of expressing basic arithmetic cannot be both consistent and complete." So, if a system can prove basic arithmetic, then there is an arithmetic statement that is true, but not provable in that system.
The second theorem is something like "for any system that can express arithmetic and provability, that system can describe its own consistency if (and only if) it's inconsistent.
This was a big deal when Godel proved this, because mathematicians at the time were trying to rebuild math from the ground up, 100% provable and complete, and he basically showed that it was an impossible task.
So that's my non-mathematician's interpretation.
As far as our class and first order logic, the question becomes, is first order logic sufficiently capable of expressing basic arithmetic such that it must be either consistent or complete, but not both? This matters because we need to know if an agent can reach faulty conclusions because its logic is inconsistent, or if true statements can't be reached because its logic is incomplete, right?
Based on what Rao said in class, Godel's theorem doesn't apply to first order logic. Is it because you can create an infinite number of axioms in first order logic, and thus it's impossible to try and prove a given statement starting from an infinity of axioms? Or is there another reason? I'm not good enough at math to see the answer on my own.
(Also, there's a great book called "Godel, Escher Bach" that I think everyone in this class would really like).