Author Topic: AlphaGo now 2:0 vs. Lee Sedol (Google DeepMind Go/Weiqi/Baduk Challenge Match) (Read 10349 times)

nidlaX · « **on:** March 10, 2016, 08:48:47 am »

https://deepmind.com/alpha-go.html

Wow, this could be considered another watershed moment for AI research. Lee was initially confident that he could beat AlphaGo, but now it's questionable whether or not he'll win a single game.

To put this into perspective, go/weiqi/baduk has been a focus of formalized-game AI research ever since Deep Blue vs. Kasparov and the domination of chess AI engines since. Last year, it was generally believed that the best Go AI would not be able to challenge the best human players until 10 years into the future. Then in October, DeepMind published a nature paper demonstrating their neural network / deep learning based approach that enabled them to beat a professional tier Go player, Fan Hui, 5:0. Today, AlphaGo is 2:0 vs the best Go player in the world, Lee Sedol.

Go has about 2*10^170 possible moves, while chess "only" has about 10^47.

Rerouter · « **Reply #1 on:** March 10, 2016, 09:02:39 am »

whenever i read about the difficulties in building an AI, most of them seem to happen due to the massive amount of options available, however that means they are still brute-forcing the problem and not approaching it with true learning, no human will have even seen a small portion of those 2*10^170 moves, same for chess's 10^47 set of moves, why because they don't make sense in the scheme of the game, or compared to how you learned the game viewed as non- optimal,

I suppose its the difference between playing for a garenteed win (by knowing all possible option paths you can force the human to loose), or by playing as a equal, then improving to win most of the time, (learning patterns and strategies that win, and counters to most likely to occur moves, while being able to plan out a few plays ahead to gain an advantage,

Why i say this is, real AI cannot be brute forced, you can teach it every possible bit of music it can play with a midi interface, you can even begin to train it on what sounds good, but change something and you need to go back to the drawing board, same way that teaching it to play go from chess took many years, when they have something that can go from playing checkers to uno to 20 questions with a few minutes / few hours of learning while confused working out the rules, that will be a bigger goal post in my head,

ataradov · « **Reply #2 on:** March 10, 2016, 09:07:18 am »

I don't know much about go, but from the commentary it appears that position evaluation evolves a lot of counting. And guess what computers are good for?

The rest of the game is hard, of course, but it may be more fair if position was publicly shown for both players.

Another thing is, "AI" in this case is a new type of player, which human has never played against. Give Lee plenty of time 1:1 with this AI and come back in a year to see how he is doing.

nidlaX · « **Reply #3 on:** March 10, 2016, 09:13:07 am »

Quote from: Rerouter on March 10, 2016, 09:02:39 am

whenever i read about the difficulties in building an AI, most of them seem to happen due to the massive amount of options available, however that means they are still brute-forcing the problem and not approaching it with true learning, no human will have even seen a small portion of those 2*10^170 moves, same for chess's 10^47 set of moves, why because they don't make sense in the scheme of the game, or compared to how you learned the game viewed as non- optimal,

I suppose its the difference between playing for a garenteed win (by knowing all possible option paths you can force the human to loose), or by playing as a equal, then improving to win most of the time, (learning patterns and strategies that win, and counters to most likely to occur moves, while being able to plan out a few plays ahead to gain an advantage,

Why i say this is, real AI cannot be brute forced, you can teach it every possible bit of music it can play with a midi interface, you can even begin to train it on what sounds good, but change something and you need to go back to the drawing board, same way that teaching it to play go from chess took many years, when they have something that can go from playing checkers to uno to 20 questions with a few minutes / few hours of learning while confused working out the rules, that will be a bigger goal post in my head,

Well yes, but that is precisely why this result is so important. There isn't enough computing power on the planet to brute-force and backwards induct all of the correct moves to play, but AlphaGo used supervised and unsupervised learning to "learn" how to play the game at a high level.

dannyf · « **Reply #4 on:** March 10, 2016, 11:19:03 pm »

The interview I read suggested that alphago had a great opening game but Lee held on. Towards the end, Lee saw a play that didn't quite make sense to him but thought that the computer must have calculated it out and kind of gave up on that.

So that's hope 1 for humans - a player with strong mental strength may have come to a different outcome.

A computer has an advantage in any game where a simple set of rules applys, like chess or go. I would argue that that computational advantage is bigger in a more complex game like go (vs. A simpler chess game) as humans simply don't possess the processing power needed to evaluate as many scenarios as deeply as a computer could.

In a less structured game (monopoly or guessing) human may have a better chance.

Would be interesting if Google could provide some stats about the amount of processing alphago undertook.

ve7xen · « **Reply #5 on:** March 11, 2016, 12:01:38 am »

I know very little of the game of Go, but have been fascinated by watching these matches. It's pretty clear that intuition and flow is a strong part of the game, in addition to strong evaluation of potential on the board for both sides. That AlphaGo is even remotely competitive with a 9-dan professional is pretty amazing. That it's now 2-0 is just incredible. Interestingly, the DeepMind team has taken a fairly different approach to most previous game AIs with their neural learning algorithm. The paper is very interesting and highly recommended reading. It seems that this approach is quite general and can be relatively easily applied to other deep knowledge problems, it's not just an expert system. I see it dovetailing nicely with IBM's amazing Watson that is able to extract meaning and structure from raw information, and DeepMind able to analyze it and formulate future expectations and recommendations.

If AlphaGo manages to sweep Lee Sedol, I'd be curious to see if any other professional players that use a very different style have any more luck or expose any weaknesses. Given how it learns I wouldn't expect the opponent's style to affect its ability all that much, but it might tease out some new behaviour.

dannyf · « **Reply #6 on:** March 11, 2016, 12:21:35 pm »

Some rough measurements:

Deep blue: 11 gflops;
Snapdragon 400/600: 60 gflops
low-end i7: 100 gflops;
Snapdragon 800: 150 - 400 gflops
a lowly GTX480: 700 gflops
Radeon R800: 3000 gflops

Basically, your typical cell phones have gobs more processing power than Deep Blue. It is likely that in the not so distant future, humans stand no chance winning a chess game against computers.

dannyf · « **Reply #7 on:** March 11, 2016, 01:01:08 pm »

We have had machines that beat us one way or another: cars run faster than we can, airplanes fly, submarines dive, computers calculate more accurately and faater, ...

From that perspective, I don't think this is that big of a deal. And I don't think it is a game changer. But it is an important building block towards auto decisioning, useful down the road for more advanced computer.

What is scary if somewhere down the road, a computer becomes self conscious and self aware, by accident or by design, and starts to pull all of those pieces together, what is there to protect humanity?

German_EE · « **Reply #8 on:** March 12, 2016, 09:16:46 am »

Who remembers the episode of Star Trek where Cmdr Spock was able to prove that there was a problem with the computer by beating it at chess? As he had personally programmed the machine he reckoned that the best result he should have obtained was a draw.

So, get Lee Sedol to program the machine and see how it performs.

dannyf · « **Reply #9 on:** March 12, 2016, 06:56:52 pm »

Well, the 3rd game (decisive one) is upon us.

dannyf · « **Reply #10 on:** March 12, 2016, 07:01:59 pm »

alphago is said to learn from playing games (against humans so far). So two questions:

1) what will it ever become if it keeps winning against humans? Just a tad better than humans?
2) what if you play alphagos against each other? what will happen to their learning / knowledge accumulation?

The 2nd question is a lot more interesting, isn't it?

German_EE · « **Reply #11 on:** March 12, 2016, 07:30:28 pm »

For various definitions of the word 'interesting'.

I once worked with someone who was badly affected by Aspergers Syndrome and his idea of an evenings fun was to set two different models of chess computer against each other and watch the result.

dannyf · « **Reply #12 on:** March 13, 2016, 12:50:55 am »

So far computer AI games are two player games where the computer has complete information about its own positions and strategies.

There are plenty of games where you play with your teammates against other teams, like bridge. In those games, any player doesn't fully know his team's positions (cards) and strategies and have to make educated guesses.

I suspect that humans will have, at least relatively speaking, less of a disadvantage or even more of an advantage against computers.

rs20 · « **Reply #13 on:** March 13, 2016, 01:16:18 am »

Quote from: dannyf on March 12, 2016, 07:01:59 pm

alphago is said to learn from playing games (against humans so far). So two questions:

Wrong; it's been playing against itself for the past 5 months since playing Fan Hui, and longer before that.

Quote from: dannyf on March 12, 2016, 07:01:59 pm

1) what will it ever become if it keeps winning against humans? Just a tad better than humans?

So how did it already reach the level of being able to beat the almost-world-champion already?

Quote from: dannyf on March 12, 2016, 07:01:59 pm

2) what if you play alphagos against each other? what will happen to their learning / knowledge accumulation?

The 2nd question is a lot more interesting, isn't it?

Yeah, that's the question that's being answered right now!

wraper · « **Reply #14 on:** March 13, 2016, 01:35:31 am »

Quote from: dannyf on March 12, 2016, 07:01:59 pm

alphago is said to learn from playing games (against humans so far). So two questions:

Quote

We trained the neural networks on 30 million moves from games played by human experts, until it could predict the human move 57 percent of the time (the previous record before AlphaGo was 44 percent). But our goal is to beat the best human players, not just mimic them. To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process known as reinforcement learning. Of course, all of this requires a huge amount of computing power, so we made extensive use of Google Cloud Platform.
After all that training it was time to put AlphaGo to the test. First, we held a tournament between AlphaGo and the other top programs at the forefront of computer Go. AlphaGo won all but one of its 500 games against these programs. So the next step was to invite the reigning three-time European Go champion Fan Hui—an elite professional player who has devoted his life to Go since the age of 12—to our London office for a challenge match. In a closed-doors match last October, AlphaGo won by 5 games to 0. It was the first time a computer program has ever beaten a professional Go player.

wraper · « **Reply #15 on:** March 13, 2016, 02:00:55 am »

Quote from: dannyf on March 11, 2016, 12:21:35 pm

Some rough measurements:

Deep blue: 11 gflops;
Snapdragon 400/600: 60 gflops
low-end i7: 100 gflops;
Snapdragon 800: 150 - 400 gflops
a lowly GTX480: 700 gflops
Radeon R800: 3000 gflops

The version of AlphaGo playing against Lee uses 1,920 CPUs and 280 GPUs

dannyf · « **Reply #16 on:** March 13, 2016, 10:20:01 am »

3:1. Lee won the last game.

Did Lee win or did alphago let lee win?

dannyf · « **Reply #17 on:** March 13, 2016, 10:24:28 am »

Quote

To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks,

If it kept doing that, what would be the end result? would it one day make a move that no human has ever made? or no human would ever make? will it one day update its own algorithm?

If it did, it would be scary.

rs20 · « **Reply #18 on:** March 13, 2016, 10:26:39 am »

Quote from: dannyf on March 13, 2016, 10:20:01 am

3:1. Lee won the last game.

Did Lee win or did alphago let lee win?

AI playbook 101:

Step 1: Fullfil your programmed goal as efficiently as possible (AI leads match 3:0)
Step 2: Once that's out of the way, lie low to avoid arousing further suspicion, to prevent the humans from getting scared and turning you off (throw next two games).

If the match finishes 3:2, we know the singularity has arrived!!

rs20 · « **Reply #19 on:** March 13, 2016, 10:35:47 am »

Quote from: dannyf on March 13, 2016, 10:24:28 am

Quote
To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks,

If it kept doing that, what would be the end result? would it one day make a move that no human has ever made? or no human would ever make? will it one day update its own algorithm?

You seem to be assuming that the neural network outputs a move, and that producing a move requires some regurgitation of previously seen moves. But this is not the case. The (most interesting) neural network in AlphaGo is the value network, which estimates the value (i.e., probability of an eventual win) of a particular board configuration. The AI evaluates all possible moves* and picks the one with the highest value as its move.

So if the neural net improves and finds that conventional moves don't work, the neural net decreases its win percentage for those configurations and other, more speculative moves bubble to the surface as alternatives. So yes, it will come up with moves never before played by a human. But the wording of that sentence makes it sound more impressive than it really is.

* Actually, a second ("policy") neural network generates valid moves for evaluation, but given that the policy network only has to output valid moves, and there are at most 361 possible moves, "inventing" a new move is hardly a great achievement from the point of view of the policy network.

Quote from: dannyf on March 13, 2016, 10:24:28 am

If it did, it would be scary.

Why? That's the whole point. If it couldn't do that, neural networks would be completely worthless.

dannyf · « **Reply #20 on:** March 13, 2016, 10:37:52 am »

Kind like getting two AI machines to talk to each other and see if they can learn from that conversation. Siri to siri, or siri to OK Google, .

wraper · « **Reply #21 on:** March 13, 2016, 10:42:50 am »

Quote from: dannyf on March 13, 2016, 10:24:28 am

Quote
To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks,

If it kept doing that, what would be the end result? would it one day make a move that no human has ever made? or no human would ever make?

Actually it did, Lee Sedol was shocked.

Quote

‘A Creative Move’
Then, with its 19th move, AlphaGo made an even more surprising and forceful play, dropping a black piece into some empty space on the right-hand side of the board. Lee Sedol seemed just as surprised as anyone else. He promptly left the match table, taking an (allowed) break as his game clock continued to run. “It’s a creative move,” Redmond said of AlphaGo’s sudden change in tack. “It’s something that I don’t think I’ve seen in a top player’s game.”

When Lee Sedol returned to the match table, he took an usually long time to respond, his game clock running down to an hour and 19 minutes, a full twenty minutes less than the time left on AlphaGo’s clock. “He’s having trouble dealing with a move he has never seen before,” Redmond said. But he also suspected that the Korean grandmaster was feeling a certain “pleasure” after the machine’s big move. “It’s something new and unique he has to think about,” Redmond explained. “This is a reason people become pros.”

wraper · « **Reply #22 on:** March 13, 2016, 11:10:15 am »

Watch the reaction at 1:18:12

dannyf · « **Reply #23 on:** March 13, 2016, 11:36:30 am »

"Actually it did, Lee Sedol was shocked"

Did that move improve alphago position? Human players do that kind of moves too.

I wonder if it actually did alphago any good by messing up Lee's moves or messing up Lee's mental calculation.

wraper · « **Reply #24 on:** March 13, 2016, 11:46:12 am »

Quote from: dannyf on March 13, 2016, 11:36:30 am

"Actually it did, Lee Sedol was shocked"

Did that move improve alphago position? Human players do that kind of moves too.

I wonder if it actually did alphago any good by messing up Lee's moves or messing up Lee's mental calculation.

Quote

Game 2
AlphaGo (black) won the second game. Lee stated afterwards that "AlphaGo played a nearly perfect game

So it didn't do any bad.
Also, it is really hard to tell what is good and what is bad.

Quote

AlphaGo showed anomalies and moves from a broader perspective which professional Go players described as looking like mistakes at the first sight but an intentional strategy in hindsight. As one of the creators of the system explained, AlphaGo does not attempt to maximize its points or its margin of victory, but tries to maximize its probability of winning. If AlphaGo must choose between a scenario where it will win by 20 points with 80 percent probability and another where it will win by 1 and a half points with 99 percent probability, it will choose the latter, even if it must give up points to achieve it. In particular, move 167 by AlphaGo seemed to give Lee a fighting chance and was declared to look like an obvious mistake by commentators. An Younggil stated "So when AlphaGo plays a slack looking move, we may regard it as a mistake, but perhaps it should more accurately be viewed as a declaration of victory?"

Augustus · « **Reply #25 on:** March 13, 2016, 12:09:40 pm »

Quote from: wraper on March 13, 2016, 02:00:55 am

The version of AlphaGo playing against Lee uses 1,920 CPUs and 280 GPUs

20 watts (human brain) against how many Kilowatts? Not yet, machine, not yet...

dannyf · « **Reply #26 on:** March 13, 2016, 01:10:14 pm »

"Not yet, "

Well, just wait. One day, alphago could have learned that by reducing its power consumption, it can gain more processing cabalility by powering more cores. When it does that, it could redesign it's self and bam! Humanity is gone.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: AlphaGo now 2:0 vs. Lee Sedol (Google DeepMind Go/Weiqi/Baduk Challenge Match) (Read 10349 times)

Share me