For Its Next Trick, DeepMind Will Beat Us at Team Sports

Capture the flag is a game played by children across the open spaces of a summer camp, and by professional video gamers as part of popular titles like Quake III and Overwatch.

In both cases, it’s a team sport. Each side guards a flag while also scheming to grab the other side’s flag and bring it back to home base. Winning the game requires good old-fashioned teamwork, a coordinated balance between defense and attack:

In other words, capture the flag requires what would seem to be a very human set of skills. But researchers at an artificial intelligence lab in London have shown that machines can master this game, too, at least in the virtual world.

In a paper published on Thursday in Science (and previously available on the website arXiv before peer review), the researchers reported that they had designed automated “agents” that exhibited humanlike behavior when playing the capture the flag “game mode” inside Quake III. These agents were able to team up against human players or play alongside them, tailoring their behavior accordingly.

“They can adapt to teammates with arbitrary skills,” said Wojciech Czarnecki, a researcher with DeepMind, a lab owned by the same parent company as Google.

Through thousands of hours of game play, the agents learned very particular skills, like racing toward the opponent’s home base when a teammate was on the verge of capturing a flag. As human players know, the moment the opposing flag is brought to one’s home base, a new flag appears at the opposing base, ripe for the taking.

[Like the Science Times page on Facebook. | Sign up for the Science Times newsletter.]

DeepMind’s project is part of a broad effort to build artificial intelligence that can play enormously complex, three-dimensional video games, including Quake III, Dota 2 and StarCraft II. Many researchers believe that success in the virtual arena will eventually lead to automated systems with improved abilities in the real world.

For instance, such skills could benefit warehouse robots as they work in groups to move goods from place to place, or help self-driving cars navigate en masse through heavy traffic. “Games have always been a benchmark for A.I.,” said Greg Brockman, who oversees similar research at OpenAI, a lab based in San Francisco. “If you can’t solve games, you can’t expect to solve anything else.”

Until recently, building a system that could match human players in a game like Quake III did not seem possible. But over the past several years, DeepMind, OpenAI and other labs have made significant advances, thanks to a mathematical technique called “reinforcement learning,” which allows machines to learn tasks by extreme trial and error.

By playing a game over and over again, an automated agent learns which strategies bring success and which do not. If an agent consistently wins more points by moving toward an opponent’s home base when a teammate is about to capture a flag, it adds this tactic to its arsenal of tricks.

In 2016, using the same fundamental technique, DeepMind researchers built a system that could beat the world’s top players at the ancient game of Go, the Eastern version of chess. Many experts had thought this would not be accomplished for another decade, given the enormous complexity of the game.

First-person video games are exponentially more complex, particularly when they involve coordination between teammates. DeepMind’s autonomous agents learned capture the flag by playing roughly 450,000 rounds of it, tallying about four years of game experience over weeks of training. At first, the agents failed miserably. But they gradually picked up the nuances of the game, like when to follow teammates as they raided an opponent’s home base:

Since completing this project, DeepMind researchers also designed a system that could beat professional players at StarCraft II, a strategy game set in space. And at OpenAI, researchers built a system that mastered Dota 2, a game that plays like a souped-up version of capture the flag. In April, a team of five autonomous agents beat a team of five of the world’s best human players.

Last year, William Lee, a professional Dota 2 player and commentator known as Blitz, played against an early version of the technology that could play only one-on-one, not as part of a team, and he was unimpressed. But as the agents continued to learn the game and he played them as a team, he was shocked by their skill.

“I didn’t think it would be possible for the machine to play five-on-five, let alone win,” he said. “I was absolutely blown away.”

As impressive as such technology has been among gamers, many artificial-intelligence experts question whether it will ultimately translate to solving real-world problems. DeepMind’s agents are not really collaborating, said Mark Riedl, a professor at Georgia Tech College of Computing who specializes in artificial intelligence. They are merely responding to what is happening in the game, rather than trading messages with one another, as human players do. (Even mere ants can collaborate by trading chemical signals.)

Although the result looks like collaboration, the agents achieve it because, individually, they so completely understand what is happening in the game.

“How you define teamwork is not something I want to tackle,” said Max Jaderberg, another DeepMind researcher who worked on the project. “But one agent will sit in the opponent’s base camp, waiting for the flag to appear, and that is only possible if it is relying on its teammates.”

Games like this are not nearly as complex as the real world. “3-D environments are designed to make navigation easy,” Dr. Riedl said. “Strategy and coordination in Quake are simple.”

Reinforcement learning is ideally suited to such games. In a video game, it is easy to identify the metric for success: more points. (In capture the flag, players earn points according to how many flags are captured.) But in the real world, no one is keeping score. Researchers must define success in other ways.

This can be done, at least with simple tasks. At OpenAI, researchers have trained a robotic hand to manipulate an alphabet block as a child might. Tell the hand to show you the letter A, and it will show you the letter A.

At a Google robotics lab, researchers have shown that machines can learn to pick up random items, such as Ping-Pong balls and plastic bananas, and toss them into a bin several feet away. This kind of technology could help sort through bins of items in huge warehouses and distribution centers run by Amazon, FedEx and other companies. Today, human workers handle such tasks.

As labs like DeepMind and OpenAI tackle bigger problems, they may begin to require ridiculously large amounts of computing power. As OpenAI’s system learned to play Dota 2 over several months — more than 45,000 years of game play — it came to rely on tens of thousands of computer chips. Renting access to all those chips cost the lab millions of dollars, Mr. Brockman said.

DeepMind and OpenAI, which is funded by various Silicon Valley kingpins including Khosla Ventures and the tech billionaire Reid Hoffman, can afford all that computing power. But academic labs and other small operations cannot, said Devendra Chaplot, an A.I. researcher at Carnegie Mellon University. The worry, for some, is that a few well-funded labs will dominate the future of artificial intelligence.

But even the big labs may not have the computing power needed to move these techniques into the complexities of the real world, which may require stronger forms of A.I. that can learn even faster. Though machines can now win capture the flag in the virtual world, they are still hopeless across the open spaces of summer camp — and will be for quite a while.

Source link