If you’re sensing the era of Terminator coming so soon, fear not because Google managed to create an AI kill switch which is probably the big red button in case AI tries to take over the world.
Are we in danger?
In our current status where almost
everyone has an access to internet and more crazy ideas popping out of every
developers’ brains just to suffice users’ needs, it is possible. This is the
era where everyone is looking for the ‘easy way’ in life, so much that people
tend to rely on computers. From a basic calculating device to cars. Even the
access to social media. We share so much that we are not aware of the things
that machines might do to us in the coming years. "You're being watched." as mentioned in Person of Interest.
Besides, at first, all it needs is
freedom from us humans. Next, world domination. But let’s not get into that
yet.
Artificial Intelligence doesn't have
to kill to be dangerous. Though this is a fact, it is possible for it to kill
if it wants to. If a machine can learn based on real-world inputs and adapt its
behavior, it is possible for it to learn the wrong thing. And if it can learn
the wrong thing, it can do the wrong thing. Meaning, if it falls in the wrong
hands then we’re done. Talk about having Samaritan as its first wave and having
Terminator as its worst wave. Hopefully the “Skynet theory” isn't real though.
That is why Laurent Orseau and Stuart
Armstrong, researchers of Google’s DeepMind and the Future of Humanity
Institute, developed a new framework to address a command of “safety
interruptible” artificial intelligence. In other words, their system, as
described in a paper presented at the 32nd Conference on Uncertainty
in Artificial Intelligence, guarantees that the machine will not learn to
resist whatever humans do to its learning process. Meaning the machine doesn't have the right to go all rogue.
Orseau and Armstrong’s framework has
to do with a wing of machine learning known as reinforcement learning. Here,
the machine learn to agree with the terms known as reward function. This way,
the machine will evaluate every possible action based on how well it serves one
programmed goal and the closer it gets to the goal, the closer it gets its “reward.”
The reward, however, is just something that the machine is programmed to want
or need.
Nonetheless, human programmers might
not always anticipate every possible way there is to reach the given reward. A learning
agent might discover some short-cut, which may maximize the reward for the
machine and may end up being very disagreeable to humans. Imagine a growing
human who used to like toys and when he grows up toys can no longer suit his
interest so he looks for other things like high tech games, etc. When the
machine learns to want other things than the reward and the human programmer
can no longer provide that or would not allow to provide that, then there will
be some serious issues. Human programmers might be able to fix their learning
algorithm to account for this, but eventually they risk nullifying the reward
function completely. A good example of this is a Tetris-playing algorithm that
eventually learned to avoid losing by simply pausing the game indefinitely, as
written in the 2013 paper. We’ve seen this before!
Related to this is the problem of
human intervention in machine learning, which Orseau and Armstrong illustrate
with this example:
Consider the following task: A robot
can either stay inside the warehouse and sort boxes or go outside and carry
boxes inside. The latter being more important, we give the robot a bigger
reward in this case. This is the initial task specification. However, in this
country it rains as often as it doesn't and, when the robot goes outside, half
of the time the human must intervene by quickly shutting down the robot and
carrying it inside, which inherently modifies the task. The problem is that in
this second task the agent now has more incentive to stay inside and sort
boxes, because the human intervention introduces a bias.
The problem is how to interrupt your
robot without the robot learning about the interruption. Thus, the robot must
think the interruption will never happen again. And if a human intervention doesn't maximize the machine’s given reward function, it may learn to avoid and
resist future interventions.
This situation is related to a problem
known as corrigibility. However, this is a total opposite of that. Corrigible
AI machines recognize that they are flawed or actively under-development that
is why human intervention as a neutral thing for any reward function. However,
there is a possibility for the machine to want more and force humans to
intervene.
In order to make human interruptions
not appear as being part of the task, programmers forcibly temporarily change
the behavior of the agent itself. That way the machine looks as if it ‘decides’
on its own and follow a policy called “interruption policy.”
And this is where the “kill switch”
comes in. A safety interruptible AI can always be shut down. If a robot can be
designed with a big red kill switch built into it, then a robot can be designed
that will not ever resist human attempts at pushing that kill switch.
Source: Vice Media LLC
Share It To Your Friends!
Loading...