Category: Computing


Researchers take major step forward in Artificial Intelligence

By Hugo Angel,

The long-standing dream of using Artificial Intelligence (AI) to build an artificial brain has taken a significant step forward, as a team led by Professor Newton Howard from the University of Oxford has successfully prototyped a nanoscale, AI-powered, artificial brain in the form factor of a high-bandwidth neural implant.
Professor Newton Howard (pictured above and below) holding parts of the implant device
In collaboration with INTENT LTD, Qualcomm Corporation, Intel Corporation, Georgetown University and the Brain Sciences Foundation, Professor Howard’s Oxford Computational Neuroscience Lab in the Nuffield Department of Surgical Sciences has developed the proprietary algorithms and the optoelectronics required for the device. Rodents’ testing is on target to begin very soon.
This achievement caps over a decade of research by Professor Howard at MIT’s Synthetic Intelligence Lab and the University of Oxford, work that resulted in several issued US patents on the technologies and algorithms that power the device, 
  • the Fundamental Code Unit of the Brain (FCU)
  • the Brain Code (BC) and the Biological Co-Processor (BCP) 

are the latest advanced foundations for any eventual merger between biological intelligence and human intelligence. Ni2o (pronounced “Nitoo”) is the entity that Professor Howard licensed to further develop, market and promote these technologies.

The Biological Co-Processor is unique in that it uses advanced nanotechnology, optogenetics and deep machine learning to intelligently map internal events, such as neural spiking activity, to external physiological, linguistic and behavioral expression. The implant contains over a million carbon nanotubes, each of which is 10,000 times smaller than the width of a human hair. Carbon nanotubes provide a natural, high-bandwidth interface as they conduct heat, light and electricity instantaneously updating the neural laces. They adhere to neuronal constructs and even promote neural growth. Qualcomm team leader Rudy Beraha commented, ‘Although the prototype unit shown today is tethered to external power, a commercial Brain Co-Processor unit will be wireless and inductively powered, enabling it to be administered with a minimally-invasive procedures.
The device uses a combination of methods to write to the brain, including 
  • pulsed electricity
  • light and 
  • various molecules that simulate or inhibit the activation of specific neuronal groups
These can be targeted to stimulate a desired response, such as releasing chemicals in patients suffering from a neurological disorder or imbalance. The BCP is designed as a fully integrated system to use the brain’s own internal systems and chemistries to pattern and mimic healthy brain behavior, an approach that stands in stark contrast to the current state of the art, which is to simply apply mild electrocution to problematic regions of the brain. 
Therapeutic uses
The Biological Co-Processor promises to provide relief for millions of patients suffering from neurological, psychiatric and psychological disorders as well as degenerative diseases. Initial therapeutic uses will likely be for patients with traumatic brain injuries and neurodegenerative disorders, such as Alzheimer’s, as the BCP will strengthen the weak, shortening connections responsible for lost memories and skills. Once implanted, the device provides a closed-loop, self-learning platform able to both determine and administer the perfect balance of pharmaceutical, electroceutical, genomeceutical and optoceutical therapies.
Dr Richard Wirt, a Senior Fellow at Intel Corporation and Co-Founder of INTENT, the company’s partner of Ni2o bringing BCP to market, commented on the device, saying, ‘In the immediate timeframe, this device will have many benefits for researchers, as it could be used to replicate an entire brain image, synchronously mapping internal and external expressions of human response. Over the long term, the potential therapeutic benefits are unlimited.
The brain controls all organs and systems in the body, so the cure to nearly every disease resides there.- Professor Newton Howard
Rather than simply disrupting neural circuits, the machine learning systems within the BCP are designed to interpret these signals and intelligently read and write to the surrounding neurons. These capabilities could be used to reestablish any degenerative or trauma-induced damage and perhaps write these memories and skills to other, healthier areas of the brain. 
One day, these capabilities could also be used in healthy patients to radically augment human ability and proactively improve health. As Professor Howard points out: ‘The brain controls all organs and systems in the body, so the cure to nearly every disease resides there.‘ Speaking more broadly, Professor Howard sees the merging of man with machine as our inevitable destiny, claiming it to be ‘the next step on the blueprint that the author of it all built into our natural architecture.
With the resurgence of neuroscience and AI enhancing machine learning, there has been renewed interest in brain implants. This past March, Elon Musk and Bryan Johnson independently announced that they are focusing and investing in for the brain/computer interface domain. 
When asked about these new competitors, Professor Howard said he is happy to see all these new startups and established names getting into the field – he only wonders what took them so long, stating: ‘I would like to see us all working together, as we have already established a mathematical foundation and software framework to solve so many of the challenges they will be facing. We could all get there faster if we could work together – after all, the patient is the priority.
© 2017 Nuffield Department of Surgical Sciences, John Radcliffe Hospital, Headington, Oxford, OX3 9DU
ORIGINAL: NDS Oxford
2 June 2017 

The future of AI is neuromorphic. Meet the scientists building digital ‘brains’ for your phone

By Hugo Angel,

Neuromorphic chips are being designed to specifically mimic the human brain – and they could soon replace CPUs
BRAIN ACTIVITY MAP
Neuroscape Lab
AI services like Apple’s Siri and others operate by sending your queries to faraway data centers, which send back responses. The reason they rely on cloud-based computing is that today’s electronics don’t come with enough computing power to run the processing-heavy algorithms needed for machine learning. The typical CPUs most smartphones use could never handle a system like Siri on the device. But Dr. Chris Eliasmith, a theoretical neuroscientist and co-CEO of Canadian AI startup Applied Brain Research, is confident that a new type of chip is about to change that.
Many have suggested Moore’s law is ending and that means we won’t get ‘more compute’ cheaper using the same methods,” Eliasmith says. He’s betting on the proliferation of ‘neuromorphics’ — a type of computer chip that is not yet widely known but already being developed by several major chip makers.
Traditional CPUs process instructions based on “clocked time” – information is transmitted at regular intervals, as if managed by a metronome. By packing in digital equivalents of neurons, neuromorphics communicate in parallel (and without the rigidity of clocked time) using “spikes” – bursts of electric current that can be sent whenever needed. Just like our own brains, the chip’s neurons communicate by processing incoming flows of electricity – each neuron able to determine from the incoming spike whether to send current out to the next neuron.
What makes this a big deal is that these chips require far less power to process AI algorithms. For example, one neuromorphic chip made by IBM contains five times as many transistors as a standard Intel processor, yet consumes only 70 milliwatts of power. An Intel processor would use anywhere from 35 to 140 watts, or up to 2000 times more power.
Eliasmith points out that neuromorphics aren’t new and that their designs have been around since the 80s. Back then, however, the designs required specific algorithms be baked directly into the chip. That meant you’d need one chip for detecting motion, and a different one for detecting sound. None of the chips acted as a general processor in the way that our own cortex does.
This was partly because there hasn’t been any way for programmers to design algorithms that can do much with a general purpose chip. So even as these brain-like chips were being developed, building algorithms for them has remained a challenge.
 
Eliasmith and his team are keenly focused on building tools that would allow a community of programmers to deploy AI algorithms on these new cortical chips.
Central to these efforts is Nengo, a compiler that developers can use to build their own algorithms for AI applications that will operate on general purpose neuromorphic hardware. Compilers are a software tool that programmers use to write code, and that translate that code into the complex instructions that get hardware to actually do something. What makes Nengo useful is its use of the familiar Python programming language – known for it’s intuitive syntax – and its ability to put the algorithms on many different hardware platforms, including neuromorphic chips. Pretty soon, anyone with an understanding of Python could be building sophisticated neural nets made for neuromorphic hardware.
Things like vision systems, speech systems, motion control, and adaptive robotic controllers have already been built with Nengo,Peter Suma, a trained computer scientist and the other CEO of Applied Brain Research, tells me.
Perhaps the most impressive system built using the compiler is Spaun, a project that in 2012 earned international praise for being the most complex brain model ever simulated on a computer. Spaun demonstrated that computers could be made to interact fluidly with the environment, and perform human-like cognitive tasks like recognizing images and controlling a robot arm that writes down what it’s sees. The machine wasn’t perfect, but it was a stunning demonstration that computers could one day blur the line between human and machine cognition. Recently, by using neuromorphics, most of Spaun has been run 9000x faster, using less energy than it would on conventional CPUs – and by the end of 2017, all of Spaun will be running on Neuromorphic hardware.
Eliasmith won NSERC’s John C. Polyani award for that project — Canada’s highest recognition for a breakthrough scientific achievement – and once Suma came across the research, the pair joined forces to commercialize these tools.
While Spaun shows us a way towards one day building fluidly intelligent reasoning systems, in the nearer term neuromorphics will enable many types of context aware AIs,” says Suma. Suma points out that while today’s AIs like Siri remain offline until explicitly called into action, we’ll soon have artificial agents that are ‘always on’ and ever-present in our lives.
Imagine a SIRI that listens and sees all of your conversations and interactions. You’ll be able to ask it for things like – “Who did I have that conversation about doing the launch for our new product in Tokyo?” or “What was that idea for my wife’s birthday gift that Melissa suggested?,” he says.
When I raised concerns that some company might then have an uninterrupted window into even the most intimate parts of my life, I’m reminded that because the AI would be processed locally on the device, there’s no need for that information to touch a server owned by a big company. And for Eliasmith, this ‘always on’ component is a necessary step towards true machine cognition. “The most fundamental difference between most available AI systems of today and the biological intelligent systems we are used to, is the fact that the latter always operate in real-time. Bodies and brains are built to work with the physics of the world,” he says.
Already, major efforts across the IT industry are heating up to get their AI services into the hands of users. Companies like Apple, Facebook, Amazon, and even Samsung, are developing conversational assistants they hope will one day become digital helpers.
ORIGINAL: Wired
Monday 6 March 2017

Google Unveils Neural Network with “Superhuman” Ability to Determine the Location of Almost Any Image

By Hugo Angel,

Guessing the location of a randomly chosen Street View image is hard, even for well-traveled humans. But Google’s latest artificial-intelligence machine manages it with relative ease.
Here’s a tricky task. Pick a photograph from the Web at random. Now try to work out where it was taken using only the image itself. If the image shows a famous building or landmark, such as the Eiffel Tower or Niagara Falls, the task is straightforward. But the job becomes significantly harder when the image lacks specific location cues or is taken indoors or shows a pet or food or some other detail.Nevertheless, humans are surprisingly good at this task. To help, they bring to bear all kinds of knowledge about the world such as the type and language of signs on display, the types of vegetation, architectural styles, the direction of traffic, and so on. Humans spend a lifetime picking up these kinds of geolocation cues.So it’s easy to think that machines would struggle with this task. And indeed, they have.

Today, that changes thanks to the work of Tobias Weyand, a computer vision specialist at Google, and a couple of pals. These guys have trained a deep-learning machine to work out the location of almost any photo using only the pixels it contains.

Their new machine significantly outperforms humans and can even use a clever trick to determine the location of indoor images and pictures of specific things such as pets, food, and so on that have no location cues.

Their approach is straightforward, at least in the world of machine learning.

  • Weyand and co begin by dividing the world into a grid consisting of over 26,000 squares of varying size that depend on the number of images taken in that location.
    So big cities, which are the subjects of many images, have a more fine-grained grid structure than more remote regions where photographs are less common. Indeed, the Google team ignored areas like oceans and the polar regions, where few photographs have been taken.

 

  • Next, the team created a database of geolocated images from the Web and used the location data to determine the grid square in which each image was taken. This data set is huge, consisting of 126 million images along with their accompanying Exif location data.
  • Weyand and co used 91 million of these images to teach a powerful neural network to work out the grid location using only the image itself. Their idea is to input an image into this neural net and get as the output a particular grid location or a set of likely candidates. 
  • They then validated the neural network using the remaining 34 million images in the data set.
  • Finally they tested the network—which they call PlaNet—in a number of different ways to see how well it works.

The results make for interesting reading. To measure the accuracy of their machine, they fed it 2.3 million geotagged images from Flickr to see whether it could correctly determine their location. “PlaNet is able to localize 3.6 percent of the images at street-level accuracy and 10.1 percent at city-level accuracy,” say Weyand and co. What’s more, the machine determines the country of origin in a further 28.4 percent of the photos and the continent in 48.0 percent of them.

That’s pretty good. But to show just how good, Weyand and co put PlaNet through its paces in a test against 10 well-traveled humans. For the test, they used an online game that presents a player with a random view taken from Google Street View and asks him or her to pinpoint its location on a map of the world.

Anyone can play at www.geoguessr.com. Give it a try—it’s a lot of fun and more tricky than it sounds.

GeoGuesser Screen Capture Example

Needless to say, PlaNet trounced the humans. “In total, PlaNet won 28 of the 50 rounds with a median localization error of 1131.7 km, while the median human localization error was 2320.75 km,” say Weyand and co. “[This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes.

An interesting question is how PlaNet performs so well without being able to use the cues that humans rely on, such as vegetation, architectural style, and so on. But Weyand and co say they know why: “We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learned subtle cues of different scenes that are even hard for a well-traveled human to distinguish.

They go further and use the machine to locate images that do not have location cues, such as those taken indoors or of specific items. This is possible when images are part of albums that have all been taken at the same place. The machine simply looks through other images in the album to work out where they were taken and assumes the more specific image was taken in the same place.

That’s impressive work that shows deep neural nets flexing their muscles once again. Perhaps more impressive still is that the model uses a relatively small amount of memory unlike other approaches that use gigabytes of the stuff. “Our model uses only 377 MB, which even fits into the memory of a smartphone,” say Weyand and co.

That’s a tantalizing idea—the power of a superhuman neural network on a smartphone. It surely won’t be long now!

Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo Geolocation with Convolutional Neural Networks

ORIGINAL: TechnoplogyReview
by Emerging Technology from the arXiv
February 24, 2016

JPMorgan Software Does in Seconds What Took Lawyers 360,000 Hours

By Hugo Angel,

New software does in seconds what took staff 360,000 hours Bank seeking to streamline systems, avoid redundancies

At JPMorgan Chase & Co., a learning machine is parsing financial deals that once kept legal teams busy for thousands of hours.

The program, called COIN, for Contract Intelligence, does the mind-numbing job of interpreting commercial-loan agreements that, until the project went online in June, consumed 360,000 hours of work each year by lawyers and loan officers. The software reviews documents in seconds, is less error-prone and never asks for vacation.

Attendees discuss software on Feb. 27, the eve of JPMorgan’s Investor Day.
Photographer: Kholood Eid/Bloomberg

While the financial industry has long touted its technological innovations, a new era of automation is now in overdrive as cheap computing power converges with fears of losing customers to startups. Made possible by investments in machine learning and a new private cloud network, COIN is just the start for the biggest U.S. bank. The firm recently set up technology hubs for teams specializing in big data, robotics and cloud infrastructure to find new sources of revenue, while reducing expenses and risks.

The push to automate mundane tasks and create new tools for bankers and clients — a growing part of the firm’s $9.6 billion technology budget — is a core theme as the company hosts its annual investor day on Tuesday.

Behind the strategy, overseen by Chief Operating Operating Officer Matt Zames and Chief Information Officer Dana Deasy, is an undercurrent of anxiety: Though JPMorgan emerged from the financial crisis as one of few big winners, its dominance is at risk unless it aggressively pursues new technologies, according to interviews with a half-dozen bank executives.


Redundant Software

That was the message Zames had for Deasy when he joined the firm from BP Plc in late 2013. The New York-based bank’s internal systems, an amalgam from decades of mergers, had too many redundant software programs that didn’t work together seamlessly.“Matt said, ‘Remember one thing above all else: We absolutely need to be the leaders in technology across financial services,’” Deasy said last week in an interview. “Everything we’ve done from that day forward stems from that meeting.

After visiting companies including Apple Inc. and Facebook Inc. three years ago to understand how their developers worked, the bank set out to create its own computing cloud called Gaia that went online last year. Machine learning and big-data efforts now reside on the private platform, which effectively has limitless capacity to support their thirst for processing power. The system already is helping the bank automate some coding activities and making its 20,000 developers more productive, saving money, Zames said. When needed, the firm can also tap into outside cloud services from Amazon.com Inc., Microsoft Corp. and International Business Machines Corp.

Tech SpendingJPMorgan will make some of its cloud-backed technology available to institutional clients later this year, allowing firms like BlackRock Inc. to access balances, research and trading tools. The move, which lets clients bypass salespeople and support staff for routine information, is similar to one Goldman Sachs Group Inc. announced in 2015.JPMorgan’s total technology budget for this year amounts to 9 percent of its projected revenue — double the industry average, according to Morgan Stanley analyst Betsy Graseck. The dollar figure has inched higher as JPMorgan bolsters cyber defenses after a 2014 data breach, which exposed the information of 83 million customers.

We have invested heavily in technology and marketing — and we are seeing strong returns,” JPMorgan said in a presentation Tuesday ahead of its investor day, noting that technology spending in its consumer bank totaled about $1 billion over the past two years.

Attendees inspect JPMorgan Markets software kiosk for Investors Day.
Photographer: Kholood Eid/Bloomberg

One-third of the company’s budget is for new initiatives, a figure Zames wants to take to 40 percent in a few years. He expects savings from automation and retiring old technology will let him plow even more money into new innovations.

Not all of those bets, which include several projects based on a distributed ledger, like blockchain, will pay off, which JPMorgan says is OK. One example executives are fond of mentioning: The firm built an electronic platform to help trade credit-default swaps that sits unused.

‘Can’t Wait’We’re willing to invest to stay ahead of the curve, even if in the final analysis some of that money will go to product or a service that wasn’t needed,Marianne Lake, the lender’s finance chief, told a conference audience in June. That’s “because we can’t wait to know what the outcome, the endgame, really looks like, because the environment is moving so fast.”As for COIN, the program has helped JPMorgan cut down on loan-servicing mistakes, most of which stemmed from human error in interpreting 12,000 new wholesale contracts per year, according to its designers.

JPMorgan is scouring for more ways to deploy the technology, which learns by ingesting data to identify patterns and relationships. The bank plans to use it for other types of complex legal filings like credit-default swaps and custody agreements. Someday, the firm may use it to help interpret regulations and analyze corporate communications.

Another program called X-Connect, which went into use in January, examines e-mails to help employees find colleagues who have the closest relationships with potential prospects and can arrange introductions.

Creating Bots
For simpler tasks, the bank has created bots to perform functions like granting access to software systems and responding to IT requests, such as resetting an employee’s password, Zames said. Bots are expected to handle 1.7 million access requests this year, doing the work of 140 people.

Matt Zames
Photographer: Kholood Eid/Bloomberg

While growing numbers of people in the industry worry such advancements might someday take their jobs, many Wall Street personnel are more focused on benefits. A survey of more than 3,200 financial professionals by recruiting firm Options Group last year found a majority expect new technology will improve their careers, for example by improving workplace performance.

Anything where you have back-office operations and humans kind of moving information from point A to point B that’s not automated is ripe for that,” Deasy said. “People always talk about this stuff as displacement. I talk about it as freeing people to work on higher-value things, which is why it’s such a terrific opportunity for the firm.

To help spur internal disruption, the company keeps tabs on 2,000 technology ventures, using about 100 in pilot programs that will eventually join the firm’s growing ecosystem of partners. For instance, the bank’s machine-learning software was built with Cloudera Inc., a software firm that JPMorgan first encountered in 2009.

We’re starting to see the real fruits of our labor,” Zames said. “This is not pie-in-the-sky stuff.

ORIGINAL:
Bloomberg

by Hugh Son
27 de febrero de 2017

End-to-End Deep Learning for Self-Driving Cars

By Hugo Angel,

In a new automotive application, we have used convolutional neural networks (CNNs) to map the raw pixels from a front-facing camera to the steering commands for a self-driving car. This powerful end-to-end approach means that with minimum training data from humans, the system learns to steer, with or without lane markings, on both local roads and highways. The system can also operate in areas with unclear visual guidance such as parking lots or unpaved roads.
Figure 1: NVIDIA’s self-driving car in action.
We designed the end-to-end learning system using an NVIDIA DevBox running Torch 7 for training. An NVIDIA DRIVETM PX self-driving car computer, also with Torch 7, was used to determine where to drive—while operating at 30 frames per second (FPS). The system is trained to automatically learn the internal representations of necessary processing steps, such as detecting useful road features, with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the outline of roads. In contrast to methods using explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously.
We believe that end-to-end learning leads to better performance and smaller systems. Better performance results because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e. g., lane detection. Such criteria understandably are selected for ease of human interpretation which doesn’t automatically guarantee maximum system performance. Smaller networks are possible because the system learns to solve the problem with the minimal number of processing steps.
This blog post is based on the NVIDIA paper End to End Learning for Self-Driving Cars. Please see the original paper for full details.
Convolutional Neural Networks to Process Visual Data
CNNs[1] have revolutionized the computational pattern recognition process[2]. Prior to the widespread adoption of CNNs, most pattern recognition tasks were performed using an initial stage of hand-crafted feature extraction followed by a classifier. The important breakthrough of CNNs is that features are now learned automatically from training examples. The CNN approach is especially powerful when applied to image recognition tasks because the convolution operation captures the 2D nature of images. By using the convolution kernels to scan an entire image, relatively few parameters need to be learned compared to the total number of operations.

While CNNs with learned features have been used commercially for over twenty years [3], their adoption has exploded in recent years because of two important developments.

  • First, large, labeled data sets such as the ImageNet Large Scale Visual Recognition Challenge (ILSVRC)[4] are now widely available for training and validation.
  • Second, CNN learning algorithms are now implemented on massively parallel graphics processing units (GPUs), tremendously accelerating learning and inference ability.
The CNNs that we describe here go beyond basic pattern recognition. We developed a system that learns the entire processing pipeline needed to steer an automobile. The groundwork for this project was actually done over 10 years ago in a Defense Advanced Research Projects Agency (DARPA) seedling project known as DARPA Autonomous Vehicle (DAVE)[5], in which a sub-scale radio control (RC) car drove through a junk-filled alley way. DAVE was trained on hours of human driving in similar, but not identical, environments. The training data included video from two cameras and the steering commands sent by a human operator.
In many ways, DAVE was inspired by the pioneering work of Pomerleau[6], who in 1989 built the Autonomous Land Vehicle in a Neural Network (ALVINN) system. ALVINN is a precursor to DAVE, and it provided the initial proof of concept that an end-to-end trained neural network might one day be capable of steering a car on public roads. DAVE demonstrated the potential of end-to-end learning, and indeed was used to justify starting the DARPA Learning Applied to Ground Robots (LAGR) program[7], but DAVE’s performance was not sufficiently reliable to provide a full alternative to the more modular approaches to off-road driving. (DAVE’s mean distance between crashes was about 20 meters in complex environments.)

About a year ago we started a new effort to improve on the original DAVE, and create a robust system for driving on public roads. The primary motivation for this work is to avoid the need to recognize specific human-designated features, such as lane markings, guard rails, or other cars, and to avoid having to create a collection of “if, then, else” rules, based on observation of these features. We are excited to share the preliminary results of this new effort, which is aptly named: DAVE–2.

The DAVE-2 System
Figure 2: High-level view of the data collection system.

Figure 2 shows a simplified block diagram of the collection system for training data of DAVE-2. Three cameras are mounted behind the windshield of the data-acquisition car, and timestamped video from the cameras is captured simultaneously with the steering angle applied by the human driver. The steering command is obtained by tapping into the vehicle’s Controller Area Network (CAN) bus. In order to make our system independent of the car geometry, we represent the steering command as 1/r, where r is the turning radius in meters. We use 1/r instead of r to prevent a singularity when driving straight (the turning radius for driving straight is infinity). 1/r smoothly transitions through zero from left turns (negative values) to right turns (positive values).

Training data contains single images sampled from the video, paired with the corresponding steering command (1/r). Training with data from only the human driver is not sufficient; the network must also learn how to recover from any mistakes, or the car will slowly drift off the road. The training data is therefore augmented with additional images that show the car in different shifts from the center of the lane and rotations from the direction of the road.
The images for two specific off-center shifts can be obtained from the left and the right cameras. Additional shifts between the cameras and all rotations are simulated through viewpoint transformation of the image from the nearest camera. Precise viewpoint transformation requires 3D scene knowledge which we don’t have, so we approximate the transformation by assuming all points below the horizon are on flat ground, and all points above the horizon are infinitely far away. This works fine for flat terrain, but for a more complete rendering it introduces distortions for objects that stick above the ground, such as cars, poles, trees, and buildings. Fortunately these distortions don’t pose a significant problem for network training. The steering label for the transformed images is quickly adjusted to one that correctly steers the vehicle back to the desired location and orientation in two seconds.
Figure 3: Training the neural network.

Figure 3 shows a block diagram of our training system. Images are fed into a CNN that then computes a proposed steering command. The proposed command is compared to the desired command for that image, and the weights of the CNN are adjusted to bring the CNN output closer to the desired output. The weight adjustment is accomplished using back propagation as implemented in the Torch 7 machine learning package.

Once trained, the network is able to generate steering commands from the video images of a single center camera. Figure 4 shows this configuration.

Figure 4: The trained network is used to generate steering commands from a single front-facing center camera.

Data Collection

Training data was collected by driving on a wide variety of roads and in a diverse set of lighting and weather conditions. We gathered surface street data in central New Jersey and highway data from Illinois, Michigan, Pennsylvania, and New York. Other road types include two-lane roads (with and without lane markings), residential roads with parked cars, tunnels, and unpaved roads. Data was collected in clear, cloudy, foggy, snowy, and rainy weather, both day and night. In some instances, the sun was low in the sky, resulting in glare reflecting from the road surface and scattering from the windshield.
The data was acquired using either our drive-by-wire test vehicle, which is a 2016 Lincoln MKZ, or using a 2013 Ford Focus with cameras placed in similar positions to those in the Lincoln. Our system has no dependencies on any particular vehicle make or model. Drivers were encouraged to maintain full attentiveness, but otherwise drive as they usually do. As of March 28, 2016, about 72 hours of driving data was collected.
Network Architecture
Figure 5: CNN architecture. The network has about 27 million connections and 250 thousand parameters.

We train the weights of our network to minimize the mean-squared error between the steering command output by the network, and either the command of the human driver or the adjusted steering command for off-center and rotated images (see “Augmentation”, later). Figure 5 shows the network architecture, which consists of 9 layers, including a normalization layer, 5 convolutional layers, and 3 fully connected layers. The input image is split into YUV planes and passed to the network.

The first layer of the network performs image normalization. The normalizer is hard-coded and is not adjusted in the learning process. Performing normalization in the network allows the normalization scheme to be altered with the network architecture, and to be accelerated via GPU processing.
The convolutional layers are designed to perform feature extraction, and are chosen empirically through a series of experiments that vary layer configurations. We then use strided convolutions in the first three convolutional layers with a 2×2 stride and a 5×5 kernel, and a non-strided convolution with a 3×3 kernel size in the final two convolutional layers.
We follow the five convolutional layers with three fully connected layers, leading to a final output control value which is the inverse-turning-radius. The fully connected layers are designed to function as a controller for steering, but we noted that by training the system end-to-end, it is not possible to make a clean break between which parts of the network function primarily as feature extractor, and which serve as controller.
Training Details
DATA SELECTION
The first step to training a neural network is selecting the frames to use. Our collected data is labeled with road type, weather condition, and the driver’s activity (staying in a lane, switching lanes, turning, and so forth). To train a CNN to do lane following, we simply select data where the driver is staying in a lane, and discard the rest. We then sample that video at 10 FPS because a higher sampling rate would include images that are highly similar, and thus not provide much additional useful information. To remove a bias towards driving straight the training data includes a higher proportion of frames that represent road curves.
AUGMENTATION
After selecting the final set of frames, we augment the data by adding artificial shifts and rotations to teach the network how to recover from a poor position or orientation. The magnitude of these perturbations is chosen randomly from a normal distribution. The distribution has zero mean, and the standard deviation is twice the standard deviation that we measured with human drivers. Artificially augmenting the data does add undesirable artifacts as the magnitude increases (as mentioned previously).
Simulation
Before road-testing a trained CNN, we first evaluate the network’s performance in simulation. Figure 6 shows a simplified block diagram of the simulation system, and Figure 7 shows a screenshot of the simulator in interactive mode.
Figure 6: Block-diagram of the drive simulator.

The simulator takes prerecorded videos from a forward-facing on-board camera connected to a human-driven data-collection vehicle, and generates images that approximate what would appear if the CNN were instead steering the vehicle. These test videos are time-synchronized with the recorded steering commands generated by the human driver.

Since human drivers don’t drive in the center of the lane all the time, we must manually calibrate the lane’s center as it is associated with each frame in the video used by the simulator. We call this position the “ground truth”.
The simulator transforms the original images to account for departures from the ground truth. Note that this transformation also includes any discrepancy between the human driven path and the ground truth. The transformation is accomplished by the same methods as described previously.

The simulator accesses the recorded test video along with the synchronized steering commands that occurred when the video was captured. The simulator sends the first frame of the chosen test video, adjusted for any departures from the ground truth, to the input of the trained CNN, which then returns a steering command for that frame. The CNN steering commands as well as the recorded human-driver commands are fed into the dynamic model [7] of the vehicle to update the position and orientation of the simulated vehicle.

Figure 7: Screenshot of the simulator in interactive mode. See text for explanation of the performance metrics. The green area on the left is unknown because of the viewpoint transformation. The highlighted wide rectangle below the horizon is the area which is sent to the CNN.

The simulator then modifies the next frame in the test video so that the image appears as if the vehicle were at the position that resulted by following steering commands from the CNN. This new image is then fed to the CNN and the process repeats.

The simulator records the off-center distance (distance from the car to the lane center), the yaw, and the distance traveled by the virtual car. When the off-center distance exceeds one meter, a virtual human intervention is triggered, and the virtual vehicle position and orientation is reset to match the ground truth of the corresponding frame of the original test video.
Evaluation
We evaluate our networks in two steps: first in simulation, and then in on-road tests.
In simulation we have the networks provide steering commands in our simulator to an ensemble of prerecorded test routes that correspond to about a total of three hours and 100 miles of driving in Monmouth County, NJ. The test data was taken in diverse lighting and weather conditions and includes highways, local roads, and residential streets.
We estimate what percentage of the time the network could drive the car (autonomy) by counting the simulated human interventions that occur when the simulated vehicle departs from the center line by more than one meter. We assume that in real life an actual intervention would require a total of six seconds: this is the time required for a human to retake control of the vehicle, re-center it, and then restart the self-steering mode. We calculate the percentage autonomy by counting the number of interventions, multiplying by 6 seconds, dividing by the elapsed time of the simulated test, and then subtracting the result from 1:
Thus, if we had 10 interventions in 600 seconds, we would have an autonomy value of
ON-ROAD TESTS
After a trained network has demonstrated good performance in the simulator, the network is loaded on the DRIVE PX in our test car and taken out for a road test. For these tests we measure performance as the fraction of time during which the car performs autonomous steering. This time excludes lane changes and turns from one road to another. For a typical drive in Monmouth County NJ from our office in Holmdel to Atlantic Highlands, we are autonomous approximately 98% of the time. We also drove 10 miles on the Garden State Parkway (a multi-lane divided highway with on and off ramps) with zero intercepts.
Here is a video of our test car driving in diverse conditions.

Visualization of Internal CNN State
Figure 8: How the CNN “sees” an unpaved road. Top: subset of the camera image sent to the CNN. Bottom left: Activation of the first layer feature maps. Bottom right: Activation of the second layer feature maps. This demonstrates that the CNN learned to detect useful road features on its own, i. e., with only the human steering angle as training signal. We never explicitly trained it to detect the outlines of roads.
Figures 8 and 9 show the activations of the first two feature map layers for two different example inputs, an unpaved road and a forest. In case of the unpaved road, the feature map activations clearly show the outline of the road while in case of the forest the feature maps contain mostly noise, i. e., the CNN finds no useful information in this image.
This demonstrates that the CNN learned to detect useful road features on its own, i. e., with only the human steering angle as training signal. We never explicitly trained it to detect the outlines of roads, for example.
Figure 9: Example image with no road. The activations of the first two feature maps appear to contain mostly noise, i. e., the CNN doesn’t recognize any useful features in this image.
Conclusions
We have empirically demonstrated that CNNs are able to learn the entire task of lane and road following without manual decomposition into road

 

  • Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551, Winter 1989.
    URL: http://yann.lecun.org/exdb/publis/pdf/lecun-89e.pdf
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks.
  • In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. URL: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
  • L. D. Jackel, D. Sharman, Stenard C. E., Strom B. I., , and D Zuckert. Optical character recognition for self-service banking. AT&T Technical Journal, 74(1):16–24, 1995.
  • Large scale visual recognition challenge (ILSVRC). URL: http://www.image-net.org/challenges/LSVRC/.
  • Net-Scale Technologies, Inc. Autonomous off-road vehicle control using end-to-end learning, July 2004. Final technical report. URL: http://net-scale.com/doc/net-scale-dave-report.pdf.
  • Dean A. Pomerleau. ALVINN, an autonomous land vehicle in a neural network. Technical report, Carnegie Mellon University, 1989.
    URL: http://repository.cmu.edu/cgi/viewcontent.cgi?article=2874&context=compsci.
  • Danwei Wang and Feng Qi. Trajectory planning for a four-wheel-steering vehicle. In Proceedings of the 2001 IEEE International Conference on Robotics & Automation, May 21–26 2001. URL: http://www.ntu.edu.sg/home/edwwang/confpapers/wdwicar01.pdf.
    rlane marking detection, semantic abstraction, path planning, and control. A small amount of training data from less than a hundred hours of driving was sufficient to train the car to operate in diverse conditions, on highways, local and residential roads in sunny, cloudy, and rainy conditions.
  • The CNN is able to learn meaningful road features from a very sparse training signal (steering alone).
  • The system learns for example to detect the outline of a road without the need of explicit labels during training.
  • More work is needed to improve the robustness of the network, to find methods to verify the robustness, and to improve visualization of the network-internal processing steps.
For full details please see the paper that this blog post is based on, and please contact us if you would like to learn more about NVIDIA’s autonomous vehicle platform!
REFERENCES
  1. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backprop- agation applied to handwritten zip code recognition. Neural Computation, 1(4):541–551, Winter 1989. URL: http://yann.lecun.org/exdb/publis/pdf/lecun-89e.pdf.
  2. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. URL: http://papers.nips.cc/paper/ 4824-imagenet-classification-with-deep-convolutional-neural-networks. pdf.
  3. L. D. Jackel, D. Sharman, Stenard C. E., Strom B. I., , and D Zuckert. Optical character recognition for self-service banking. AT&T Technical Journal, 74(1):16–24, 1995.
  4. Large scale visual recognition challenge (ILSVRC). URL: http://www.image-net.org/ challenges/LSVRC/.
  5. Net-Scale Technologies, Inc. Autonomous off-road vehicle control using end-to-end learning, July 2004. Final technical report. URL: http://net-scale.com/doc/net-scale-dave-report.pdf.
  6. Dean A. Pomerleau. ALVINN, an autonomous land vehicle in a neural network. Technical report, Carnegie Mellon University, 1989. URL: http://repository.cmu.edu/cgi/viewcontent. cgi?article=2874&context=compsci.
  7. Danwei Wang and Feng Qi. Trajectory planning for a four-wheel-steering vehicle. In Proceedings of the 2001 IEEE International Conference on Robotics & Automation, May 21–26 2001. URL: http: //www.ntu.edu.sg/home/edwwang/confpapers/wdwicar01.pdf.
ORIGINAL: NVidia

An international team of scientists has come up with a blueprint for a large-scale quantum computer

By Hugo Angel,

‘It is the Holy Grail of science … we will be able to do certain things we could never even dream of before’
Courtesy Professor Winfried Hensinger
Quantum computing breakthrough could help ‘change life completely‘, say scientists
Scientists claim to have produced the first-ever blueprint for a large-scale quantum computer in a development that could bring about a technological revolution on a par with the invention of computing itself.
Until now quantum computers have had just a fraction of the processing power they are theoretically capable of producing.
But an international team of researchers believe they have finally overcome the main technical problems that have prevented the construction of more powerful machines.
They are currently building a prototype and a full-scale quantum computer – many millions of times faster than the best currently available – could be built in about a decade.
This is a modal window.
Scientists invent invisible underwater robots based on glass eels
Such devices work by utilising the almost magical properties found in the world of the very small, where an atom can apparently exist in two different places at the same time.
Professor Winfried Hensinger, head of the Ion Quantum Technology Group at Sussex University, who has been leading this research, told The Independent: “It is the Holy Grail of science, really, to build a quantum computer.
And we are now publishing the actual nuts-and-bolts construction plan for a large-scale quantum computer.
It is thought the astonishing processing power unleashed by quantum mechanics will lead to new, life-saving medicines, help solve the most intractable scientific problems, and probe the mysteries of the universe.
Life will change completely. We will be able to do certain things we could never even dream of before,” Professor Hensinger said.
You can imagine that suddenly the sky is the limit.
This is really, really exciting … it’s probably one of the most exciting times to be in this field.
He said small quantum computers had been built in the past but to test the theories.
This is not an academic study any more, it really is all the engineering required to build such a device,” he said.
Nobody has really gone ahead and drafted a full engineering plan of how you build one.
Many people questioned, because this is so hard to make this happen, that it can even be built.
We show that not only can it be built, but we provide a whole detailed plan on how to make it happen.
The problem is that existing quantum computers require lasers focused precisely on individual atoms. The larger the computer, the more lasers are required and the greater the chance of something going wrong.
But Professor Hensinger and colleagues used a different technique to monitor the atoms involving a microwave field and electricity in an ‘ion-trap’ device.

What we have is a solution that we can scale to arbitrary [computing] power,” he said.

Fig. 2. Gradient wires placed underneath each gate zone and embedded silicon photodetector.
(A) Illustration showing an isometric view of the two main gradient wires placed underneath each gate zone. Short wires are placed locally underneath each gate zone to form coils, which compensate for slowly varying magnetic fields and allow for individual addressing. The wire configuration in each zone can be seen in more detail in the inset.
(B) Silicon photodetector (marked green) embedded in the silicon substrate, transparent center segmented electrodes, and the possible detection angle are shown. VIA structures are used to prevent optical cross-talk from neighboring readout zones.
Source: Science Journals — AAAS. Blueprint for a microwave trapped ion quantum computer. Lekitsch et al. Sci. Adv. 2017;3: e1601540 1 February 2017
Fig. 4. Scalable module illustration. One module consisting of 36 × 36 junctions placed on the supporting steel frame structure: Nine wafers containing the required DACs and control electronics are placed between the wafer holding 36 × 36 junctions and the microchannel cooler (red layer) providing the cooling. X-Y-Z piezo actuators are placed in the four corners on top of the steel frame, allowing for accurate alignment of the module. Flexible electric wires supply voltages, currents, and control signals to the DACs and control electronics, such as field-programmable gate arrays (FPGAs). Coolant is supplied to the microchannel cooler layer via two flexible steel tubes placed in the center of the modules.
Source: Science Journals — AAAS. Blueprint for a microwave trapped ion quantum computer. Lekitsch et al. Sci. Adv. 2017;3: e1601540 1 February 2017
Fig. 5. Illustration of vacuum chambers. Schematic of octagonal UHV chambers connected together; each chamber is 4.5 × 4.5 m2 large and can hold >2.2 million individual X-junctions placed on steel frames.
Source: Science Journals — AAAS. Blueprint for a microwave trapped ion quantum computer. Lekitsch et al. Sci. Adv. 2017;3: e1601540 1 February 2017

We are already building it now. Within two years we think we will have completed a prototype which incorporates all the technology we state in this blueprint.

At the same time we are now looking for industry partner so we can really build a large-scale device that fills a building basically.
It’s extraordinarily expensive so we need industry partners … this will be in the 10s of millions, up to £100m.
Commenting on the research, described in a paper in the journal Science Advances, other academics praised the quality of the work but expressed caution about how quickly it could be developed.
Dr Toby Cubitt, a Royal Society research fellow in quantum information theory at University College London, said: “Many different technologies are competing to build the first large-scale quantum computer. Ion traps were one of the earliest realistic proposals. 
This work is an important step towards scaling up ion-trap quantum computing.
Though there’s still a long way to go before you’ll be making spreadsheets on your quantum computer.
And Professor Alan Woodward, of Surrey University, hailed the “tremendous step in the right direction”.
It is great work,” he said. “They have made some significant strides forward.

But he added it was “too soon to say” whether it would lead to the hoped-for technological revolution.

ORIGINAL: The Independent
Ian Johnston Science Correspondent
Thursday 2 February 2017

The Great A.I.Awakening

By Hugo Angel,

How Google used artificial intelligence to transform GoogleTranslate, one of its more popular services — and howmachine learning is poised to reinvent computing itself.
Credit Illustration by Pablo Delcan

Prologue: You Are What You Have Read

Late one Friday night in early November, Jun Rekimoto, a distinguished professor of human-computer interaction at the University of Tokyo, was online preparing for a lecture when he began to notice some peculiar posts rolling in on social media. Apparently Google Translate, the company’s popular machine-translation service, had suddenly and almost immeasurably improved. Rekimoto visited Translate himself and began to experiment with it. He was astonished. He had to go to sleep, but Translate refused to relax its grip on his imagination.

Rekimoto wrote up his initial findings in a blog post.
First, he compared a few sentences from two published versions of “The Great Gatsby,Takashi Nozaki’s 1957 translation and Haruki Murakami’s more recent iteration, with what this new Google Translate was able to produce. Murakami’s translation is written “in very polished Japanese,” Rekimoto explained to me later via email, but the prose is distinctively “Murakami-style.” By contrast, Google’s translation — despite some “small unnaturalness” — reads to him as “more transparent.”
The second half of Rekimoto’s post examined the service in the other direction, from Japanese to English. He dashed off his own Japanese interpretation of the opening to Hemingway’s “The Snows of Kilimanjaro,” then ran that passage back through Google into English. He published this version alongside Hemingway’s original, and proceeded to invite his readers to guess which was the work of a machine.
NO. 1:
Kilimanjaro is a snow-covered mountain 19,710 feet high, and is said to be the highest mountain in Africa. Its western summit is called the Masai “Ngaje Ngai,” the House of God. Close to the western summit there is the dried and frozen carcass of a leopard. No one has explained what the leopard was seeking at that altitude.
NO. 2:
Kilimanjaro is a mountain of 19,710 feet covered with snow and is said to be the highest mountain in Africa. The summit of the west is called “Ngaje Ngai” in Masai, the house of God. Near the top of the west there is a dry and frozen dead body of leopard. No one has ever explained what leopard wanted at that altitude.
Even to a native English speaker, the missing article on the leopard is the only real giveaway that No. 2 was the output of an automaton. Their closeness was a source of wonder to Rekimoto, who was well acquainted with the capabilities of the previous service. Only 24 hours earlier, Google would have translated the same Japanese passage as follows:
Kilimanjaro is 19,710 feet of the mountain covered with snow, and it is said that the highest mountain in Africa. Top of the west, “Ngaje Ngai” in the Maasai language, has been referred to as the house of God. The top close to the west, there is a dry, frozen carcass of a leopard. Whether the leopard had what the demand at that altitude, there is no that nobody explained.
Rekimoto promoted his discovery to his hundred thousand or so followers on Twitter, and over the next few hours thousands of people broadcast their own experiments with the machine-translation service. Some were successful, others meant mostly for comic effect. As dawn broke over Tokyo, Google Translate was the No. 1 trend on Japanese Twitter, just above some cult anime series and the long-awaited new single from a girl-idol supergroup. Everybody wondered: How had Google Translate become so uncannily artful?
Four days later, a couple of hundred journalists, entrepreneurs and advertisers from all over the world gathered in Google’s London engineering office for a special announcement. Guests were greeted with Translate-branded fortune cookies. Their paper slips had a foreign phrase on one side — mine was in Norwegian — and on the other, an invitation to download the Translate app. Tables were set with trays of doughnuts and smoothies, each labeled with a placard that advertised its flavor in German (zitrone), Portuguese (baunilha) or Spanish (manzana). After a while, everyone was ushered into a plush, dark theater.
Photo


Sundar Pichai, chief executive of Google, outside his office in Mountain View, Calif. CreditBrian Finke for The New York Times
Sadiq Khan, the mayor of London, stood to make a few opening remarks. A friend, he began, had recently told him he reminded him of Google. “Why, because I know all the answers?” the mayor asked. “No,” the friend replied, “because you’re always trying to finish my sentences.” The crowd tittered politely. Khan concluded by introducing Google’s chief executive, Sundar Pichai, who took the stage.
Pichai was in London in part to inaugurate Google’s new building there, the cornerstone of a new “knowledge quarter” under construction at King’s Cross, and in part to unveil the completion of the initial phase of a company transformation he announced last year. The Google of the future, Pichai had said on several occasions, was going to be “A.I. first.What that meant in theory was complicated and had welcomed much speculation. What it meant in practice, with any luck, was that soon the company’s products would no longer represent the fruits of traditional computer programming, exactly, but “machine learning.
A rarefied department within the company, Google Brain, was founded five years ago on this very principle: that artificial “neural networks” that acquaint themselves with the world via trial and error, as toddlers do, might in turn develop something like human flexibility. This notion is not new — a version of it dates to the earliest stages of modern computing, in the 1940s — but for much of its history most computer scientists saw it as vaguely disreputable, even mystical. Since 2011, though, Google Brain has demonstrated that this approach to artificial intelligence could solve many problems that confounded decades of conventional efforts. Speech recognition didn’t work very well until Brain undertook an effort to revamp it; the application of machine learning made its performance on Google’s mobile platform, Android, almost as good as human transcription. The same was true of image recognition. Less than a year ago, Brain for the first time commenced with the gut renovation of an entire consumer product, and its momentous results were being celebrated tonight.
Translate made its debut in 2006 and since then has become one of Google’s most reliable and popular assets; it serves more than 500 million monthly users in need of 140 billion words per day in a different language. It exists not only as its own stand-alone app but also as an integrated feature within Gmail, Chrome and many other Google offerings, where we take it as a push-button given — a frictionless, natural part of our digital commerce. It was only with the refugee crisis, Pichai explained from the lectern, that the company came to reckon with Translate’s geopolitical importance: On the screen behind him appeared a graph whose steep curve indicated a recent fivefold increase in translations between Arabic and German. (It was also close to Pichai’s own heart. He grew up in India, a land divided by dozens of languages.) The team had been steadily adding new languages and features, but gains in quality over the last four years had slowed considerably.
Until today. As of the previous weekend, Translate had been converted to an A.I.-based system for much of its traffic, not just in the United States but in Europe and Asia as well: The rollout included translations between English and Spanish, French, Portuguese, German, Chinese, Japanese, Korean and Turkish. The rest of Translate’s hundred-odd languages were to come, with the aim of eight per month, by the end of next year. The new incarnation, to the pleasant surprise of Google’s own engineers, had been completed in only nine months. The A.I. system had demonstrated overnight improvements roughly equal to the total gains the old one had accrued over its entire lifetime.
Pichai has an affection for the obscure literary reference; he told me a month earlier, in his office in Mountain View, Calif., that Translate in part exists because not everyone can be like the physicist Robert Oppenheimer, who learned Sanskrit to read the Bhagavad Gita in the original. In London, the slide on the monitors behind him flicked to a Borges quote: “Uno no es lo que es por lo que escribe, sino por lo que ha leído.”
Grinning, Pichai read aloud an awkward English version of the sentence that had been rendered by the old Translate system: “One is not what is for what he writes, but for what he has read.”
To the right of that was a new A.I.-rendered version: “You are not what you write, but what you have read.”
It was a fitting remark: The new Google Translate was run on the first machines that had, in a sense, ever learned to read anything at all.
Google’s decision to reorganize itself around A.I. was the first major manifestation of what has become an industrywide machine-learning delirium. Over the past four years, six companies in particular — Google, Facebook, Apple, Amazon, Microsoft and the Chinese firm Baidu — have touched off an arms race for A.I. talent, particularly within universities. Corporate promises of resources and freedom have thinned out top academic departments. It has become widely known in Silicon Valley that Mark Zuckerberg, chief executive of Facebook, personally oversees, with phone calls and video-chat blandishments, his company’s overtures to the most desirable graduate students. Starting salaries of seven figures are not unheard-of. Attendance at the field’s most important academic conference has nearly quadrupled. What is at stake is not just one more piecemeal innovation but control over what very well could represent an entirely new computational platform: pervasive, ambient artificial intelligence.
What is at stake is not just one more piecemeal innovation but control over what very well could represent an entirely new computational platform. 
The phrase “artificial intelligence” is invoked as if its meaning were self-evident, but it has always been a source of confusion and controversy. Imagine if you went back to the 1970s, stopped someone on the street, pulled out a smartphone and showed her Google Maps. Once you managed to convince her you weren’t some oddly dressed wizard, and that what you withdrew from your pocket wasn’t a black-arts amulet but merely a tiny computer more powerful than that the one that guided Apollo missions, Google Maps would almost certainly seem to her a persuasive example of “artificial intelligence.” In a very real sense, it is. It can do things any map-literate human can manage, like get you from your hotel to the airport — though it can do so much more quickly and reliably. It can also do things that humans simply and obviously cannot: It can evaluate the traffic, plan the best route and reorient itself when you take the wrong exit.
Practically nobody today, however, would bestow upon Google Maps the honorific “A.I.,” so sentimental and sparing are we in our use of the word “intelligence.” Artificial intelligence, we believe, must be something that distinguishes HAL from whatever it is a loom or wheelbarrow can do. The minute we can automate a task, we downgrade the relevant skill involved to one of mere mechanism. Today Google Maps seems, in the pejorative sense of the term, robotic: It simply accepts an explicit demand (the need to get from one place to another) and tries to satisfy that demand as efficiently as possible. The goal posts for “artificial intelligence” are thus constantly receding.
When he has an opportunity to make careful distinctions, Pichai differentiates between the current applications of A.I. and the ultimate goal of “artificial general intelligence.” Artificial general intelligence will not involve dutiful adherence to explicit instructions, but instead will demonstrate a facility with the implicit, the interpretive. It will be a general tool, designed for general purposes in a general context. Pichai believes his company’s future depends on something like this. Imagine if you could tell Google Maps, “I’d like to go to the airport, but I need to stop off on the way to buy a present for my nephew.” A more generally intelligent version of that service — a ubiquitous assistant, of the sort that Scarlett Johansson memorably disembodied three years ago in the Spike Jonze film “Her”— would know all sorts of things that, say, a close friend or an earnest intern might know: your nephew’s age, and how much you ordinarily like to spend on gifts for children, and where to find an open store. But a truly intelligent Maps could also conceivably know all sorts of things a close friend wouldn’t, like what has only recently come into fashion among preschoolers in your nephew’s school — or more important, what its users actually want. If an intelligent machine were able to discern some intricate if murky regularity in data about what we have done in the past, it might be able to extrapolate about our subsequent desires, even if we don’t entirely know them ourselves.
The new wave of A.I.-enhanced assistants — Apple’s Siri, Facebook’s M, Amazon’s Echo — are all creatures of machine learning, built with similar intentions. The corporate dreams for machine learning, however, aren’t exhausted by the goal of consumer clairvoyance. A medical-imaging subsidiary of Samsung announced this year that its new ultrasound devices could detect breast cancer. Management consultants are falling all over themselves to prep executives for the widening industrial applications of computers that program themselves. DeepMind, a 2014 Google acquisition, defeated the reigning human grandmaster of the ancient board game Go, despite predictions that such an achievement would take another 10 years.
In a famous 1950 essay, Alan Turing proposed a test for an artificial general intelligence: a computer that could, over the course of five minutes of text exchange, successfully deceive a real human interlocutor. Once a machine can translate fluently between two natural languages, the foundation has been laid for a machine that might one day “understand” human language well enough to engage in plausible conversation. Google Brain’s members, who pushed and helped oversee the Translate project, believe that such a machine would be on its way to serving as a generally intelligent all-encompassing personal digital assistant.
What follows here is the story of how a team of Google researchers and engineers — at first one or two, then three or four, and finally more than a hundred — made considerable progress in that direction. It’s an uncommon story in many ways, not least of all because it defies many of the Silicon Valley stereotypes we’ve grown accustomed to. It does not feature people who think that everything will be unrecognizably different tomorrow or the next day because of some restless tinkerer in his garage. It is neither a story about people who think technology will solve all our problems nor one about people who think technology is ineluctably bound to create apocalyptic new ones. It is not about disruption, at least not in the way that word tends to be used.
It is, in fact, three overlapping stories that converge in Google Translate’s successful metamorphosis to A.I. — a technical story, an institutional story and a story about the evolution of ideas. The technical story is about one team on one product at one company, and the process by which they refined, tested and introduced a brand-new version of an old product in only about a quarter of the time anyone, themselves included, might reasonably have expected. The institutional story is about the employees of a small but influential artificial-intelligence group within that company, and the process by which their intuitive faith in some old, unproven and broadly unpalatable notions about computing upended every other company within a large radius. The story of ideas is about the cognitive scientists, psychologists and wayward engineers who long toiled in obscurity, and the process by which their ostensibly irrational convictions ultimately inspired a paradigm shift in our understanding not only of technology but also, in theory, of consciousness itself.
It’s an uncommon story in many ways, not least of all because it defies many of the Silicon Valley stereotypes we’ve grown accustomed to. 
The first story, the story of Google Translate, takes place in Mountain View over nine months, and it explains the transformation of machine translation. The second story, the story of Google Brain and its many competitors, takes place in Silicon Valley over five years, and it explains the transformation of that entire community. The third story, the story of deep learning, takes place in a variety of far-flung laboratories — in Scotland, Switzerland, Japan and most of all Canada — over seven decades, and it might very well contribute to the revision of our self-image as first and foremost beings who think.
All three are stories about artificial intelligence. The seven-decade story is about what we might conceivably expect or want from it. The five-year story is about what it might do in the near future. The nine-month story is about what it can do right this minute. These three stories are themselves just proof of concept. All of this is only the beginning.


Part I: Learning Machine
1. The Birth of Brain
Jeff Dean, though his title is senior fellow, is the de facto head of Google Brain. Dean is a sinewy, energy-efficient man with a long, narrow face, deep-set eyes and an earnest, soapbox-derby sort of enthusiasm. The son of a medical anthropologist and a public-health epidemiologist, Dean grew up all over the world — Minnesota, Hawaii, Boston, Arkansas, Geneva, Uganda, Somalia, Atlanta — and, while in high school and college, wrote software used by the World Health Organization. He has been with Google since 1999, as employee 25ish, and has had a hand in the core software systems beneath nearly every significant undertaking since then. A beloved artifact of company culture is Jeff Dean Facts, written in the style of the Chuck Norris Facts meme: “Jeff Dean’s PIN is the last four digits of pi.” “When Alexander Graham Bell invented the telephone, he saw a missed call from Jeff Dean.” “Jeff Dean got promoted to Level 11 in a system where the maximum level is 10.” (This last one is, in fact, true.)
The Google engineer and Google Brain leader Jeff Dean. CreditBrian Finke for The New York Times
One day in early 2011, Dean walked into one of the Google campus’s “microkitchens” — the “Googley” word for the shared break spaces on most floors of the Mountain View complex’s buildings — and ran into Andrew Ng, a young Stanford computer-science professor who was working for the company as a consultant. Ng told him about Project Marvin, an internal effort (named after the celebrated A.I. pioneer Marvin Minsky) he had recently helped establish to experiment with “neural networks,” pliant digital lattices based loosely on the architecture of the brain. Dean himself had worked on a primitive version of the technology as an undergraduate at the University of Minnesota in 1990, during one of the method’s brief windows of mainstream acceptability. Now, over the previous five years, the number of academics working on neural networks had begun to grow again, from a handful to a few dozen. Ng told Dean that Project Marvin, which was being underwritten by Google’s secretive X lab, had already achieved some promising results.
Dean was intrigued enough to lend his “20 percent” — the portion of work hours every Google employee is expected to contribute to programs outside his or her core job — to the project. Pretty soon, he suggested to Ng that they bring in another colleague with a neuroscience background, Greg Corrado. (In graduate school, Corrado was taught briefly about the technology, but strictly as a historical curiosity. “It was good I was paying attention in class that day,” he joked to me.) In late spring they brought in one of Ng’s best graduate students, Quoc Le, as the project’s first intern. By then, a number of the Google engineers had taken to referring to Project Marvin by another name: Google Brain.
Since the term “artificial intelligence” was first coined, at a kind of constitutional convention of the mind at Dartmouth in the summer of 1956, a majority of researchers have long thought the best approach to creating A.I. would be to write a very big, comprehensive program that laid out both the rules of logical reasoning and sufficient knowledge of the world. If you wanted to translate from English to Japanese, for example, you would program into the computer all of the grammatical rules of English, and then the entirety of definitions contained in the Oxford English Dictionary, and then all of the grammatical rules of Japanese, as well as all of the words in the Japanese dictionary, and only after all of that feed it a sentence in a source language and ask it to tabulate a corresponding sentence in the target language. You would give the machine a language map that was, as Borges would have had it, the size of the territory. This perspective is usually called “symbolic A.I.” — because its definition of cognition is based on symbolic logic — or, disparagingly, “good old-fashioned A.I.”
There are two main problems with the old-fashioned approach. The first is that it’s awfully time-consuming on the human end. The second is that it only really works in domains where rules and definitions are very clear: in mathematics, for example, or chess. Translation, however, is an example of a field where this approach fails horribly, because words cannot be reduced to their dictionary definitions, and because languages tend to have as many exceptions as they have rules. More often than not, a system like this is liable to translate “minister of agriculture” as “priest of farming.” Still, for math and chess it worked great, and the proponents of symbolic A.I. took it for granted that no activities signaled “general intelligence” better than math and chess.
An excerpt of a 1961 documentary emphasizing the longstanding premise of artificial-intelligence research: If you could program a computer to mimic higher-order cognitive tasks like math or chess, you were on a path that would eventually lead to something akin to consciousness. Video posted on YouTube by Roberto Pieraccini
There were, however, limits to what this system could do. In the 1980s, a robotics researcher at Carnegie Mellon pointed out that it was easy to get computers to do adult things but nearly impossible to get them to do things a 1-year-old could do, like hold a ball or identify a cat. By the 1990s, despite punishing advancements in computer chess, we still weren’t remotely close to artificial general intelligence.
There has always been another vision for A.I. — a dissenting view — in which the computers would learn from the ground up (from data) rather than from the top down (from rules). This notion dates to the early 1940s, when it occurred to researchers that the best model for flexible automated intelligence was the brain itself. A brain, after all, is just a bunch of widgets, called neurons, that either pass along an electrical charge to their neighbors or don’t. What’s important are less the individual neurons themselves than the manifold connections among them. This structure, in its simplicity, has afforded the brain a wealth of adaptive advantages. The brain can operate in circumstances in which information is poor or missing; it can withstand significant damage without total loss of control; it can store a huge amount of knowledge in a very efficient way; it can isolate distinct patterns but retain the messiness necessary to handle ambiguity.
There was no reason you couldn’t try to mimic this structure in electronic form, and in 1943 it was shown that arrangements of simple artificial neurons could carry out basic logical functions. They could also, at least in theory, learn the way we do. With life experience, depending on a particular person’s trials and errors, the synaptic connections among pairs of neurons get stronger or weaker. An artificial neural network could do something similar, by gradually altering, on a guided trial-and-error basis, the numerical relationships among artificial neurons. It wouldn’t need to be preprogrammed with fixed rules. It would, instead, rewire itself to reflect patterns in the data it absorbed.
This attitude toward artificial intelligence was evolutionary rather than creationist. If you wanted a flexible mechanism, you wanted one that could adapt to its environment. If you wanted something that could adapt, you didn’t want to begin with the indoctrination of the rules of chess. You wanted to begin with very basic abilities — sensory perception and motor control — in the hope that advanced skills would emerge organically. Humans don’t learn to understand language by memorizing dictionaries and grammar books, so why should we possibly expect our computers to do so?
Google Brain was the first major commercial institution to invest in the possibilities embodied by this way of thinking about A.I. Dean, Corrado and Ng began their work as a part-time, collaborative experiment, but they made immediate progress. They took architectural inspiration for their models from recent theoretical outlines — as well as ideas that had been on the shelf since the 1980s and 1990s — and drew upon both the company’s peerless reserves of data and its massive computing infrastructure. They instructed the networks on enormous banks of “labeled” data — speech files with correct transcriptions, for example — and the computers improved their responses to better match reality.
“The portion of evolution in which animals developed eyes was a big development,” Dean told me one day, with customary understatement. We were sitting, as usual, in a whiteboarded meeting room, on which he had drawn a crowded, snaking timeline of Google Brain and its relation to inflection points in the recent history of neural networks. “Now computers have eyes. We can build them around the capabilities that now exist to understand photos. Robots will be drastically transformed. They’ll be able to operate in an unknown environment, on much different problems.” These capacities they were building may have seemed primitive, but their implications were profound.

Geoffrey Hinton, whose ideas helped lay the foundation for the neural-network approach to Google Translate, at Google’s offices in Toronto. CreditBrian Finke for The New York Times
2. The Unlikely Intern
In its first year or so of existence, Brain’s experiments in the development of a machine with the talents of a 1-year-old had, as Dean said, worked to great effect. Its speech-recognition team swapped out part of their old system for a neural network and encountered, in pretty much one fell swoop, the best quality improvements anyone had seen in 20 years. Their system’s object-recognition abilities improved by an order of magnitude. This was not because Brain’s personnel had generated a sheaf of outrageous new ideas in just a year. It was because Google had finally devoted the resources — in computers and, increasingly, personnel — to fill in outlines that had been around for a long time.
A great preponderance of these extant and neglected notions had been proposed or refined by a peripatetic English polymath named Geoffrey Hinton. In the second year of Brain’s existence, Hinton was recruited to Brain as Andrew Ng left. (Ng now leads the 1,300-person A.I. team at Baidu.) Hinton wanted to leave his post at the University of Toronto for only three months, so for arcane contractual reasons he had to be hired as an intern. At intern training, the orientation leader would say something like, “Type in your LDAP” — a user login — and he would flag a helper to ask, “What’s an LDAP?” All the smart 25-year-olds in attendance, who had only ever known deep learning as the sine qua non of artificial intelligence, snickered: “Who is that old guy? Why doesn’t he get it?”
“At lunchtime,” Hinton said, “someone in the queue yelled: ‘Professor Hinton! I took your course! What are you doing here?’ After that, it was all right.”
A few months later, Hinton and two of his students demonstrated truly astonishing gains in a big image-recognition contest, run by an open-source collective called ImageNet, that asks computers not only to identify a monkey but also to distinguish between spider monkeys and howler monkeys, and among God knows how many different breeds of cat. Google soon approached Hinton and his students with an offer. They accepted. “I thought they were interested in our I.P.,” he said. “Turns out they were interested in us.”
Hinton comes from one of those old British families emblazoned like the Darwins at eccentric angles across the intellectual landscape, where regardless of titular preoccupation a person is expected to make sideline contributions to minor problems in astronomy or fluid dynamics. His great-great-grandfather was George Boole, whose foundational work in symbolic logic underpins the computer; another great-great-grandfather was a celebrated surgeon, his father a venturesome entomologist, his father’s cousin a Los Alamos researcher; the list goes on. He trained at Cambridge and Edinburgh, then taught at Carnegie Mellon before he ended up at Toronto, where he still spends half his time. (His work has long been supported by the largess of the Canadian government.) I visited him in his office at Google there. He has tousled yellowed-pewter hair combed forward in a mature Noel Gallagher style and wore a baggy striped dress shirt that persisted in coming untucked, and oval eyeglasses that slid down to the tip of a prominent nose. He speaks with a driving if shambolic wit, and says things like, “Computers will understand sarcasm before Americans do.”
Hinton had been working on neural networks since his undergraduate days at Cambridge in the late 1960s, and he is seen as the intellectual primogenitor of the contemporary field. For most of that time, whenever he spoke about machine learning, people looked at him as though he were talking about the Ptolemaic spheres or bloodletting by leeches. Neural networks were taken as a disproven folly, largely on the basis of one overhyped project: the Perceptron, an artificial neural network that Frank Rosenblatt, a Cornell psychologist, developed in the late 1950s. The New York Times reported that the machine’s sponsor, the United States Navy, expected it would “be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” It went on to do approximately none of those things. Marvin Minsky, the dean of artificial intelligence in America, had worked on neural networks for his 1954 Princeton thesis, but he’d since grown tired of the inflated claims that Rosenblatt — who was a contemporary at Bronx Science — made for the neural paradigm. (He was also competing for Defense Department funding.) Along with an M.I.T. colleague, Minsky published a book that proved that there were painfully simple problems the Perceptron could never solve.
Minsky’s criticism of the Perceptron extended only to networks of one “layer,” i.e., one layer of artificial neurons between what’s fed to the machine and what you expect from it — and later in life, he expounded ideas very similar to contemporary deep learning. But Hinton already knew at the time that complex tasks could be carried out if you had recourse to multiple layers. The simplest description of a neural network is that it’s a machine that makes classifications or predictions based on its ability to discover patterns in data. With one layer, you could find only simple patterns; with more than one, you could look for patterns of patterns. Take the case of image recognition, which tends to rely on a contraption called a “convolutional neural net.” (These were elaborated in a seminal 1998 paper whose lead author, a Frenchman named Yann LeCun, did his postdoctoral research in Toronto under Hinton and now directs a huge A.I. endeavor at Facebook.) The first layer of the network learns to identify the very basic visual trope of an “edge,” meaning a nothing (an off-pixel) followed by a something (an on-pixel) or vice versa. Each successive layer of the network looks for a pattern in the previous layer. A pattern of edges might be a circle or a rectangle. A pattern of circles or rectangles might be a face. And so on. This more or less parallels the way information is put together in increasingly abstract ways as it travels from the photoreceptors in the retina back and up through the visual cortex. At each conceptual step, detail that isn’t immediately relevant is thrown away. If several edges and circles come together to make a face, you don’t care exactly where the face is found in the visual field; you just care that it’s a face.
A demonstration from 1993 showing an early version of the researcher Yann LeCun’s convolutional neural network, which by the late 1990s was processing 10 to 20 percent of all checks in the United States. A similar technology now drives most state-of-the-art image-recognition systems. Video posted on YouTube by Yann LeCun
The issue with multilayered, “deep” neural networks was that the trial-and-error part got extraordinarily complicated. In a single layer, it’s easy. Imagine that you’re playing with a child. You tell the child, “Pick up the green ball and put it into Box A.” The child picks up a green ball and puts it into Box B. You say, “Try again to put the green ball in Box A.” The child tries Box A. Bravo.
Now imagine you tell the child, “Pick up a green ball, go through the door marked 3 and put the green ball into Box A.” The child takes a red ball, goes through the door marked 2 and puts the red ball into Box B. How do you begin to correct the child? You cannot just repeat your initial instructions, because the child does not know at which point he went wrong. In real life, you might start by holding up the red ball and the green ball and saying, “Red ball, green ball.” The whole point of machine learning, however, is to avoid that kind of explicit mentoring. Hinton and a few others went on to invent a solution (or rather, reinvent an older one) to this layered-error problem, over the halting course of the late 1970s and 1980s, and interest among computer scientists in neural networks was briefly revived. “People got very excited about it,” he said. “But we oversold it.” Computer scientists quickly went back to thinking that people like Hinton were weirdos and mystics.
These ideas remained popular, however, among philosophers and psychologists, who called it “connectionism” or “parallel distributed processing.” “This idea,” Hinton told me, “of a few people keeping a torch burning, it’s a nice myth. It was true within artificial intelligence. But within psychology lots of people believed in the approach but just couldn’t do it.” Neither could Hinton, despite the generosity of the Canadian government. “There just wasn’t enough computer power or enough data. People on our side kept saying, ‘Yeah, but if I had a really big one, it would work.’ It wasn’t a very persuasive argument.”
‘The portion of evolution in which animals developed eyes was a big development. Now computers have eyes.’ 
3. A Deep Explanation of Deep Learning
When Pichai said that Google would henceforth be “A.I. first,” he was not just making a claim about his company’s business strategy; he was throwing in his company’s lot with this long-unworkable idea. Pichai’s allocation of resources ensured that people like Dean could ensure that people like Hinton would have, at long last, enough computers and enough data to make a persuasive argument. An average brain has something on the order of 100 billion neurons. Each neuron is connected to up to 10,000 other neurons, which means that the number of synapses is between 100 trillion and 1,000 trillion. For a simple artificial neural network of the sort proposed in the 1940s, the attempt to even try to replicate this was unimaginable. We’re still far from the construction of a network of that size, but Google Brain’s investment allowed for the creation of artificial neural networks comparable to the brains of mice.
To understand why scale is so important, however, you have to start to understand some of the more technical details of what, exactly, machine intelligences are doing with the data they consume. A lot of our ambient fears about A.I. rest on the idea that they’re just vacuuming up knowledge like a sociopathic prodigy in a library, and that an artificial intelligence constructed to make paper clips might someday decide to treat humans like ants or lettuce. This just isn’t how they work. All they’re doing is shuffling information around in search of commonalities — basic patterns, at first, and then more complex ones — and for the moment, at least, the greatest danger is that the information we’re feeding them is biased in the first place.
If that brief explanation seems sufficiently reassuring, the reassured nontechnical reader is invited to skip forward to the next section, which is about cats. If not, then read on. (This section is also, luckily, about cats.)
Imagine you want to program a cat-recognizer on the old symbolic-A.I. model. You stay up for days preloading the machine with an exhaustive, explicit definition of “cat.” You tell it that a cat has four legs and pointy ears and whiskers and a tail, and so on. All this information is stored in a special place in memory called Cat. Now you show it a picture. First, the machine has to separate out the various distinct elements of the image. Then it has to take these elements and apply the rules stored in its memory. If(legs=4) and if(ears=pointy) and if(whiskers=yes) and if(tail=yes) and if(expression=supercilious), then(cat=yes). But what if you showed this cat-recognizer a Scottish Fold, a heart-rending breed with a prized genetic defect that leads to droopy doubled-over ears? Our symbolic A.I. gets to (ears=pointy) and shakes its head solemnly, “Not cat.” It is hyperliteral, or “brittle.” Even the thickest toddler shows much greater inferential acuity.
Now imagine that instead of hard-wiring the machine with a set of rules for classification stored in one location of the computer’s memory, you try the same thing on a neural network. There is no special place that can hold the definition of “cat.” There is just a giant blob of interconnected switches, like forks in a path. On one side of the blob, you present the inputs (the pictures); on the other side, you present the corresponding outputs (the labels). Then you just tell it to work out for itself, via the individual calibration of all of these interconnected switches, whatever path the data should take so that the inputs are mapped to the correct outputs. The training is the process by which a labyrinthine series of elaborate tunnels are excavated through the blob, tunnels that connect any given input to its proper output. The more training data you have, the greater the number and intricacy of the tunnels that can be dug. Once the training is complete, the middle of the blob has enough tunnels that it can make reliable predictions about how to handle data it has never seen before. This is called “supervised learning.”
The reason that the network requires so many neurons and so much data is that it functions, in a way, like a sort of giant machine democracy. Imagine you want to train a computer to differentiate among five different items. Your network is made up of millions and millions of neuronal “voters,” each of whom has been given five different cards: one for cat, one for dog, one for spider monkey, one for spoon and one for defibrillator. You show your electorate a photo and ask, “Is this a cat, a dog, a spider monkey, a spoon or a defibrillator?” All the neurons that voted the same way collect in groups, and the network foreman peers down from above and identifies the majority classification: “A dog?”
You say: “No, maestro, it’s a cat. Try again.”
Now the network foreman goes back to identify which voters threw their weight behind “cat” and which didn’t. The ones that got “cat” right get their votes counted double next time — at least when they’re voting for “cat.” They have to prove independently whether they’re also good at picking out dogs and defibrillators, but one thing that makes a neural network so flexible is that each individual unit can contribute differently to different desired outcomes. What’s important is not the individual vote, exactly, but the pattern of votes. If Joe, Frank and Mary all vote together, it’s a dog; but if Joe, Kate and Jessica vote together, it’s a cat; and if Kate, Jessica and Frank vote together, it’s a defibrillator. The neural network just needs to register enough of a regularly discernible signal somewhere to say, “Odds are, this particular arrangement of pixels represents something these humans keep calling ‘cats.’ ” The more “voters” you have, and the more times you make them vote, the more keenly the network can register even very weak signals. If you have only Joe, Frank and Mary, you can maybe use them only to differentiate among a cat, a dog and a defibrillator. If you have millions of different voters that can associate in billions of different ways, you can learn to classify data with incredible granularity. Your trained voter assembly will be able to look at an unlabeled picture and identify it more or less accurately.
Part of the reason there was so much resistance to these ideas in computer-science departments is that because the output is just a prediction based on patterns of patterns, it’s not going to be perfect, and the machine will never be able to define for you what, exactly, a cat is. It just knows them when it sees them. This wooliness, however, is the point. The neuronal “voters” will recognize a happy cat dozing in the sun and an angry cat glaring out from the shadows of an untidy litter box, as long as they have been exposed to millions of diverse cat scenes. You just need lots and lots of the voters — in order to make sure that some part of your network picks up on even very weak regularities, on Scottish Folds with droopy ears, for example — and enough labeled data to make sure your network has seen the widest possible variance in phenomena.
It is important to note, however, that the fact that neural networks are probabilistic in nature means that they’re not suitable for all tasks. It’s no great tragedy if they mislabel 1 percent of cats as dogs, or send you to the wrong movie on occasion, but in something like a self-driving car we all want greater assurances. This isn’t the only caveat. Supervised learning is a trial-and-error process based on labeled data. The machines might be doing the learning, but there remains a strong human element in the initial categorization of the inputs. If your data had a picture of a man and a woman in suits that someone had labeled “woman with her boss,” that relationship would be encoded into all future pattern recognition. Labeled data is thus fallible the way that human labelers are fallible. If a machine was asked to identify creditworthy candidates for loans, it might use data like felony convictions, but if felony convictions were unfair in the first place — if they were based on, say, discriminatory drug laws — then the loan recommendations would perforce also be fallible.
Image-recognition networks like our cat-identifier are only one of many varieties of deep learning, but they are disproportionately invoked as teaching examples because each layer does something at least vaguely recognizable to humans — picking out edges first, then circles, then faces. This means there’s a safeguard against error. For instance, an early oddity in Google’s image-recognition software meant that it could not always identify a barbell in isolation, even though the team had trained it on an image set that included a lot of exercise categories. A visualization tool showed them the machine had learned not the concept of “dumbbell” but the concept of “dumbbell+arm,” because all the dumbbells in the training set were attached to arms. They threw into the training mix some photos of solo barbells. The problem was solved. Not everything is so easy.
Google Brain’s investment allowed for the creation of artificial neural networks comparable to the brains of mice. 
4. The Cat Paper
Over the course of its first year or two, Brain’s efforts to cultivate in machines the skills of a 1-year-old were auspicious enough that the team was graduated out of the X lab and into the broader research organization. (The head of Google X once noted that Brain had paid for the entirety of X’s costs.) They still had fewer than 10 people and only a vague sense for what might ultimately come of it all. But even then they were thinking ahead to what ought to happen next. First a human mind learns to recognize a ball and rests easily with the accomplishment for a moment, but sooner or later, it wants to ask for the ball. And then it wades into language.
The first step in that direction was the cat paper, which made Brain famous.
What the cat paper demonstrated was that a neural network with more than a billion “synaptic” connections — a hundred times larger than any publicized neural network to that point, yet still many orders of magnitude smaller than our brains — could observe raw, unlabeled data and pick out for itself a high-order human concept. The Brain researchers had shown the network millions of still frames from YouTube videos, and out of the welter of the pure sensorium the network had isolated a stable pattern any toddler or chipmunk would recognize without a moment’s hesitation as the face of a cat. The machine had not been programmed with the foreknowledge of a cat; it reached directly into the world and seized the idea for itself. (The researchers discovered this with the neural-network equivalent of something like an M.R.I., which showed them that a ghostly cat face caused the artificial neurons to “vote” with the greatest collective enthusiasm.) Most machine learning to that point had been limited by the quantities of labeled data. The cat paper showed that machines could also deal with raw unlabeled data, perhaps even data of which humans had no established foreknowledge. This seemed like a major advance not only in cat-recognition studies but also in overall artificial intelligence.
The lead author on the cat paper was Quoc Le. Le is short and willowy and soft-spoken, with a quick, enigmatic smile and shiny black penny loafers. He grew up outside Hue, Vietnam. His parents were rice farmers, and he did not have electricity at home. His mathematical abilities were obvious from an early age, and he was sent to study at a magnet school for science. In the late 1990s, while still in school, he tried to build a chatbot to talk to. He thought, How hard could this be?
“But actually,” he told me in a whispery deadpan, “it’s very hard.”
He left the rice paddies on a scholarship to a university in Canberra, Australia, where he worked on A.I. tasks like computer vision. The dominant method of the time, which involved feeding the machine definitions for things like edges, felt to him like cheating. Le didn’t know then, or knew only dimly, that there were at least a few dozen computer scientists elsewhere in the world who couldn’t help imagining, as he did, that machines could learn from scratch. In 2006, Le took a position at the Max Planck Institute for Biological Cybernetics in the medieval German university town of Tübingen. In a reading group there, he encountered two new papers by Geoffrey Hinton. People who entered the discipline during the long diaspora all have conversion stories, and when Le read those papers, he felt the scales fall away from his eyes.
“There was a big debate,” he told me. “A very big debate.” We were in a small interior conference room, a narrow, high-ceilinged space outfitted with only a small table and two whiteboards. He looked to the curve he’d drawn on the whiteboard behind him and back again, then softly confided, “I’ve never seen such a big debate.”
He remembers standing up at the reading group and saying, “This is the future.” It was, he said, an “unpopular decision at the time.” A former adviser from Australia, with whom he had stayed close, couldn’t quite understand Le’s decision. “Why are you doing this?” he asked Le in an email.
“I didn’t have a good answer back then,” Le said. “I was just curious. There was a successful paradigm, but to be honest I was just curious about the new paradigm. In 2006, there was very little activity.” He went to join Ng at Stanford and began to pursue Hinton’s ideas. “By the end of 2010, I was pretty convinced something was going to happen.”
What happened, soon afterward, was that Le went to Brain as its first intern, where he carried on with his dissertation work — an extension of which ultimately became the cat paper. On a simple level, Le wanted to see if the computer could be trained to identify on its own the information that was absolutely essential to a given image. He fed the neural network a still he had taken from YouTube. He then told the neural network to throw away some of the information contained in the image, though he didn’t specify what it should or shouldn’t throw away. The machine threw away some of the information, initially at random. Then he said: “Just kidding! Now recreate the initial image you were shown based only on the information you retained.” It was as if he were asking the machine to find a way to “summarize” the image, and then expand back to the original from the summary. If the summary was based on irrelevant data — like the color of the sky rather than the presence of whiskers — the machine couldn’t perform a competent reconstruction. Its reaction would be akin to that of a distant ancestor whose takeaway from his brief exposure to saber-tooth tigers was that they made a restful swooshing sound when they moved. Le’s neural network, unlike that ancestor, got to try again, and again and again and again. Each time it mathematically “chose” to prioritize different pieces of information and performed incrementally better. A neural network, however, was a black box. It divined patterns, but the patterns it identified didn’t always make intuitive sense to a human observer. The same network that hit on our concept of cat also became enthusiastic about a pattern that looked like some sort of furniture-animal compound, like a cross between an ottoman and a goat.
Le didn’t see himself in those heady cat years as a language guy, but he felt an urge to connect the dots to his early chatbot. After the cat paper, he realized that if you could ask a network to summarize a photo, you could perhaps also ask it to summarize a sentence. This problem preoccupied Le, along with a Brain colleague named Tomas Mikolov, for the next two years.
In that time, the Brain team outgrew several offices around him. For a while they were on a floor they shared with executives. They got an email at one point from the administrator asking that they please stop allowing people to sleep on the couch in front of Larry Page and Sergey Brin’s suite. It unsettled incoming V.I.P.s. They were then allocated part of a research building across the street, where their exchanges in the microkitchen wouldn’t be squandered on polite chitchat with the suits. That interim also saw dedicated attempts on the part of Google’s competitors to catch up. (As Le told me about his close collaboration with Tomas Mikolov, he kept repeating Mikolov’s name over and over, in an incantatory way that sounded poignant. Le had never seemed so solemn. I finally couldn’t help myself and began to ask, “Is he … ?” Le nodded. “At Facebook,” he replied.)
Members of the Google Brain team in 2012, after their famous “cat paper” demonstrated the ability of neural networks to analyze unlabeled data. When shown millions of still frames from YouTube, a network isolated a pattern resembling the face of a cat. CreditGoogle
They spent this period trying to come up with neural-network architectures that could accommodate not only simple photo classifications, which were static, but also complex structures that unfolded over time, like language or music. Many of these were first proposed in the 1990s, and Le and his colleagues went back to those long-ignored contributions to see what they could glean. They knew that once you established a facility with basic linguistic prediction, you could then go on to do all sorts of other intelligent things — like predict a suitable reply to an email, for example, or predict the flow of a sensible conversation. You could sidle up to the sort of prowess that would, from the outside at least, look a lot like thinking.

Part II: Language Machine
5. The Linguistic Turn
The hundred or so current members of Brain — it often feels less like a department within a colossal corporate hierarchy than it does a club or a scholastic society or an intergalactic cantina — came in the intervening years to count among the freest and most widely admired employees in the entire Google organization. They are now quartered in a tiered two-story eggshell building, with large windows tinted a menacing charcoal gray, on the leafy northwestern fringe of the company’s main Mountain View campus. Their microkitchen has a foosball table I never saw used; a Rock Band setup I never saw used; and a Go kit I saw used on a few occasions. (I did once see a young Brain research associate introducing his colleagues to ripe jackfruit, carving up the enormous spiky orb like a turkey.)
When I began spending time at Brain’s offices, in June, there were some rows of empty desks, but most of them were labeled with Post-it notes that said things like “Jesse, 6/27.” Now those are all occupied. When I first visited, parking was not an issue. The closest spaces were those reserved for expectant mothers or Teslas, but there was ample space in the rest of the lot. By October, if I showed up later than 9:30, I had to find a spot across the street.
Brain’s growth made Dean slightly nervous about how the company was going to handle the demand. He wanted to avoid what at Google is known as a “success disaster” — a situation in which the company’s capabilities in theory outpaced its ability to implement a product in practice. At a certain point he did some back-of-the-envelope calculations, which he presented to the executives one day in a two-slide presentation.
“If everyone in the future speaks to their Android phone for three minutes a day,” he told them, “this is how many machines we’ll need.” They would need to double or triple their global computational footprint.
“That,” he observed with a little theatrical gulp and widened eyes, “sounded scary. You’d have to” — he hesitated to imagine the consequences — “build new buildings.”
There was, however, another option: just design, mass-produce and install in dispersed data centers a new kind of chip to make everything faster. These chips would be called T.P.U.s, or “tensor processing units,” and their value proposition — counterintuitively — is that they are deliberately less precise than normal chips. Rather than compute 12.246 times 54.392, they will give you the perfunctory answer to 12 times 54. On a mathematical level, rather than a metaphorical one, a neural network is just a structured series of hundreds or thousands or tens of thousands of matrix multiplications carried out in succession, and it’s much more important that these processes be fast than that they be exact. “Normally,” Dean said, “special-purpose hardware is a bad idea. It usually works to speed up one thing. But because of the generality of neural networks, you can leverage this special-purpose hardware for a lot of other things.”
Just as the chip-design process was nearly complete, Le and two colleagues finally demonstrated that neural networks might be configured to handle the structure of language. He drew upon an idea, called “word embeddings,” that had been around for more than 10 years. When you summarize images, you can divine a picture of what each stage of the summary looks like — an edge, a circle, etc. When you summarize language in a similar way, you essentially produce multidimensional maps of the distances, based on common usage, between one word and every single other word in the language. The machine is not “analyzing” the data the way that we might, with linguistic rules that identify some of them as nouns and others as verbs. Instead, it is shifting and twisting and warping the words around in the map. In two dimensions, you cannot make this map useful. You want, for example, “cat” to be in the rough vicinity of “dog,” but you also want “cat” to be near “tail” and near “supercilious” and near “meme,” because you want to try to capture all of the different relationships — both strong and weak — that the word “cat” has to other words. It can be related to all these other words simultaneously only if it is related to each of them in a different dimension. You can’t easily make a 160,000-dimensional map, but it turns out you can represent a language pretty well in a mere thousand or so dimensions — in other words, a universe in which each word is designated by a list of a thousand numbers. Le gave me a good-natured hard time for my continual requests for a mental picture of these maps. “Gideon,” he would say, with the blunt regular demurral of Bartleby, “I do not generally like trying to visualize thousand-dimensional vectors in three-dimensional space.”
Still, certain dimensions in the space, it turned out, did seem to represent legible human categories, like gender or relative size. If you took the thousand numbers that meant “king” and literally just subtracted the thousand numbers that meant “queen,” you got the same numerical result as if you subtracted the numbers for “woman” from the numbers for “man.” And if you took the entire space of the English language and the entire space of French, you could, at least in theory, train a network to learn how to take a sentence in one space and propose an equivalent in the other. You just had to give it millions and millions of English sentences as inputs on one side and their desired French outputs on the other, and over time it would recognize the relevant patterns in words the way that an image classifier recognized the relevant patterns in pixels. You could then give it a sentence in English and ask it to predict the best French analogue.
The major difference between words and pixels, however, is that all of the pixels in an image are there at once, whereas words appear in a progression over time. You needed a way for the network to “hold in mind” the progression of a chronological sequence — the complete pathway from the first word to the last. In a period of about a week, in September 2014, three papers came out — one by Le and two others by academics in Canada and Germany — that at last provided all the theoretical tools necessary to do this sort of thing. That research allowed for open-ended projects like Brain’s Magenta, an investigation into how machines might generate art and music. It also cleared the way toward an instrumental task like machine translation. Hinton told me he thought at the time that this follow-up work would take at least five more years.
It’s no great tragedy if neural networks mislabel 1 percent of cats as dogs, but in something like a self-driving car we all want greater assurances. 
6. The Ambush
Le’s paper showed that neural translation was plausible, but he had used only a relatively small public data set. (Small for Google, that is — it was actually the biggest public data set in the world. A decade of the old Translate had gathered production data that was between a hundred and a thousand times bigger.) More important, Le’s model didn’t work very well for sentences longer than about seven words.
Mike Schuster, who then was a staff research scientist at Brain, picked up the baton. He knew that if Google didn’t find a way to scale these theoretical insights up to a production level, someone else would. The project took him the next two years. “You think,” Schuster says, “to translate something, you just get the data, run the experiments and you’re done, but it doesn’t work like that.”
Schuster is a taut, focused, ageless being with a tanned, piston-shaped head, narrow shoulders, long camo cargo shorts tied below the knee and neon-green Nike Flyknits. He looks as if he woke up in the lotus position, reached for his small, rimless, elliptical glasses, accepted calories in the form of a modest portion of preserved acorn and completed a relaxed desert decathlon on the way to the office; in reality, he told me, it’s only an 18-mile bike ride each way. Schuster grew up in Duisburg, in the former West Germany’s blast-furnace district, and studied electrical engineering before moving to Kyoto to work on early neural networks. In the 1990s, he ran experiments with a neural-networking machine as big as a conference room; it cost millions of dollars and had to be trained for weeks to do something you could now do on your desktop in less than an hour. He published a paper in 1997 that was barely cited for a decade and a half; this year it has been cited around 150 times. He is not humorless, but he does often wear an expression of some asperity, which I took as his signature combination of German restraint and Japanese restraint.
The issues Schuster had to deal with were tangled. For one thing, Le’s code was custom-written, and it wasn’t compatible with the new open-source machine-learning platform Google was then developing, TensorFlow. Dean directed to Schuster two other engineers, Yonghui Wu and Zhifeng Chen, in the fall of 2015. It took them two months just to replicate Le’s results on the new system. Le was around, but even he couldn’t always make heads or tails of what they had done.
As Schuster put it, “Some of the stuff was not done in full consciousness. They didn’t know themselves why they worked.”
This February, Google’s research organization — the loose division of the company, roughly a thousand employees in all, dedicated to the forward-looking and the unclassifiable — convened their leads at an offsite retreat at the Westin St. Francis, on Union Square, a luxury hotel slightly less splendid than Google’s own San Francisco shop a mile or so to the east. The morning was reserved for rounds of “lightning talks,” quick updates to cover the research waterfront, and the afternoon was idled away in cross-departmental “facilitated discussions.” The hope was that the retreat might provide an occasion for the unpredictable, oblique, Bell Labs-ish exchanges that kept a mature company prolific.
At lunchtime, Corrado and Dean paired up in search of Macduff Hughes, director of Google Translate. Hughes was eating alone, and the two Brain members took positions at either side. As Corrado put it, “We ambushed him.”
“O.K.,” Corrado said to the wary Hughes, holding his breath for effect. “We have something to tell you.”
They told Hughes that 2016 seemed like a good time to consider an overhaul of Google Translate — the code of hundreds of engineers over 10 years — with a neural network. The old system worked the way all machine translation has worked for about 30 years: It sequestered each successive sentence fragment, looked up those words in a large statistically derived vocabulary table, then applied a battery of post-processing rules to affix proper endings and rearrange it all to make sense. The approach is called “phrase-based statistical machine translation,” because by the time the system gets to the next phrase, it doesn’t know what the last one was. This is why Translate’s output sometimes looked like a shaken bag of fridge magnets. Brain’s replacement would, if it came together, read and render entire sentences at one draft. It would capture context — and something akin to meaning.
The stakes may have seemed low: Translate generates minimal revenue, and it probably always will. For most Anglophone users, even a radical upgrade in the service’s performance would hardly be hailed as anything more than an expected incremental bump. But there was a case to be made that human-quality machine translation is not only a short-term necessity but also a development very likely, in the long term, to prove transformational. In the immediate future, it’s vital to the company’s business strategy. Google estimates that 50 percent of the internet is in English, which perhaps 20 percent of the world’s population speaks. If Google was going to compete in China — where a majority of market share in search-engine traffic belonged to its competitor Baidu — or India, decent machine translation would be an indispensable part of the infrastructure. Baidu itself had published a pathbreaking paper about the possibility of neural machine translation in July 2015.
‘You think to translate something, you just get the data, run the experiments and you’re done, but it doesn’t work like that.’ 
And in the more distant, speculative future, machine translation was perhaps the first step toward a general computational facility with human language. This would represent a major inflection point — perhaps the major inflection point — in the development of something that felt like true artificial intelligence.

Most people in Silicon Valley were aware of machine learning as a fast-approaching horizon, so Hughes had seen this ambush coming. He remained skeptical. A modest, sturdily built man of early middle age with mussed auburn hair graying at the temples, Hughes is a classic line engineer, the sort of craftsman who wouldn’t have been out of place at a drafting table at 1970s Boeing. His jeans pockets often look burdened with curious tools of ungainly dimension, as if he were porting around measuring tapes or thermocouples, and unlike many of the younger people who work for him, he has a wardrobe unreliant on company gear. He knew that various people in various places at Google and elsewhere had been trying to make neural translation work — not in a lab but at production scale — for years, to little avail.
Hughes listened to their case and, at the end, said cautiously that it sounded to him as if maybe they could pull it off in three years.
Dean thought otherwise. “We can do it by the end of the year, if we put our minds to it.” One reason people liked and admired Dean so much was that he had a long record of successfully putting his mind to it. Another was that he wasn’t at all embarrassed to say sincere things like “if we put our minds to it.”
Hughes was sure the conversion wasn’t going to happen any time soon, but he didn’t personally care to be the reason. “Let’s prepare for 2016,” he went back and told his team. “I’m not going to be the one to say Jeff Dean can’t deliver speed.”
A month later, they were finally able to run a side-by-side experiment to compare Schuster’s new system with Hughes’s old one. Schuster wanted to run it for English-French, but Hughes advised him to try something else. “English-French,” he said, “is so good that the improvement won’t be obvious.”
It was a challenge Schuster couldn’t resist. The benchmark metric to evaluate machine translation is called a BLEU score, which compares a machine translation with an average of many reliable human translations. At the time, the best BLEU scores for English-French were in the high 20s. An improvement of one point was considered very good; an improvement of two was considered outstanding.
The neural system, on the English-French language pair, showed an improvement over the old system of seven points.
Hughes told Schuster’s team they hadn’t had even half as strong an improvement in their own system in the last four years.
To be sure this wasn’t some fluke in the metric, they also turned to their pool of human contractors to do a side-by-side comparison. The user-perception scores, in which sample sentences were graded from zero to six, showed an average improvement of 0.4 — roughly equivalent to the aggregate gains of the old system over its entire lifetime of development.
Google’s Quoc Le (right), whose work demonstrated the plausibility of neural translation, with Mike Schuster, who helped apply that work to Google Translate. CreditBrian Finke for The New York Times
In mid-March, Hughes sent his team an email. All projects on the old system were to be suspended immediately.

7. Theory Becomes Product
Until then, the neural-translation team had been only three people — Schuster, Wu and Chen — but with Hughes’s support, the broader team began to coalesce. They met under Schuster’s command on Wednesdays at 2 p.m. in a corner room of the Brain building called Quartz Lake. The meeting was generally attended by a rotating cast of more than a dozen people. When Hughes or Corrado were there, they were usually the only native English speakers. The engineers spoke Chinese, Vietnamese, Polish, Russian, Arabic, German and Japanese, though they mostly spoke in their own efficient pidgin and in math. It is not always totally clear, at Google, who is running a meeting, but in Schuster’s case there was no ambiguity.
The steps they needed to take, even then, were not wholly clear. “This story is a lot about uncertainty — uncertainty throughout the whole process,” Schuster told me at one point. “The software, the data, the hardware, the people. It was like” — he extended his long, gracile arms, slightly bent at the elbows, from his narrow shoulders — “swimming in a big sea of mud, and you can only see this far.” He held out his hand eight inches in front of his chest. “There’s a goal somewhere, and maybe it’s there.”
Most of Google’s conference rooms have videochat monitors, which when idle display extremely high-resolution oversaturated public Google+ photos of a sylvan dreamscape or the northern lights or the Reichstag. Schuster gestured toward one of the panels, which showed a crystalline still of the Washington Monument at night.

“The view from outside is that everyone has binoculars and can see ahead so far.”
The theoretical work to get them to this point had already been painstaking and drawn-out, but the attempt to turn it into a viable product — the part that academic scientists might dismiss as “mere” engineering — was no less difficult. For one thing, they needed to make sure that they were training on good data. Google’s billions of words of training “reading” were mostly made up of complete sentences of moderate complexity, like the sort of thing you might find in Hemingway. Some of this is in the public domain: The original Rosetta Stone of statistical machine translation was millions of pages of the complete bilingual records of the Canadian Parliament. Much of it, however, was culled from 10 years of collected data, including human translations that were crowdsourced from enthusiastic respondents. The team had in their storehouse about 97 million unique English “words.” But once they removed the emoticons, and the misspellings, and the redundancies, they had a working vocabulary of only around 160,000.
Then you had to refocus on what users actually wanted to translate, which frequently had very little to do with reasonable language as it is employed. Many people, Google had found, don’t look to the service to translate full, complex sentences; they translate weird little shards of language. If you wanted the network to be able to handle the stream of user queries, you had to be sure to orient it in that direction. The network was very sensitive to the data it was trained on. As Hughes put it to me at one point: “The neural-translation system is learning everything it can. It’s like a toddler. ‘Oh, Daddy says that word when he’s mad!’ ” He laughed. “You have to be careful.”
More than anything, though, they needed to make sure that the whole thing was fast and reliable enough that their users wouldn’t notice. In February, the translation of a 10-word sentence took 10 seconds. They could never introduce anything that slow. The Translate team began to conduct latency experiments on a small percentage of users, in the form of faked delays, to identify tolerance. They found that a translation that took twice as long, or even five times as long, wouldn’t be registered. An eightfold slowdown would. They didn’t need to make sure this was true across all languages. In the case of a high-traffic language, like French or Chinese, they could countenance virtually no slowdown. For something more obscure, they knew that users wouldn’t be so scared off by a slight delay if they were getting better quality. They just wanted to prevent people from giving up and switching over to some competitor’s service.
Schuster, for his part, admitted he just didn’t know if they ever could make it fast enough. He remembers a conversation in the microkitchen during which he turned to Chen and said, “There must be something we don’t know to make it fast enough, but I don’t know what it could be.”
He did know, though, that they needed more computers — “G.P.U.s,” graphics processors reconfigured for neural networks — for training.
Hughes went to Schuster to ask what he thought. “Should we ask for a thousand G.P.U.s?”
Schuster said, “Why not 2,000?”
In the more distant, speculative future, machine translation was perhaps the first step toward a general computational facility with human language. 
Ten days later, they had the additional 2,000 processors.
By April, the original lineup of three had become more than 30 people — some of them, like Le, on the Brain side, and many from Translate. In May, Hughes assigned a kind of provisional owner to each language pair, and they all checked their results into a big shared spreadsheet of performance evaluations. At any given time, at least 20 people were running their own independent weeklong experiments and dealing with whatever unexpected problems came up. One day a model, for no apparent reason, started taking all the numbers it came across in a sentence and discarding them. There were months when it was all touch and go. “People were almost yelling,” Schuster said.
By late spring, the various pieces were coming together. The team introduced something called a “word-piece model,” a “coverage penalty,” “length normalization.” Each part improved the results, Schuster says, by maybe a few percentage points, but in aggregate they had significant effects. Once the model was standardized, it would be only a single multilingual model that would improve over time, rather than the 150 different models that Translate currently used. Still, the paradox — that a tool built to further generalize, via learning machines, the process of automation required such an extraordinary amount of concerted human ingenuity and effort — was not lost on them. So much of what they did was just gut. How many neurons per layer did you use? 1,024 or 512? How many layers? How many sentences did you run through at a time? How long did you train for?
“We did hundreds of experiments,” Schuster told me, “until we knew that we could stop the training after one week. You’re always saying: When do we stop? How do I know I’m done? You never know you’re done. The machine-learning mechanism is never perfect. You need to train, and at some point you have to stop. That’s the very painful nature of this whole system. It’s hard for some people. It’s a little bit an art — where you put your brush to make it nice. It comes from just doing it. Some people are better, some worse.”
By May, the Brain team understood that the only way they were ever going to make the system fast enough to implement as a product was if they could run it on T.P.U.s, the special-purpose chips that Dean had called for. As Chen put it: “We did not even know if the code would work. But we did know that without T.P.U.s, it definitely wasn’t going to work.” He remembers going to Dean one on one to plead, “Please reserve something for us.” Dean had reserved them. The T.P.U.s, however, didn’t work right out of the box. Wu spent two months sitting next to someone from the hardware team in an attempt to figure out why. They weren’t just debugging the model; they were debugging the chip. The neural-translation project would be proof of concept for the whole infrastructural investment.
One Wednesday in June, the meeting in Quartz Lake began with murmurs about a Baidu paper that had recently appeared on the discipline’s chief online forum. Schuster brought the room to order. “Yes, Baidu came out with a paper. It feels like someone looking through our shoulder — similar architecture, similar results.” The company’s BLEU scores were essentially what Google achieved in its internal tests in February and March. Le didn’t seem ruffled; his conclusion seemed to be that it was a sign Google was on the right track. “It is very similar to our system,” he said with quiet approval.
The Google team knew that they could have published their results earlier and perhaps beaten their competitors, but as Schuster put it: “Launching is more important than publishing. People say, ‘Oh, I did something first,’ but who cares, in the end?”
This did, however, make it imperative that they get their own service out first and better. Hughes had a fantasy that they wouldn’t even inform their users of the switch. They would just wait and see if social media lit up with suspicions about the vast improvements.
“We don’t want to say it’s a new system yet,” he told me at 5:36 p.m. two days after Labor Day, one minute before they rolled out Chinese-to-English to 10 percent of their users, without telling anyone. “We want to make sure it works. The ideal is that it’s exploding on Twitter: ‘Have you seen how awesome Google Translate got?’ ”

8. A Celebration
The only two reliable measures of time in the seasonless Silicon Valley are the rotations of seasonal fruit in the microkitchens — from the pluots of midsummer to the Asian pears and Fuyu persimmons of early fall — and the zigzag of technological progress. On an almost uncomfortably warm Monday afternoon in late September, the team’s paper was at last released. It had an almost comical 31 authors. The next day, the members of Brain and Translate gathered to throw themselves a little celebratory reception in the Translate microkitchen. The rooms in the Brain building, perhaps in homage to the long winters of their diaspora, are named after Alaskan locales; the Translate building’s theme is Hawaiian.
The Hawaiian microkitchen has a slightly grainy beach photograph on one wall, a small lei-garlanded thatched-hut service counter with a stuffed parrot at the center and ceiling fixtures fitted to resemble paper lanterns. Two sparse histograms of bamboo poles line the sides, like the posts of an ill-defended tropical fort. Beyond the bamboo poles, glass walls and doors open onto rows of identical gray desks on either side. That morning had seen the arrival of new hooded sweatshirts to honor 10 years of Translate, and many team members went over to the party from their desks in their new gear. They were in part celebrating the fact that their decade of collective work was, as of that day, en route to retirement. At another institution, these new hoodies might thus have become a costume of bereavement, but the engineers and computer scientists from both teams all seemed pleased.
‘It was like swimming in a big sea of mud, and you can only see this far.’ Schuster held out his hand eight inches in front of his chest. 
Google’s neural translation was at last working. By the time of the party, the company’s Chinese-English test had already processed 18 million queries. One engineer on the Translate team was running around with his phone out, trying to translate entire sentences from Chinese to English using Baidu’s alternative. He crowed with glee to anybody who would listen. “If you put in more than two characters at once, it times out!” (Baidu says this problem has never been reported by users.)
When word began to spread, over the following weeks, that Google had introduced neural translation for Chinese to English, some people speculated that it was because that was the only language pair for which the company had decent results. Everybody at the party knew that the reality of their achievement would be clear in November. By then, however, many of them would be on to other projects.
Hughes cleared his throat and stepped in front of the tiki bar. He wore a faded green polo with a rumpled collar, lightly patterned across the midsection with dark bands of drying sweat. There had been last-minute problems, and then last-last-minute problems, including a very big measurement error in the paper and a weird punctuation-related bug in the system. But everything was resolved — or at least sufficiently resolved for the moment. The guests quieted. Hughes ran efficient and productive meetings, with a low tolerance for maundering or side conversation, but he was given pause by the gravity of the occasion. He acknowledged that he was, perhaps, stretching a metaphor, but it was important to him to underline the fact, he began, that the neural translation project itself represented a “collaboration between groups that spoke different languages.”
Their neural-translation project, he continued, represented a “step function forward” — that is, a discontinuous advance, a vertical leap rather than a smooth curve. The relevant translation had been not just between the two teams but from theory into reality. He raised a plastic demi-flute of expensive-looking Champagne.
“To communication,” he said, “and cooperation!”

The engineers assembled looked around at one another and gave themselves over to little circumspect whoops and applause.
Jeff Dean stood near the center of the microkitchen, his hands in his pockets, shoulders hunched slightly inward, with Corrado and Schuster. Dean saw that there was some diffuse preference that he contribute to the observance of the occasion, and he did so in a characteristically understated manner, with a light, rapid, concise addendum.
What they had shown, Dean said, was that they could do two major things at once: “Do the research and get it in front of, I dunno, half a billion people.”
Everyone laughed, not because it was an exaggeration but because it wasn’t.

Epilogue: Machines Without Ghosts
Perhaps the most famous historic critique of artificial intelligence, or the claims made on its behalf, implicates the question of translation. The Chinese Room argument was proposed in 1980 by the Berkeley philosopher John Searle. In Searle’s thought experiment, a monolingual English speaker sits alone in a cell. An unseen jailer passes him, through a slot in the door, slips of paper marked with Chinese characters. The prisoner has been given a set of tables and rules in English for the composition of replies. He becomes so adept with these instructions that his answers are soon “absolutely indistinguishable from those of Chinese speakers.” Should the unlucky prisoner be said to “understand” Chinese? Searle thought the answer was obviously not. This metaphor for a computer, Searle later wrote, exploded the claim that “the appropriately programmed digital computer with the right inputs and outputs would thereby have a mind in exactly the sense that human beings have minds.”
For the Google Brain team, though, or for nearly everyone else who works in machine learning in Silicon Valley, that view is entirely beside the point. This doesn’t mean they’re just ignoring the philosophical question. It means they have a fundamentally different view of the mind. Unlike Searle, they don’t assume that “consciousness” is some special, numinously glowing mental attribute — what the philosopher Gilbert Ryle called the “ghost in the machine.” They just believe instead that the complex assortment of skills we call “consciousness” has randomly emerged from the coordinated activity of many different simple mechanisms. The implication is that our facility with what we consider the higher registers of thought are no different in kind from what we’re tempted to perceive as the lower registers. Logical reasoning, on this account, is seen as a lucky adaptation; so is the ability to throw and catch a ball. Artificial intelligence is not about building a mind; it’s about the improvement of tools to solve problems. As Corrado said to me on my very first day at Google, “It’s not about what a machine ‘knows’ or ‘understands’ but what it ‘does,’ and — more importantly — what it doesn’t do yet.”
Where you come down on “knowing” versus “doing” has real cultural and social implications. At the party, Schuster came over to me to express his frustration with the paper’s media reception. “Did you see the first press?” he asked me. He paraphrased a headline from that morning, blocking it word by word with his hand as he recited it: GOOGLE SAYS A.I. TRANSLATION IS INDISTINGUISHABLE FROM HUMANS’. Over the final weeks of the paper’s composition, the team had struggled with this; Schuster often repeated that the message of the paper was “It’s much better than it was before, but not as good as humans.” He had hoped it would be clear that their efforts weren’t about replacing people but helping them.
And yet the rise of machine learning makes it more difficult for us to carve out a special place for us. If you believe, with Searle, that there is something special about human “insight,” you can draw a clear line that separates the human from the automated. If you agree with Searle’s antagonists, you can’t. It is understandable why so many people cling fast to the former view. At a 2015 M.I.T. conference about the roots of artificial intelligence, Noam Chomsky was asked what he thought of machine learning. He pooh-poohed the whole enterprise as mere statistical prediction, a glorified weather forecast. Even if neural translation attained perfect functionality, it would reveal nothing profound about the underlying nature of language. It could never tell you if a pronoun took the dative or the accusative case. This kind of prediction makes for a good tool to accomplish our ends, but it doesn’t succeed by the standards of furthering our understanding of why things happen the way they do. A machine can already detect tumors in medical scans better than human radiologists, but the machine can’t tell you what’s causing the cancer.
Then again, can the radiologist?
Medical diagnosis is one field most immediately, and perhaps unpredictably, threatened by machine learning. Radiologists are extensively trained and extremely well paid, and we think of their skill as one of professional insight — the highest register of thought. In the past year alone, researchers have shown not only that neural networks can find tumors in medical images much earlier than their human counterparts but also that machines can even make such diagnoses from the texts of pathology reports. What radiologists do turns out to be something much closer to predictive pattern-matching than logical analysis. They’re not telling you what caused the cancer; they’re just telling you it’s there.
Once you’ve built a robust pattern-matching apparatus for one purpose, it can be tweaked in the service of others. One Translate engineer took a network he put together to judge artwork and used it to drive an autonomous radio-controlled car. A network built to recognize a cat can be turned around and trained on CT scans — and on infinitely more examples than even the best doctor could ever review. A neural network built to translate could work through millions of pages of documents of legal discovery in the tiniest fraction of the time it would take the most expensively credentialed lawyer. The kinds of jobs taken by automatons will no longer be just repetitive tasks that were once — unfairly, it ought to be emphasized — associated with the supposed lower intelligence of the uneducated classes. We’re not only talking about three and a half million truck drivers who may soon lack careers. We’re talking about inventory managers, economists, financial advisers, real estate agents. What Brain did over nine months is just one example of how quickly a small group at a large company can automate a task nobody ever would have associated with machines.
The most important thing happening in Silicon Valley right now is not disruption. Rather, it’s institution-building — and the consolidation of power — on a scale and at a pace that are both probably unprecedented in human history. Brain has interns; it has residents; it has “ninja” classes to train people in other departments. Everywhere there are bins of free bike helmets, and free green umbrellas for the two days a year it rains, and little fruit salads, and nap pods, and shared treadmill desks, and massage chairs, and random cartons of high-end pastries, and places for baby-clothes donations, and two-story climbing walls with scheduled instructors, and reading groups and policy talks and variegated support networks. The recipients of these major investments in human cultivation — for they’re far more than perks for proles in some digital salt mine — have at hand the power of complexly coordinated servers distributed across 13 data centers on four continents, data centers that draw enough electricity to light up large cities.
But even enormous institutions like Google will be subject to this wave of automation; once machines can learn from human speech, even the comfortable job of the programmer is threatened. As the party in the tiki bar was winding down, a Translate engineer brought over his laptop to show Hughes something. The screen swirled and pulsed with a vivid, kaleidoscopic animation of brightly colored spheres in long looping orbits that periodically collapsed into nebulae before dispersing once more.
Hughes recognized what it was right away, but I had to look closely before I saw all the names — of people and files. It was an animation of the history of 10 years of changes to the Translate code base, every single buzzing and blooming contribution by every last team member. Hughes reached over gently to skip forward, from 2006 to 2008 to 2015, stopping every once in a while to pause and remember some distant campaign, some ancient triumph or catastrophe that now hurried by to be absorbed elsewhere or to burst on its own. Hughes pointed out how often Jeff Dean’s name expanded here and there in glowing spheres.
Hughes called over Corrado, and they stood transfixed. To break the spell of melancholic nostalgia, Corrado, looking a little wounded, looked up and said, “So when do we get to delete it?”
Don’t worry about it,” Hughes said. “The new code base is going to grow. Everything grows.
Correction: December 22, 2016 
An earlier version of this article referred incorrectly to a computer used in space travel. A computer was used to guide Apollo missions — not the “Apollo shuttle.” (There was no such shuttle.)
Gideon Lewis-Kraus is a writer at large for the magazine and a fellow at New America. He last wrote about the contradictions of travel photography
ORIGINAL: NYTimes
BY GIDEON LEWIS-KRAUS
DEC. 14, 2016

Carbon Nanotube Transistors Finally Outperform Silicon

By Hugo Angel,

Photo: Stephanie Precourt/UW-Madison College of Engineering

Back in the 1990s, observers predicted that the single-walled carbon nanotube (SWCNT) would be the nanomaterial that pushed silicon aside and created a post-CMOS world where Moore’s Law could continue its march towards ever=smaller chip dimensions. All of that hope was swallowed up by inconsistencies between semiconducting and metallic SWCNTs and the vexing issue of trying to get them all to align on a wafer.

The introduction of graphene seemed to take the final bit of luster off of carbon nanotubes’ shine, but the material, which researchers have been using to make transistors for over 20 years, has experienced a renaissance of late.
Now, researchers at the University of Wisconsin-Madison (UW-Madison) have given SWCNTs a new boost in their resurgence by using them to make a transistor that outperforms state-of-the-art silicon transistors.
This achievement has been a dream of nanotechnology for the last 20 years,” said Michael Arnold, a professor at UW-Madison, in a press release. “Making carbon nanotube transistors that are better than silicon transistors is a big milestone,” Arnold added. “[It’s] a critical advance toward exploiting carbon nanotubes in logic, high-speed communications, and other semiconductor electronics technologies.
In research described in the journal Science Advances, the UW-Madison researchers were able to achieve a current that is 1.9 times as fast as that seen in silicon transistors. The measure of how rapidly the current that can travel through the channel between a transistor’s source and drain determines how fast the circuit is. The more current there is, the more quickly the gate of the next device in the circuit can be charged .

The key to getting the nanotubes to create such a fast transistor was a new process that employs polymers to sort between the metallic and semiconducting SWCNTs to create an ultra-high purity of solution.
We’ve identified specific conditions in which you can get rid of nearly all metallic nanotubes, [leaving] less than 0.01 percent metallic nanotubes [in a sample],” said Arnold.
The researchers had already tackled the problem of aligning and placing the nanotubes on a wafer two years ago when they developed a process they dubbed “floating evaporative self-assembly.” That technique uses a hydrophobic substrate and partially submerges it in water. Then the SWCNTs are deposited on its surface and the substrate removed vertically from the water.
In our research, we’ve shown that we can simultaneously overcome all of these challenges of working with nanotubes, and that has allowed us to create these groundbreaking carbon nanotube transistors that surpass silicon and gallium arsenide transistors,” said Arnold.
In the video below, Arnold provides a little primer on SWCNTs and what his group’s research with them could mean to the future of electronics.

In continuing research, the UW-Madison team will be aiming to replicate the manufacturability of silicon transistors. To date, they have managed to scale their alignment and deposition process to 1-inch-by-1-inch wafers; the longer-term goal is to bring this up to commercial scales.

Arnold added: “There has been a lot of hype about carbon nanotubes that hasn’t been realized, and that has kind of soured many people’s outlook. But we think the hype is deserved. It has just taken decades of work for the materials science to catch up and allow us to effectively harness these materials.

ORIGINAL: IEEE

By Dexter Johnson
6 Sep 2016

Quantum Computers Explained – Limits of Human Technology

By Hugo Angel,

Where are the limits of human technology? And can we somehow avoid them? This is where quantum computers become very interesting. 
Check out THE NOVA PROJECT to learn more about dark energy: www.nova.org.au 


ORIGINAL: YouTube



  Category: Computing, Physics, Quantum
  Comments: Comments Off on Quantum Computers Explained – Limits of Human Technology

IBM, Local Motors debut Olli, the first Watson-powered self-driving vehicle

By Hugo Angel,

Olli hits the road in the Washington, D.C. area and later this year in Miami-Dade County and Las Vegas.
Local Motors CEO and co-founder John B. Rogers, Jr. with “Olli” & IBM, June 15, 2016.Rich Riggins/Feature Photo Service for IBM

IBM, along with the Arizona-based manufacturer Local Motors, debuted the first-ever driverless vehicle to use the Watson cognitive computing platform. Dubbed “Olli,” the electric vehicle was unveiled at Local Motors’ new facility in National Harbor, Maryland, just outside of Washington, D.C.

Olli, which can carry up to 12 passengers, taps into four Watson APIs (

  • Speech to Text, 
  • Natural Language Classifier, 
  • Entity Extraction and 
  • Text to Speech

) to interact with its riders. It can answer questions like “Can I bring my children on board?” and respond to basic operational commands like, “Take me to the closest Mexican restaurant.” Olli can also give vehicle diagnostics, answering questions like, “Why are you stopping?

Olli learns from data produced by more than 30 sensors embedded throughout the vehicle, which will added and adjusted to meet passenger needs and local preferences.
While Olli is the first self-driving vehicle to use IBM Watson Internet of Things (IoT), this isn’t Watson’s first foray into the automotive industry. IBM launched its IoT for Automotive unit in September of last year, and in March, IBM and Honda announced a deal for Watson technology and analytics to be used in the automaker’s Formula One (F1) cars and pits.
IBM demonstrated its commitment to IoT in March of last year, when it announced it was spending $3B over four years to establish a separate IoT business unit, whch later became the Watson IoT business unit.
IBM says that starting Thursday, Olli will be used on public roads locally in Washington, D.C. and will be used in Miami-Dade County and Las Vegas later this year. Miami-Dade County is exploring a pilot program that would deploy several autonomous vehicles to shuttle people around Miami.
ORIGINAL: ZDnet
By Stephanie Condon for Between the Lines
June 16, 2016

The Quest to Make Code Work Like Biology Just Took A Big Step

By Hugo Angel,

THE QUEST TO MAKE CODE WORK LIKE BIOLOGY JUST TOOK A BIG STEP

|Chef CTO Adam Jacob.CHRISTIE HEMM KLOK/WIRED
IN THE EARLY 1970s, at Silicon Valley’s Xerox PARC, Alan Kay envisioned computer software as something akin to a biological system, a vast collection of small cells that could communicate via simple messages. Each cell would perform its own discrete task. But in communicating with the rest, it would form a more complex whole. “This is an almost foolproof way of operating,” Kay once told me. Computer programmers could build something large by focusing on something small. That’s a simpler task, and in the end, the thing you build is stronger and more efficient. 
The result was a programming language called SmallTalk. Kay called it an object-oriented language—the “objects” were the cells—and it spawned so many of the languages that programmers use today, from Objective-C and Swiftwhich run all the apps on your Apple iPhone, to JavaGoogle’s language of choice on Android phones. Kay’s vision of code as biology is now the norm. It’s how the world’s programmers think about building software. 

In the ’70s, Alan Kay was a researcher at Xerox PARC, where he helped develop the notion of personal computing, the laptop, the now ubiquitous overlapping-window interface, and object-oriented programming.
COMPUTER HISTORY MUSEUM
But Kay’s big idea extends well beyond individual languages like Swift and Java. This is also how Google, Twitter, and other Internet giants now think about building and running their massive online services. The Google search engine isn’t software that runs on a single machine. Serving millions upon millions of people around the globe, it’s software that runs on thousands of machines spread across multiple computer data centers. Google runs this entire service like a biological system, as a vast collection of self-contained pieces that work in concert. It can readily spread those cells of code across all those machines, and when machines break—as they inevitably do—it can move code to new machines and keep the whole alive. 
Now, Adam Jacob wants to bring this notion to every other business on earth. Jacob is a bearded former comic-book-store clerk who, in the grand tradition of Alan Kay, views technology like a philosopher. He’s also the chief technology officer and co-founder of Chef, a Seattle company that has long helped businesses automate the operation of their online services through a techno-philosophy known as “DevOps.” Today, he and his company unveiled a new creation they call Habitat. Habitat is a way of packaging entire applications into something akin to Alan Kay’s biological cells, squeezing in not only the application code but everything needed to run, oversee, and update that code—all its “dependencies,” in programmer-speak. Then you can deploy hundreds or even thousands of these cells across a network of machines, and they will operate as a whole, with Habitat handling all the necessary communication between each cell. “With Habitat,” Jacob says, “all of the automation travels with the application itself.” 
That’s something that will at least capture the imagination of coders. And if it works, it will serve the rest of us too. If businesses push their services towards the biological ideal, then we, the people who use those services, will end up with technology that just works better—that coders can improve more easily and more quickly than before
Reduce, Reuse, Repackage 
Habitat is part of a much larger effort to remake any online business in the image of Google. Alex Polvi, CEO and founder of a startup called CoreOS, calls this movement GIFEE—or Google Infrastructure For Everyone Else—and it includes tools built by CoreOS as well as such companies as Docker and Mesosphere, not to mention Google itself. The goal: to create tools that more efficiently juggle software across the vast computer networks that drive the modern digital world. 
But Jacob seeks to shift this idea’s center of gravity. He wants to make it as easy as possible for businesses to run their existing applications in this enormously distributed manner. He wants businesses embrace this ideal even if they’re not willing to rebuild these applications or the computer platforms they run on. He aims to provide a way of wrapping any code—new or old—in an interface that can run on practically any machine. Rather than rebuilding your operation in the image of Google, Jacob says, you can simply repackage it. 
If what I want is an easier application to manage, why do I need to change the infrastructure for that application?” he says. It’s yet another extension of Alan Kay’s biological metaphor—as he himself will tell you. When I describe Habitat to Kay—now revered as one of the founding fathers of the PC, alongside so many other PARC researchers—he says it does what SmallTalk did so long go
Chef CTO Adam Jacob.CHRISTIE HEMM KLOK/WIRED
The Unknown Programmer 
Kay traces the origins of SmallTalk to his time in the Air Force. In 1961, he was stationed at Randolph Air Force Base near San Antonio, Texas, and he worked as a programmer, building software for a vacuum-tube computer called the Burroughs 220. In those days, computers didn’t have operating systems. No Apple iOS. No Windows. No Unix. And data didn’t come packaged in standard file formats. No .doc. No .xls. No .txt. But the Air Force needed a way of sending files between bases so that different machines could read them. Sometime before Kay arrived, another Air Force programmer—whose name is lost to history—cooked up a good way. 
This unnamed programmer—“almost certainly an enlisted man,” Kay says, “because officers didn’t program back then”—would put data on a magnetic-tape reel along with all the procedures needed to read that data. Then, he tacked on a simple interface—a few “pointers,” in programmer-speak—that allowed the machine to interact with those procedures. To read the data, all the machine needed to understand were the pointers—not a whole new way of doing things. In this way, someone like Kay could read the tape from any machine on any Air Force base. 
Kay’s programming objects worked in a similar way. Each did its own thing, but could communicate with the outside world through a simple interface. That meant coders could readily plug an old object into a new program, or reuse it several times across the same program. Today, this notion is fundamental to software design. And now, Habitat wants to recreate this dynamic on a higher level: not within an application, but in a way that allows an application to run across as a vast computer network. 
Because Habitat wraps an application in a package that includes everything needed to run and oversee the application—while fronting this package with a simple interface—you can potentially run that application on any machine. Or, indeed, you can spread tens, hundreds, or even thousands of packages across a vast network of machines. Software called the Habitat Supervisor sits on each machine, running each package and ensuring it can communicate with the rest. Written in a new programming language called Rust which is suited to modern online systems, Chef designed this Supervisor specifically to juggle code on an enormous scale. 
Kay’s vision of code as biology is now the norm. It’s how the world’s programmers think about the software they build. 
But the important stuff lies inside those packages. Each package includes everything you need to orchestrate the application, as modern coders say, across myriad machines. Once you deploy your packages across a network, Jacob says, they can essentially orchestrate themselves. Instead of overseeing the application from one central nerve center, you can distribute the task—the ultimate aim of Kay’s biological system. That’s simpler and less likely to fail, at least in theory. 
What’s more, each package includes everything you need to modify the application—to, say, update the code or apply new security rules. This is what Jacob means when he says that all the automation travels with the application. “Having the management go with the package,” he says, “means I can manage in the same way, no matter where I choose to run it.” That’s vital in the modern world. Online code is constantly changing, and this system is designed for change.

‘Grownup Containers’ 
The idea at the heart of Habitat is similar to concepts that drive Mesosphere, Google’s Kubernetes, and Docker’s Swarm. All of these increasingly popular tools run software inside Linux “containers”—walled-off spaces within the Linux operating system that provide ways to orchestrate discrete pieces of code across myriad machines. Google uses containers in running its own online empire, and the rest of Silicon Valley is following suit. 
But Chef is taking a different tack. Rather than centering Habitat around Linux containers, they’ve built a new kind of package designed to run in other ways too. You can run Habitat packages atop Mesosphere or Kubernetes. You can also run them atop virtual machines, such as those offered by Amazon or Google on their cloud services. Or you can just run them on your own servers. “We can take all the existing software in the world, which wasn’t built with any of this new stuff in mind, and make it behave,” Jacob says. 
Jon Cowie, senior operations engineer at the online marketplace Etsy, is among the few outsiders who have kicked the tires on Habibat. He calls it “grownup containers.” Building an application around containers can be a complicated business, he explains. Habitat, he says, is simpler. You wrap your code, old or new, in a new interface and run it where you want to run it. “They are giving you a flexible toolkit,” he says. 
That said, container systems like Mesosphere and Kubernetes can still be a very important thing. These tools include “schedulers” that spread code across myriad machines in a hyper-efficient way, finding machines that have available resources and actually launching the code. Habitat doesn’t do that. It handles everything after the code is in place. 
Jacob sees Habitat as a tool that runs in tandem with a Mesophere or a Kubernetes—or atop other kinds of systems. He sees it as a single tool that can run any application on anything. But you may have to tweak Habitat so it will run on your infrastructure of choice. In packaging your app, Habitat must use a format that can speak to each type of system you want it to run on (the inputs and outputs for a virtual machine are different, say, from the inputs and outputs for Kubernetes), and at the moment, it only offers certain formats. If it doesn’t handle your format of choice, you’ll have to write a little extra code of your own. 
Jacob says writing this code is “trivial.” And for seasoned developers, it may be. Habitat’s overarching mission is to bring the biological imperative to as many businesses as possible. But of course, the mission isn’t everything. The importance of Habitat will really come down to how well it works.

Promise Theory 
Whatever the case, the idea behind Habitat is enormously powerful. The biological ideal has driven the evolution of computing systems for decades—and will continue to drive their evolution. Jacob and Chef are taking a concept that computer coders are intimately familiar with, and they’re applying it to something new. 
They’re trying to take away more of the complexity—and do this in a way that matches the cultural affiliation of developers,” says Mark Burgess, a computer scientist, physicist, and philosopher whose ideas helped spawn Chef and other DevOps projects. 
Burgess compares this phenomenon to what he calls Promise Theory, where humans and autonomous agents work together to solve problems by striving to fulfill certain intentions, or promises. He sees computer automation not just as a cooperation of code, but of people and code. That’s what Jacob is striving for. You share your intentions with Habitat, and its autonomous agents work to realize them—a flesh-and-blood biological system combining with its idealized counterpart in code. 
ORIGINAL: Wired
AUTHOR: CADE METZ.CADE METZ BUSINESS 
DATE OF PUBLICATION: 06.14.16.06.14.16 

Former NASA chief unveils $100 million neural chip maker KnuEdge

By Hugo Angel,

Daniel Goldin
It’s not all that easy to call KnuEdge a startup. Created a decade ago by Daniel Goldin, the former head of the National Aeronautics and Space Administration, KnuEdge is only now coming out of stealth mode. It has already raised $100 million in funding to build a “neural chip” that Goldin says will make data centers more efficient in a hyperscale age.
Goldin, who founded the San Diego, California-based company with the former chief technology officer of NASA, said he believes the company’s brain-like chip will be far more cost and power efficient than current chips based on the computer design popularized by computer architect John von Neumann. In von Neumann machines, memory and processor are separated and linked via a data pathway known as a bus. Over the years, von Neumann machines have gotten faster by sending more and more data at higher speeds across the bus as processor and memory interact. But the speed of a computer is often limited by the capacity of that bus, leading to what some computer scientists to call the “von Neumann bottleneck.” IBM has seen the same problem, and it has a research team working on brain-like data center chips. Both efforts are part of an attempt to deal with the explosion of data driven by artificial intelligence and machine learning.
Goldin’s company is doing something similar to IBM, but only on the surface. Its approach is much different, and it has been secretly funded by unknown angel investors. And Goldin said in an interview with VentureBeat that the company has already generated $20 million in revenue and is actively engaged in hyperscale computing companies and Fortune 500 companies in the aerospace, banking, health care, hospitality, and insurance industries. The mission is a fundamental transformation of the computing world, Goldin said.
It all started over a mission to Mars,” Goldin said.

Above: KnuEdge’s first chip has 256 cores.Image Credit: KnuEdge
Back in the year 2000, Goldin saw that the time delay for controlling a space vehicle would be too long, so the vehicle would have to operate itself. He calculated that a mission to Mars would take software that would push technology to the limit, with more than tens of millions of lines of code.
Above: Daniel Goldin, CEO of KnuEdge.
Image Credit: KnuEdge
I thought, Former NASA chief unveils $100 million neural chip maker KnuEdge

It’s not all that easy to call KnuEdge a startup. Created a decade ago by Daniel Goldin, the former head of the National Aeronautics and Space Administration, KnuEdge is only now coming out of stealth mode. It has already raised $100 million in funding to build a “neural chip” that Goldin says will make data centers more efficient in a hyperscale age.
Goldin, who founded the San Diego, California-based company with the former chief technology officer of NASA, said he believes the company’s brain-like chip will be far more cost and power efficient than current chips based on the computer design popularized by computer architect John von Neumann. In von Neumann machines, memory and processor are separated and linked via a data pathway known as a bus. Over the years, von Neumann machines have gotten faster by sending more and more data at higher speeds across the bus as processor and memory interact. But the speed of a computer is often limited by the capacity of that bus, leading to what some computer scientists to call the “von Neumann bottleneck.” IBM has seen the same problem, and it has a research team working on brain-like data center chips. Both efforts are part of an attempt to deal with the explosion of data driven by artificial intelligence and machine learning.
Goldin’s company is doing something similar to IBM, but only on the surface. Its approach is much different, and it has been secretly funded by unknown angel investors. And Goldin said in an interview with VentureBeat that the company has already generated $20 million in revenue and is actively engaged in hyperscale computing companies and Fortune 500 companies in the aerospace, banking, health care, hospitality, and insurance industries. The mission is a fundamental transformation of the computing world, Goldin said.
It all started over a mission to Mars,” Goldin said.

Above: KnuEdge’s first chip has 256 cores.Image Credit: KnuEdge
Back in the year 2000, Goldin saw that the time delay for controlling a space vehicle would be too long, so the vehicle would have to operate itself. He calculated that a mission to Mars would take software that would push technology to the limit, with more than tens of millions of lines of code.
Above: Daniel Goldin, CEO of KnuEdge.
Image Credit: KnuEdge
I thought, holy smokes,” he said. “It’s going to be too expensive. It’s not propulsion. It’s not environmental control. It’s not power. This software business is a very big problem, and that nation couldn’t afford it.
So Goldin looked further into the brains of the robotics, and that’s when he started thinking about the computing it would take.
Asked if it was easier to run NASA or a startup, Goldin let out a guffaw.
I love them both, but they’re both very different,” Goldin said. “At NASA, I spent a lot of time on non-technical issues. I had a project every quarter, and I didn’t want to become dull technically. I tried to always take on a technical job doing architecture, working with a design team, and always doing something leading edge. I grew up at a time when you graduated from a university and went to work for someone else. If I ever come back to this earth, I would graduate and become an entrepreneur. This is so wonderful.
Back in 1992, Goldin was planning on starting a wireless company as an entrepreneur. But then he got the call to “go serve the country,” and he did that work for a decade. He started KnuEdge (previously called Intellisis) in 2005, and he got very patient capital.
When I went out to find investors, I knew I couldn’t use the conventional Silicon Valley approach (impatient capital),” he said. “It is a fabulous approach that has generated incredible wealth. But I wanted to undertake revolutionary technology development. To build the future tools for next-generation machine learning, improving the natural interface between humans and machines. So I got patient capital that wanted to see lightning strike. Between all of us, we have a board of directors that can contact almost anyone in the world. They’re fabulous business people and technologists. We knew we had a ten-year run-up.
But he’s not saying who those people are yet.
KnuEdge’s chips are part of a larger platform. KnuEdge is also unveiling KnuVerse, a military-grade voice recognition and authentication technology that unlocks the potential of voice interfaces to power next-generation computing, Goldin said.
While the voice technology market has exploded over the past five years due to the introductions of Siri, Cortana, Google Home, Echo, and ViV, the aspirations of most commercial voice technology teams are still on hold because of security and noise issues. KnuVerse solutions are based on patented authentication techniques using the human voice — even in extremely noisy environments — as one of the most secure forms of biometrics. Secure voice recognition has applications in industries such as banking, entertainment, and hospitality.
KnuEdge says it is now possible to authenticate to computers, web and mobile apps, and Internet of Things devices (or everyday objects that are smart and connected) with only a few words spoken into a microphone — in any language, no matter how loud the background environment or how many other people are talking nearby. In addition to KnuVerse, KnuEdge offers Knurld.io for application developers, a software development kit, and a cloud-based voice recognition and authentication service that can be integrated into an app typically within two hours.
And KnuEdge is announcing KnuPath with LambdaFabric computing. KnuEdge’s first chip, built with an older manufacturing technology, has 256 cores, or neuron-like brain cells, on a single chip. Each core is a tiny digital signal processor. The LambdaFabric makes it possible to instantly connect those cores to each other — a trick that helps overcome one of the major problems of multicore chips, Goldin said. The LambdaFabric is designed to connect up to 512,000 devices, enabling the system to be used in the most demanding computing environments. From rack to rack, the fabric has a latency (or interaction delay) of only 400 nanoseconds. And the whole system is designed to use a low amount of power.
All of the company’s designs are built on biological principles about how the brain gets a lot of computing work done with a small amount of power. The chip is based on what Goldin calls “sparse matrix heterogeneous machine learning algorithms.” And it will run C++ software, something that is already very popular. Programmers can program each one of the cores with a different algorithm to run simultaneously, for the “ultimate in heterogeneity.” It’s multiple input, multiple data, and “that gives us some of our power,” Goldin said.

Above: KnuEdge’s KnuPath chip.
Image Credit: KnuEdge
KnuEdge is emerging out of stealth mode to aim its new Voice and Machine Learning technologies at key challenges in IoT, cloud based machine learning and pattern recognition,” said Paul Teich, principal analyst at Tirias Research, in a statement. “Dan Goldin used his experience in transforming technology to charter KnuEdge with a bold idea, with the patience of longer development timelines and away from typical startup hype and practices. The result is a new and cutting-edge path for neural computing acceleration. There is also a refreshing surprise element to KnuEdge announcing a relevant new architecture that is ready to ship… not just a concept or early prototype.”
Today, Goldin said the company is ready to show off its designs. The first chip was ready last December, and KnuEdge is sharing it with potential customers. That chip was built with a 32-nanometer manufacturing process, and even though that’s an older technology, it is a powerful chip, Goldin said. Even at 32 nanometers, the chip has something like a two-times to six-times performance advantage over similar chips, KnuEdge said.
The human brain has a couple of hundred billion neurons, and each neuron is connected to at least 10,000 to 100,000 neurons,” Goldin said. “And the brain is the most energy efficient and powerful computer in the world. That is the metaphor we are using.”
KnuEdge has a new version of its chip under design. And the company has already generated revenue from sales of the prototype systems. Each board has about four chips.
As for the competition from IBM, Goldin said, “I believe we made the right decision and are going in the right direction. IBM’s approach is very different from what we have. We are not aiming at anyone. We are aiming at the future.
In his NASA days, Goldin had a lot of successes. There, he redesigned and delivered the International Space Station, tripled the number of space flights, and put a record number of people into space, all while reducing the agency’s planned budget by 25 percent. He also spent 25 years at TRW, where he led the development of satellite television services.
KnuEdge has 100 employees, but Goldin said the company outsources almost everything. Goldin said he is planning to raised a round of funding late this year or early next year. The company collaborated with the University of California at San Diego and UCSD’s California Institute for Telecommunications and Information Technology.
With computers that can handle natural language systems, many people in the world who can’t read or write will be able to fend for themselves more easily, Goldin said.
I want to be able to take machine learning and help people communicate and make a living,” he said. “This is just the beginning. This is the Wild West. We are talking to very large companies about this, and they are getting very excited.
A sample application is a home that has much greater self-awareness. If there’s something wrong in the house, the KnuEdge system could analyze it and figure out if it needs to alert the homeowner.
Goldin said it was hard to keep the company secret.
I’ve been biting my lip for ten years,” he said.
As for whether KnuEdge’s technology could be used to send people to Mars, Goldin said. “This is available to whoever is going to Mars. I tried twice. I would love it if they use it to get there.
ORIGINAL: Venture Beat

holy smokes

,” he said. “It’s going to be too expensive. It’s not propulsion. It’s not environmental control. It’s not power. This software business is a very big problem, and that nation couldn’t afford it.

So Goldin looked further into the brains of the robotics, and that’s when he started thinking about the computing it would take.
Asked if it was easier to run NASA or a startup, Goldin let out a guffaw.
I love them both, but they’re both very different,” Goldin said. “At NASA, I spent a lot of time on non-technical issues. I had a project every quarter, and I didn’t want to become dull technically. I tried to always take on a technical job doing architecture, working with a design team, and always doing something leading edge. I grew up at a time when you graduated from a university and went to work for someone else. If I ever come back to this earth, I would graduate and become an entrepreneur. This is so wonderful.
Back in 1992, Goldin was planning on starting a wireless company as an entrepreneur. But then he got the call to “go serve the country,” and he did that work for a decade. He started KnuEdge (previously called Intellisis) in 2005, and he got very patient capital.
When I went out to find investors, I knew I couldn’t use the conventional Silicon Valley approach (impatient capital),” he said. “It is a fabulous approach that has generated incredible wealth. But I wanted to undertake revolutionary technology development. To build the future tools for next-generation machine learning, improving the natural interface between humans and machines. So I got patient capital that wanted to see lightning strike. Between all of us, we have a board of directors that can contact almost anyone in the world. They’re fabulous business people and technologists. We knew we had a ten-year run-up.
But he’s not saying who those people are yet.
KnuEdge’s chips are part of a larger platform. KnuEdge is also unveiling KnuVerse, a military-grade voice recognition and authentication technology that unlocks the potential of voice interfaces to power next-generation computing, Goldin said.
While the voice technology market has exploded over the past five years due to the introductions of Siri, Cortana, Google Home, Echo, and ViV, the aspirations of most commercial voice technology teams are still on hold because of security and noise issues. KnuVerse solutions are based on patented authentication techniques using the human voice — even in extremely noisy environments — as one of the most secure forms of biometrics. Secure voice recognition has applications in industries such as banking, entertainment, and hospitality.
KnuEdge says it is now possible to authenticate to computers, web and mobile apps, and Internet of Things devices (or everyday objects that are smart and connected) with only a few words spoken into a microphone — in any language, no matter how loud the background environment or how many other people are talking nearby. In addition to KnuVerse, KnuEdge offers Knurld.io for application developers, a software development kit, and a cloud-based voice recognition and authentication service that can be integrated into an app typically within two hours.
And KnuEdge is announcing KnuPath with LambdaFabric computing. KnuEdge’s first chip, built with an older manufacturing technology, has 256 cores, or neuron-like brain cells, on a single chip. Each core is a tiny digital signal processor. The LambdaFabric makes it possible to instantly connect those cores to each other — a trick that helps overcome one of the major problems of multicore chips, Goldin said. The LambdaFabric is designed to connect up to 512,000 devices, enabling the system to be used in the most demanding computing environments. From rack to rack, the fabric has a latency (or interaction delay) of only 400 nanoseconds. And the whole system is designed to use a low amount of power.
All of the company’s designs are built on biological principles about how the brain gets a lot of computing work done with a small amount of power. The chip is based on what Goldin calls “sparse matrix heterogeneous machine learning algorithms.” And it will run C++ software, something that is already very popular. Programmers can program each one of the cores with a different algorithm to run simultaneously, for the “ultimate in heterogeneity.” It’s multiple input, multiple data, and “that gives us some of our power,” Goldin said.

Above: KnuEdge’s KnuPath chip.
Image Credit: KnuEdge
KnuEdge is emerging out of stealth mode to aim its new Voice and Machine Learning technologies at key challenges in IoT, cloud based machine learning and pattern recognition,” said Paul Teich, principal analyst at Tirias Research, in a statement. “Dan Goldin used his experience in transforming technology to charter KnuEdge with a bold idea, with the patience of longer development timelines and away from typical startup hype and practices. The result is a new and cutting-edge path for neural computing acceleration. There is also a refreshing surprise element to KnuEdge announcing a relevant new architecture that is ready to ship… not just a concept or early prototype.”
Today, Goldin said the company is ready to show off its designs. The first chip was ready last December, and KnuEdge is sharing it with potential customers. That chip was built with a 32-nanometer manufacturing process, and even though that’s an older technology, it is a powerful chip, Goldin said. Even at 32 nanometers, the chip has something like a two-times to six-times performance advantage over similar chips, KnuEdge said.
The human brain has a couple of hundred billion neurons, and each neuron is connected to at least 10,000 to 100,000 neurons,” Goldin said. “And the brain is the most energy efficient and powerful computer in the world. That is the metaphor we are using.”
KnuEdge has a new version of its chip under design. And the company has already generated revenue from sales of the prototype systems. Each board has about four chips.
As for the competition from IBM, Goldin said, “I believe we made the right decision and are going in the right direction. IBM’s approach is very different from what we have. We are not aiming at anyone. We are aiming at the future.
In his NASA days, Goldin had a lot of successes. There, he redesigned and delivered the International Space Station, tripled the number of space flights, and put a record number of people into space, all while reducing the agency’s planned budget by 25 percent. He also spent 25 years at TRW, where he led the development of satellite television services.
KnuEdge has 100 employees, but Goldin said the company outsources almost everything. Goldin said he is planning to raised a round of funding late this year or early next year. The company collaborated with the University of California at San Diego and UCSD’s California Institute for Telecommunications and Information Technology.
With computers that can handle natural language systems, many people in the world who can’t read or write will be able to fend for themselves more easily, Goldin said.
I want to be able to take machine learning and help people communicate and make a living,” he said. “This is just the beginning. This is the Wild West. We are talking to very large companies about this, and they are getting very excited.
A sample application is a home that has much greater self-awareness. If there’s something wrong in the house, the KnuEdge system could analyze it and figure out if it needs to alert the homeowner.
Goldin said it was hard to keep the company secret.
I’ve been biting my lip for ten years,” he said.
As for whether KnuEdge’s technology could be used to send people to Mars, Goldin said. “This is available to whoever is going to Mars. I tried twice. I would love it if they use it to get there.
ORIGINAL: Venture Beat