Category: Pattern Recognition


Deep Learning AI Listens to Machines For Signs of Trouble

By Hugo Angel,

ORIGINAL: Spectrum IEEE
By Jeremy Hsu, spectrum.ieee.org
December 27th, 2016
Image: 3DSignals

 

Driving your car until it breaks down on the road is never anyone’s favorite way to learn the need for routine maintenance. But preventive or scheduled maintenance checks often miss many of the problems that can come up. An Israeli startup has come up with a better idea: Use artificial intelligence to listen for early warning signs that a car might be nearing a breakdown.

The service of 3DSignals, a startup based in Kefar Sava, Israel, relies on the artificial intelligence technique known as deep learning to understand the noise patterns of troubled machines and predict problems in advance. 3DSignals has already begun talking with leading European automakers about possibly using the deep learning service to detect possible trouble both in auto factory machinery and in the cars themselves. The startup has even chatted with companies about using their service to automatically detect problems in future taxi fleets of driverless cars.

Deep learning usually refers to software algorithms known as artificial neural networks. These neural networks can learn to become better at specific tasks by filtering relevant data through multiple (deep) layers of artificial neurons. Many companies such as Google and Facebook have used deep learning to develop AI systems that

Many tech giants have also applied deep learning to make their services become better at automatically recognizing the spoken sounds of different human languages. But few companies have bothered with using deep learning to develop AI that’s good at listening to other acoustic signals such as the sounds of machines or music. That’s where 3DSignals hopes it can become a big player with its deep learning focus on more general sound patterns, Lavi explains.

I think most of the world is occupied with deep learning on images. This is by far the most popular application and the most recent. But part of the industry is doing deep learning on acoustics focused on speech recognition and conversation. I think we are probably in the very small group of companies doing acoustics which is more general. This is my aim, to be the world leader in general acoustics deep learning.

For each client, 3DSignals installs ultrasonic microphones that can detect sounds ranging up to 100 kilohertz (human hearing range is between 20 hertz and 20 kilohertz). The startup’s “Internet of Things” service connects the microphones to a computing device that can process some of the data and then upload the information to an online network where the deep learning algorithms do their work. Clients can always check the status of their machines by using any Web-connected device such as a smartphone or tablet.

The first clients for 3DSignals include heavy industry companies operating machinery such as circular cutting blades in mills or hydroelectric turbines in power plants. These companies started out by purchasing the first tier of the 3DSignals service that does not use deep learning. Instead, this first tier of service uses software that relies on basic physics modeling of certain machine parts—such as circular cutting saws—to predict when some parts may start to wear out. That allows the clients to begin getting value from day one.

The second tier of the service uses a deep learning algorithm and the sounds coming from the microphones to help detect strange or unusual noises from the machines. The deep learning algorithms train on sound patterns that can signal general problems with the machines. But only the third tier of the service, also using deep learning, can classify the sounds as indicating specific types of problems. Before this can happen, though, the clients need to help train the deep learning algorithm by first labeling certain sound patterns as belonging to specific types of problems.

After a while, we can not only say when problem type A happens, but we can say before it happens, you’re going to have problem type A in five hours,” Lavi says. “Some problems don’t happen instantly; there’s a deterioration.

When trained, the 3DSignals deep learning algorithms are able to identify predict specific problems in advance with 98 percent accuracy. But the current clients using the 3DSignals system have not yet begun taking advantage of this classification capability; they are still building their training datasets by having people manually label specific sound signatures as belonging to specific problems.

The one-year-old startup has just 15 employees, but it has grown fairly fast and raised $3.3 million so far from investors such as Dov Moran, the Israeli entrepreneur credited with being one of the first to invent USB flash drives. Lavi and his fellow co-founders are already eying several big markets that include automobiles and the energy sector beyond hydroelectric power plants. A series A funding round to attract venture capital is planned for sometime in 2017.

If all goes well, 3DSignals could expand its lead in the growing market for providing “predictive maintenance” to factories, power plants, and car owners. The impending arrival of driverless cars may put even more responsibility on the metaphorical shoulders of a deep learning AI that could listen for problems while the human passengers tune out from the driving experience. On top of all this, 3DSignals has the chance to pioneer the advancement of deep learning in listening to general sounds. Not bad for a small startup.

“It’s important for us to be specialists in general acoustic deep learning, because the research literature does not cover it,” Lavi says.

Google’s Deep Mind Gives AI a Memory Boost That Lets It Navigate London’s Underground

By Hugo Angel,

Photo: iStockphoto

Google’s DeepMind artificial intelligence lab does more than just develop computer programs capable of beating the world’s best human players in the ancient game of Go. The DeepMind unit has also been working on the next generation of deep learning software that combines the ability to recognize data patterns with the memory required to decipher more complex relationships within the data.

Deep learning is the latest buzz word for artificial intelligence algorithms called neural networks that can learn over time by filtering huge amounts of relevant data through many “deep” layers. The brain-inspired neural network layers consist of nodes (also known as neurons). Tech giants such as Google, Facebook, Amazon, and Microsoft have been training neural networks to learn how to better handle tasks such as recognizing images of dogs or making better Chinese-to-English translations. These AI capabilities have already benefited millions of people using Google Translate and other online services.
But neural networks face huge challenges when they try to rely solely on pattern recognition without having the external memory to store and retrieve information. To improve deep learning’s capabilities, Google DeepMind created a “differentiable neural computer” (DNC) that gives neural networks an external memory for storing information for later use.
Neural networks are like the human brain; we humans cannot assimilate massive amounts of data and we must rely on external read-write memory all the time,” says Jay McClelland, director of the Center for Mind, Brain and Computation at Stanford University. “We once relied on our physical address books and Rolodexes; now of course we rely on the read-write storage capabilities of regular computers.
McClelland is a cognitive scientist who served as one of several independent peer reviewers for the Google DeepMind paper that describes development of this improved deep learning system. The full paper is presented in the 12 Oct 2016 issue of the journal Nature.
The DeepMind team found that the DNC system’s combination of the neural network and external memory did much better than a neural network alone in tackling the complex relationships between data points in so-called “graph tasks.” For example, they asked their system to either simply take any path between points A and B or to find the shortest travel routes based on a symbolic map of the London Underground subway.
An unaided neural network could not even finish the first level of training, based on traveling between two subway stations without trying to find the shortest route. It achieved an average accuracy of just 37 percent after going through almost two million training examples. By comparison, the neural network with access to external memory in the DNC system successfully completed the entire training curriculum and reached an average of 98.8 percent accuracy on the final lesson.
The external memory of the DNC system also proved critical to success in performing logical planning tasks such as solving simple block puzzle challenges. Again, a neural network by itself could not even finish the first lesson of the training curriculum for the block puzzle challenge. The DNC system was able to use its memory to store information about the challenge’s goals and to effectively plan ahead by writing its decisions to memory before acting upon them.
In 2014, DeepMind’s researchers developed another system, called the neural Turing machine, that also combined neural networks with external memory. But the neural Turing machine was limited in the way it could access “memories” (information) because such memories were effectively stored and retrieved in fixed blocks or arrays. The latest DNC system can access memories in any arbitrary location, McClelland explains.
The DNC system’s memory architecture even bears a certain resemblance to how the hippocampus region of the brain supports new brain cell growth and new connections in order to store new memories. Just as the DNC system uses the equivalent of time stamps to organize the storage and retrieval of memories, human “free recall” experiments have shown that people are more likely to recall certain items in the same order as first presented.
Despite these similarities, the DNC’s design was driven by computational considerations rather than taking direct inspiration from biological brains, DeepMind’s researchers write in their paper. But McClelland says that he prefers not to think of the similarities as being purely coincidental.
The design decisions that motivated the architects of the DNC were the same as those that structured the human memory system, although the latter (in my opinion) was designed by a gradual evolutionary process, rather than by a group of brilliant AI researchers,” McClelland says.
Human brains still have significant advantages over any brain-inspired deep learning software. For example, human memory seems much better at storing information so that it is accessible by both context or content, McClelland says. He expressed hope that future deep learning and AI research could better capture the memory advantages of biological brains.
 
DeepMind’s DNC system and similar neural learning systems may represent crucial steps for the ongoing development of AI. But the DNC system still falls well short of what McClelland considers the most important parts of human intelligence.
The DNC is a sophisticated form of external memory, but ultimately it is like the papyrus on which Euclid wrote the elements. The insights of mathematicians that Euclid codified relied (in my view) on a gradual learning process that structured the neural circuits in their brains so that they came to be able to see relationships that others had not seen, and that structured the neural circuits in Euclid’s brain so that he could formulate what to write. We have a long way to go before we understand fully the algorithms the human brain uses to support these processes.
It’s unclear when or how Google might take advantage of the capabilities offered by the DNC system to boost its commercial products and services. The DeepMind team was “heads down in research” or too busy with travel to entertain media questions at this time, according to a Google spokesperson.
But Herbert Jaeger, professor for computational science at Jacobs University Bremen in Germany, sees the DeepMind team’s work as a “passing snapshot in a fast evolution sequence of novel neural learning architectures.” In fact, he’s confident that the DeepMind team already has something better than the DNC system described in the Nature paper. (Keep in mind that the paper was submitted back in January 2016.)
DeepMind’s work is also part of a bigger trend in deep learning, Jaeger says. The leading deep learning teams at Google and other companies are racing to build new AI architectures with many different functional modules—among them, attentional control or working memory; they then train the systems through deep learning.
The DNC is just one among dozens of novel, highly potent, and cleverly-thought-out neural learning systems that are popping up all over the place,” Jaeger says.
ORIGINAL: IEEE Spectrum
12 Oct 2016

Show and Tell: image captioning open sourced in TensorFlow

By Hugo Angel,

 In 2014, research scientists on the Google Brain team trained a machine learning system to automatically produce captions that accurately describe images. Further development of that system led to its success in the Microsoft COCO 2015 image captioning challenge, a competition to compare the best algorithms for computing accurate image captions, where it tied for first place.
Today, we’re making the latest version of our image captioning system available as an open source model in TensorFlow.
This release contains significant improvements to the computer vision component of the captioning system, is much faster to train, and produces more detailed and accurate descriptions compared to the original system. These improvements are outlined and analyzed in the paper Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, published in IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatically captioned by our system.
So what’s new? 
Our 2014 system used the Inception V1 image classification model to initialize the image encoder, which
produces the encodings that are useful for recognizing different objects in the images. This was the best image model available at the time, achieving 89.6% top-5 accuracy on the benchmark ImageNet 2012 image classification task. We replaced this in 2015 with the newer Inception V2 image classification model, which achieves 91.8% accuracy on the same task.The improved vision component gave our captioning system an accuracy boost of 2 points in the BLEU-4 metric (which is commonly used in machine translation to evaluate the quality of generated sentences) and was an important factor of its success in the captioning challenge.Today’s code release initializes the image encoder using the Inception V3 model, which achieves 93.9% accuracy on the ImageNet classification task. Initializing the image encoder with a better vision model gives the image captioning system a better ability to recognize different objects in the images, allowing it to generate more detailed and accurate descriptions. This gives an additional 2 points of improvement in the BLEU-4 metric over the system used in the captioning challenge.Another key improvement to the vision component comes from fine-tuning the image model. This step addresses the problem that the image encoder is initialized by a model trained to classify objects in images, whereas the goal of the captioning system is to describe the objects in images using the encodings produced by the image model.  For example, an image classification model will tell you that a dog, grass and a frisbee are in the image, but a natural description should also tell you the color of the grass and how the dog relates to the frisbee.  In the fine-tuning phase, the captioning system is improved by jointly training its vision and language components on human generated captions. This allows the captioning system to transfer information from the image that is specifically useful for generating descriptive captions, but which was not necessary for classifying objects. In particular,  after fine-tuning it becomes better at correctly describing the colors of objects. Importantly, the fine-tuning phase must occur after the language component has already learned to generate captions – otherwise, the noisiness of the randomly initialized language component causes irreversible corruption to the vision component. For more details, read the full paper here.
Left: the better image model allows the captioning model to generate more detailed and accurate descriptions. Right: after fine-tuning the image model, the image captioning system is more likely to describe the colors of objects correctly.
Until recently our image captioning system was implemented in the DistBelief software framework. The TensorFlow implementation released today achieves the same level of accuracy with significantly faster performance: time per training step
is just 0.7 seconds in TensorFlow compared to 3 seconds in DistBelief on an Nvidia K20 GPU, meaning that total training time is just 25% of the time previously required.
A natural question is whether our captioning system can generate novel descriptions of previously unseen contexts and interactions. The system is trained by showing it hundreds of thousands of images that were captioned manually by humans, and it often re-uses human captions when presented with scenes similar to what it’s seen before.
When the model is presented with scenes similar to what it’s seen before, it will often re-use human generated captions.
So does it really understand the objects and their interactions in each image? Or does it always regurgitate descriptions from the training data? Excitingly, our model does indeed develop the ability to generate accurate new captions when presented with completely new scenes, indicating a deeper understanding of the objects and context in the images. Moreover, it learns how to express that knowledge in natural-sounding English phrases despite receiving no additional language training other than reading the human captions.
 

Our model generates a completely new caption using concepts learned from similar scenes in the training set
We hope that sharing this model in TensorFlow will help push forward image captioning research and applications, and will also
allow interested people to learn and have fun. To get started training your own image captioning system, and for more details on the neural network architecture, navigate to the model’s home-page here. While our system uses the Inception V3 image classification model, you could even try training our system with the recently released Inception-ResNet-v2 model to see if it can do even better!

ORIGINAL: Google Blog

by Chris Shallue, Software Engineer, Google Brain Team
September 22, 2016

Deep Learning With Python & Tensorflow – PyConSG 2016

By Hugo Angel,

ORIGINAL: Pycon.SG
Jul 5, 2016
Speaker: Ian Lewis
Description
Python has lots of scientific, data analysis, and machine learning libraries. But there are many problems when starting out on a machine learning project. Which library do you use? How can you use a model that has been trained in your production app? In this talk I will discuss how you can use TensorFlow to create Deep Learning applications and how to deploy them into production.
Abstract
Python has lots of scientific, data analysis, and machine learning libraries. But there are many problems when starting out on a machine learning project. Which library do you use? How do they compare to each other? How can you use a model that has been trained in your production application?
TensorFlow is a new Open-Source framework created at Google for building Deep Learning applications. Tensorflow allows you to construct easy to understand data flow graphs in Python which form a mathematical and logical pipeline. Creating data flow graphs allow easier visualization of complicated algorithms as well as running the training operations over multiple hardware GPUs in parallel.
In this talk I will discuss how you can use TensorFlow to create Deep Learning applications. I will discuss how it compares to other Python machine learning libraries like Theano or Chainer. Finally, I will discuss how trained TensorFlow models could be deployed into a production system using TensorFlow Serve.
Event Page: https://pycon.sg
Produced by Engineers.SG

How a Japanese cucumber farmer is using deep learning and TensorFlow.

By Hugo Angel,

by Kaz Sato, Developer Advocate, Google Cloud Platform
August 31, 2016
It’s not hyperbole to say that use cases for machine learning and deep learning are only limited by our imaginations. About one year ago, a former embedded systems designer from the Japanese automobile industry named Makoto Koike started helping out at his parents’ cucumber farm, and was amazed by the amount of work it takes to sort cucumbers by size, shape, color and other attributes.
Makoto’s father is very proud of his thorny cucumber, for instance, having dedicated his life to delivering fresh and crispy cucumbers, with many prickles still on them. Straight and thick cucumbers with a vivid color and lots of prickles are considered premium grade and command much higher prices on the market.
But Makoto learned very quickly that sorting cucumbers is as hard and tricky as actually growing them.Each cucumber has different color, shape, quality and freshness,” Makoto says.
Cucumbers from retail stores
Cucumbers from Makoto’s farm
In Japan, each farm has its own classification standard and there’s no industry standard. At Makoto’s farm, they sort them into nine different classes, and his mother sorts them all herself — spending up to eight hours per day at peak harvesting times.
The sorting work is not an easy task to learn. You have to look at not only the size and thickness, but also the color, texture, small scratches, whether or not they are crooked and whether they have prickles. It takes months to learn the system and you can’t just hire part-time workers during the busiest period. I myself only recently learned to sort cucumbers well,” Makoto said.
Distorted or crooked cucumbers are ranked as low-quality product
There are also some automatic sorters on the market, but they have limitations in terms of performance and cost, and small farms don’t tend to use them.
Makoto doesn’t think sorting is an essential task for cucumber farmers. “Farmers want to focus and spend their time on growing delicious vegetables. I’d like to automate the sorting tasks before taking the farm business over from my parents.
Makoto Koike, center, with his parents at the family cucumber farm
Makoto Koike, family cucumber farm
The many uses of deep learning
Makoto first got the idea to explore machine learning for sorting cucumbers from a completely different use case: Google AlphaGo competing with the world’s top professional Go player.
When I saw the Google’s AlphaGo, I realized something really serious is happening here,” said Makoto. “That was the trigger for me to start developing the cucumber sorter with deep learning technology.
Using deep learning for image recognition allows a computer to learn from a training data set what the important “features” of the images are. By using a hierarchy of numerous artificial neurons, deep learning can automatically classify images with a high degree of accuracy. Thus, neural networks can recognize different species of cats, or models of cars or airplanes from images. Sometimes neural networks can exceed the performance of the human eye for certain applications. (For more information, check out my previous blog post Understanding neural networks with TensorFlow Playground.)

TensorFlow democratizes the power of deep learning
But can computers really learn mom’s art of cucumber sorting? Makoto set out to see whether he could use deep learning technology for sorting using Google’s open source machine learning library, TensorFlow.
Google had just open sourced TensorFlow, so I started trying it out with images of my cucumbers,” Makoto said. “This was the first time I tried out machine learning or deep learning technology, and right away got much higher accuracy than I expected. That gave me the confidence that it could solve my problem.
With TensorFlow, you don’t need to be knowledgeable about the advanced math models and optimization algorithms needed to implement deep neural networks. Just download the sample code and read the tutorials and you can get started in no time. The library lowers the barrier to entry for machine learning significantly, and since Google open-sourced TensorFlow last November, many “non ML” engineers have started playing with the technology with their own datasets and applications.

Cucumber sorting system design
Here’s a systems diagram of the cucumber sorter that Makoto built. The system uses Raspberry Pi 3 as the main controller to take images of the cucumbers with a camera, and 

  • in a first phase, runs a small-scale neural network on TensorFlow to detect whether or not the image is of a cucumber
  • It then forwards the image to a larger TensorFlow neural network running on a Linux server to perform a more detailed classification.
Systems diagram of the cucumber sorter
Makoto used the sample TensorFlow code Deep MNIST for Experts with minor modifications to the convolution, pooling and last layers, changing the network design to adapt to the pixel format of cucumber images and the number of cucumber classes.
Here’s Makoto’s cucumber sorter, which went live in July:
Here’s a close-up of the sorting arm, and the camera interface:

And here is the cucumber sorter in action:

Pushing the limits of deep learning
One of the current challenges with deep learning is that you need to have a large number of training datasets. To train the model, Makoto spent about three months taking 7,000 pictures of cucumbers sorted by his mother, but it’s probably not enough.
When I did a validation with the test images, the recognition accuracy exceeded 95%. But if you apply the system with real use cases, the accuracy drops down to about 70%. I suspect the neural network model has the issue of “overfitting” (the phenomenon in neural network where the model is trained to fit only to the small training dataset) because of the insufficient number of training images.
The second challenge of deep learning is that it consumes a lot of computing power. The current sorter uses a typical Windows desktop PC to train the neural network model. Although it converts the cucumber image into 80 x 80 pixel low-resolution images, it still takes two to three days to complete training the model with 7,000 images.
Even with this low-res image, the system can only classify a cucumber based on its shape, length and level of distortion. It can’t recognize color, texture, scratches and prickles,” Makoto explained. Increasing image resolution by zooming into the cucumber would result in much higher accuracy, but would also increase the training time significantly.
To improve deep learning, some large enterprises have started doing large-scale distributed training, but those servers come at an enormous cost. Google offers Cloud Machine Learning (Cloud ML), a low-cost cloud platform for training and prediction that dedicates hundreds of cloud servers to training a network with TensorFlow. With Cloud ML, Google handles building a large-scale cluster for distributed training, and you just pay for what you use, making it easier for developers to try out deep learning without making a significant capital investment.
These specialized servers were used in the AlphaGo match
Makoto is eagerly awaiting Cloud ML. “I could use Cloud ML to try training the model with much higher resolution images and more training data. Also, I could try changing the various configurations, parameters and algorithms of the neural network to see how that improves accuracy. I can’t wait to try it.

Former NASA chief unveils $100 million neural chip maker KnuEdge

By Hugo Angel,

Daniel Goldin
It’s not all that easy to call KnuEdge a startup. Created a decade ago by Daniel Goldin, the former head of the National Aeronautics and Space Administration, KnuEdge is only now coming out of stealth mode. It has already raised $100 million in funding to build a “neural chip” that Goldin says will make data centers more efficient in a hyperscale age.
Goldin, who founded the San Diego, California-based company with the former chief technology officer of NASA, said he believes the company’s brain-like chip will be far more cost and power efficient than current chips based on the computer design popularized by computer architect John von Neumann. In von Neumann machines, memory and processor are separated and linked via a data pathway known as a bus. Over the years, von Neumann machines have gotten faster by sending more and more data at higher speeds across the bus as processor and memory interact. But the speed of a computer is often limited by the capacity of that bus, leading to what some computer scientists to call the “von Neumann bottleneck.” IBM has seen the same problem, and it has a research team working on brain-like data center chips. Both efforts are part of an attempt to deal with the explosion of data driven by artificial intelligence and machine learning.
Goldin’s company is doing something similar to IBM, but only on the surface. Its approach is much different, and it has been secretly funded by unknown angel investors. And Goldin said in an interview with VentureBeat that the company has already generated $20 million in revenue and is actively engaged in hyperscale computing companies and Fortune 500 companies in the aerospace, banking, health care, hospitality, and insurance industries. The mission is a fundamental transformation of the computing world, Goldin said.
It all started over a mission to Mars,” Goldin said.

Above: KnuEdge’s first chip has 256 cores.Image Credit: KnuEdge
Back in the year 2000, Goldin saw that the time delay for controlling a space vehicle would be too long, so the vehicle would have to operate itself. He calculated that a mission to Mars would take software that would push technology to the limit, with more than tens of millions of lines of code.
Above: Daniel Goldin, CEO of KnuEdge.
Image Credit: KnuEdge
I thought, Former NASA chief unveils $100 million neural chip maker KnuEdge

It’s not all that easy to call KnuEdge a startup. Created a decade ago by Daniel Goldin, the former head of the National Aeronautics and Space Administration, KnuEdge is only now coming out of stealth mode. It has already raised $100 million in funding to build a “neural chip” that Goldin says will make data centers more efficient in a hyperscale age.
Goldin, who founded the San Diego, California-based company with the former chief technology officer of NASA, said he believes the company’s brain-like chip will be far more cost and power efficient than current chips based on the computer design popularized by computer architect John von Neumann. In von Neumann machines, memory and processor are separated and linked via a data pathway known as a bus. Over the years, von Neumann machines have gotten faster by sending more and more data at higher speeds across the bus as processor and memory interact. But the speed of a computer is often limited by the capacity of that bus, leading to what some computer scientists to call the “von Neumann bottleneck.” IBM has seen the same problem, and it has a research team working on brain-like data center chips. Both efforts are part of an attempt to deal with the explosion of data driven by artificial intelligence and machine learning.
Goldin’s company is doing something similar to IBM, but only on the surface. Its approach is much different, and it has been secretly funded by unknown angel investors. And Goldin said in an interview with VentureBeat that the company has already generated $20 million in revenue and is actively engaged in hyperscale computing companies and Fortune 500 companies in the aerospace, banking, health care, hospitality, and insurance industries. The mission is a fundamental transformation of the computing world, Goldin said.
It all started over a mission to Mars,” Goldin said.

Above: KnuEdge’s first chip has 256 cores.Image Credit: KnuEdge
Back in the year 2000, Goldin saw that the time delay for controlling a space vehicle would be too long, so the vehicle would have to operate itself. He calculated that a mission to Mars would take software that would push technology to the limit, with more than tens of millions of lines of code.
Above: Daniel Goldin, CEO of KnuEdge.
Image Credit: KnuEdge
I thought, holy smokes,” he said. “It’s going to be too expensive. It’s not propulsion. It’s not environmental control. It’s not power. This software business is a very big problem, and that nation couldn’t afford it.
So Goldin looked further into the brains of the robotics, and that’s when he started thinking about the computing it would take.
Asked if it was easier to run NASA or a startup, Goldin let out a guffaw.
I love them both, but they’re both very different,” Goldin said. “At NASA, I spent a lot of time on non-technical issues. I had a project every quarter, and I didn’t want to become dull technically. I tried to always take on a technical job doing architecture, working with a design team, and always doing something leading edge. I grew up at a time when you graduated from a university and went to work for someone else. If I ever come back to this earth, I would graduate and become an entrepreneur. This is so wonderful.
Back in 1992, Goldin was planning on starting a wireless company as an entrepreneur. But then he got the call to “go serve the country,” and he did that work for a decade. He started KnuEdge (previously called Intellisis) in 2005, and he got very patient capital.
When I went out to find investors, I knew I couldn’t use the conventional Silicon Valley approach (impatient capital),” he said. “It is a fabulous approach that has generated incredible wealth. But I wanted to undertake revolutionary technology development. To build the future tools for next-generation machine learning, improving the natural interface between humans and machines. So I got patient capital that wanted to see lightning strike. Between all of us, we have a board of directors that can contact almost anyone in the world. They’re fabulous business people and technologists. We knew we had a ten-year run-up.
But he’s not saying who those people are yet.
KnuEdge’s chips are part of a larger platform. KnuEdge is also unveiling KnuVerse, a military-grade voice recognition and authentication technology that unlocks the potential of voice interfaces to power next-generation computing, Goldin said.
While the voice technology market has exploded over the past five years due to the introductions of Siri, Cortana, Google Home, Echo, and ViV, the aspirations of most commercial voice technology teams are still on hold because of security and noise issues. KnuVerse solutions are based on patented authentication techniques using the human voice — even in extremely noisy environments — as one of the most secure forms of biometrics. Secure voice recognition has applications in industries such as banking, entertainment, and hospitality.
KnuEdge says it is now possible to authenticate to computers, web and mobile apps, and Internet of Things devices (or everyday objects that are smart and connected) with only a few words spoken into a microphone — in any language, no matter how loud the background environment or how many other people are talking nearby. In addition to KnuVerse, KnuEdge offers Knurld.io for application developers, a software development kit, and a cloud-based voice recognition and authentication service that can be integrated into an app typically within two hours.
And KnuEdge is announcing KnuPath with LambdaFabric computing. KnuEdge’s first chip, built with an older manufacturing technology, has 256 cores, or neuron-like brain cells, on a single chip. Each core is a tiny digital signal processor. The LambdaFabric makes it possible to instantly connect those cores to each other — a trick that helps overcome one of the major problems of multicore chips, Goldin said. The LambdaFabric is designed to connect up to 512,000 devices, enabling the system to be used in the most demanding computing environments. From rack to rack, the fabric has a latency (or interaction delay) of only 400 nanoseconds. And the whole system is designed to use a low amount of power.
All of the company’s designs are built on biological principles about how the brain gets a lot of computing work done with a small amount of power. The chip is based on what Goldin calls “sparse matrix heterogeneous machine learning algorithms.” And it will run C++ software, something that is already very popular. Programmers can program each one of the cores with a different algorithm to run simultaneously, for the “ultimate in heterogeneity.” It’s multiple input, multiple data, and “that gives us some of our power,” Goldin said.

Above: KnuEdge’s KnuPath chip.
Image Credit: KnuEdge
KnuEdge is emerging out of stealth mode to aim its new Voice and Machine Learning technologies at key challenges in IoT, cloud based machine learning and pattern recognition,” said Paul Teich, principal analyst at Tirias Research, in a statement. “Dan Goldin used his experience in transforming technology to charter KnuEdge with a bold idea, with the patience of longer development timelines and away from typical startup hype and practices. The result is a new and cutting-edge path for neural computing acceleration. There is also a refreshing surprise element to KnuEdge announcing a relevant new architecture that is ready to ship… not just a concept or early prototype.”
Today, Goldin said the company is ready to show off its designs. The first chip was ready last December, and KnuEdge is sharing it with potential customers. That chip was built with a 32-nanometer manufacturing process, and even though that’s an older technology, it is a powerful chip, Goldin said. Even at 32 nanometers, the chip has something like a two-times to six-times performance advantage over similar chips, KnuEdge said.
The human brain has a couple of hundred billion neurons, and each neuron is connected to at least 10,000 to 100,000 neurons,” Goldin said. “And the brain is the most energy efficient and powerful computer in the world. That is the metaphor we are using.”
KnuEdge has a new version of its chip under design. And the company has already generated revenue from sales of the prototype systems. Each board has about four chips.
As for the competition from IBM, Goldin said, “I believe we made the right decision and are going in the right direction. IBM’s approach is very different from what we have. We are not aiming at anyone. We are aiming at the future.
In his NASA days, Goldin had a lot of successes. There, he redesigned and delivered the International Space Station, tripled the number of space flights, and put a record number of people into space, all while reducing the agency’s planned budget by 25 percent. He also spent 25 years at TRW, where he led the development of satellite television services.
KnuEdge has 100 employees, but Goldin said the company outsources almost everything. Goldin said he is planning to raised a round of funding late this year or early next year. The company collaborated with the University of California at San Diego and UCSD’s California Institute for Telecommunications and Information Technology.
With computers that can handle natural language systems, many people in the world who can’t read or write will be able to fend for themselves more easily, Goldin said.
I want to be able to take machine learning and help people communicate and make a living,” he said. “This is just the beginning. This is the Wild West. We are talking to very large companies about this, and they are getting very excited.
A sample application is a home that has much greater self-awareness. If there’s something wrong in the house, the KnuEdge system could analyze it and figure out if it needs to alert the homeowner.
Goldin said it was hard to keep the company secret.
I’ve been biting my lip for ten years,” he said.
As for whether KnuEdge’s technology could be used to send people to Mars, Goldin said. “This is available to whoever is going to Mars. I tried twice. I would love it if they use it to get there.
ORIGINAL: Venture Beat

holy smokes

,” he said. “It’s going to be too expensive. It’s not propulsion. It’s not environmental control. It’s not power. This software business is a very big problem, and that nation couldn’t afford it.

So Goldin looked further into the brains of the robotics, and that’s when he started thinking about the computing it would take.
Asked if it was easier to run NASA or a startup, Goldin let out a guffaw.
I love them both, but they’re both very different,” Goldin said. “At NASA, I spent a lot of time on non-technical issues. I had a project every quarter, and I didn’t want to become dull technically. I tried to always take on a technical job doing architecture, working with a design team, and always doing something leading edge. I grew up at a time when you graduated from a university and went to work for someone else. If I ever come back to this earth, I would graduate and become an entrepreneur. This is so wonderful.
Back in 1992, Goldin was planning on starting a wireless company as an entrepreneur. But then he got the call to “go serve the country,” and he did that work for a decade. He started KnuEdge (previously called Intellisis) in 2005, and he got very patient capital.
When I went out to find investors, I knew I couldn’t use the conventional Silicon Valley approach (impatient capital),” he said. “It is a fabulous approach that has generated incredible wealth. But I wanted to undertake revolutionary technology development. To build the future tools for next-generation machine learning, improving the natural interface between humans and machines. So I got patient capital that wanted to see lightning strike. Between all of us, we have a board of directors that can contact almost anyone in the world. They’re fabulous business people and technologists. We knew we had a ten-year run-up.
But he’s not saying who those people are yet.
KnuEdge’s chips are part of a larger platform. KnuEdge is also unveiling KnuVerse, a military-grade voice recognition and authentication technology that unlocks the potential of voice interfaces to power next-generation computing, Goldin said.
While the voice technology market has exploded over the past five years due to the introductions of Siri, Cortana, Google Home, Echo, and ViV, the aspirations of most commercial voice technology teams are still on hold because of security and noise issues. KnuVerse solutions are based on patented authentication techniques using the human voice — even in extremely noisy environments — as one of the most secure forms of biometrics. Secure voice recognition has applications in industries such as banking, entertainment, and hospitality.
KnuEdge says it is now possible to authenticate to computers, web and mobile apps, and Internet of Things devices (or everyday objects that are smart and connected) with only a few words spoken into a microphone — in any language, no matter how loud the background environment or how many other people are talking nearby. In addition to KnuVerse, KnuEdge offers Knurld.io for application developers, a software development kit, and a cloud-based voice recognition and authentication service that can be integrated into an app typically within two hours.
And KnuEdge is announcing KnuPath with LambdaFabric computing. KnuEdge’s first chip, built with an older manufacturing technology, has 256 cores, or neuron-like brain cells, on a single chip. Each core is a tiny digital signal processor. The LambdaFabric makes it possible to instantly connect those cores to each other — a trick that helps overcome one of the major problems of multicore chips, Goldin said. The LambdaFabric is designed to connect up to 512,000 devices, enabling the system to be used in the most demanding computing environments. From rack to rack, the fabric has a latency (or interaction delay) of only 400 nanoseconds. And the whole system is designed to use a low amount of power.
All of the company’s designs are built on biological principles about how the brain gets a lot of computing work done with a small amount of power. The chip is based on what Goldin calls “sparse matrix heterogeneous machine learning algorithms.” And it will run C++ software, something that is already very popular. Programmers can program each one of the cores with a different algorithm to run simultaneously, for the “ultimate in heterogeneity.” It’s multiple input, multiple data, and “that gives us some of our power,” Goldin said.

Above: KnuEdge’s KnuPath chip.
Image Credit: KnuEdge
KnuEdge is emerging out of stealth mode to aim its new Voice and Machine Learning technologies at key challenges in IoT, cloud based machine learning and pattern recognition,” said Paul Teich, principal analyst at Tirias Research, in a statement. “Dan Goldin used his experience in transforming technology to charter KnuEdge with a bold idea, with the patience of longer development timelines and away from typical startup hype and practices. The result is a new and cutting-edge path for neural computing acceleration. There is also a refreshing surprise element to KnuEdge announcing a relevant new architecture that is ready to ship… not just a concept or early prototype.”
Today, Goldin said the company is ready to show off its designs. The first chip was ready last December, and KnuEdge is sharing it with potential customers. That chip was built with a 32-nanometer manufacturing process, and even though that’s an older technology, it is a powerful chip, Goldin said. Even at 32 nanometers, the chip has something like a two-times to six-times performance advantage over similar chips, KnuEdge said.
The human brain has a couple of hundred billion neurons, and each neuron is connected to at least 10,000 to 100,000 neurons,” Goldin said. “And the brain is the most energy efficient and powerful computer in the world. That is the metaphor we are using.”
KnuEdge has a new version of its chip under design. And the company has already generated revenue from sales of the prototype systems. Each board has about four chips.
As for the competition from IBM, Goldin said, “I believe we made the right decision and are going in the right direction. IBM’s approach is very different from what we have. We are not aiming at anyone. We are aiming at the future.
In his NASA days, Goldin had a lot of successes. There, he redesigned and delivered the International Space Station, tripled the number of space flights, and put a record number of people into space, all while reducing the agency’s planned budget by 25 percent. He also spent 25 years at TRW, where he led the development of satellite television services.
KnuEdge has 100 employees, but Goldin said the company outsources almost everything. Goldin said he is planning to raised a round of funding late this year or early next year. The company collaborated with the University of California at San Diego and UCSD’s California Institute for Telecommunications and Information Technology.
With computers that can handle natural language systems, many people in the world who can’t read or write will be able to fend for themselves more easily, Goldin said.
I want to be able to take machine learning and help people communicate and make a living,” he said. “This is just the beginning. This is the Wild West. We are talking to very large companies about this, and they are getting very excited.
A sample application is a home that has much greater self-awareness. If there’s something wrong in the house, the KnuEdge system could analyze it and figure out if it needs to alert the homeowner.
Goldin said it was hard to keep the company secret.
I’ve been biting my lip for ten years,” he said.
As for whether KnuEdge’s technology could be used to send people to Mars, Goldin said. “This is available to whoever is going to Mars. I tried twice. I would love it if they use it to get there.
ORIGINAL: Venture Beat

First Human Tests of Memory Boosting Brain Implant—a Big Leap Forward

By Hugo Angel,

You have to begin to lose your memory, if only bits and pieces, to realize that memory is what makes our lives. Life without memory is no life at all.” — Luis Buñuel Portolés, Filmmaker
Image Credit: Shutterstock.com
Every year, hundreds of millions of people experience the pain of a failing memory.
The reasons are many:

  • traumatic brain injury, which haunts a disturbingly high number of veterans and football players; 
  • stroke or Alzheimer’s disease, which often plagues the elderly; or 
  • even normal brain aging, which inevitably touches us all.
Memory loss seems to be inescapable. But one maverick neuroscientist is working hard on an electronic cure. Funded by DARPA, Dr. Theodore Berger, a biomedical engineer at the University of Southern California, is testing a memory-boosting implant that mimics the kind of signal processing that occurs when neurons are laying down new long-term memories.
The revolutionary implant, already shown to help memory encoding in rats and monkeys, is now being tested in human patients with epilepsy — an exciting first that may blow the field of memory prosthetics wide open.
To get here, however, the team first had to crack the memory code.

Deciphering Memory
From the very onset, Berger knew he was facing a behemoth of a problem.
We weren’t looking to match everything the brain does when it processes memory, but to at least come up with a decent mimic, said Berger.
Of course people asked: can you model it and put it into a device? Can you get that device to work in any brain? It’s those things that lead people to think I’m crazy. They think it’s too hard,” he said.
But the team had a solid place to start.
The hippocampus, a region buried deep within the folds and grooves of the brain, is the critical gatekeeper that transforms memories from short-lived to long-term. In dogged pursuit, Berger spent most of the last 35 years trying to understand how neurons in the hippocampus accomplish this complicated feat.
At its heart, a memory is a series of electrical pulses that occur over time that are generated by a given number of neurons, said Berger. This is important — it suggests that we can reduce it to mathematical equations and put it into a computational framework, he said.
Berger hasn’t been alone in his quest.
By listening to the chatter of neurons as an animal learns, teams of neuroscientists have begun to decipher the flow of information within the hippocampus that supports memory encoding. Key to this process is a strong electrical signal that travels from CA3, the “input” part of the hippocampus, to CA1, the “output” node.
This signal is impaired in people with memory disabilities, said Berger, so of course we thought if we could recreate it using silicon, we might be able to restore — or even boost — memory.

Bridging the Gap
Yet this brain’s memory code proved to be extremely tough to crack.
The problem lies in the non-linear nature of neural networks: signals are often noisy and constantly overlap in time, which leads to some inputs being suppressed or accentuated. In a network of hundreds and thousands of neurons, any small change could be greatly amplified and lead to vastly different outputs.
It’s a chaotic black box, laughed Berger.
With the help of modern computing techniques, however, Berger believes he may have a crude solution in hand. His proof?
Use his mathematical theorems to program a chip, and then see if the brain accepts the chip as a replacement — or additional — memory module.
Berger and his team began with a simple task using rats. They trained the animals to push one of two levers to get a tasty treat, and recorded the series of CA3 to CA1 electronic pulses in the hippocampus as the animals learned to pick the correct lever. The team carefully captured the way the signals were transformed as the session was laid down into long-term memory, and used that information — the electrical “essence” of the memory — to program an external memory chip.
They then injected the animals with a drug that temporarily disrupted their ability to form and access long-term memories, causing the animals to forget the reward-associated lever. Next, implanting microelectrodes into the hippocampus, the team pulsed CA1, the output region, with their memory code.
The results were striking — powered by an external memory module, the animals regained their ability to pick the right lever.
Encouraged by the results, Berger next tried his memory implant in monkeys, this time focusing on a brain region called the prefrontal cortex, which receives and modulates memories encoded by the hippocampus.
Placing electrodes into the monkey’s brains, the team showed the animals a series of semi-repeated images, and captured the prefrontal cortex’s activity when the animals recognized an image they had seen earlier. Then with a hefty dose of cocaine, the team inhibited that particular brain region, which disrupted the animal’s recall.
Next, using electrodes programmed with the “memory code,” the researchers guided the brain’s signal processing back on track — and the animal’s performance improved significantly.
A year later, the team further validated their memory implant by showing it could also rescue memory deficits due to hippocampal malfunction in the monkey brain.

A Human Memory Implant
Last year, the team cautiously began testing their memory implant prototype in human volunteers.
Because of the risks associated with brain surgery, the team recruited 12 patients with epilepsy, who already have electrodes implanted into their brain to track down the source of their seizures.
Repeated seizures steadily destroy critical parts of the hippocampus needed for long-term memory formation, explained Berger. So if the implant works, it could benefit these patients as well.
The team asked the volunteers to look through a series of pictures, and then recall which ones they had seen 90 seconds later. As the participants learned, the team recorded the firing patterns in both CA1 and CA3 — that is, the input and output nodes.
Using these data, the team extracted an algorithm — a specific human “memory code” — that could predict the pattern of activity in CA1 cells based on CA3 input. Compared to the brain’s actual firing patterns, the algorithm generated correct predictions roughly 80% of the time.
It’s not perfect, said Berger, but it’s a good start.
Using this algorithm, the researchers have begun to stimulate the output cells with an approximation of the transformed input signal.
We have already used the pattern to zap the brain of one woman with epilepsy, said Dr. Dong Song, an associate professor working with Berger. But he remained coy about the result, only saying that although promising, it’s still too early to tell.
Song’s caution is warranted. Unlike the motor cortex, with its clear structured representation of different body parts, the hippocampus is not organized in any obvious way.
It’s hard to understand why stimulating input locations can lead to predictable results, said Dr. Thoman McHugh, a neuroscientist at the RIKEN Brain Science Institute. It’s also difficult to tell whether such an implant could save the memory of those who suffer from damage to the output node of the hippocampus.
That said, the data is convincing,” McHugh acknowledged.
Berger, on the other hand, is ecstatic. “I never thought I’d see this go into humans,” he said.
But the work is far from done. Within the next few years, Berger wants to see whether the chip can help build long-term memories in a variety of different situations. After all, the algorithm was based on the team’s recordings of one specific task — what if the so-called memory code is not generalizable, instead varying based on the type of input that it receives?
Berger acknowledges that it’s a possibility, but he remains hopeful.
I do think that we will find a model that’s a pretty good fit for most conditions, he said. After all, the brain is restricted by its own biophysics — there’s only so many ways that electrical signals in the hippocampus can be processed, he said.
The goal is to improve the quality of life for somebody who has a severe memory deficit,” said Berger. “If I can give them the ability to form new long-term memories for half the conditions that most people live in, I’ll be happy as hell, and so will be most patients.
ORIGINAL: Singularity Hub

Next Rembrandt

By Hugo Angel,

01 GATHERING THE DATA
To distill the artistic DNA of Rembrandt, an extensive database of his paintings was built and analyzed, pixel by pixel.
FUN FACT:
150 Gigabytes of digitally rendered graphics

BUILDING AN EXTENSIVE POOL OF DATA
t’s been almost four centuries since the world lost the talent of one its most influential classical painters, Rembrandt van Rijn. To bring him back, we distilled the artistic DNA from his work and used it to create The Next Rembrandt.
We examined the entire collection of Rembrandt’s work, studying the contents of his paintings pixel by pixel. To get this data, we analyzed a broad range of materials like high resolution 3D scans and digital files, which were upscaled by deep learning algorithms to maximize resolution and quality. This extensive database was then used as the foundation for creating The Next Rembrandt.
Data is used by many people today to help them be more efficient and knowledgeable about their daily work, and about the decisions they need to make. But in this project it’s also used to make life itself more beautiful. It really touches the human soul.
– Ron Augustus, Microsoft
02 DETERMINING THE SUBJECT
Data from Rembrandt’s body of work showed the way to the subject of the new painting.
FUN FACT:
346 Paintings were studied


DELVING INTO REMBRANDT VAN RIJN
  • 49% FEMALE
  • 51% MALE
Throughout his life, Rembrandt painted a great number of self-portraits, commissioned portraits and group shots, Biblical scenes, and even a few landscapes. He’s known for painting brutally honest and unforgiving portrayals of his subjects, utilizing a limited color palette for facial emphasis, and innovating the use of light and shadows.
“There’s a lot of Rembrandt data available — you have this enormous amount of technical data from all these paintings from various collections. And can we actually create something out of it that looks like Rembrandt? That’s an appealing question.”
– Joris Dik, Technical University Delft
BREAKING DOWN THE DEMOGRAPHICS IN REMBRANDT’S WORK
To create new artwork using data from Rembrandt’s paintings, we had to maximize the data pool from which to pull information. Because he painted more portraits than any other subject, we narrowed down our exploration to these paintings.
Then we found the period in which the majority of these paintings were created: between 1632 and 1642. Next, we defined the demographic segmentation of the people in these works and saw which elements occurred in the largest sample of paintings. We funneled down that selection starting with gender and then went on to analyze everything from age and head direction, to the amount of facial hair present.
After studying the demographics, the data lead us to a conclusive subject: a portrait of a Caucasian male with facial hair, between the ages of thirty and forty, wearing black clothes with a white collar and a hat, facing to the right.
03 GENERATING THE FEATURES
A software system was designed to understand Rembrandt’s style and generate new features.
FUN FACT:
500+ Hours of rendering
MASTERING THE STYLE OF REMBRANDT
In creating the new painting, it was imperative to stay accurate to Rembrandt’s unique style. As “The Master of Light and Shadow,” Rembrandt relied on his innovative use of lighting to shape the features in his paintings. By using very concentrated light sources, he essentially created a “spotlight effect” that gave great attention to the lit elements and left the rest of the painting shrouded in shadows. This resulted in some of the features being very sharp and in focus and others becoming soft and almost blurry, an effect that had to be replicated in the new artwork.
When you want to make a new painting you have some idea of how it’s going to look. But in our case we started from basically nothing — we had to create a whole painting using just data from Rembrandt’s paintings.
– Ben Haanstra, Developer
GENERATING FEATURES BASED ON DATA
To master his style, we designed a software system that could understand Rembrandt based on his use of geometry, composition, and painting materials. A facial recognition algorithm identified and classified the most typical geometric patterns used by Rembrandt to paint human features. It then used the learned principles to replicate the style and generate new facial features for our painting.
CONSTRUCTING A FACE OUT OF THE NEW FEATURES
Once we generated the individual features, we had to assemble them into a fully formed face and bust according to Rembrandt’s use of proportions. An algorithm measured the distances between the facial features in Rembrandt’s paintings and calculated them based on percentages. Next, the features were transformed, rotated, and scaled, then accurately placed within the frame of the face. Finally, we rendered the light based on gathered data in order to cast authentic shadows on each feature.
04 BRINGING IT TO LIFE
CREATING ACCURATE DEPTH AND TEXTURE
Analyses
We now had a digital file true to Rembrandt’s style in content, shapes, and lighting. But paintings aren’t just 2D — they have a remarkable three-dimensionality that comes from brushstrokes and layers of paint. To recreate this texture, we had to study 3D scans of Rembrandt’s paintings and analyze the intricate layers on top of the canvas.
“We looked at a number of Rembrandt paintings, and we scanned their surface texture, their elemental composition, and what kinds of pigments were used. That’s the kind of information you need if you want to generate a painting by Rembrandt virtually.”
– Joris Dik, Technical University Delft
USING A HEIGHT MAP TO PRINT IN 3D
We created a height map using two different algorithms that found texture patterns of canvas surfaces and layers of paint. That information was transformed into height data, allowing us to mimic the brushstrokes used by Rembrandt.
We then used an elevated printing technique on a 3D printer that output multiple layers of paint-based UV ink. The final height map determined how much ink was released onto the canvas during each layer of the printing process. In the end, we printed thirteen layers of ink, one on top of the other, to create a painting texture true to Rembrandt’s style.

ORIGINAL: Next Rembrandt

A Scale-up Synaptic Supercomputer (NS16e): Four Perspectives

By Hugo Angel,

Today, Lawrence Livermore National Lab (LLNL) and IBM announce the development of a new Scale-up Synaptic Supercomputer (NS16e) that highly integrates 16 TrueNorth Chips in a 4×4 array to deliver 16 million neurons and 256 million synapses. LLNL will also receive an end-to-end software ecosystem that consists of a simulator; a programming language; an integrated programming environment; a library of algorithms as well as applications; firmware; tools for composing neural networks for deep learning; a teaching curriculum; and cloud enablement. Also, don’t miss the story in The Wall Street Journal (sign-in required) and the perspective and a video by LLNL’s Brian Van Essen.
To provide insights into what it took to achieve this significant milestone in the history of our project, following are four intertwined perspectives from my colleagues:

  • Filipp Akopyan — First Steps to an Efficient Scalable NeuroSynaptic Supercomputer.
  • Bill Risk and Ben Shaw — Creating an Iconic Enclosure for the NS16e.
  • Jun Sawada — NS16e System as a Neural Network Development Workstation.
  • Brian Taba — How to Program a Synaptic Supercomputer.
The following timeline provides context for today’s milestone in terms of the continued evolution of our project.
Illustration Credit: William Risk

Microsoft Neural Net Shows Deep Learning can get Way Deeper

By Hugo Angel,

Silicon Wafer by Sonic
PAUL TAYLOR/GETTY IMAGES
COMPUTER VISION IS now a part of everyday life. Facebook recognizes faces in the photos you post to the popular social network. The Google Photos app can find images buried in your collection, identifying everything from dogs to birthday parties to gravestones. Twitter can pinpoint pornographic images without help from human curators.
All of this “seeing” stems from a remarkably effective breed of artificial intelligence called deep learning. But as far as this much-hyped technology has come in recent years, a new experiment from Microsoft Research shows it’s only getting started. Deep learning can go so much deeper.
We’re staring at a huge design space, trying to figure out where to go next.‘ 

 

PETER LEE, MICROSOFT RESEARCH
This revolution in computer vision was a long time coming. A key turning point came in 2012, when artificial intelligence researchers from the University of Toronto won a competition called ImageNet. ImageNet pits machines against each other in an image recognition contest—which computer can identify cats or cars or clouds more accurately?—and that year, the Toronto team, including researcher Alex Krizhevsky and professor Geoff Hinton, topped the contest using deep neural nets, a technology that learns to identify images by examining enormous numbers of them, rather than identifying images according to rules diligently hand-coded by humans.
 
Toronto’s win provided a roadmap for the future of deep learning. In the years since, the biggest names on the ‘net—including Facebook, Google, Twitter, and Microsoft—have used similar tech to build computer vision systems that can match and even surpass humans. “We can’t claim that our system ‘sees’ like a person does,” says Peter Lee, the head of research at Microsoft. “But what we can say is that for very specific, narrowly defined tasks, we can learn to be as good as humans.
Roughly speaking, neural nets use hardware and software to approximate the web of neurons in the human brain. This idea dates to the 1980s, but in 2012, Krizhevsky and Hinton advanced the technology by running their neural nets atop graphics processing units, or GPUs. These specialized chips were originally designed to render images for games and other highly graphical software, but as it turns out, they’re also suited to the kind of math that drives neural nets. Google, Facebook, Twitter, Microsoft, and so many others now use GPU-powered-AI to handle image recognition and so many others tasks, from Internet search to security. Krizhevsky and Hinton joined the staff at Google.
Deep learning can go so much deeper.
Now, the latest ImageNet winner is pointing to what could be another step in the evolution of computer vision—and the wider field of artificial intelligence. Last month, a team of Microsoft researchers took the ImageNet crown using a new approach they call a deep residual network. The name doesn’t quite describe it. They’ve designed a neural net that’s significantly more complex than typical designs—one that spans 152 layers of mathematical operations, compared to the typical six or seven. It shows that, in the years to come, companies like Microsoft will be able to use vast clusters of GPUs and other specialized chips to significantly improve not only image recognition but other AI services, including systems that recognize speech and even understand language as we humans naturally speak it.
In other words, deep learning is nowhere close to reaching its potential. “We’re staring at a huge design space,” Lee says, “trying to figure out where to go next.
Layers of Neurons
Deep neural networks are arranged in layers. Each layer is a different set of mathematical operations—aka algorithms. The output of one layer becomes the input of the next. Loosely speaking, if a neural network is designed for image recognition, one layer will look for a particular set of features in an image—edges or angles or shapes or textures or the like—and the next will look for another set. These layers are what make these neural networks deep. “Generally speaking, if you make these networks deeper, it becomes easier for them to learn,” says Alex Berg, a researcher at the University of North Carolina who helps oversee the ImageNet competition.
Constructing this kind of mega-neural net is flat-out difficult.
Today, a typical neural network includes six or seven layers. Some might extend to 20 or even 30. But the Microsoft team, led by researcher Jian Sun, just expanded that to 152. In essence, this neural net is better at recognizing images because it can examine more features. “There is a lot more subtlety that can be learned,” Lee says.
In the past, according Lee and researchers outside of Microsoft, this sort of very deep neural net wasn’t feasible. Part of the problem was that as your mathematical signal moved from layer to layer, it became diluted and tended to fade. As Lee explains, Microsoft solved this problem by building a neural net that skips certain layers when it doesn’t need them, but uses them when it does. “When you do this kind of skipping, you’re able to preserve the strength of the signal much further,” Lee says, “and this is turning out to have a tremendous, beneficial impact on accuracy.
Berg says that this is an notable departure from previous systems, and he believes that others companies and researchers will follow suit.
Deep Difficulty
The other issue is that constructing this kind of mega-neural net is tremendously difficult. Landing on a particular set of algorithms—determining how each layer should operate and how it should talk to the next layer—is an almost epic task. But Microsoft has a trick here, too. It has designed a computing system that can help build these networks.
As Jian Sun explains it, researchers can identify a promising arrangement for massive neural networks, and then the system can cycle through a range of similar possibilities until it settles on this best one. “In most cases, after a number of tries, the researchers learn [something], reflect, and make a new decision on the next try,” he says. “You can view this as ‘human-assisted search.’”
Microsoft has designed a computing system that can help build these networks.
According to Adam Gibson—the chief researcher at deep learning startup Skymind—this kind of thing is getting more common. It’s called “hyper parameter optimization.” “People can just spin up a cluster [of machines], run 10 models at once, find out which one works best and use that,” Gibson says. “They can input some baseline parameter—based on intuition—and the machines kind of homes in on what the best solution is.” As Gibson notes, last year Twitter acquired a company, Whetlab, that offers similar ways of “optimizing” neural networks.

‘A Hardware Problem’
As Peter Lee and Jian Sun describe it, such an approach isn’t exactly “brute forcing” the problem. “With very very large amounts of compute resources, one could fantasize about a gigantic ‘natural selection’ setup where evolutionary forces help direct a brute-force search through a huge space of possibilities,” Lee says. “The world doesn’t have those computing resources available for such a thing…For now, we will still depend on really smart researchers like Jian.
But Lee does say that, thanks to new techniques and computer data centers filled with GPU machines, the realm of possibilities for deep learning are enormous. A big part of the company’s task is just finding the time and the computing power needed to explore these possibilities. “This work as dramatically exploded the design space. The amount of ground to cover, in terms of scientific investigation, has become exponentially larger,” Lee says. And this extends well beyond image recognition, into speech recognition, natural language understanding, and other tasks.
As Lee explains, that’s one reason Microsoft is not only pushing to improve the power of its GPUs clusters, but exploring the use of other specialized processors, including FPGAs—chips that can programmed for particular tasks, such as deep learning. “There has also been an explosion in demand for much more experimental hardware platforms from our researchers,” he says. And this work is sending ripples across the wider of world of tech and artificial intelligence. This past summer, in its largest ever acquisition deal, Intel agreed to buy Altera, which specializes in FPGAs.
Indeed, Gibson says that deep learning has become more of “a hardware problem.” Yes, we still need top researchers to guide the creation of neural networks, but more and more, finding new paths is a matter of brute-forcing new algorithms across ever more powerful collections of hardware. As Gibson point out, though these deep neural nets work extremely well, we don’t quite know why they work. The trick lies in finding the complex combination of algorithms that work the best. More and better hardware can shorten the path.
The end result is that the companies that can build the most powerful networks of hardware are the companies will come out ahead. That would be Google and Facebook and Microsoft. Those that are good at deep learning today will only get better.
ORIGINAL: Wired

NVIDIA DRIVE PX 2. NVIDIA Accelerates Race to Autonomous Driving at CES 2016

By Hugo Angel,

NVIDIA today shifted its autonomous-driving leadership into high gear.
At a press event kicking off CES 2016, we unveiled artificial-intelligence technology that will let cars sense the world around them and pilot a safe route forward.
Dressed in his trademark black leather jacket, speaking to a crowd of some 400 automakers, media and analysts, NVIDIA CEO Jen-Hsun Huang revealed DRIVE PX 2, an automotive supercomputing platform that processes 24 trillion deep learning operations a second. That’s 10 times the performance of the first-generation DRIVE PX, now being used by more than 50 companies in the automotive world.
The new DRIVE PX 2 delivers 8 teraflops of processing power. It has the processing power of 150 MacBook Pros. And it’s the size of a lunchbox in contrast to earlier autonomous-driving technology being used today, which takes up the entire trunk of a mid-sized sedan.
Self-driving cars will revolutionize society,” Huang said at the beginning of his talk. “And NVIDIA’s vision is to enable them.
 
Volvo to Deploy DRIVE PX in Self-Driving SUVs
As part of its quest to eliminate traffic fatalities, Volvo will be the first automaker to deploy DRIVE PX 2.
Huang announced that Volvo – known worldwide for safety and reliability – will be the first automaker to deploy DRIVE PX 2.
In the world’s first public trial of autonomous driving, the Swedish automaker next year will lease 100 XC90 luxury SUVs outfitted with DRIVE PX 2 technology. The technology will help the vehicles drive autonomously around Volvo’s hometown of Gothenburg, and semi-autonomously elsewhere.
DRIVE PX 2 has the power to harness a host of sensors to get a 360 degree view of the environment around the car.
The rear-view mirror is history,” Jen-Hsun said.
Drive Safely, by Not Driving at All
Not so long ago, pundits had questioned the safety of technology in cars. Now, with Volvo incorporating autonomous vehicles into its plan to end traffic fatalities, that script has been flipped. Autonomous cars may be vastly safer than human-piloted vehicles.
Car crashes – an estimated 93 percent of them caused by human error kill 1.3 million drivers each year. More American teenagers die from texting while driving than any other cause, including drunk driving.
There’s also a productivity issue. Americans waste some 5.5 billion hours of time each year in traffic, costing the U.S. about $121 billion, according to an Urban Mobility Report from Texas A&M. And inefficient use of roads by cars wastes even vaster sums spent on infrastructure.
Deep Learning Hits the Road
Self-driving solutions based on computer vision can provide some answers. But tackling the infinite permutations that a driver needs to react to – stray pets, swerving cars, slashing rain, steady road construction crews – is far too complex a programming challenge.
Deep learning enabled by NVIDIA technology can address these challenges. A highly trained deep neural network – residing on supercomputers in the cloud – captures the experience of many tens of thousands of hours of road time.
Huang noted that a number of automotive companies are already using NVIDIA’s deep learning technology to power their efforts, getting speedup of 30-40X in training their networks compared with other technology. BMW, Daimler and Ford are among them, along with innovative Japanese startups like Preferred Networks and ZMP. And Audi said it was able in four hours to do training that took it two years with a competing solution.
  NVIDIA DRIVE PX 2 is part of an end-to-end platform that brings deep learning to the road.
NVIDIA’s end-to-end solution for deep learning starts with NVIDIA DIGITS, a supercomputer that can be used to train digital neural networks by exposing them to data collected during that time on the road. On the other end is DRIVE PX 2, which draws on this training to make inferences to enable the car to progress safely down the road. In the middle is NVIDIA DriveWorks, a suite of software tools, libraries and modules that accelerates development and testing of autonomous vehicles.
DriveWorks enables sensor calibration, acquisition of surround data, synchronization, recording and then processing streams of sensor data through a complex pipeline of algorithms running on all of the DRIVE PX 2’s specialized and general-purpose processors.
During the event, Huang reminded the audience that machines are already beating humans at tasks once considered impossible for computers, such as image recognition. Systems trained with deep learning can now correctly classify images more than 96 percent of the time, exceeding what humans can do on similar tasks.
He used the event to show what deep learning can do for autonomous vehicles.
A series of demos drove this home, showing in three steps how DRIVE PX 2 harnesses a host of sensors – lidar, radar and cameras and ultrasonic – to understand the world around it, in real time, and plan a safe and efficient path forward.
The World’s Biggest Infotainment System
 
The highlight of the demos was what Huang called the world’s largest car infotainment system — an elegant block the size of a medium-sized bedroom wall mounted with a long horizontal screen and a long vertical one.
While a third larger screen showed the scene that a driver would take in, the wide demo screen showed how the car — using deep learning and sensor fusion — “viewed” the very same scene in real-time, stitched together from its array of sensors. On its right, the huge portrait-oriented screen shows a highly precise map that marked the car’s progress.
It’s a demo that will leave an impression on an audience that’s going to be hear a lot about the future of driving in the week ahead.
Photos from Our CES 2016 Press Event
NVIDIA Drive PX-2
ORIGINAL: Nvidia
By Bob Sherbin on January 3, 2016

Robots are learning from YouTube tutorials

By Hugo Angel,

Do it yourself, robot. (Reuters/Kim Kyung-Hoon)
For better or worse, we’ve taught robots to mimic human behavior in countless ways. They can perform tasks as rudimentary as picking up objects, or as creative as dreaming their own dreams. They can identify bullying, and even play jazz. Now, we’ve taught robots the most human task of all: how to teach themselves to make Jell-O shots from watching YouTube videos.
Ever go to YouTube and type in something like, “How to make pancakes,” or, “How to mount a TV”? Sure you have. While many such tutorials are awful—and some are just deliberately misleading—the sheer number of instructional videos offers strong odds of finding one that’s genuinely helpful. And when all those videos are aggregated and analyzed simultaneously, it’s not hard for a robot to figure out what the correct steps are.

Researchers at Cornell University have taught robots to do just that with a system called RoboWatch. By watching and scanning multiple videos of the same “how-to” activity (with subtitles enabled), bots can 

  • identify common steps, 
  • put them in order, and 
  • learn how to do whatever the tutorials are teaching.
Robot learning is not new, but what’s unusual here is that these robots can learn without human supervision, as Phys.Org points out.
Similar research usually requires human overseers to introduce and explain words, or captions, for the robots to parse. RoboWatch, however, needs no human help, save that someone ensure all the videos analyzed fall into a single category (pdf). The idea is that a human could one day tell a robot to perform a task and then the robot would independently research and learn how to carry out that task.
So next time you getting frustrated watching a video on how to change a tire, don’t fret. Soon, a robot will do all that for you. We just have to make sure it doesn’t watch any videos about “how to take over the world.
ORIGINAL: QZ
December 22, 2015