BRAIN ACTIVITY MAP
Here’s a tricky task. Pick a photograph from the Web at random. Now try to work out where it was taken using only the image itself. If the image shows a famous building or landmark, such as the Eiffel Tower or Niagara Falls, the task is straightforward. But the job becomes significantly harder when the image lacks specific location cues or is taken indoors or shows a pet or food or some other detail.Nevertheless, humans are surprisingly good at this task. To help, they bring to bear all kinds of knowledge about the world such as the type and language of signs on display, the types of vegetation, architectural styles, the direction of traffic, and so on. Humans spend a lifetime picking up these kinds of geolocation cues.So it’s easy to think that machines would struggle with this task. And indeed, they have.
Today, that changes thanks to the work of Tobias Weyand, a computer vision specialist at Google, and a couple of pals. These guys have trained a deep-learning machine to work out the location of almost any photo using only the pixels it contains.
Their new machine significantly outperforms humans and can even use a clever trick to determine the location of indoor images and pictures of specific things such as pets, food, and so on that have no location cues.
Their approach is straightforward, at least in the world of machine learning.
- Weyand and co begin by dividing the world into a grid consisting of over 26,000 squares of varying size that depend on the number of images taken in that location.
So big cities, which are the subjects of many images, have a more fine-grained grid structure than more remote regions where photographs are less common. Indeed, the Google team ignored areas like oceans and the polar regions, where few photographs have been taken.
- Next, the team created a database of geolocated images from the Web and used the location data to determine the grid square in which each image was taken. This data set is huge, consisting of 126 million images along with their accompanying Exif location data.
- Weyand and co used 91 million of these images to teach a powerful neural network to work out the grid location using only the image itself. Their idea is to input an image into this neural net and get as the output a particular grid location or a set of likely candidates.
- They then validated the neural network using the remaining 34 million images in the data set.
- Finally they tested the network—which they call PlaNet—in a number of different ways to see how well it works.
The results make for interesting reading. To measure the accuracy of their machine, they fed it 2.3 million geotagged images from Flickr to see whether it could correctly determine their location. “PlaNet is able to localize 3.6 percent of the images at street-level accuracy and 10.1 percent at city-level accuracy,” say Weyand and co. What’s more, the machine determines the country of origin in a further 28.4 percent of the photos and the continent in 48.0 percent of them.
That’s pretty good. But to show just how good, Weyand and co put PlaNet through its paces in a test against 10 well-traveled humans. For the test, they used an online game that presents a player with a random view taken from Google Street View and asks him or her to pinpoint its location on a map of the world.
Anyone can play at www.geoguessr.com. Give it a try—it’s a lot of fun and more tricky than it sounds.
|GeoGuesser Screen Capture Example|
Needless to say, PlaNet trounced the humans. “In total, PlaNet won 28 of the 50 rounds with a median localization error of 1131.7 km, while the median human localization error was 2320.75 km,” say Weyand and co. “[This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes.”
An interesting question is how PlaNet performs so well without being able to use the cues that humans rely on, such as vegetation, architectural style, and so on. But Weyand and co say they know why: “We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learned subtle cues of different scenes that are even hard for a well-traveled human to distinguish.”
They go further and use the machine to locate images that do not have location cues, such as those taken indoors or of specific items. This is possible when images are part of albums that have all been taken at the same place. The machine simply looks through other images in the album to work out where they were taken and assumes the more specific image was taken in the same place.
That’s impressive work that shows deep neural nets flexing their muscles once again. Perhaps more impressive still is that the model uses a relatively small amount of memory unlike other approaches that use gigabytes of the stuff. “Our model uses only 377 MB, which even fits into the memory of a smartphone,” say Weyand and co.
That’s a tantalizing idea—the power of a superhuman neural network on a smartphone. It surely won’t be long now!
Ref: arxiv.org/abs/1602.05314 : PlaNet—Photo Geolocation with Convolutional Neural Networks
By Eliza Strickland
27 Aug 2013Could psychological-monitoring apps become as common as fitness and activity gadgets?
|Image: Cogito A Mind Minder: Cogito’s mood-monitoring app can detect signals of psychological distress|
In April, the software company Cogito was halfway through a clinical trial to see if it could detect symptoms of depression and post-traumatic stress disorder (PTSD) through a smartphone app. All of the 100 participants in the study lived around Boston. Then, on 15 April, two bombs went off near the finish line of the Boston Marathon, killing three people and injuring hundreds. Suddenly, Cogito’s clinical trial was a lot more relevant.
The trial was funded by the Defense Advanced Research Projects Agency (DARPA) under its Detection and Computational Analysis of Psychological Signals program. To address the troubling number of psychological problems and suicides among active-duty military personnel and veterans, the U.S. Department of Defense is seeking technologies that can identify at-risk individuals so professionals can help them.
Cogito, a Boston-based MIT spin-off, developed an app that keeps track of a person’s social behavior and vocal characteristics. The app monitors the phone’s location and time of use and also logs phone calls and text messages. (It doesn’t look at the content of those calls and texts.) Finally, there’s an active component: Participants can choose to fill out questionnaires about their mood and can record audio diaries. Cogito’s expertise is in automated speech analysis, which it applies to those audio diaries; future iterations could mine phone conversations for information as well.
Put all the data together and you’re able to tell a lot about a person, says Cogito CEO Joshua Feast. Sometimes you even find signs of distress that people don’t want to admit to or haven’t recognized themselves. “We’re able to look at sleep, mood, social isolation, and physical isolation,” says Feast, all of which can serve as “honest signals” of psychological trouble. In the Boston trial, Cogito was only testing the sophisticated algorithms it developed to aggregate the data. If the trial works out, future versions of the software could provide these summaries to clinicians to allow them to intervene and could also give the information to the subjects themselves.
All of this can seem rather creepy—apps that get inside your head and reveal your emotional secrets. But Feast says that’s why his company places so much emphasis on privacy and trust. If Cogito’s system becomes a commercial product, there will be legal guarantees that a user will always own and control his or her own data. For example, a user could choose whether or not to share the data with a clinician. Feast says he doesn’t think users would have it any other way. “Morally it’s the right thing to do, and also for adoption it’s the right thing to do,” he says.
The participants in the Boston trial included veterans, civilians with histories of trauma or depression, and some healthy civilians. While the bulk of the data from the study is still being analyzed, Feast says the impact of the April bombing is already clear. The algorithms picked up more markers of stress in the participants, including decreased use of the app’s interactive components. “Fewer survey questions were being answered, and fewer audio diaries were being recorded,” he says.
Further study of the data will answer other important questions about the nature of depression and PTSD, says Feast: “What is resilience? What kind of people fared better after the bombing? What happens to people with vulnerability when things like this happen?” The company is still formulating its research questions, he says.
The Durkheim Project, another initiative funded by this DARPA program, focuses more narrowly on identifying veterans at risk of suicide. Chris Poulin, director of the project, explains that his system predicts suicide risk by analyzing veterans’ text messages and their posting on social-networking sites like Facebook and Twitter. Poulin says he’s impressed with the scope of Cogito’s data collection and its incorporation of voice monitoring. “There are other people out there collecting mobile data and looking at activity metrics, but very few people have integrated voice data,” he says.
Cogito’s voice-analysis software, Cogito Dialog, monitors vocal characteristics such as level of excitement and fluidity of speech. Feast explains that it’s tricky to get clear data in this area because there’s so much natural variation in people’s speech habits. However, the system can detect changes to an individual’s speech patterns over time and can also be useful in telemedicine. For example, if a clinician calls veterans and asks them all the same series of questions, a monitoring system can flag people with unusual responses. “Speech analysis is well suited for looking at population norms and deviation from the norms,” says Feast.
Feast believes that the company’s experience with the Boston bombing provides a preview of a possible future where psychological monitoring apps are as common as the fitness and activity gadgets that proliferate today. “When there’s an earthquake or terrorist attack or traumatic event that hits a population center, this technology could support a rapid response team for psychological distress,” he says. “It would be like the CDC [Centers for Disease Control] responding to a flu outbreak.”