Category: Web

An executive’s guide to machine learning

By admin,

An executive’s guide to machine learning

It’s no longer the preserve of artificial-intelligence researchers and born-digital companies like Amazon, Google, and Netflix.
Machine learning is based on algorithms that can learn from data without relying on rules-based programming. It came into its own as a scientific discipline in the late 1990s as steady advances in digitization and cheap computing power enabled data scientists to stop building finished models and instead train computers to do so. The unmanageable volume and complexity of the big data that the world is now swimming in have increased the potential of machine learning—and the need for it.
Stanford’s Fei-Fei Li

In 2007 Fei-Fei Li, the head of Stanford’s Artificial Intelligence Lab, gave up trying to program computers to recognize objects and began labeling the millions of raw images that a child might encounter by age three and feeding them to computers. By being shown thousands and thousands of labeled data sets with instances of, say, a cat, the machine could shape its own rules for deciding whether a particular set of digital pixels was, in fact, a cat.1 Last November, Li’s team unveiled a program that identifies the visual elements of any picture with a high degree of accuracy. IBM’s Watson machine relied on a similar self-generated scoring system among hundreds of potential answers to crush the world’s best Jeopardy! players in 2011.

Dazzling as such feats are, machine learning is nothing like learning in the human sense (yet). But what it already does extraordinarily well—and will get better at—is relentlessly chewing through any amount of data and every combination of variables. Because machine learning’s emergence as a mainstream management tool is relatively recent, it often raises questions. In this article, we’ve posed some that we often hear and answered them in a way we hope will be useful for any executive. Now is the time to grapple with these issues, because the competitive significance of business models turbocharged by machine learning is poised to surge. Indeed, management author Ram Charan suggests that any organization that is not a math house now or is unable to become one soon is already a legacy company.2
1. How are traditional industries using machine learning to gather fresh business insights?
Well, let’s start with sports. This past spring, contenders for the US National Basketball Association championship relied on the analytics of Second Spectrum, a California machine-learning start-up. By digitizing the past few seasons’ games, it has created predictive models that allow a coach to distinguish between, as CEO Rajiv Maheswaran puts it, “a bad shooter who takes good shots and a good shooter who takes bad shots”—and to adjust his decisions accordingly.
You can’t get more venerable or traditional than General Electric, the only member of the original Dow Jones Industrial Average still around after 119 years. GE already makes hundreds of millions of dollars by crunching the data it collects from deep-sea oil wells or jet engines to optimize performance, anticipate breakdowns, and streamline maintenance. But Colin Parris, who joined GE Software from IBM late last year as vice president of software research, believes that continued advances in data-processing power, sensors, and predictive algorithms will soon give his company the same sharpness of insight into the individual vagaries of a jet engine that Google has into the online behavior of a 24-year-old netizen from West Hollywood.
2. What about outside North America?
In Europe, more than a dozen banks have replaced older statistical-modeling approaches with machine-learning techniques and, in some cases, experienced 10 percent increases in sales of new products, 20 percent savings in capital expenditures, 20 percent increases in cash collections, and 20 percent declines in churn. The banks have achieved these gains by devising new recommendation engines for clients in retailing and in small and medium-sized companies. They have also built microtargeted models that more accurately forecast who will cancel service or default on their loans, and how best to intervene.
Closer to home, as a recent article in McKinsey Quarterly notes,3 our colleagues have been applying hard analytics to the soft stuff of talent management. Last fall, they tested the ability of three algorithms developed by external vendors and one built internally to forecast, solely by examining scanned résumés, which of more than 10,000 potential recruits the firm would have accepted. The predictions strongly correlated with the real-world results. Interestingly, the machines accepted a slightly higher percentage of female candidates, which holds promise for using analytics to unlock a more diverse range of profiles and counter hidden human bias.
As ever more of the analog world gets digitized, our ability to learn from data by developing and testing algorithms will only become more important for what are now seen as traditional businesses. Google chief economist Hal Varian calls this “computer kaizen.” For “just as mass production changed the way products were assembled and continuous improvement changed how manufacturing was done,” he says, “so continuous [and often automatic] experimentation will improve the way we optimize business processes in our organizations.4
3. What were the early foundations of machine learning?
Machine learning is based on a number of earlier building blocks, starting with classical statistics. Statistical inference does form an important foundation for the current implementations of artificial intelligence. But it’s important to recognize that classical statistical techniques were developed between the 18th and early 20th centuries for much smaller data sets than the ones we now have at our disposal. Machine learning is unconstrained by the preset assumptions of statistics. As a result, it can yield insights that human analysts do not see on their own and make predictions with ever-higher degrees of accuracy.
More recently, in the 1930s and 1940s, the pioneers of computing (such as Alan Turing, who had a deep and abiding interest in artificial intelligence) began formulating and tinkering with the basic techniques such as neural networks that make today’s machine learning possible. But those techniques stayed in the laboratory longer than many technologies did and, for the most part, had to await the development and infrastructure of powerful computers, in the late 1970s and early 1980s. That’s probably the starting point for the machine-learning adoption curve. New technologies introduced into modern economies—the steam engine, electricity, the electric motor, and computers, for example—seem to take about 80 years to transition from the laboratory to what you might call cultural invisibility. The computer hasn’t faded from sight just yet, but it’s likely to by 2040. And it probably won’t take much longer for machine learning to recede into the background.
4. What does it take to get started?
C-level executives will best exploit machine learning if they see it as a tool to craft and implement a strategic vision. But that means putting strategy first. Without strategy as a starting point, machine learning risks becoming a tool buried inside a company’s routine operations: it will provide a useful service, but its long-term value will probably be limited to an endless repetition of “cookie cutter” applications such as models for acquiring, stimulating, and retaining customers.
We find the parallels with M&A instructive. That, after all, is a means to a well-defined end. No sensible business rushes into a flurry of acquisitions or mergers and then just sits back to see what happens. Companies embarking on machine learning should make the same three commitments companies make before embracing M&A. Those commitments are,

  • first, to investigate all feasible alternatives;
  • second, to pursue the strategy wholeheartedly at the C-suite level; and,
  • third, to use (or if necessary acquire) existing expertise and knowledge in the C-suite to guide the application of that strategy.
The people charged with creating the strategic vision may well be (or have been) data scientists. But as they define the problem and the desired outcome of the strategy, they will need guidance from C-level colleagues overseeing other crucial strategic initiatives. More broadly, companies must have two types of people to unleash the potential of machine learning.

  • Quants” are schooled in its language and methods.
  • Translators” can bridge the disciplines of data, machine learning, and decision making by reframing the quants’ complex results as actionable insights that generalist managers can execute.
Access to troves of useful and reliable data is required for effective machine learning, such as Watson’s ability, in tests, to predict oncological outcomes better than physicians or Facebook’s recent success teaching computers to identify specific human faces nearly as accurately as humans do. A true data strategy starts with identifying gaps in the data, determining the time and money required to fill those gaps, and breaking down silos. Too often, departments hoard information and politicize access to it—one reason some companies have created the new role of chief data officer to pull together what’s required. Other elements include putting responsibility for generating data in the hands of frontline managers.
Start small—look for low-hanging fruit and trumpet any early success. This will help recruit grassroots support and reinforce the changes in individual behavior and the employee buy-in that ultimately determine whether an organization can apply machine learning effectively. Finally, evaluate the results in the light of clearly identified criteria for success.
5. What’s the role of top management?
Behavioral change will be critical, and one of top management’s key roles will be to influence and encourage it. Traditional managers, for example, will have to get comfortable with their own variations on A/B testing, the technique digital companies use to see what will and will not appeal to online consumers. Frontline managers, armed with insights from increasingly powerful computers, must learn to make more decisions on their own, with top management setting the overall direction and zeroing in only when exceptions surface. Democratizing the use of analytics—providing the front line with the necessary skills and setting appropriate incentives to encourage data sharing—will require time.
C-level officers should think about applied machine learning in three stages: machine learning 1.0, 2.0, and 3.0—or, as we prefer to say,

  1. description, 
  2. prediction, and
  3. prescription. 

They probably don’t need to worry much about the description stage, which most companies have already been through. That was all about collecting data in databases (which had to be invented for the purpose), a development that gave managers new insights into the past. OLAP—online analytical processing—is now pretty routine and well established in most large organizations.

There’s a much more urgent need to embrace the prediction stage, which is happening right now. Today’s cutting-edge technology already allows businesses not only to look at their historical data but also to predict behavior or outcomes in the future—for example, by helping credit-risk officers at banks to assess which customers are most likely to default or by enabling telcos to anticipate which customers are especially prone to “churn” in the near term (exhibit).
A frequent concern for the C-suite when it embarks on the prediction stage is the quality of the data. That concern often paralyzes executives. In our experience, though, the last decade’s IT investments have equipped most companies with sufficient information to obtain new insights even from incomplete, messy data sets, provided of course that those companies choose the right algorithm. Adding exotic new data sources may be of only marginal benefit compared with what can be mined from existing data warehouses. Confronting that challenge is the task of the “chief data scientist.”
Prescription—the third and most advanced stage of machine learning—is the opportunity of the future and must therefore command strong C-suite attention. It is, after all, not enough just to predict what customers are going to do; only by understanding why they are going to do it can companies encourage or deter that behavior in the future. Technically, today’s machine-learning algorithms, aided by human translators, can already do this. For example, an international bank concerned about the scale of defaults in its retail business recently identified a group of customers who had suddenly switched from using credit cards during the day to using them in the middle of the night. That pattern was accompanied by a steep decrease in their savings rate. After consulting branch managers, the bank further discovered that the people behaving in this way were also coping with some recent stressful event. As a result, all customers tagged by the algorithm as members of that microsegment were automatically given a new limit on their credit cards and offered financial advice.
The prescription stage of machine learning, ushering in a new era of man–machine collaboration, will require the biggest change in the way we work. While the machine identifies patterns, the human translator’s responsibility will be to interpret them for different microsegments and to recommend a course of action. Here the C-suite must be directly involved in the crafting and formulation of the objectives that such algorithms attempt to optimize.
6. This sounds awfully like automation replacing humans in the long run. Are we any nearer to knowing whether machines will replace managers?
It’s true that change is coming (and data are generated) so quickly that human-in-the-loop involvement in all decision making is rapidly becoming impractical. Looking three to five years out, we expect to see far higher levels of artificial intelligence, as well as the development of distributed autonomous corporations. These self-motivating, self-contained agents, formed as corporations, will be able to carry out set objectives autonomously, without any direct human supervision. Some DACs will certainly become self-programming.
One current of opinion sees distributed autonomous corporations as threatening and inimical to our culture. But by the time they fully evolve, machine learning will have become culturally invisible in the same way technological inventions of the 20th century disappeared into the background. The role of humans will be to direct and guide the algorithms as they attempt to achieve the objectives that they are given. That is one lesson of the automatic-trading algorithms which wreaked such damage during the financial crisis of 2008.
No matter what fresh insights computers unearth, only human managers can decide the essential questions, such as which critical business problems a company is really trying to solve. Just as human colleagues need regular reviews and assessments, so these “brilliant machines” and their works will also need to be regularly evaluated, refined—and, who knows, perhaps even fired or told to pursue entirely different paths—by executives with experience, judgment, and domain expertise.
The winners will be neither machines alone, nor humans alone, but the two working together effectively.
7. So in the long term there’s no need to worry?
It’s hard to be sure, but distributed autonomous corporations and machine learning should be high on the C-suite agenda. We anticipate a time when the philosophical discussion of what intelligence, artificial or otherwise, might be will end because there will be no such thing as intelligence—just processes. If distributed autonomous corporations act intelligently, perform intelligently, and respond intelligently, we will cease to debate whether high-level intelligence other than the human variety exists. In the meantime, we must all think about what we want these entities to do, the way we want them to behave, and how we are going to work with them.
About the authors
Dorian Pyle is a data expert in McKinsey’s Miami office, and Cristina San Jose is a principal in the Madrid office.
by Dorian Pyle and Cristina San Jose
June 2015

The networked beauty of forests (TED) & Mother Tree – Suzanne Simard

By admin,

Learn about the sophisticated, underground, fungal network trees use to communicate and even share nutrients. UBC professor Suzanne Simard leads us through the forrest to investigate this underground community.

Deforestation causes more greenhouse gas emissions than all trains, planes and automobiles combined. What can we do to change this contributor to global warming? Suzanne Simard examines how the complex, symbiotic networks of our forests mimic our own neural and social networks — and how those connections might make all the difference.


10 IBM Watson-Powered Apps That Are Changing Our World

By admin,

Nov 6, 2014


IBM is investing $1 billion in its IBM Watson Group with the aim of creating an ecosystem of startups and businesses building cognitive computing applications with Watson. Here are 10 examples that are making an impact.
IBM considers Watson to represent a new era of computing — a step forward to cognitive computing, where apps and systems interact with humans via natural language and help us augment our own understanding of the world with big data insights.
Big Blue isn’t playing small ball with that claim. It has opened a new IBM Watson Global Headquarters in the heart of New York City’s Silicon Alley and is investing $1 billion into the Watson Group, focusing on development and research as well as bringing cloud-delivered cognitive applications and services to the market. That includes $100 million available for venture investments to support IBM’s ecosystem of start-ups and businesses building cognitive apps with Watson.
Here are 10 examples of Watson-powered cognitive apps that are already starting to shake things up.
USAA and Watson Help Military Members Transition to Civilian Life
USAA, a financial services firm dedicated to those who serve or have served in the military, has turned to IBM’s Watson Engagement Advisor in a pilot program to help military men and women transition to civilian life.
According to the U.S. Bureau of Labor Statistics, about 155,000 active military members transition to civilian life each year. This process can raise many questions, like “Can I be in the reserve and collect veteran’s compensation benefits?” or “How do I make the most of the Post-9/11 GI Bill?” Watson has analyzed and understands more than 3,000 documents on topics exclusive to military transitions, allowing members to ask it questions and receive answers specific to their needs.

LifeLearn Sofie is an intelligent treatment support tool for veterinarians of all backgrounds and levels of experience. Sofie is powered by IBM WatsonTM, the world’s leading cognitive computing system. She can understand and process natural language, enabling interactions that are more aligned with how humans think and interact.

Implement Watson
Dive deeper into subjects. Find insights where no one ever thought to look before. From Healthcare to Retail, there’s an IBM Watson Solution that’s right for your enterprise.


Helping doctors identify treatment options
The challenge
According to one expert, only 20 percent of the knowledge physicians use to diagnose and treat patients today is evidence based. Which means that one in five diagnoses is incorrect or incomplete.

… Continue reading

Coding the “Brain”

By admin,

By francis tseng
February 25, 2014
Img source. IDEO
I have always been interested in automation. In the wake of cheap computing and other technological advancement, more and more repetitive tasks are being automated, from tax preparation to checkout services to manufacturing to product handling. Recently, I’ve found that maintaining a social media activity feels repetitive. As a result, I’ve wondered, using natural language generation, can I automate my Internet presence? Can an algorithm represent me on my social networks? Is it possible that, on the Internet, no one knows you’re a bot?

To start, I limited my domain to Twitter because its concise, fleeting format makes it a good place for experimentation. And to represent me, I developed the Brain, a bot that learns to imitate certain users (or hand-picked “muses”) and tweets in some amalgamation of their styles.

A few of the muses include webbedspace, daniel_rehn, wwwtext, WilbotOsterman, ftrain, and a_antonellis.

Some example tweets:

he slime’s transforming your arm directly into its own cells! But if you focus, you can push back, and turn its tentacle into your flesh! – webbedspace

As those experiences trend toward the monolithic that density becomes heavier, more likely to crush than to lead to unexpected encounters – WilbotOsterman

By learning from these muses, the Brain sometimes actually tweets a poignant observation:

the mind a true internet of things

Or some kind of strangely elegant aphorism:

people are just more noise

Or vague threats:

anytime i’ve been phased out you drink the pain

So it seems possible to create one’s own algorithmic ambassador to the web. Interested in substituting cold code for human wit? Here’s how….

Act natural

Natural language generation — the name for this domain of challenges — is a multifaceted and, therefore, tricky endeavor. How can an algorithm, which does not really “understand” language — generate grammatical and sensible prose?

The distinction between “grammatical” and “sensible” is important, as both are requirements for text to be considered understandable or “passable” by humans.

For example, the following sentence is grammatical, but not sensible:

we spent the entirety of his nose between his 900 million

The following sort of makes sense — you can get the gist of it — however, the phrase is not grammatical:

operation olympic games the multiyear corruption is

But if you’re willing to toss aside those concerns, there’s a simple approach which is fun if only for its wild unpredictability: Markov chain generation.

Invisible chains

Markov chain generation is an intuitive technique based on the simple concept of sequencing possible events (or words, in this case) so that the probability of each event (or word) depends only on the the previous one. Because Markov chain generation is such a simple approach, it doesn’t always produce grammatical or sensible output. But it can be very entertaining. The code can lead to quite a few “happy accidents,” which exploit the human tendency to interpret liberally. Better still? Every once in a while the algorithm spontaneously produces a surprisingly coherent metaphor.

A Markov chain is a system typically represented like so:

markov chains


Each circle is “state,” and at any given moment the system is in one of these states.

The lines that connect the states (known as “edges”) have values which represent the probability that one state will lead to another state. These edges are “directed,” that is, they go from one state to another in only one direction. For instance, A is connected to C, but C is not connected to A.

In a Markov chain, look at the current state, then roll a die (or more accurately, pick a random value between 0 and 1) to determine what the next state will be.

For instance, in the above diagram, say we start in state A. Then we want to figure out what the next state will be. The two states that A is connected to are states B and C. The edge that connects A to C has a probability of 0.7, and the edge that connects A to B has a probability of 0.3. So we have a 70 percent chance of going to C and a 30 percent chance of going to B.

Imagine that we pick our random number and it’s determined that we now enter state C. Now we repeat the process. State C is connected to B, and it is also connected to itself. What this means is that there is a (40 percent) chance that the next state after C will be C again.

Pick a random number again, go to another state (B or C), and this continues ad nauseam.

Listen to many

So how can we adapt such a system for language generation?

It’s simple — assume that each “state” is a word, and generate a chain of states (words) to create a sentence.

The question is how do we determine the edges (or probabilities) of the system? If we start with the word “the,” how do we know what word we’re likely to go to next?

What we need is a corpus — a collection of text — to model the Markov system off. In the Brain, this text is collected from other Twitter users (“muses”) and over time the Brain builds a vocabulary and “learns” the probabilities that one word leads to another.

But for demonstration purposes, let’s just use the following text from Space Ipsum, “a space-themed lorem ipsum generator.” Consider each sentence as a document and allow the collection of sentences to be our “corpus”:

The Eagle has landed. The regret on our side is, they used to say years ago, we are reading about you in science class. Now they say, we are reading about you in history class.

Never in all their history have men been able truly to conceive of the world as one: a single sphere, a globe, having the qualities of a globe, a round earth in which all the directions eventually meet, in which there is no center because every point, or none, is center — an equal earth which all men occupy as equals. The airman’s earth, if free men make it, will be truly round: a globe in practice, not in theory.

For those who have seen the Earth from space, and for the hundreds and perhaps thousands more who will, the experience most certainly changes your perspective. The things that we share in our world are far more valuable than those which divide us.

We can process this text to build a vocabulary and learn the relationships between words, accumulating “knowledge” to later generate text.

# We’ll use this to keep track of words
# and the probabilities between them.
knowledge = {}
# Split the text into sentences (sentenceuments).
# Clean up line breaks, lowercase everything, and remove empty strings.
sentences = filter(None, text.lower().replace(‘\n’, ”).split(‘.’))
# Generate the “knowledge”.
for sentence in sentences:
# Split the sentence into words.
# Splitting on whitespace is a decent approach.
# We also remove empty strings.
words = filter(None, sentence.split(‘ ‘))
# We want to keep track of the start and end
# of sentences so we know where to start and end.
words.insert(0, ‘<start>’)
for idx, word in enumerate(words):
if idx < len(words) – 1:
entry = knowledge.get(word, {})
# Look at the next word so we can
# build probabilities b/w words.
next = words[idx+1]
# Increment the count of this word
# in the knowledge.
if next not in entry:
entry[next] = 0
entry[next] += 1
knowledge[word] = entry

Speak your mind

With training complete, you can start to use this accumulated knowledge to generate sentences.

Sentence generation works like the Markov chain example above. Begin with “<start>”; then look at the words that have started a sentence and randomly select one. Next, look at the word selected. What word comes after this word? Randomly pick one, and so on.

def generate():
# Start with the start word.
sentence = [‘<start>’]
# Start picking words for the sentence!
while(sentence[-1] != ‘<stop>’):
word = weighted_choice(knowledge[ sentence[-1] ])
except KeyError:
# Join the sentence, with a period for good measure.
return ‘ ‘.join(sentence[1:-1]) + ‘.’
def weighted_choice(choices):
Random selects a key from a dictionary,
where each key’s value is its probability weight.
# Randomly select a value between 0 and
# the sum of all the weights.
rand = random.uniform(0, sum(choices.values()))
# Seek through the dict until a key is found
# resulting in the random value.
summ = 0.0
for key, value in choices.items():
summ += value
if rand < summ: return key
# If this returns False,
# it’s likely because the knowledge is empty.
return False


With this small set of text, we can get some fairly coherent sentences:

the things that we share in science class.

never in theory.

the qualities of the eagle has landed.

the world are reading about you in history class.

Get smart(er)

Markov chain generation is a nice approach, but after a while grammaticality gets important. When working with much larger sets of text, you might see more gibberish surface. There are more sophisticated approaches that have greater assurances of grammaticality, such as context-free grammars, where you define certain grammatical rules that are adhered to by a recursive algorithm. Such a technique requires a bit more care and craft, but you can expect smarter-sounding results, and perhaps uncover some real pearls of wit and wisdom.

Complete code is available here.

Francis Tseng is an interactive developer based out of IDEO’s New York office. He is currently working on Argos, a modern news processing platform for automating the context of stories. You can follow him on Twitter @frnsys and his bot @pub_sci.