Text Analysis on the State of the Union

First up in this episode: a crash course in natural language processing, and important steps if you want to use machine learning techniques on text data. Then we'll take that NLP know-how and talk about a really cool analysis of State of the Union text, which analyzes the topics and word choices of every President from Washington to Obama.

Relevant links:

Text Analysis on the State of the Union
Linear Digressions

Paradigms in Artificial Intelligence

Artificial intelligence includes a number of different strategies for how to make machines more intelligent, and often more human-like, in their ability to learn and solve problems. An ambitious group of researchers is working right now to classify all the approaches to AI, perhaps as a first step toward unifying these approaches and move closer to strong AI. In this episode, we'll touch on some of the most provocative work in many different subfields of artificial intelligence, and their strengths and weaknesses.

Relevant links:

paradigms of AI produced_1

Survival Analysis

Survival analysis is all about studying how long until an event occurs--it's used in marketing to study how long a customer stays with a service, in epidemiology to estimate the duration of survival of a patient with some illness, and in social science to understand how the characteristics of a war inform how long the war goes on.  This episode talks about the special challenges associated with survival analysis, and the tools that (data) scientists use to answer all kinds of duration-related questions.

Survival Analysis
Linear Digressions

Gravitational Waves

All aboard the gravitational waves bandwagon--with the first direct observation of gravitational waves announced this week, Katie's dusting off her physics PhD for a very special gravity-related episode.  Discussed in this episode: what are gravitational waves, how are they detected, and what does this announcement mean for future studies of the universe.

Relevant links:
http://www.nytimes.com/2016/02/12/science/ligo-gravitational-waves-black-holes-einstein.html
https://www.ligo.caltech.edu/news/ligo20160211

Gravitational Waves
Linear Digressions

The Turing Test

Let's imagine a future in which a truly intelligent computer program exists.  How would it convince us (humanity) that it was intelligent?  Alan Turing's answer to this question, proposed over 60 years ago, is that the program could convince a human conversational partner that it, the computer, was in fact a human.  60 years later, the Turing Test endures as a gold standard of artificial intelligence.  It hasn't been beaten, either--yet.

Relevant links:
https://en.wikipedia.org/wiki/Turing_test
http://commonsensereasoning.org/winograd.html
http://consumerist.com/2015/09/29/its-not-just-you-robots-are-also-bad-at-assembling-ikea-furniture/

Turing Test
Linear Digressions

Item Response Theory: How Smart ARE You?

Psychometrics is all about measuring the psychological characteristics of people; for example, scholastic aptitude.  How is this done?  Tests, of course!  But there's a chicken-and-egg problem here: you need to know both how hard a test is, and how smart the test-taker is, in order to get the results you want.  How to solve this problem, one equation with two unknowns?  Item response theory--the data science behind such tests and the GRE.

Relevant links: 
https://en.wikipedia.org/wiki/Item_response_theory

Item Response Theory
Linear Digressions

Great Social Networks in History

The Medici were one of the great ruling families of Europe during the Renaissance.  How did they come to rule?  Not power, or money, or armies, but through the strength of their social network.  And speaking of great historical social networks, analysis of the network of letter-writing during the Enlightenment is helping humanities scholars track the dispersion of great ideas across the world during that time, from Voltaire to Benjamin Franklin and everyone in between.

Relevant links:
https://www2.bc.edu/~jonescq/mb851/Mar12/PadgettAnsell_AJS_1993.pdf
http://republicofletters.stanford.edu/index.html

Medicis
Linear Digressions

How Much to Pay a Spy

A few small encores on auction theory, and then--how can you value a piece of information before you know what it is?  Decision theory has some pointers.  Some highly relevant information if you are trying to figure out how much to pay a spy.

Relevant links:
https://tuecontheoryofnetworks.wordpress.com/2013/02/25/the-origin-of-the-dutch-auction/
http://www.nowozin.net/sebastian/blog/the-fair-price-to-pay-a-spy-an-introduction-to-the-value-of-information.html

How Much to Pay a Spy
Linear Digressions

Sold! Auctions Part 2

The Google ads auction is a special kind of auction, one you might not know as well as the famous English auction (which we talked about in the last episode).  But if it's what Google uses to sell billions of dollars of ad space in real time, you know it must be pretty cool.

Relevant links:
https://en.wikipedia.org/wiki/English_auction
http://people.ischool.berkeley.edu/~hal/Papers/2006/position.pdf
http://www.benedelman.org/publications/gsp-060801.pdf

Auctions Part 2
Linear Digressions

Going Once, Going Twice: Auctions Part 1

The Google AdWords algorithm is (famously) an auction system for allocating a massive amount of online ad space in real time--with that fascinating use case in mind, this episode is part one in a two-part series all about auctions.  We dive into the theory of auctions, and what makes a "good" auction.   

Relevant links:
https://en.wikipedia.org/wiki/English_auction
http://people.ischool.berkeley.edu/~hal/Papers/2006/position.pdf
http://www.benedelman.org/publications/gsp-060801.pdf

Auctions Part 1
Linear Digressions

Unlabeled Supervised Learning--whaaa?

In order to do supervised learning, you need a labeled training dataset.  Or do you...?

Relevant links:
http://www.cs.columbia.edu/~dplewis/candidacy/goldman00enhancing.pdf

Unlabeled Supervised Learning
Linear Digressions

Zipf's Law

Zipf's law is related to the statistics of how word usage is distributed.  As it turns out, this is also strikingly reminiscent of how income is distributed, and populations of cities, and bug reports in software, as well as tons of other phenomena that we all interact with every day.

Relevant links:
http://economix.blogs.nytimes.com/2010/04/20/a-tale-of-many-cities/
http://arxiv.org/pdf/cond-mat/0412004.pdf
https://terrytao.wordpress.com/2009/07/03/benfords-law-zipfs-law-and-the-pareto-distribution/

Zipf's Law
Linear Digressions