Deep Learning

Watson

Deep learning refers to a sub-field of machine learning (systems that examine data, from sensors or databases, and identify complex relationships) that is based on learning several levels of representations, corresponding to a hierarchy of features or factors or concepts, where higher-level concepts are defined from lower-level ones, and the same lower-level concepts can help to define many higher-level concepts.

Deep learning is part of a broader family of machine learning methods based on learning representations. An observation (e.g., an image) can be represented in many ways (e.g., a vector of pixels), but some representations make it easier to learn tasks of interest (e.g., is this the image of a human face?) from examples, and research in this area attempts to define what makes better representations and how to learn them.

Several learning algorithms, mostly unsupervised learning algorithms (which find hidden structure in unlabeled data), aim at discovering better representations of the inputs provided during training (feeding the initial data). It has been argued that an intelligent machine is one that learns a representation that disentangles the underlying factors of variation that explain the observed data. A Google team led by Andrew Ng and Jeff Dean created a neural network that learned to recognize higher-level concepts, such as cats, only from watching unlabeled images taken from YouTube videos.

Deep learning algorithms are based on distributed representations, a notion that was introduced in the 1980’s with connectionism (modeling mental or behavioral phenomena as the emergent processes of interconnected networks of simple units). Fundamentally, a distributed representation is one in which meaning is not captured by a single symbolic unit, but rather arises from the interaction of a set of units, normally in a network of some sort. In the case of the brain, the concept of ‘grandmother’ does not seem to be represented by a single ‘grandmother cell,’ but is rather distributed across a network of neurons. The underlying assumption behind distributed representations is that the observed data were generated by the interactions of many factors (not all known to the observer), and that what is learned about a particular factor from any given configuration can often generalize to other configurations.

Deep learning adds the assumption that these factors are organized into multiple levels, corresponding to different levels of abstraction or composition: higher-level representations are obtained by transforming or generating lower-level representations. The relationships between these factors can be viewed as similar to the relationships between entries in a dictionary or in Wikipedia, although these factors can be numerical (e.g., the position of the face in the image) or categorical (e.g., is it human face?), whereas entries in a dictionary are purely symbolic. The appropriate number of levels and the structure that relates these factors is something that a deep learning algorithm is also expected to discover from examples.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s