Archive for October 17th, 2015

October 17, 2015

Unstructured Data

noisy text


Unstructured Data refers to information that is not organized in a predefined manner. Properly formated computerized data is stored in a database (making it easily retrievable) and labeled with metadata (‘data about data,’ e.g., author, subject, size). Unstructured information has missing or conflicting metadata and may lack contextual clues that make it difficult to understand using traditional programs.

Techniques such as data mining, Natural Language Processing (NLP), and ‘noisy-text’ analytics provide different methods to find patterns in, or otherwise interpret, this information. NLP is a field in Artificial Intelligence, related to linguistics that attempts to program computers to understand human languages. There is considerable commercial interest in the field because of its application to news-gathering, text categorization, voice-activation, archiving, and large-scale content-analysis.

read more »