
MIRACLE and the Content Analysis Engine (CAE)
The Multimedia Information Retrieval by Content (MIRACLE) project encompasses the technologies and interfaces required for multiple types of video search, including not only text content searches (using transcripts, subtitles, closed captions, and speech recognition) but also searches based on visual information and speaker segmentation.
Visual information includes both face clustering and scene change detection, allowing users to search video by selecting a face or scene from a list. Speaker segmentation, which differentiates among all speakers, allows users to find each instance where a particular person is speaking.

Indexing is performed by the MIRACLE content analysis engine (CAE), which integrates multiple technologies and Research projects to index most multimedia content — YouTube, TV broadcasts, full-length movies, home videos, and even podcasts — and then store the content and index information in a database. The CAE indexes video and multimedia using many different types of content metadata, described in more detail in the indexing and representation techniques page.
Using our 24-hour content acquisition system (URSA), the CAE processes a large number television channels automatically. Using the MIRACLE pipeline above, this content can be searched semantically, for visual duplicates, and analyzed through several aggregation methods. Interesting patterns can be discovered by looking at content with these aggregation methods. Using MIRACLE and URSA, the commercial start times and commercial duration of two television talk shows on NBC (Late Night with Jimmy Fallon and The Tonight Show with Jay Leno) were analyzed over about six months.

As the illustration indicates, commercials are aired at various times throughout the program on Late Night. This variance indicates that editors of Late Night often chose to delay a commercial break until a skit or interview segment was completed. This behavior can be contrasted against that of the Tonight Show which has a very regular time for commercial placement. This example application could aide content producers to verify the regularity of their program and it could aide users to better plan their viewing habits. Fore more information about other metadata generated by the CAE please view our indexing and representation techniques page.
Yes! Continuing collaboration with our product development and enterprise businesses unit has produced an enterprise level Content Indexing Service (CIS) as part of AT&T's Internet Video Management Service (iVMS) that provides much of the same functionality of the CAE while adding additional provisions for user authentication, content hosting, and the ability to generate different amounts of content metadata according to customized profiles. For more information on the availability and cost of CIS, please click here.
A wide variety of video formats are handled by the Content Analysis Engine and played back on almost any device. Numerous device types are supported for searching and viewing MIRACLE-indexed video: web browsers on any desktop platform, game console, and smartphone, with the iPhone having a separate speech-enabled mobile application called iMIRACLE. Investigations for app playback and deployment on new platforms can be found on our delivery and consumption page.
MIRACLE also provides web-based search interfaces and a straight-forward search API. Metadata information that MIRACLE generates is stored in an XML format so developers can write their own interfaces to access the content. With these simple interfaces, the services MIRACLE provides, which continue to grow, can easily be added to internet-based mash-ups allowing developers to spend time making fun and interesting new applications and not re-implementing MIRACLE services.
For a demonstration of searching within a MIRACLE-indexed video for a spoken phrase, click this link for a search of the word "sequences" in the Neil Sloane talk. The video starts from the first instance of the word sequences, with other instances marked in the clickable interactive timeline and highlighted in the transcript. Additionally, all of the automatically segmented phrases in the transcript are clickable, allowing an immediate playback jump to that time in the video.
CAE Services expose a cutting-edge suite of processing and metadata generation routines over simple HTTP interfaces so that it can be deployed "in the cloud" allowing the services to accomodate large-scale applications with ease. The illustration bellow provides a high-level example of the benefits and applications that the CAE Services provide.

Many of the CAE Services have been defined to operate on a very granular level (usually single images or other and low-bandwidth resources) so that they can distribute the request efficiently and respond with minimal latency. Other services, like CIS have been created to accomodate general video files, which may require lengthy processing times and large data transfers. Some functions of the CAE services are described below.
Technical Documents
AT&T Research at TRECVID 2011
Eric Zavesky, Zhu Liu, Behzad Shahraray, Ning Zhou
TRECVID Workshop,
2011.
[PDF]
[BIB]
NIST Copyright
AT&T RESEARCH AT TRECVID 2010
Eric Zavesky, Behzad Shahraray, Zhu Liu, Neela Sawant
TRECVID 2010 Workshop,
2010.
[PDF]
[BIB]
NIST Copyright
Project Members
Sub-Projects
iMIRACLE - Content Retrieval on Mobile Devices with Speech
Visual Semantics for Intuitive Mid-Level Representations
eClips - Personalized Content Clip Retrieval and Delivery
Related Projects
AT&T Application Resource Optimizer (ARO) - For energy-efficient apps
CHI Scan (Computer Human Interaction Scan)
CoCITe – Coordinating Changes in Text
E4SS - ECharts for SIP Servlets
Scalable Ad Hoc Wireless Geocast
Graphviz System for Network Visualization
Information Visualization Research - Prototypes and Systems
Swift - Visualization of Communication Services at Scale
AT&T Natural VoicesTM Text-to-Speech
StratoSIP: SIP at a Very High Level
Content Augmenting Media (CAM)
Content Acquisition Processing, Monitoring, and Forensics for AT&T Services (CONSENT)
Social TV - View and Contribute to Public Opinions about Your Content Live
Enhanced Indexing and Representation with Vision-Based Biometrics
Visual Semantics for Intuitive Mid-Level Representations
eClips - Personalized Content Clip Retrieval and Delivery
iMIRACLE - Content Retrieval on Mobile Devices with Speech
AT&T WATSON (SM) Speech Technologies
Wireless Demand Forecasting, Network Capacity Analysis, and Performance Optimization