Automatic content recognition

Automatic content recognition (ACR) is an identification technology to recognize content played on a media device or present in a media file. Devices containing ACR support enable users to quickly obtain additional information about the content they see with any user-based input or search efforts. For example, developers of the application can then provide personalized complementary content to viewers.[1]

To start the recognition, a short media clip (audio, video, or both) is selected. This clip could be selected from within a media file or recorded by a device. Through algorithms such as fingerprinting, information from the actual perceptual content is taken and compared to a database of reference fingerprints, each reference fingerprint corresponding to a known recorded work.[2] The database may contain metadata about the work and associated information, including complementary media. If the fingerprint of the media clip is matched, the identification software returns the corresponding metadata to the client application.[3]

Audio based ACR is commonly used in the market. The two leading methodologies are acoustic fingerprinting and watermarking. Another common approach uses video fingerprinting.

Acoustic fingerprinting generates unique fingerprints from the content itself. Fingerprinting techniques work regardless of content format, codec, bitrate and compression techniques.[4] This makes it possible to use across networks and channels. Therefore, it is widely used for interactive TV, second screen application and content monitoring sectors.[5][6] Popular apps like Shazam, YouTube, Facebook,[7] Thetake, WeChat and Weibo are using audio fingerprinting methodology to recognize the content played from a TV and trigger additional features like votes, lotteries, topics or purchases.

In contrast to fingerprinting, digital watermarking requires inserting digital tags containing information about the content into the content itself prior to distribution. For example, a broadcast encoder might insert a watermark every few seconds that could be used to identify to broadcast channel, program id, and time stamp. The watermark is normally inaudible or invisible to the users. Terminal devices like phones or tablets read the watermarks instead of actually recognizing the played content.[8] Watermarking technology is utilized in media protection field to trace where illegal copies originate.[9]

It is expected by Next/Market Insights that 2.5 billion devices will be integrated with ACR technology to provide synchronized live and on-demand video watching experience.[10]

In 2011, ACR technology was applied in TV content by Apple Inc.'s Shazam service, which captured the attention of the television industry. Shazam was previously a music recognition service which recognized music from sound recordings. By utilizing its own fingerprint technology to identify live channels and videos, Shazam extended their business to television programming. In 2012, satellite communications provider DIRECTV partnered with TV loyalty vendor Viggle to provide an interactive viewing experience on the second screen. In 2013, LG partnered with Cognitive Networks (later purchased by Vizio and renamed Inscape), an ACR vendor, to provide ACR driven interaction.[11] In 2015, ACR technology spread to even more applications and smart TVs. Social applications and TV manufacturers like Facebook, Twitter, Google, WeChat, Weibo, LG, Samsung, and Vizio TV have used ACR technology either developed by themselves or integrated from third party ACR providers.[citation needed] In 2016, additional applications and mobile OS embedded with automatic content recognition services were available including Peach, Omusic and Mi OS.[12][13][14]

ACR technology helps audiences easily retrieve information about the content they watched. For smart TVs and applications with ACR technology embedded the audience can check the name of the song which is played or descriptions of the movie they watched.[15] In addition to that, the identified video and music content can be linked to internet content providers for on-demand viewing, third parties for additional background information, or complementary media.

Because devices can be "aware" of content being watched or listened to, second screen devices can feed users complementary content beyond what is presented on the primary viewing screen. ACR technology can not only identify the content, but also it can identify the precise location within the content. Thus, additional information can be presented to the user. ACR can enable a variety of interactive features such as polls, coupons, lottery or purchase of goods based on timestamp.[16]

Real-time audience measurement metrics are now achievable by applying ACR technology into smart TVs, set top boxes and mobile devices such as smart phones and tables. This measurement data is highly essential to quantify audience consumption to set advertising pricing policies.

For advertisers and content owners, it is vital to know when and where their content has been played. Traditionally agencies or advertisers have to manually audit the presentation. At scale it only can be checked through a statistical sampling method. ACR technology enables automatic monitoring of the content played in TV. Information like the time of play, duration, frequency can be achieved without any manual effort.[17][18] Many people have expressed some concern[weasel words] however on the information that these smart TVs are sending out to the companies collecting this data. However there is an option in almost every set to disable this feature.[19]

The alternative approaches are video based automated content recognition technologies. These are a suite of technologies that revolve around the convergence of video and TV Everywhere[20] which will render the audio and digital watermarking methods incapable of handling the millions of unique streams going out and billions of hours of footage to be reviewed with metadata extracted or enriched in relation to the content in real-time. Where acoustic fingerprint fails in its reliance on a database of reference fingerprints. Digital watermarking relies on intrusive frame by frame production stage imprinting on every piece of content.[21] The effectiveness of these techniques have been challenged based on their presumed inability to effectively scale to the amount of video being generated.[22] In practice for monetization and other user based ACR applications the reference database or presence of watermarks only has to cover those videos that are targets of monetization. For example, a video that is hosted on YouTube and viewed only once does not need to be present in a world wide ACR database or be impressed with a watermark.

ACR service providers include ACRCloud, Red Bee Media, Digimarc, Gracenote, Kantar Media, Inscape Data Services, and Shazam.