Video Big Data Whitepaper (FREE download)

video big data videospace

The term "Video Big Data" is rarely heard of. The reasons are pretty simple: 

  1. It's difficult to extract data from videos
  2. It's difficult make sense of unstructured video data

Therefore, it is not an understatement to say that video is the most difficult medium to search and extract intelligence from. However, given the amount of videos are that generated daily in the public domain (e.g. YouTube) and private domain (e.g. broadcasters, CCTV, education, etc.), it is also not an understatement to say that video is the King of Content. 

The objective of Big Data is to gain Business Intelligence. Video Big Data is no different. The obvious difference is the source and the type of data that can be extracted out from videos.

This Video Big Data Whitepaper aims to explain how we can extract value and intelligence from videos with a 3 step approach:

  1. Extract video data 
  2. Transform unstructured video data
  3. Analyse to data into intelligence 

With this whitepaper, we hope to share some of our knowledge and experiences working with Video Big Data. From our calculations, we estimate that Video Big Data will dwarf Big Data as we know it. Thus, the importance of this whitepaper. We hope you enjoy and benefit from it!

Yours sincerely,

The VideoSpace Team

Bringing AI Video Search to Broadcast Asia

alex-chan-babbobox-videospace-broadcast-asia.jpg

We are super excited about bringing our A.I. Video Search to Broadcast Asia after starting out in UK, US and China in 2018. It feels so good to be home!

Babbobox CEO, Alex Chan will be talking about "The Age of AI" and how it will transform the entire broadcast and media industry with Video Search, Personalized Content and Video Big Data.

We will also be making a big announcement and showcasing it during the show! We are pretty sure it will blow you away! So do drop by and say Hi!

Video Big Data (Part 3) – From Mess to Intelligence?

The objective of Big Data is to gain Business Intelligence. Video Big Data is no different. The obvious difference is the source and the type of data that can be extracted out from videos. In there, lies the main challenges - Extraction, Transformation and Analysis.

videospace-video-big-data.png

In this instalment, we will explain why Artificial Intelligence is central to the “mess” in video big data.

In the first installment (Part 1), we explained:

  • Why Video Big Data will absolutely dwarf current Big Data, and
  • How Video is the most difficult medium to extract data

In the previous instalment (Part 2), we examined:

  • the kind of data elements that we can extract from videos (speech, text, objects, activities, motion, faces, emotions)

But first, let’s examine why there is a mess in video data. The short explanation is because a large part of video data is unstructured data. In particular, data from speech and text. For example, text extracted from a 30 minutes news segment could cover multiple topics and events, mentioned numerous places and persons. To add to the complexities, we have to time-aligned when these words are spoken. In many ways, text (e.g. slide presentations that appear in videos) are the same.

Thus, we have to answer 2 key question:

  1. How do we meet sense of ‘messy’ video data?
  2. How can we extract knowledge or intelligence from that mess?

The answer lies in another form of Artificial Intelligence (A.I.) - the study of Natural Language Processing (NLP). That is because it can process and attempt to make sense of unstructured text in the following areas:

  • Topic detection
  • Key phrase extraction
  • Sentiment analysis

The reason is because NLP can be used to turn unstructured video data into structured data. Only then can we start making sense and manipulating the data into either intelligence or actionable items like alerts, triggers, etc.

The field of Video Big Data is just starting. Without the advancement in multiple areas of Artificial Intelligence in multiple areas (Speech Recognition, Computer Vision, Facial Analysis, Text Analytics, etc.), Video Big Data wouldn’t even exist as it needs these fields to work in tandem or in sequence.

Given the rate that we are producing videos, alongside our ability to extract video data using A.I. The only way is up and we are not even close to uncovering the tip of Video Big Data iceberg.

Video Big Data will be bigger than BIG. 

VideoSpace will be right in the middle of it all. Let’s put this prediction into a time capsule and revisit it in a few years.

Video Big Data (Part 2) - What kind of Video Data?

videospace-video-big-data.jpg

In the last installment, we explained:

  • Why Video Big Data will absolutely dwarf current Big Data
  • How Video is the most difficult medium to extract data from

Which explains why Video Big Data remains a largely unexplored field. But also means the intense opportunities available because we have not even scrap the tip of this huge data iceberg.

In this installment, we will examine the kind of data elements that we can extract from videos. 

1. Speech
In a hour of video, a person can say up to 9,000 words. So imagine the amount of data just from speech alone. However, the process of transcribing speech is filled with problems and we are currently only starting to get an acceptable level of accuracy.

2. Text
Besides speech, text is probably the second most important element inside videos. For example, in a presentation or lecture, besides speech the speaker would augment the session with a set of slides. Or news tickers appearing during a news broadcast. 

3. Objects
There are thousands of objects inside a video within different timeframe. Therefore, it can be quite challenging to identity what objects are in the video content and in which scene they appear in. 

4. Activities
The difference between video and still images is motion. Different video scenes contain complex activities, such as “running in a group” or “driving a car”. Ability to extract activities will give a lot of insight what the videos are about. This includes offensive content that might contain nudity and profanity.

5. Motion
Detecting motion enables you to efficiently identify sections of interest within an otherwise long and uneventful video. That might sound simple, but what if you have 10,000 hours of videos to review every night? That’s a near impossible task to eyeball every video minute.

6. Faces
Detecting faces from videos adds face detection ability to any survelliance or CCTV system. This will be useful to analyze human traffic within a mall, street or even a restaurant or café. When we include facial recognition, it opens up another data dimension.

7. Emotion
Emotion detection is an extension of the Face Detection that returns analysis on multiple emotional attributes from the faces detected. With emotion detection, one can gauge audience emotional response over a period of time.

This list of video data is certainly not exhaustive but is a definitely a good starting point to the field of Video Big Data. In the next installment, we will examine some of the techniques used to extract these video data. 

Yours sincerely,

The VideoSpace Team

Video Big Data (Part I) - An Introduction

videospace-video-big-data.jpg

Fact: YouTube sees more than 300 hours of videos uploaded every minute. That's 18,000 years worth of videos in a year. And that's just YouTube ONLY! If we add all other videos in the public domain, we wouldn't even know where to start with the numbers. 

However, the even bigger numbers are actually hidden in the private domain from sources like broadcasters, media companies, CCTVs, GoPros, bodycams, smart devices, etc. We are recording videos at an unprecedented pace and scale. 

There is one word to describe this - BIG!

Which brings us to Video Big Data. Or should I say the lack of it. Even the term "Video Big Data" is rarely heard of. The reason is pretty simple - this stems from the inability to extract video data and making sense of it. But there is so much information embedded inside videos that is waiting to be discovered, it's an absolute goldmine! 

So the real question is... how can we extract value from videos?

However, the problem with video is that it is the most difficult medium to work with. There are a few reasons why: 

  • There are so many elements inside a video (speech, text, faces, objects, etc)
  • It is not static.
  • It is very difficult to extract the various elements of video data. 
  • Each video element requires a different data extraction technique.
  • It is very difficult to make sense of video data because of its unstructured nature.
  • It's expensive to extract data at scale

These problems are real and is preventing the arrival of Age of Video Big Data. But there is hope yet. With substantial use of Artificial Intelligence, VideoSpace is beginning to crack this enigma. 

In the next segment of this "Video Big Data" Series, we will examine how we can tackle these problems and extract value from videos. 

Launch Announcement - “Translated A.I. Video Search” to break Language Barriers for Video Search in Washington D.C.

Washington DC, 5 March 2018: - Babbobox officially announces the launch of the World’s First “Translated A.I. Video Search” at the Microsoft Tech Summit held in Washington DC, United States.

Humans are not only divided geographically, but also by language. Today, we are lifting this language barrier and allowing video search in a language that you do not understand.

Imagine you are doing research on Japanese culture and the only language that you know is English. How would you research videos that are in Japanese? The simple answer is, you can’t. That’s because even with the best search engines today, can only search for words in the same language that you enter. Meaning, if you key in English and there will not any results because the videos are in Japanese.

Language is the BIGGEST Search barrier today.

What our “Translated A.I. Video Search” does is that it allows you to search in another language. Meaning, you can search a Japanese video in English (or any other language that you choose) and we will bring you to exactly where this word is said in the video. We can do that in 600 different language pairs.

What this means is that these videos in Japanese are no longer limited by the language barrier and the knowledge within these videos are now made available not just to watch, but also to be search.

Unleash your video library’s true potential by allowing your audience to search your videos in their own languages. In the process, we also automatically create a massive amount of Video SEO in multiple languages, thus, allowing other search engines to index and search your videos!

From the extracted video data, we use various NLP (Natural Language Processing) techniques to transform the unstructured big data into a language that you understand, where you can further analysis, turning data into intelligence.

As of today, our Search Engine has the following languages supported:

  • Speech Recognition - 12 languages
  • Video OCR - 26 languages
  • Documents - 100+ languages
  • Translated Search – 600+ language pairs 

About Babbobox (website: www.babbobox.com)
Babbobox enables organisations to unleash potential value in their digital assets by using A.I. Search. Babbobox developed Four World’s First breakthrough A.I. solutions, including our Unified Search Engine that combines numerous advanced technologies like Speech Recognition, Video OCR, Image Analysis, Artificial Intelligence, Translation and Enterprise Search, etc. We are the only solution today that empowers you with the ability to index and search across all digital assets (documents, images, audio and videos) on a single platform. With the extracted data, we turn it into Unstructured Big Data by analysis using various Natural Language Processing (NLP) techniques.

We transform your unstructured data into intelligence. Made possible by A.I.

To find out more, please CONTACT US

Bringing A.I. Video Search to DC

mstechsummit_babbobox

Super excited about bringing our A.I. Video Search to Washington D.C at the Microsoft Tech Summit. We hope to do Asia and Singapore proud (as it looks like we are the only ones)!

On top of that, we are also super excited about the announcement that we will be making during Tech Summit. We believe it's another WORLD'S FIRST! This will bring the world closer, in terms of knowledge, language and data.

If you think our ability to search 7 video elements (speech, text, objects, motion, faces, emotions, offensive content) awesome...

What we are going to announce next will blow you away! Watch this space!

Babbobox featured on The Record for their World's First

Our launch - "World's First Video Search Engine with Interactive Results" in Birmingham (UK) was picked up by The Record and given some airtime. It feels great to be picked up and be given that bit of recognition for doing what we do to a global audience. 

Click HERE for the article.

Note: The Record is a global magazine featuring the Best of Enterprise Technology on The Microsoft Platform.

Thank you Birmingham... Hello Washington!

Finally, 2 intensive days of MS Tech Summit in Birmingham... done and dusted. Absolutely the right decision to come to UK to do this. Massive event! Exactly the right platform to showcase our Video Search technologies. 

babbobox-ceo-alex-chan
babbobox-ceo-alex-chan-clevertime-joao-penha-lopes
mstechsummit-birmingham

Caught up with Scott Guthrie. Held so many in-depth discussion with so many UK enterprises, universities, government agencies, etc. If we have our way, our stuff might even end up in Scotland Yard! So let's see... 

Good-bye Birmingham... Next stop, Trump-capital Washington in March! I'm excited already...

ANNOUNCEMENT: Global Launch of World’s First “Video Search Engine with Interactive Results”

Birmingham, 24 January 2018: - Babbobox and Infini Videos officially announce the launch of the world’s first  “Video Search Engine with Interactive Results” at the Microsoft Tech Summit held in Birmingham, United Kingdom today.

Both tech start-ups Babbobox and Infini Videos believe the future of video search lies in immediate content relevance. Video has proven to be the hardest medium to index because there is so much detail.  Aside from the metadata that an editor may have typed in, most archived videos are essentially unstructured data.  Often, this is because transcripts are not made and scripts are lost, or there isn’t sufficient timing information to align with the video.

To make sense of this data, techniques such as Speech Recognition, Video OCR, Image analysis and various Cognitive and Artificial Intelligence are applied to extract data from media. Since much of the videos in archives contains speech, therefore automatic transcription is a great first step in extracting data from media. With the transcript, an editor is able to search for timecodes in source videos, scrub through those sources, and manually locate viable scenes. This manual process is time-consuming, and not suitable for public use since a text search result does not make for a watchable video.

The innovative “Video Search Engine with Interactive Results” that Babbobox and Infini Videos have co-produced, allows a user to search a topic, and immediately view the search results as an interactive video. One is able to immediately choose scenes within the video that are relevant to the search. With the full-automation of the indexing and the output scene selection, productivity is enhanced and content for the public will scale up. The platform promises increased productivity as it will provide users with fine control over topics and sources, and allow editors to focus on the direction of the stories.

Babbobox and Infini Videos believe that this will be the future of Video Search.

About Babbobox (website: www.babbobox.com)
Babbobox launched two World’s First - World’s First True Unified Search Engine where it has the ability to index and “Search Everything” (all formats including video, audio and documents). Thus, positioning Babbobox to become "The Next Generation of Intelligent Storage". With VideoSpace, we created the World’s Frist Video-Search-as-a-Service. Thus, forming the foundations to enable a new breed of video services for the world.

About Infini Videos (website: www.infinivideos.com)
Infini Videos is a B2B online technology platform for the creation and delivery of HTML5 interactive videos. Infini Videos makes it easy to create engaging interactive videos as well as to access the rich data analytics offered on the platform. The company currently offers branching aka “Choose-your-own-adventure” and 360-degree types of interactivity. In addition to the technology platform, the company also provides specialized creative services as a one-stop solution for clients. Infini Videos is part of the Mediacorp’s MediaPreneur Incubator programme.

To find out more, please CONTACT US

Bringing VideoSpace to the World... Starting with Birmingham

MSTechSummit_Birmingham_728x90.png

Many asked why we were not present at Microsoft Tech Summit Singapore this week... The reason is simple, that's because we will be in the Birmingham leg (24 to 25 Jan) next week instead!

I promise that we will be making a BIG announcement next week. And it will be another World's First! (Hint: We are bringing Video Search to another level...)

babbobox tech summit birmingham

It definitely feels great to see Babbobox listed as part of the invited companies to show our wares at this Microsoft Tier-1 event. On top of that, I can also that we are representing Asia alongside Yamaha.

If any of you do happen to be in Tech Summit Birmingham, do drop by and say Hi!

Episode X - A New Search (and a Forceful 2018!)

CLICK to watch how VideoSpace aid the Rebels in their search for the New Death Star...

In search of the new Death Star plans, the Rebels secretly planted thousands of cameras within the Empire in hope of finding clues. However, with thousands of videos and impending deadline, how can the rebels possibly search for information within videos?! Fortunately...

In a galaxy, not far away... 

VideoSpace Search Engine is helping the Rebels searches inside videos for speech, text, objects, motion, faces, emotions (note: not for stormtroopers since they wear helmets) automatically! Thus, finding vital clues for the New Death Star plans...

In the meantime, as we help the Rebels fight the Evil Empire, we wish you a...

Forceful 2018! 

From, 

All of us at Babbobox

May The Search Be With You!

What is a Video Search Engine? Part V – Detecting Objects

After discussing the ability to video search for Speech, Text, Motion, Face and Emotions, the next big element class is “Objects”.

video search engine object recognition

Essentially, it is an analysis tool that extracts metadata from a video and defines important objects or entities inside a video. Object detection has come to a point where it can detect objects or entities (like cat, flower, computer, etc.) and pinpoint the exact location there is the scene(s) appear inside the video.

The purpose of Object Detection is to help us better understand the overall content of our videos based on objects detected within the video. It also gives uas a time-based understanding on when each object appears within the video. Object Detection basically uses tagging and domain-specific models to identify content and label it with confidence.

Like other vision type of video searches, Computer Vision scientists developed Object Recognition based on the deep learning technology developed using deep neural network models to detect and label thousands of objects and scenes in videos.

video search engine object detection

With Object Detection, it is now possible to search every moment of every video file in your video library and catalog to find every objects as well as its importance.

To find out more about how you can detect objects inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

Announcement – Adding “Object Detection” to VideoSpace Search Engine.

We are delighted to announce that we are adding “Object Detection” capability to our Video Search Engine.

This is a significant milestone as it enhances our already extensive video search capabilities. Thus, establishing VideoSpace Search Engine’s as one of the most powerful Video Search Engine in the world.

Thus, adding to the list of elements that we can index and search inside videos:

  • Objects (NEW!)
  • Speech
  • Text
  • Motion
  • Face
  • Emotion
  • Offensive Content
  • Custom (e.g. Logos, Objects, Landmarks, etc.)
videospace search engine object detection

The new “Object Detection” feature enables us to detect entities (like cat, flower, computer, etc.) and pinpoint the exact location there is the scene(s) appear inside the video.

For example, in this short four minute VIDEO, we are able to:

  • detect 111 unique entities
  • mark exactly where these 111 entities appear

To find out more about Object Detection in videos, please click HERE

What is a Video Search Engine? Part IV – Detecting Faces and Emotions

The ability to detect faces has been around for some time for real time CCTV systems. However, these systems remain out of reach for many as they are expensive and would need specialized implementation that would drive the cost up higher. Therefore, detecting faces from videos instead is a viable alternative because it instantly adds face detection ability to any CCTV system.

Detecting faces allows you to count, track movements by detecting unique faces. Face detection finds and tracks human faces within a video. Multiple faces can be detected and subsequently be tracked as they move around.

video search engine face detection

This will be useful to analyze human traffic within a mall, street or even a restaurant or café. It would be possible to identify and track movement of unique human faces. Therefore, it is possible to perform a headcount of human traffic within the video. 

Beyond detecting faces, it is more possible to detect emotions. Emotion Detection is an extension of the Face Detection video search that returns analysis on multiple emotional attributes from the faces detected, for example happiness, sadness, fear, anger, etc.

video search engine emotion detection

Recognizing the emotion of a person or crowd over time based allows us to track the emotional highs and lows within a particular time-frame. It also allows us to track someone’s emotions at a specific point of time. Answering questions like, how did the crowd react when the President makes a particular point? With emotion detection, it can be applicable to gauge audience responses in scenarios like:

  • Speeches
  • Focus groups
  • Group reactions
  • Interviews

Emotion detection can form a very good baseline for the scenarios above.

To find out more about how you can detect faces and emotions inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part III – Detecting Motion

In Part I and II, we examined how we would be able to search Speech and Text inside videos. In Part III, we will look at one of the first names given to videos – “Motion” Picture. 

So, all videos have motion? That may not be true, not all videos have motion (or movement) all the time, especially in the case of security and surveillance videos.

video-search-engine-motion-detection.jpg

Detecting motion in videos enables you to efficiently identify sections of interest within an otherwise long and uneventful video. That might sound simple with a single video, but what if you have 10,000 hours of videos to review every night? That’s a near impossible task to eyeball every video minute.

Motion detection can be used on static camera footage to identify sections of the video where motion occurs.

  • Detect when motion has occurred in videos with stationery backgrounds
  • Eliminate false positives caused because of light changes, shadows, small insects, and others

While there are motion sensors that can detect motion real-time, these systems tend to be expensive. Thus, the reason why most of the CCTV surveillance systems only does recording at best. Therefore, there are many scenarios that does not require real-time motion detection, like detecting a car entering a bus lane during peak hours.

video search engine bus lane detection

Current technology has come to a point where it is able to differentiate between real motion (such as a person walking into a room), and false positives (such as leaves in the wind, along with shadow or light changes). This allows you to generate security alerts from camera feeds without being spammed with endless irrelevant events, while being able to extract moments of interest from extremely long surveillance videos.

To find out more about how you can detect motion inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part II – Searching Text

In Part I, we found out that there are 7099 living languages in the world. That includes both written and spoken only languages. According to Ethnologue (20th edition) out of that 7,099 living languages, 3,866 have a developed writing system.

Which leads us to this second part of our series – Searching Text inside a video. Besides Speech, Text is probably the second most important element where we can extract data from.

For example, in a presentation or talk given by a speaker. Besides speech, the speaker would augment the session with a set of slides. Therefore, besides his voice, text (in the slides) is another set of data that can be captured. This is important because what he says and what he present in the slides can be vastly different.

 Text that can be OCRed during a presentation

Text that can be OCRed during a presentation

The technology to capture these text inside the video is called Video OCR (Optical Character Recognition). Video OCR is derived from OCR, a technology that has been around a long time.

By strict definition, Optical Character Recognition (OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo or from subtitle text superimposed on an image (source: Wikipedia). The first OCR machine that read characters and converted them into standard telegraph code was invented by Emanuel Goldberg in 1914!

Unfortunately, one hundred years on, OCR technology still has some ways to go, especially in the field of adding more language capabilities and recognizing handwriting. However, with more A.I. and Machine Learning, the hope is that researchers can add more capabilities to what OCR can do now.

However, Video OCR is giving OCR a new lease of life by simply adding another dimension – moving images. Given the amount of videos that has never been OCRed before and the amount of videos being generated every day, the potential for Video OCR is immerse.

To find out more about how you can search TEXT inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

What is a Video Search Engine? Part I - Searching Speech

Of all formats, videos are the most difficult to search. Typically, current search engines can only search for "Title" and "Metadata" of the videos, which are manually keyed in by a human. There is no way to search the content inside the video. For example, how do you find a specific piece of news in a news clip? Or specific words that appear inside a video? How can you find them without actually watching the videos yourself?

Before we even get into the question of what is a video search engine, we need to have an understanding what can we search inside a video? Elements can include Speech, Words (or Text), Motion, Emotions, Faces and Objects.

VideoSpace Video Search Engine

To kick of this “What is a Video Search Engine?” series, let’s tackle the most obvious of the elements – Speech.

In an hour, a person can say up to 9,000 words. Given the rate of videos are being produced today, that’s a lot of words. According to The Ethnologue catalogue of world languages, there are currently 7099 living languages. Obviously, Speech Recognition technology has not been able to keep with these vast number of languages. However, the good news is (depending on you see things), just 23 languages account for more than half of the world’s population.

 Languages in the World (Source: www.ethnologue.com)

Languages in the World (Source: www.ethnologue.com)

On the technical aspect of searching speech in videos, the following process is required: 

  1. Transcribe (Speech-to-Text) – transcribing speech in the video
  2. Index - make the speech searchable
  3. Search - brings the users to exactly where the search terms are in the video.

The processes involved might sound simple, but the process of transcribing speech is filled with problems. There are factors that can affect the accuracy of speech recognition. For example:

  • heavy localized accent
  • low speech volume
  • bad diction
  • heavy background noise
  • multiple voices speaking at the same time

With the above in consideration, there are a lot of videos that are “not suitable” for machine transcribing: movies, TV shows, anything with mixed audio and sound effects, poorly recorded content with background noise (hiss).

To find out more about how you can search speech inside your videos, visit VideoSpace Video Search Engine or our Video-Search-as-a-Service.

Video Platform from a DevOps prespective - Babbobox CTO, Sabrina Lim at CloudExpo Asia 2017

Babbobox CTO, Sabrina Lim (yes.. she's a female CTO), will be speaking at CloudExpo Asia - DevOps Live at 12.05pm on 12 Oct.

Babbobox CTO Sabrina Lim

Essentially, she'll be speaking about a combination technologies covering Media, Search, A.I., Cognitive and unstructured data (all the stuff that we are using) from the DevOps perspective. So yes... it'll be a bit geeky and techie!

So if you are at the show, do drop by and say hi!

UK, US... Here we come! Catch us at Microsoft Tech Summit Birmingham and Washington, DC!

Time to get out there and show the world our Babbobox Search Engine

We are excited to be in invited to participate in the exclusive Microsoft Tech Summit. As part of our expansion plan, we have decided take part on the events in Birmingham (UK) and Washington DC (US).

Here are the dates and venue details:

tech_summit_birmingham

Birmingham, UK

Date: January 24-25, 2018
Venue: National Exhibition Centre (NEC) Birmingham B40 1NT United Kingdom
URL: https://www.microsoft.com/en-gb/techsummit/birmingham

tech_summit_washingtondc

Washington, DC, USA

Date: March 5-6, 2018
Venue: Ronald Reagan Building and International Trade Center
1300 Pennsylvania Avenue NW
Washington, DC 20004
URL: https://www.microsoft.com/en-us/techsummit/washington-dc

If you are in town then, do drop by and say hi! Watch this space for further Tech Summit updates!

#babbobox #videospace #videosearchengine #microsoft