Top Cognitive Services APIs for Developers

Today, we’re in the midst of the biggest machine empowerment in the history of human kind. The current time may be remembered in history for the birth of the intelligent machine age.

We’re seeing more and more intelligent systems. Major technology corporations have already started building and introducing intelligent machines, intelligent systems, and intelligent apps. Your smart phone isn’t smart anymore. Now, it is an intelligent machine, loaded with intelligent apps, that manages your life, including health, scheduling, reminders, chores, shopping, social, and recommendations.

“Artificial intelligence is the technology of 2017”

This year, major technology conferences including Microsoft Build, SAP Sapphire, Google IO, Apple WWDC, Facebook F8, all had one thing in common: Artificial Intelligence.

Sundar Pichai, Satya Nadella, Mark Zuckerberg, and Tim Cook have announced year 2017 a year of AI and machine learning.

Clearly, in 2017, AI is the “big thing” of today.

Machine learning is a branch of artificial intelligence that gives machines the ability to learn and make decisions by using patterns and algorithms. Machine learning is being used everywhere. Websites such as Facebook and LinkedIn use machine learning to recommend posts and friends to you. Amazon uses machine learning to recommend products and personalize your shopping experience. Smartphones are using machine learning to organize your photos and map recommendations. Email clients use machine learning to recognize spam and filter emails.

“Machine learning is a part of our daily lives.”

Today, the study of the human brain and human behavior seems to be the center of focus for most innovation companies.

Cognitive computing is a discipline of computer science that involves working with the human brain and building platforms for human interaction with computers, using machine learning.

Cognitive services include platforms and APIs to build intelligent, machine learning-based applications to interact with humans and focus on vision, speech, language, and knowledge.

In this article, I share my thoughts about some of the top cognitive services APIs for developers to build intelligent apps.

#4. Apple CoreML

Do you ever wonder how your iPhone or other smart devices knows about your target destination, detecting faces of your friends in images, or suggest words when you text? Let’s call it the magic of machine learning. All smart devices have some kind of machine learning processing behind the scenes.


Image source: Apple

Apple’s CoreML is a native layer built into Apple’s iOS 11 operating system to take advantage of low level technologies like Metal and Accelerate and takes advantage of the CPU and GPU to provide maximum performance and efficiency.


Image source: Apple

CoreML provides APIs for vision, natural language processing, and GameplayKit.

  • Vision API allows app developers to implement computer vision machine learning including face tracking, face detection, landmarks, text detection, rectangle detection, barcode detection, object tracking, and image registration.
  • Natural Language Processing API allows app developers to use machine learning to deeply understand text using features such as language identification, tokenization, lemmatization, parts of speech, and named entity recognition. 

Learn more about Apple CoreML here: https://developer.apple.com/documentation/coreml

#3. Amazon AI

Amazon AI services is a cloud-based API that provides developers the ability to build intelligent applications with speech, natural language, and image processing functionality.


Image source: Amazon

Amazon AI consists of three services – Lex, Polly, and Rekognition.


Image source: Amazon

  • Amazon Lex provides advanced deep learning functionalities of automatic speech recognition (ASR) and natural language understanding (NLU) to enable you to build applications with conversational interfaces, also known as bots or chatbots. Amazon Alexa is powered by Amazon Lex.
  • Amazon Polly is a service to convert text into speech in the same or different languages and voices.
  • Amazon Rekognition is a natural image processing and analysis service including objects, scenes, and face detection, as well as searching and comparing between images. 

Learn more about Amazon AI here: https://aws.amazon.com/amazon-ai/

#2. Google Cloud Platform

Google’s cognitive services APIs are a part of Google’s cloud machine learning platform. It provides a powerful API for developers in the following areas.


Image source: Google

Google cloud provides APIs for computer vision, speech recognition, natural language processing, and translation.

  • Google Cloud Video Intelligence API makes videos searchable and discoverable by extracting metadata, identifying key nouns, and annotating the content of the video.
  • Google Cloud Vision API enables you to understand the content of an image including categories, objects and faces, words, and more. Face recognition is a common use of Vision API.
  • Google Cloud Speech API enables you to convert audio to text by applying neural network models in an easy to use API.
  • Google Natural Language API provides developers functionality to information about people, places, events and much more, mentioned in text documents, news articles or blog posts.
  • Google Cloud Translation API lets developers convert text from a source language to a target language.

Learn more about Google machine learning here: https://cloud.google.com/products/machine-learning

#1. Microsoft Cognitive Services

Microsoft Cognitive Services is the most comprehensive and advanced cloud based API to build intelligent applications. Microsoft Cognitive Services is a set of 29 different APIs that provides functionality to implement computer vision, speech recognition and translation, natural language processing, knowledge and recommendation, and intelligent search.

 
Microsoft Congnitive Services

Vision API

Vision API is a set that includes Computer Vision, Emotion, Video Indexer, Face, Video, Content Moderator, and Custom Vision Service.

  • Computer Vision API enables us to analyze images for content type such as adult/racy images and confidence level, read text and hand-written notes, recognize landmarks and celebrities, analyze video in real-time, search text and content in video, and generate a thumbnail.
  • Emotion API is used to recognize anger, contempt, disgust, fear, happiness, neutrality, sadness, and surprise in images and videos  
  • Video Indexer generates insights about video content including metadata and search for spoken words, faces, characters, and emotions.
  • Face API provides functionality for face verification, face detection, and face identification such as age, color, and sex of a person, similar face searches, and face grouping.
  • Video API functionality includes stabilizing shaky videos, detecting and tracking faces, detecting motion, and generating video thumbnails.
  • Content Moderator includes image moderation, text moderation, video moderation, and human review tool.
  • Custom Vision Service allows you to customize a computer vision model to your own unique user case.

Speech API

Speech API provides services of speech recognition and speech translation. It includes Speaker Recognition, Translator Speech, Custom Speech Service, and Bing Speech APIs.

  • Translator Speech API provides functionality to add speech translation to your app from a source language to a target language.
  • Custom Speech Service helps overcome speech recognition barriers such as speaking style, vocabulary, and background noise.
  • Bing Speech API converts spoken audio to text that may be used to build voice enabled apps.
  • Speaker Recognition API provides functionality to identify individual speakers or use speech to identify speakers in a video.

Language API

Language API provides functionality to implement natural language processing. Language API set includes LUIS, Bing Spell Check, Web Language Model, Text Analytics, Translator Text, and Linguistic Analysis.

  • Language Understanding Intelligent Service (LUIS) provides tools that enable developers to build their own custom models to interact with users.
  • Bing Spell Check API provides functionality to correct spell checking including world breaks, slang, incorrect person and brand names, and homonyms.
  • Web Language Model API automates and fixes the website language such as spaces, URL formats, and input data.
  • Text Analytics API detects sentiments, key phrases, topics, and language from text.
  • Translator Text API provides functionality to translate text from one language to other language.
  • Linguistic Analysis API uses advanced linguistic analysis tools for natural language processing, giving you access to part-of-speech tagging and parsing. These tools allow you to hone in on important concepts and actions.

Knowledge API

  • Knowledge API includes Recommendation, Academic, Knowledge, QnA Maker, Entity Linking, and Custom Decision.
  • Recommendations API learns from historical data and predicts what users may be interested in.
  • Academic Knowledge API provides access to academic content in the Microsoft Academic Graph.
  • Knowledge Exploration Service enables interactive search experiences over structured data via natural language inputs.
  • QnA Maker API can be used to build question and answers based on existing content.
  • Entity Linking Intelligence Service API provides functionality of data links with named entity recognition and disambiguation.
  • Custom Decision Service provides contextual decision making services that helps easy to use and rapid learning.

Search API

Bing API provides functionality to build smarter and more engaging apps and websites using Bing search engine. The Bing APIs include Autosuggest, Video Search, Image Search, Web Search, News Search, and Custom Search.

Learn more about Microsoft Cognitive Services here: https://azure.microsoft.com/en-us/services/cognitive-services/

Summary

In this article, I shared a list of available cognitive services APIs for developers. Apple, Amazon, Google, and Microsoft are clearly leading the way. Microsoft Cognitive Services is clearly is the most comprehensive, advanced, and intelligent service among them all.

Get started with Microsoft Cognitive Services here >

Up Next
    Ebook Download
    View all
    Learn
    View all