Using Microsoft Cognitive Services to find data insights in social data
Blog|by Jamie Maguire|12 October 2018
In some of my earlier blogs posts I introduced Microsoft Cognitive Services and explored some of the APIs such as text analytics and machine vision. In this blog post, we stay with the Cognitive Services theme; specifically, we look at how you can use Microsoft Cognitive Services to help you surface insights in social media data.
Why social media?
Social media has been around for many years now and the uptake of these services has been impressive. Just one look at this infographic by Aryxe shows how much data is generated in only 60 seconds!
Adoption of these online services is only going to increase, and as new services are developed, alternative data structures will be created thereby adding to ever growing sea of unstructured data.
Unlocking the value in social data
People often take to these platforms to share things that are important to them, rant about poor customer service or share content that their friends or colleagues might find interesting. Social data can help businesses keep their finger on the pulse of how their product, brand or service is being perceived online or help them understand their customers better (or even find new customers!).
As many as 500 million tweets are sent every day by users on Twitter! By tapping into conversations users are having about businesses can glean all sorts of real-time data insights that include, but are not limited to:
- how users feel about a product or service
- user location
- how often users tweet
- topics users tweet about
With so many Tweets being sent on Twitter each day, its clearly a lot of data to process manually but this is where Cognitive Services (and the Twitter API) can help you!
Before you access Twitter data, you need to create a Twitter Application from their developer site, after you’ve done that you’ll be issued with a set of private keys and tokens which you need to supply to the Twitter API REST endpoints.
A good place to start is with the generic tweet search (GET search/tweets) endpoint. You can find a complete list of all the Twitter API endpoints here. When you define your search parameters (in this case ‘ipad’) and make a request to this endpoint, you’ll get a response like the following:
Granted, there is a lot of data! The key field for this blog post though is the text field which contains the Tweet copy, which in this instance is:
Enter for a chance to Win an Apple Ipad (newest version)! https://t.co/jBCZnfKCuI https://t.co/EbsbV6UCB8
Armed with your Twitter Application keys/tokens and this endpoint, you can then start to build REST requests to invoke the Twitter API search which will return the JSON we’ve just looked at. You can then parse this JSON data to extract the values you’re interested in.
Alternatively, if you prefer not to have to do that, you can opt to use a third-party developer library like Tweetinvi. This NuGet package gives you access to a nice object graph that represents things such as Tweets, Users and other useful information.
Now that we’ve covered the basics of the Twitter Search and the data it returns, it’s time to turn our attention to the other piece of the puzzle – Cognitive Services.
Cognitive Services – Sentiment Analysis
You can find out more about sentiment analysis in one of my other blog posts. In a nutshell, sentiment analysis is a classification problem that is concerned about identifying the underlying emotion (positive, negative) in a given dataset.
I studied this as part of a master’s degree a few years ago and some of that research went into a software API in 2016 that mined Twitter data. It involved building a classifier, sourcing training data and then training the classifier with positive, negative and indifferent sentences. At the time I was able to get an accuracy rate of about 80%.
With Cognitive Services however, you don’t need to do any of this!
In this example, using the Microsoft.Azure.CognitiveServices.Language SDK, you can see how simple it is to apply sentiment analysis to a stream of text, or in our case, a Tweet.
When we run this code, the Cognitive Services API endpoint will classify and determine the sentiment score for the text that was supplied for the test tweet “I love my new iphone” which has been assigned a confidence scoring of 0.97 (or 97%) as being positive in sentiment.
Cognitive Services – Keyword Extraction
Another useful feature that can help you extract further data insights is that Key Word Extraction endpoint that ships with the Text Analytics API. It gives you all the main talking points in a sentence
In this example, you can see how you can easily implement this:
When we parse out the result, we get the following keyword(s) in our console application:
Note – that before you try any of this, you’ll need a Cognitive Services account and set of associated keys in Azure, you can create a free account here!
Now that we’ve covered the basics from a Twitter and Cognitive Services perspective, it’s time to look at how you can bring all of this together and build a basic Twitter application that fetches Tweets and applies some Cognitive Service magic to it!
Bringing it all together
For this example, we’ll use Tweetinvi NuGet package. You could write a set of classes and build your own custom API to construct and execute REST requests against the Twitter API endpoints. Using Tweetinvi is quick way to consume the Twitter API without getting caught up in OAuth and Twitter API parameters!
In this sample console application, we’ll:
- Build and execute a search against the Twitter API using Tweetinvi using the hashtag “#machinelearning” that returns a maximum of 10 Tweets
- Loop through each tweet
- Invoke the Cognitive Services Sentiment Analysis and Keyword Extraction endpoints against each tweet
- Output the results
Below you can see the main logic of the console application:
When we run the console application it makes a call to Twitter, runs a search and applies our Cognitive Services API endpoints the data we retrieve. You can see an example Tweet being processed by our console application here, the Tweet copy is in italics and the Cognitive Services analysis results are in bold text.
I think you’ll agree, the sentiment is positive and the key phrases in this tweet have been extracted! You can find the complete source code for it here if you want to download it and experiment with or play around with the code.
Obviously, this sample application is basic, wouldn’t scale very well and doesn’t have any fault tolerance built into it. It does show you how you can easily blend together Cognitive Services easily with other endpoints such as the Twitter API. You could even add the Cognitive Services Vision API into the mix which would let you categorise any images in Tweets.
You could further extend this application by writing your own custom API that executes REST requests against the Twitter API endpoints which would give you complete control over the data have available and is a more flexible approach. This is the approach that I took when building an AdTech solution that could dynamically identify sales leads and automatically create Twitter Ad Campaigns.
Some other use cases for mining social media data can include, but are not limited to:
- identifying brand advocates
- help identify digital PR disasters
- scraping reviews such as Amazon and surfacing popular terms
- word clouds and visualisation of common themes for current affairs or sporting events
In this blog post we’ve explored the Twitter API and how you can extract Twitter data then apply Microsoft Cognitive Services Text Analytics API to help surface data insights.
We’ve seen how you can identify the sentiment in a collection of Tweets as well as the main keywords or phrases that are present. Are you using or considering implementing any the Cognitive Services APIs in your software projects?
The source code for this blog post can be found in the GitHub repo here.
You can find out more about the full range of Bing API’s here.
Contact Grey Matter
If you have any questions or want some extra information, complete the form below and one of the team will be in touch ASAP. If you have a specific use case, please let us know and we'll help you find the right solution faster.
Software Architect, Consultant, Developer, and Microsoft AI MVP. 15+ years’ experience architecting and building solutions using the .NET stack. Into tech, web, code, AI, machine learning, business and start-ups.
Intel and Google are working together to drive high-performance computing forward on Google Cloud with the release of the Cloud HPC Toolkit.
Thu 22 September 2022 11:00 am - 1:00 pm BST
Learn how to increase your revenue through Microsoft’s Commercial Marketplace if you create Azure Managed Application solutions