Getting the most out of your data – key takeaways from the Data Transformation event
Blog|21 November 2019
Grey Matter recently delivered an event that focused solely on how you can get the most out of your data using Microsoft technologies. The event was split into three different sessions delivered by Grey Matter, Cloud Architect Danail Dimitrov, and Microsoft’s Paul Henwood, and Microsoft AI MVP, Jamie Maguire. Hosted at Microsoft’s Reading Campus, the technologies leveraged centred around Azure SQL and Azure Data Services, Microsoft Cognitive Services and Power BI and how they could help attendees to:
- migrate existing data infrastructure to the cloud
- ingest data from external data sources
- use Azure Cognitive Services to surface AI insights in image and text data
- perform data transforms using Azure Data Flow
- easily self-author reports using Power BI
Each session segued naturally into the next to show the complete data story from the point of data extraction, ingestion, and transformation before being finally laid to rest in an Azure Data Lake for reporting purposes.
Key technologies, concepts and code walkthroughs presented throughout the day included:
- Azure Data Lake
- Azure SQL Database
- Data Factory V2
- Data Flow and transforming JSON documents
- Database Experimentation Assistant
- Database Migration Assistant
- Elastic Pools
- Managed Instances
- Maria DB, MySQL, PostgreSQL and Cosmos DB
- SQL Server Migration Assistant
- Storage Accounts and types of storage
- Azure Cognitive Services
- Computer Vision API
- Custom Vision API
- Text Analytics API
Reporting and Visualisation
- Power BI
- Real-world use cases
- Real-time integrations
Below we give you a quick overview of what was covered in each session, linking to the full session recording where you can see the deep-dive demos that were delivered on the day.
Azure and Data Services
The opening session set the data scene. Azure SQL Database offers users an intelligent cloud database that learns and adapts, scales on the fly, redefines multi-tenancy, works in your environment, and secures and protects your data.
There are a number of different purchasing models available, DTUs and vCores, as well as Hyperscale which is a relatively new tier that decouples computation and storage nodes thereby improving performance and scalability offering rapid storage growth, nearly instantaneous backups and fast restores, rapid scale-out and scale-up.
In this session, Danail demoed provisioning an Azure SQL Database and an Elastic Pool and then putting that database into the Elastic Pool (skip to 23mins on the video). Attendees were then shown the different migration tools available to them, with a focus on Microsoft Data Migration Assistant. The different SQL Server options were also discussed for data storage, alongside usage scenarios.
Instagram Graph API and Cognitive Services
After seeing how existing databases can be migrated to Azure, Jamie showed attendees how interfaces built using C# connected to the Instagram Graph API, and retrieved over 10,000 images.
Layers of AI were then applied to this dataset by integrating Azure Cognitive Services Text Analytics and Computer Vision APIs, all of which provided rich insights into the Instagram dataset.
Text Analytics API
Text Analytics API and its capabilities were introduced with demos showing attendees how, with just a few lines of code, the following insights could be surfaced in the image descriptions of Instagram data:
- sentiment analysis
- keyphrase extraction
There were some questions around implementing custom classification mechanisms and while at the time of writing, no such mechanism exists in the Text Analytics API, it’s certainly possible to create a language model using LUIS to identify the underlying human intent being expressed in text.
Computer Vision and Custom Vision APIs
Computer Vision API was introduced, the functionality it offered and how the .NET SDK makes it easy to generate human-readable descriptions and tags from image data. There are some occasions where it’s better to leverage the power of the Custom Vision API for Edge cases or unique image processing requirements.
It was demonstrated how Custom Vision API makes it easy to build custom state-of-the-art computer vision models by using a web dashboard and how the custom image model can then be consumed in a C# application to make image classification predictions.
Jamie also briefly touched on how Form Recognizer can be used to build middleware that can generate structured data from multiple unstructured data feeds.
For anyone wishing to experiment with the code examples that were used for the Text Analytics, Computer Vision and Custom Vision demonstrations, you can find them on Jamie’s GitHub profile here.
With the data now gathered and enriched, Danail took a look at data storage options available in Azure, going into the many benefits of Storage Accounts. This discussion drilled into Blob storage and the use of Data Lake Gen2 for big data analytics.
Different methods of accessing data stored on Storage Accounts was covered with a demonstration that showcased the access of second-generation Data Lake files through the use of Storage Explorer. This is the location used to store the data gathered in the previous session.
Danail shared the Azure solution architectures published by Microsoft and talked about the use of Azure Data Factory V2 within the scope of the modern data warehouse architecture. Azure Data Factory V2 was provisioned and was used for a simple copy operation between two SQL relational databases. The demonstration then showcased the Azure Databricks powered Data Flows to convert document-style JSON files into tabular CSV format while gradually manipulating the data as it was passed through the Data Flow.
Power BI and the Microsoft Power Platform
Now that you have your data where you want it and enriched with AI, Paul Henwood demonstrated how you can use Power BI to start visualising it. 70% of organisations don’t believe that they are using their data to the fullest, and that amount of data is growing daily with the majority of it unstructured and requiring specialist people to analyse it.
This is where Power BI comes in. Power BI unifies various services, supporting both the self-service and enterprise BI data model, to reduce the amount of time that it takes to get insights out of data. Anyone can access and analyse, at any time from any device.
Paul emphasised the importance of creating a data culture within an organisation as a foundation for rolling out tools like Power BI and the Microsoft Power Platform (Dynamics, Power BI, Power Apps and Power Automate – formerly Flow). Along with multi-device flexibility, licensing and best practice for building dashboards.
Four key areas were covered with demos:
- Self-service BI and collaboration
- Unified self-service and enterprise BI
- Pervasive AI for BI
- Big data analytics with Azure Data Services
The key takeaways being that Power BI allows you to pull data from one or more data sources, analyse it, visualise it, publish it and collaborate, working both for those that want to self-service and the data scientists that are used to working on enterprise BI and need more functionality.
There were great questions throughout the day, and it was insightful to hear about what attendees were working on and share ideas how technology in Azure could be used to enhance data usage.
It is clear how Microsoft technologies can be leveraged by all areas of the business to gain more meaningful insights from corporate data.
Grey Matter’s technical services team hold Microsoft certifications in Data Platform, Data Analytics and Datacentre and can execute comprehensive data platform assessments, migrations and deployments. If you’d like to get some expert advice on the state of your data estate, best practice insights and end-user training, please get in touch with the team at Grey Matter, they’d be happy to help.
Email: firstname.lastname@example.org | Call: +44 (0)1364 655100
Contact Grey Matter
If you have any questions or want some extra information, complete the form below and one of the team will be in touch ASAP. If you have a specific use case, please let us know and we'll help you find the right solution faster.
Intel and Google are working together to drive high-performance computing forward on Google Cloud with the release of the Cloud HPC Toolkit.
Thu 22 September 2022 11:00 am - 1:00 pm BST
Learn how to increase your revenue through Microsoft’s Commercial Marketplace if you create Azure Managed Application solutions