re:Invent 2019 - AIM303 > Lab 3: Amazon Comprehend

Lab 3: Amazon Comprehend

NLP analysis of texts

arch.lab3.png

Let’s now look into gaining some understanding of the content of our transcribed text using Amazon Comprehend.

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. No machine learning experience required.

There is a treasure trove of potential sitting in your unstructured data. Customer emails, support tickets, product reviews, social media, even advertising copy represents insights into customer sentiment that can be put to work for your business. The question is how to get at it? As it turns out, Machine learning is particularly good at accurately identifying specific items of interest inside vast swathes of text (such as finding company names in analyst reports), and can learn the sentiment hidden inside language (identifying negative reviews, or positive customer interactions with customer service agents), at almost limitless scale.

Amazon Comprehend uses machine learning to help you uncover the insights and relationships in your unstructured data. The service identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; analyzes text using tokenization and parts of speech; and automatically organizes a collection of text files by topic. You can also use AutoML capabilities in Amazon Comprehend to build a custom set of entities or text classification models that are tailored uniquely to your organization’s needs.

We will use the AWS Management Console in this lab and run the NLP analysis manually.

Real time analysis

On the AWS Management Console, search for the Amazon Comprehend service and click the name. You may land on a splash screen first. If that is the case, find the button that is inscribed with Launch Amazon Comprehend to continue. You will also see a Launch Amazon Comprehend Medical button - this is a specialized version that is able to deal with medical documents. When you land in the Amazon Comprehend console, make sure you’re in the Real-time analysis view. If you’re not, click on Real-time analysis in the navigation frame on the left.
Scroll down to the Input text section. It already comes with a sample text. Quickly read that sample text.
When scroll down further to the Insights section, you will find the results of the NLP analysis of the above sample text with the five categories
- Entities
- Key phrases
- Language
- Sentiment
- Syntax
Take a moment to explore all the results and connect them to the above sample text.

Now do the same with the transcript of the sample calls you received from Amazon Transcribe. Below, we also have another sample that you can try. It’s a transcript of the customer audio channel. Apparently, it makes a quite positive impression:

Good afternoon! I purchased your flux capacitor and I guess I have an operational issue with it. Do you think you can help me? Oh, excellent, that explains it. Very good! Thank you so much for your help. Yes, bye.

This impression is reflected by the sentiment analysis results coming out of the real-time analysis of the above text. Have a look at the screen shot below.
Now let’s assume the following transcript of a customer. This doesn’t look so much like a happy customer.

I bought that lousy flux capacitor you had on sale and guess what – that crappy things doesn’t work!
As expected! Always the same with your shitty products.
No, don’t even try to come with explanations!
No, you listen! I want to give the crap back and you owe me the money!
There you go. Bye!

Paste the above into the Input text field and see how our impression gets reflected in the sentiment analysis results. Take a moment to explore all other results and add some more call recordings if you have the time.

Topic modeling

Besides real-time analysis, you can also submit batch jobs to Amazon Comprehend in order to process a set of documents in a row. One of the features supported by batch processing is topic modeling, which you can use to determine common themes, such as the subjects for a batch of news articles.

Take a moment to explore the documentation landing page for topic modeling with Amazon Comprehend.

As topic modeling needs a certain amount of input documents to produce meaningful results (at least 1,000 documents in each topic modeling job), we will not actively perform it in this workshop.

Conclusion

Congratulations!

You have successfully run some NLP analyses with Amazon Comprehend and explored the various aspects that are considered.

We’d like to encourage you to explore the Amazon Comprehend documentation in more detail after this workshop, starting once more with a detailed overview about What Is Amazon Comprehend?, followed by an overview of the currently Languages Supported in Amazon Comprehend.

We will now look into how to get to a processing pipeline that does the transcription and comprehension jobs for us automatically for any newly replicated call recording.