How Google’s BERT Changed Natural Language Understanding
Are you a developer, data analyst, search engine optimization specialist, or just genuinely interested in technology? Then you’ve probably already heard of Google’s latest and largest change to its search algorithm. Catch up on what exactly BERT is and the technologies behind it. More importantly, get an understanding of the impact that these technologies – made up of a little something called Natural Language Processing or NLP – will have on our future.
Mục Lục
What is Natural Language Processing?
Natural Language Understanding (NLU or NLP) dates back to 1905 when the original Turing test was created. NLP is a subfield of linguistics, computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages. Specifically, how an application can understand large amounts of natural language data.
Long story short? It helps search engines, smart speakers, digital assistant and other devices understand what you’re saying, and more importantly, responding with the best result possible.
So Why Is NLP so Hard to Perfect?
Languages, especially English, are problematic when creating algorithms and applications because of the ambiguous nature of the language, or lexical ambiguity. Almost every other word in the English language has multiple meanings. Lexical ambiguity is based on context, so meaning can only be understood at the sentence level rather than the word level. For example, here are the complexities of the English language for the word “grounded.”
- She grounded the wire.
- She was grounded for the rest of the week.
- She is a well-grounded person.
- She grounded the airplane.
- She grounded her ship on a coral reef.
- She grounded to second base.
There are also more granular homophones like:
- There, Their
- To, Too, Two
- Where, Were, Wear
- Whether, Weather, Wether
In another example, here are two different phrases that sound identical when spoken:
“four candles” and “fork handles”
Ok, Got It. So, Where Does a Neural Network Come In?
A neural network is a computer system modeled on the human brain and nervous system. It’s designed to recognize patterns by labeling or clustering raw, real-world data (images, sound, text, time series), and translating them correctly.
Neural networks are components of larger machine learning applications, including algorithms for reinforcement learning, classification and regression. Long story short, the neural network is the brain (machine learning) that finds patterns from Natural Language Processing.
Ok, so Give Me an Example of When Neural Networks Are Used.
Neural networks are and can be applied to many things in your daily life. This includes:
- Classifying certain emails as spam
- Detecting fraud
- Detecting customer satisfaction
- Detecting diseases early
Got It, so What Is Google’s BERT?
Google introduced BERT (Bi-Directional Encoder Representations Transformers) to their search engine backend on October 21, 2019. BERT applies to English-language search queries and featured snippets, impacting 10% of all search queries. BERT’s neural network-based technique for natural language processing pre-training improves search results for complicated search queries that depend on context.
Bi-directional refers to the algorithm viewing the text – before and after any given word. Transformers refer to models that process words in relation to all other words in a sentence. In layman’s terms, this new algorithm helps Google better discern the context of the words in a search query.
According to Google, “These improvements are oriented around improving language understanding, particularly for more natural language/conversational queries, as BERT is able to help Search better understand the nuance and context of words in Searches and better match those queries with helpful results.”
“Particularly, for longer and more conversational queries, or searches where prepositions like ‘for’ and ‘to’ matter a lot to the meaning. Search will be able to understand the context of the words in your query. You can search in a way that feels natural for you.”
Some exciting developments BERT brings to the NLP field are:
- Pre-training from unlabeled text
- Bi-directional contextual models
- The use of a transformer architecture
- Masked language modeling
- Focused attention
- Textual entailment (next sentence prediction)
- Disambiguation through context is open-sourced
How Can I Optimize My Website for BERT?
We could say, like everyone else does, that you can’t optimize for BERT. But you can, and it’s simple. Create great content that is simple to understand, grammatically correct, and answers any questions that users may have, and do it better than anyone else. If you’re already doing that, keep it up!
This addition to Google Search was implemented not only to better understand how users search, but also to better understand your website’s content and topics. In turn, it will appropriately match your content to the users search intent at a higher accuracy.
Search engines are still in their infant stages and have a long way to go. For example, search will continue to move more or entirely to voice, which is why natural language understanding is so important. These models and practices will also help with non-search related fields, which will change the world.
How Will BERT Impact the Future?
Improvements are continually being made to BERT through its open source models found on GitHub. Google’s AI team is also working with the Toyota research team on a state-of-the-art (SOTA) natural language processing model named ALBERT. Interestingly enough, companies have also incorporated BERT into their own training models. Microsoft’s MT-DNN and Facebook’s RoBERTa have already beaten the original BERT model in testing.
As you can see, BERT dramatically improved the future of Natural Language Processing and Machine Learning, but there is still much more potential. Further advancements will not only improve traditional search results, but also voice search, smart homes, chatbots, business analytics, automation, and much more.
Here are some common areas of business currently leveraging NLP for increased returns that will only improve over time:
- Using NLP to exchange market intelligence with company stakeholders.
- Chatbots have become a solution for customer call centers. Chatbots provide conversational, human-like assistance to customers that reduce call volume while improving customer experience and loyalty. In the future, chatbots will be able to have even more natural conversations to the point where you will not be able to realize whether you are speaking to a real person or bot.
- As mentioned before, businesses operators are increasingly reliant on social, search, and BI data to monitor customer sentiments. Much of this data is text and requires NLP for sentiment analysis.
- NLP has substituted several customer services functions with reliable service.
- NLP has also helped advertising funnels target segmented customers based on their feelings and sentiments during the buyer’s journey.
- Phones can use NLP to order movie tickets and request dining reservations.
- Omni-channel ecosystem – amazing conversation engagement across all channels with no disconnect of who the user is and what they are interested in.
BERT can also be used for content writing and search engine optimization. This writer used NLP analysis on this article and the following datasets show how it was perceived (not including NLP percentages):
- Sentiment – identifies and extracts subjective information in source material, and helping a business to understand the social sentiment of their brand, product or service while monitoring online conversations.
- Positive
- Emotion – Analyze the overall emotion and the targeted emotion of the content.
- Joy
- Keywords – Determine important keywords ranked by relevance.
- Google’s BERT Impact
- Natural Language Understanding’s Future
- search algorithm
- NLP dates
- search engines
- Neural Networks
- Natural Language Processing
- basic understanding of what Google
- Bi-Directional Encoder Representations Transformers
- Entities – Extract people, companies, organizations, cities, geographic features, and other information from the content.
- BERT – Person
- Google – Company
- BERT – Company
- Natural Language Processing – PrintMedia
- Categories – Classify content into a hierarchy that’s five levels deep with a score.
- / technology and computing / internet technology / web search / people search
- / technology and computing / internet technology / social network
- Concept – Identifies general concepts that may not be directly referenced in the text.
- Information retrieval
- Natural language processing
- Artificial intelligence
- Linguistics
- Google search
- Web search engine
- Bing
- Syntax – Identify tokens, part of speech, sentence boundaries and lemmas in the text
- Semantic Roles – Parse sentences into subject, action, and object form and view additional semantic information such as keywords, entities, sentiment, and verb normalization.
In the future, NLP will harness and understand unstructured data. You’ll be able to conversationally ask a digital assistant a question and receive a human-like response. The machine will also learn how you like it displayed and will organize the data accordingly. Furthermore, you’ll be able to have natural conversations with the assistant, compared to just speaking single, one-at-a-time commands.
Are you looking into Machine Learning and NLP, or have you already started implementing such systems into your business? We’d love to learn more about how you are using these technologies. Let us know your thoughts on how you use this technology to make your projects more proficient and accurate.
Here are other resources on BERT: