A few pages into the book, Natural Language Processing with AWS AI, the authors describe being enamored by the Arabian Nights story of Ali Baba and the Forty Thieves. Ali Baba, a poor man discovers a den with treasures hidden in a cave. The cave however, only opens when one utters the magic words – Open Sesame! A centuries old inspiration for NLP-enabled voice-activated application? In any case, this tale old as time, is replicated in our homes of today.
A look around ourselves, and we will quickly realize that our words – both written and spoken, are regularly used to interact not only with other beings, but also with and through machines. Among many day-to-day tasks we ask Maps to route us to our favorite coffee shop, we allow Text to finish our sentences, we rely on machines to translate and transcribe our words, even embellish our communication with gifs and emojis. Interactions between language and machines are an inevitable part of our lived reality.
So how are machines leveraged to make meaning of words? Have you ever wondered how Natural Language is indeed processed? Are you interested in knowing more about this fascinating interdisciplinary field, but don’t know where to start? Look no further! In this article, we will introduce Natural Language Processing by answering five questions that you may have wanted answered.
First, What is natural language?
Natural languages are those that we hear, write, and speak. Of the over 7000 spoken languages in the world, examples include English, Spanish, Mandarin, or Hindi which are used in interactions among humans and which continue to evolve, with additions and edits and through metamorphoses of words or phrases to mean different things, over time. For example, the word “cute” which started off as a shortening of the word acute and originally meant sharp-witted, before taking on significance to mean attractive.
Now, some of us may be familiar with programming languages such as C, and you may wonder if these are also natural languages. Natural languages are unlike “artificial” ones, such as programming languages which have detailed construction, since the latter don’t evolve organically through use and need intentional manipulation to evolve or be altered.
Next, What is NLP?
NLP stands for Natural Language Processing which is an interdisciplinary field focusing on the use of computation technology to break down natural languages into keywords, concepts, sentiments, or emotions, depending on the task that humans want to perform. We know that humans learn to speak, read, and write through practice. Machines, on the other hand, cannot read like humans, at least in the literary sense. Computers understand inputs in binaries (0s and 1s) and make meaning of words by first converting them to strings of binaries. Thus, machines do not have the inherent ability to distinguish natural languages like English, Spanish, Hindi, etc. However, they are stellar at learning rules. Through NLP, we have been able to bridge a gap by coding rules that machines do understand to help us automate tasks that involve understanding natural languages. These tasks are then used to perform a wide variety of activities, from curating our playlists to recognizing voice instructions and making our homes speech-enabled such as using voice control to turn on Alexa, crank up the air conditioner, or read us our emails.
Erm, okay, but why do we need machine intelligence to make meaning of words?
Humans interact through languages, and NLP can be leveraged through machines to make meaning of words exchanged between humans or to perform tasks through directing machines using language. In the former application, diving deep into large datasets comprising written or spoken word can help researchers understand contexts, behavior, attitudes, and sentiments. For example, consider the inordinate amount of text generated in engineering classrooms, NLP systems can be used to help instructors quickly distill written feedback from their students to inform changes to their pedagogy to make classes more engaging and useful. For the latter application, NLP can be thought of as one of the primary contributors for making our devices smart.
Some common areas where NLP is used on a daily basis are-
- Filtering our emails: One of the earliest applications of NLP was to help users filter out spam in emails. The technology helped to uncover certain patterns which potentially signal spam. The newer applications of NLP can be seen in our Gmail’s email classification. Based on the content of the emails ,the system is now capable of classifying our emails into three categories-Primary, Social, Promotions.
- Making Smart assistants smarter: With the advent of voice recognition, interacting and getting a quick response from smart technological devices like Siri, Alexa, Google Assistant has become our new normal. From procuring information about weather, thermostat, stock markets, cafes near us, to cracking jokes when we feel sad or gloomy, smart assistants like Alexa have now become an intrinsic member of our households.
- Enabling Digital Phone Calls: Very often these days, we come across telephonic systems which mention- ‘this call may be recorded for training purposes’. Little do we know what that entails. Most of the calls to service centers are now computerized and mimic human voices. The data collected goes into a database for a NLP system to learn from and improve in the future. This helps large companies categorize issues/services, improve resolution rates and increase customer satisfaction.
But surely there are challenges to NLP? What are some things researchers are working to solve?
Machines may often struggle with natural languages and with differentiating between contexts to skillfully process words. Let’s look at Groucho Marx’s timeless statement:
“Time flies like an arrow, fruit flies like a banana!”
Human brains can process this sentence since we have context-specific information that helps us differentiate between the flying of time versus fruit flies. However, consider the similarities in structural semantics of the two halves of the sentence. Although a simplistic one, this example can highlight how like with all other technologies, NLP too has its own share of complexities and limitations. Challenges for the field of NLP lie not only in ongoing research advancements to look into aspects that make natural language unique, such as those related to ambiguity or sentiment (e.g., irony/sarcasm), but also infrastructure related challenges such as those around computation and storage needed to train and execute an NLP model. This is where cloud computing can come handy. Another challenge, ubiquitous to all AI systems is related to building explainable or responsible NLP AI practices. Researchers across the globe are making advancements in this interdisciplinary space, addressing all the challenges that this unique problem poses.
Okay! I am interested. What skills do I need to brush up on to work on NLP?
By now you may have gathered that NLP is an interdisciplinary effort. One doesn’t always need a Data Science degree to exploit the endless possibilities that NLP has to offer. Although if you are interested in the science behind these technologies, exploring higher education degrees is definitely a great way to go. Several universities offer degree and certificate courses offering insight into computational linguistics, science, social science, or the engineering of NLP systems.
However, if you are only interested in extending NLP capabilities into specific applications, all you need is some command over linguistics and basic computational understanding to begin your NLP journey. Here is what we recommend to get one started:
- A beginner level understanding of a programming language like R or Python can be very useful for aspirants. Most of the libraries and tools available to support NLP are written in Python. For example : NLTK, the most widely-mentioned NLP library available for Python. An understanding of ML algorithms and applications will definitely prove to be an added advantage while exploring NLP.
- A good book which delves into AI systems that can be leveraged for NLP projects can go a long way! Refer to this Amazon science article on how books like, “Natural language processing using AWS AI services” can help you get started on your ML Journey with low code/no code cloud solutions and autoML.
- Patience goes a long way, especially while working with real-world data. Finding clean data for an NLP project is always painful and 80% of any ML project goes into data cleaning. So in order to save time, one can refer to kaggle for existing dataset or use open source datasets and python notebooks which are often part of textbooks on NLP to get started with a project quickly.
- Get certified! Certification always helps stand out your resume. Some of the high paying certifications are AWS ML Certification and Google Cloud professional ML Engineer certification.
There are endless possibilities in the future of Natural Language Processing. Whether you are merely consuming the technology or actively advancing it, there is much to learn and collaborate on. If one is interested in furthering a career in NLP and ML, leveraging the SWE Mentorship platform is a great way to connect with others in the field. The authors of this article also welcome questions from any reader interested in learning more about their exploits with NLP and exploring career opportunities in this realm.
Happy Deriving Meaning in and from the words (and technologies) around you!
Dr. Sreyoshi Bhaduri is a Senator and a decade-long member of the Society of Women Engineers. She is an Engineering Educator and People Research Scientist. Sreyoshi currently works as a Research Scientist at Amazon. Her research leverages employee data to generate data-driven insights for decisions impacting organizational Culture and Talent. She employs innovative, ethical, and inclusive mixed-methods research approaches to uncover insights about the 21st century workforce. Views expressed in this article are her own, and do not necessarily reflect those of the organizations she works at or volunteers with. Learn more about Sreyoshi’s impact and get in touch at www.ThatStatsGirl.com.
Indrani is a SWE member currently pursuing her Master’s in Computing Systems from Georgia Tech. Having spent more than 6.5 years as a software developer working with Tata Consultancy Services in India, Indrani has worn multiple hats: from being an individual contributor, to leading teams of engineers. Indrani also has a Masters in Computer Application and has always been passionate about increasing representation of women in STEM. She volunteers at several global organizations to inspire, mentor and help women persist in STEM. Indrani is a singer and a cooking enthusiast, and in her free time loves to explore cuisines and enjoy music from around the world.She can be reached at https://www.linkedin.com/in/indranisen/
Mona Mona is author of the book Natural Language Processing with AWS AI. Mona Mona is a Sr AI/ML customer engineer at Google. She is a highly skilled IT professional bringing more than 10 years in software design, development, and integration across diverse work environments. Before joining Google Mona worked at Amazon Web Services (AWS) as a Senior Machine learning solution architect, her role is to ensure customer success in building applications and services on the AWS platform. She is responsible for crafting highly scalable, flexible, and resilient cloud architecture that addresses customer business problems. She publishes widely on the topic and has authored several blogs on AI and NLP and most recently, a research paper on AI-powered search solutions which was published by Amazon Science and has been recognized as a runner’s up in AAAI conference. She can be reached at https://www.linkedin.com/in/mona-mona/
Prem Ranga is the author of Natural Language Process with AWS AI along with Mona. Prem is a Sr AI/ML specialist SA and NLP domain lead at Amazon Web Services (AWS), and is the author of several ML blogs and a research paper on Reinforcement Learning.
He guest lectures at universities and enjoys helping customers with their AI/ML journey. Prem can be reached at https://www.linkedin.com/in/premkr