XenonStack

A Stack Innovator

Post Top Ad

Showing posts with label Natural Language Processing. Show all posts
Showing posts with label Natural Language Processing. Show all posts

Thursday, 12 December 2019

12/12/2019 05:47:00 pm

Applications of Natural Language Processing (NLP)



What is Natural Language Processing (NLP)

Natural Language Processing techniques, a subset of Artificial Intelligence increasing its necessity with the improvement of its sub-technologies day by day. Language is the prime source of communication and interactions. Without language, communication is not possible, and without communication, process completion is not possible. This is also another reason for the increment in the involvement of Natural Language Processing in different domains. The involvement of Natural Language Processing (NLP) is also increasing the dimensions of different areas. Some of these domains are -
These domains can be considered as use cases of Natural Language Processing but also have their separate use cases. The implementation of these use cases can be generalized to an extent, But these domains also demand diversities in the different levels of implementation as well as in the expertise required to implement them. Let’s understand briefly about these domains from the mirror of Natural Language Processing. One point is to be noted here that Natural Language Processing also requires some integration with other technologies such as Machine Learning, Deep Learning, and Big Data Analytics.

Natural Language Processing (NLP) Applications

Business Applications for Natural Language Processing

Let’s start with the domain of commercialization. It is pretty evident that the business domain is consists of some interesting and necessary use cases and problems that can be addressed by the use of Natural Language Processing. Some use cases of Natural Language Processing which are used in the Business domain are -
Sentiment Analysis — It is widely used in social media analytics and web monitoring which allows knowing the insights of the customers concerning particular products or services. It can be advantageous for any company to know about the thinking of the customers about a product so that they can know about the scope of improvement and how to achieve robustness. Natural language processing can not solely handle this task; it requires integration with highly computational methods such as Machine learning and deep learning to do the back end computation and Big data analytics to digest the data at an enormous scale.
Email Filters — Emails are adopted as a medium of communication officially now. Even the government considers it is official to communicate with the help of Email. But this medium is also vulnerable to spamming of the content. Companies that provide Email domains such as Google, Zoho or Yahoo are researching in the field of making it Full-proof by using different measures. Email filtering is an everyday use case of Natural Language Processing by applying various text analytics measures. It is a task of spam detection which is also in sentiment analysis as a pre-processing technique.
Voice Recognition — These are techniques that are powered by Natural Language Processing that allow companies to develop smart voice-driven services and interfaces for any product and service. To narrow the communication gap between the machines and humans is the most critical and necessary step to increase the grip on Artificial Intelligence. It can be achieved by only and only Voice Recognition which is possible by Natural Language Understanding a sub-process of Natural Language Processing.
Information Extraction — Information is the new fuel, it is a well-known fact now. But the data which is received at any receiving end mostly consist of unstructured format. The emergence of the advanced statistical algorithms results in the rise of predictive analytics and prescriptive analytics which made the prediction system more accurate. But these algorithms demand more and more information for finding the patterns and Matching them. Of course, Machine learning and Deep learning methods are doing an impressively great job, but without Natural Language Processing these things are not possible.
Continue Reading: XenonStack/Blogs

Wednesday, 11 December 2019

12/11/2019 05:15:00 pm

Sentiment Analytics Solutions with Deep Learning



Overview of Sentiment and Intent Analysis

  • Sentiment Analysis is termed as contextual mining of text to identify and extract information, understand the social sentiment of a brand. It is a text classification tool to analyze incoming messages and to depict positive, negative or neutral sentiments.
  • Sentiment Analysis using Natural Language Processing involves Supervised Learning, Neural Network Approach. Sentiment Analysis using Deep Learning will include Visual Keras Deep Learning Approach.
  • Intent Analysis involves understanding the emotions and intent of a user. It involves choosing the right events, tracking behavior against retention, identification of user’s need, bringing Real-Time Data Insights deriving value from Predictive Analytics. Intent Analysis using Automated Text Classification with Machine Learning involves Supervised Text Classification, Unsupervised Text Classification. Intent Analysis using Deep Learning involves Convolutional Neural Networks.

Business Challenge for Sentiment Analysis Adoption

  • Sarcasm Detection
  • Evaluate text to predict emotions
  • Parallel Computing for Massive Data
  • To improve algorithm precision

Solution Offered for Real Time Analysis

Real-Time Solution focusing Twitter trends and tweets involving –
  • Web Scraping to crawl Data from Twitter using Tweepy in Python.
  • Natural Language Processing to clean Textual Data and Feature Extraction.
The various steps included are –
  • Sentence Tokenization
  • Word Tokenization
  • Regular Expressions
  • Removing Stopwords
  • Working on n-grams
 Algorithms and Models use Supervised Learning algorithms in Text Mining trained on massive volume of data for better feature extraction and better accuracy to predict one’s attribute.

Wednesday, 13 September 2017

9/13/2017 10:54:00 am

What is SEMANTIC ANALYSIS?

The word semantic is a Linguistic term. It means something related to meaning in a language or logic.
In a natural language, semantic analysis is relating the structures and occurrences of the words, phrases, clauses, paragraphs etc and understanding the idea of what’s written in particular text. Does the formation of the sentences, occurrences of the words make any sense?
The challenge we face in the technologically advanced world is to make the computer understand the language or logic as much as the human does.
Semantic analysis requires rules to be defined for the system. These rules are same as the way we think about a language and we ask the computer to imitate. For example, “apple is red” is a simple sentence which a human understands that there is something called as Apple and it is red in color and the human knows that red means color.
For a computer, this is an alien language. The concept of linguistics here is this sentence formation has a structure in it. Subject-Predicate-object or in short form s-p-o. Where "apple" is subject, "is" is predicate and "red" are objects. Similarly, there are other linguistic nuances that are used in the semantic analysis.


Need for Semantic Analysis

The reason why we want the computer to understand as much as we do is that we have a lot of data and we have to make the most out of it.
Let us strictly restrict ourselves to text data. Extracting appropriate data (results) based on the query is one of the challenging tasks. This data can be a whole document or just an answer to a query and that depends on the query itself.
Assume that we have million text documents in our database and if we have a query for which the answer is in the documents. The challenges are
  • Getting the appropriate documents
  • Listing them in the ranked order
  • Giving the answer to the query if it is specific

Difference between Keyword-based Search and Semantic Search

In a search engine, a keyword based search is the searching technique which is implemented on the text documents based on the words that are found in the query. The query is initially processed for text cleaning and preprocessing and then based on the words used in the query the searching is done on the documents.
The documents are returned based on the most number of matches of the query words with documents.
In semantic search, we take care of the frequency of the words, syntactic structure of the natural language and other linguistic elements. In semantic search, the system understands the exact requirement of the search query.
When we search for “Usain Bolt” in Google, it returns the most appropriate documents and web pages regarding the famous athlete despite much more people with the same name since the search engine understands that we are searching for an athlete.
Keyword Based Search

Now, if we are a little specific in our search and search for Usain Bolt birthday, Google returns it as,
Semantic based Search

So, since Usain Bolt is quite a famous figure it might not be a surprising aspect for us. But there are a large number of other famous personalities and it is close to impossible to store all the information manually and show up accurately when a query is given by the user.
Moreover, the search query may not be constant. Each individual may query differently. Semantic techniques are applied here to store the data and fetch the results upon querying.
Let us see a different way of querying the above on Google
Different Ways of Searching on Google
Different Ways of Searching on Google

From above figures, it is evident that whatever way you give the search query, the search engine understands the intent of the user.


Semantic Search based on Domain Ontology

Earlier, we have seen search efficiency of Google which searches irrespective of any particular domain. Searches of this kind are based on open information extraction. What if we require a search engine for a specific domain?
The domain may be anything. A college, A particular sport, a specific subject, a famous location, tourist spots etc. For example, suppose we have a college and we want to create a search engine only for that college such that any text query regarding the college is answered by the search engine. For this purpose, we create domain ontology.


What is Ontology?

An ontology is set of concepts, their definitions, descriptions, properties, and relations. The relations here are relations among concepts and relations among relations.


How do we create Ontology?

Before starting to create an ontology, we first choose the domain of consideration. We list out all the concepts related to that domain along with the relations. We have a data structure which is already defined to represent the ontology. Ontology is created as .owl files.
An OWL file consists of concepts as classes and for classes, there are subclasses, properties, instances, data types and much more. All this information will be in XML form. For simplicity, there are tools available to create ontologies like Protege.

Creating Ontology


Storing the Unstructured Text Data in RDF Form

Ontology is created based on the concepts and we are ready to use this to find out the appropriate document for the query in a search engine. The text documents which are available in unstructured form need a structure and we call it as semantic structure.
Thanks to RDF (Resource Description Framework). RDF is a structure where we store the information given in text into triples form. These triples are similar to the triples that we have discussed earlier i.e. s-p-o form.
Machine Learning and Text Analysis process is used to extract data required and store in the form of triples. This way the knowledge base is ready. Both the ontology as well as structured form of text data as RDF’s
Storing Unstructured Data in RDF Form

Architecture for Real-Time Semantic Search Engine
Implementation of the architecture on "Computer Science" Domain -
The complete architecture for the search engine would be "Platform as a Service (PAAS)". Let us consider an example for "Computer Science" as a domain. In this, the user can search for faculty CVs from the desired universities and research areas based on the query. So the steps to build a Semantic Search Engine are -
  • Crawl the documents (DOC, PDF, XML, HTML etc) from various universities and classify faculty profiles
  • Convert the unstructured text present in various formats to structured RDF form as described in earlier sections.
  • Build Ontology for Computer Science Domain
  • Store the data in Apache Jena triple store (Both Ontology and RDF's)
  • Use SparQL query language on the data

Finally, the user can search for the required data by university or research area. Additionally, if a user has a project information i.e. any project of his/her own regarding Computer Science, the user can submit the project and the system analyze the project to identify appropriate faculty profiles working in a similar subject area. Various Big Datacomponents are necessary to make the search engine feasible to search in real-time.

Lingustic and Semantic Search


Data Ingestion using our Web Crawler Service

Starting with the data extraction process, a web crawler was built which scrapes the content from any university or educational websites.This Web Crawler is built using Akka framework which is highly scalable, concurrent and distributed. This also supports almost all type of files like HTML, DOC, PDF, Text Files and even images.
Continue Reading The Full Article at - XenonStack.com/Blog

Wednesday, 24 May 2017

5/24/2017 05:58:00 pm

Overview of Artificial Intelligence and Role of Natural Language Processing in Big Data



Artificial Intelligence Overview




AI refers to ‘Artificial Intelligence’ which means making machines capable of performing intelligent tasks like human beings. AI performs automated tasks using intelligence.

The term Artificial Intelligence has two key components -
    • Automation  
    • Intelligence

Goals of Artificial Intelligence







Stages of Artificial Intelligence


Stage 1 - Machine Learning - It is a set of algorithms used by intelligent systems to learn from experience.

Stage 2 - Machine Intelligence - These are the advanced set of algorithms used by machines to learn from experience. Eg - Deep Neural Networks.

ArtificiaI Intelligence technology is currently at this stage.

Stage 3 - Machine Consciousness - It is self-learning from experience without the need of external data.





Types of Artificial Intelligence



ANI - Artificial Narrow Intelligence - It comprises of basic/role tasks such as those performed by chatbots, personal assistants like SIRI by Apple and Alexa by Amazon.

AGI - Artificial General Intelligence - Artificial General Intelligence comprises of human-level tasks such as performed by self-driving cars by Uber, Autopilot by Tesla. It involves continual learning by the machines.

ASI - Artificial Super Intelligence - Artificial Super Intelligence refers to intelligence way smarter than humans.

What Makes System AI Enabled









Difference Between NLP, AI, ML, DL & NN



AI or Artificial IntelligenceBuilding systems that can do intelligent things.

NLP or Natural Language Processing - Building systems that can understand language. It is a subset of Artificial Intelligence.

ML or Machine Learning - Building systems that can learn from experience. It is also a subset of Artificial Intelligence.

NN or Neural Network - Biologically inspired network of Artificial Neurons.

DL or Deep Learning - Building systems that use Deep Neural Network on a large set of data. It is a subset of Machine Learning.



What is Natural Language Processing?


Natural Language Processing (NLP) is “ability of machines to understand and interpret human language the way it is written or spoken”.

The objective of NLP is to make computer/machines as intelligent as human beings in understanding language.



The ultimate goal of NLP is to the fill the gap how the humans communicate(natural language) and what the computer understands(machine language).

There are three different levels of linguistic analysis done before performing NLP -

Syntax - What part of given text is grammatically true.
Semantics - What is the meaning of given text?
Pragmatics - What is the purpose of the text?

NLP deal with different aspects of language such as

  • Phonology - It is systematic organization of sounds in language.
  • Morphology - It is a study of words formation and their relationship with each other.


Approaches of NLP for understanding semantic analysis

  • Distributional It employs large-scale statistical tactics of Machine Learning and Deep Learning.
  • Frame - Based The sentences which are syntactically different but semantically same are represented inside data structure (frame) for the stereotyped situation.
  • Theoretical This approach is based on the idea that sentences refer to the real word (the sky is blue) and parts of the sentence can be combined to represent whole meaning.
  • Interactive Learning - It involves pragmatic approach and user is responsible for teaching the computer to learn the language step by step in an interactive learning environment. 


The true success of NLP lies in the fact that humans deceive into believing that they are talking to humans instead of computers.

Why Do We Need NLP?


With NLP, it is possible to perform certain tasks like Automated Speech and Automated Text Writing in less time.

Due to the presence of large data (text) around, why not we use the computers untiring willingness and ability to run several algorithms to perform tasks in no time.

These tasks include other NLP applications like Automatic Summarization (to generate summary of given text) and Machine Translation (translation of one language into another)

Process of NLP


In case the text is composed of speech, speech-to-text conversion is performed.

The mechanism of Natural Language Processing involves two processes:
  • Natural Language Understanding
  • Natural Language Generation

Natural Language Understanding


NLU or Natural Language Understanding tries to understand the meaning of given text. The nature and structure of each word inside text must be understood for NLU. For understanding structure, NLU tries to resolve following ambiguity present in natural language:

  • Lexical Ambiguity - Words have multiple meanings
  • Syntactic Ambiguity - Sentence having multiple parse trees.
  • Semantic Ambiguity - Sentence having multiple meanings
  • Anaphoric Ambiguity - Phrase or word which is previously mentioned but has a different meaning.


Next, the meaning of each word is understood by using lexicons (vocabulary) and set of grammatical rules.

However, there are certain different words having similar meaning (synonyms) and words having more than one meaning (polysemy).

Natural Language Generation


It is the process of automatically producing text from structured data in a readable format with meaningful phrases and sentences. The problem of natural language generation is hard to deal with. It is subset of NLP

Natural language generation divided into three proposed stages:-

1. Text Planning - Ordering of the basic content in structured data is done.
2. Sentence Planning - The sentences are combined from structured data to represent the flow of information.
3. Realization - Grammatically correct sentences are produced finally to represent text.

Difference Between NLP and Text Mining or Text Analytics


Natural language processing is responsible for understanding meaning and structure of given text.

Text Mining or Text Analytics is a process of extracting hidden information inside text data through pattern recognition.



Natural language processing is used to understand the meaning (semantics) of given text data, while text mining is used to understand structure (syntax) of given text data.

As an example - I found my wallet near the bank. The task of NLP is to understand in the end that ‘bank’ refers to financial institute or ‘river bank'.

What is Big Data?


According to the Author Dr. Kirk Borne, Principal Data Scientist, Big Data Definition is described as big data is everything, quantified, and tracked.

For More Details on Big Data, Please Read - Ingestion And Processing of Data For Big Data and IoT Solutions

NLP for Big Data is the Next Big Thing


Today around 80 % of total data is available in the raw form. Big Data comes from information stored in big organizations as well as enterprises. Examples include information of employees, company purchase, sale records, business transactions, the previous record of organizations, social media etc.

Though human uses language, which is ambiguous and unstructured to be interpreted by computers, yet with the help of NLP, this huge unstructured data can be harnessed for evolving patterns inside data to better know the information contained in data.

NLP can solve big problems of the business world by using Big Data. Be it any business like retail, healthcare, business, financial institutions.

What is Chatbot?


Chatbots or Automated Intelligent Agents


  • These are the computer program you can talk to through messaging apps, chat windows or through voice calling apps
  • These are intelligent digital assistants used to resolve customer queries in a cost-effective, quick, and consistent manner.

Importance of Chatbots

Chatbots are important to understanding changes in digital customer care services provided and in many routine queries that are most frequently enquired.

Chatbots are useful in a certain scenario when the customer service requests are specific in the area and highly predictable, managing a high volume of similar requests, automated responses.

Working of Chatbot





Knowledge Base - It contains the database of information that is used to equip chatbots with the information needed to respond to queries of customers request.

Data Store - It contains interaction history of chatbot with users.

Continue Reading About AI & NLP  At: XenonStack.com/Blog