FLAIR - A Framework for NLP
Last Updated :
26 Nov, 2020
What is FLAIR?
It is a simple framework for state-of-the-art NLP. It is a very powerful library which is developed by Zalando Research. The Flair framework is built on top of PyTorch.
What are the Features available in Flair?
- Flair supports a number of word embeddings used to perform NLP tasks such as FastText, ELMo, GloVe, BERT and its variants, XLM, and Byte Pair Embeddings including Flair Embedding.
- The Flair Embedding is based on the concept of contextual string embeddings which is used for Sequence Labelling.
- Using Flair you can also combine different word embeddings together to get better results.
- Flair supports a number of languages.
Contextual String Embeddings:
In this word embedding each of the letters in the words are sent to the Character Language Model and then the input representation is taken out from the forward and backward LSTMs.
The input representation for the word 'Washington' is been considered based on the context before the word 'Washington'. The first and last character states of each word is taken in order to generate the word embeddings.
You can see that for the word 'Washington' the red mark is the forward LSTM output and the blue mark is the backward LSTM output. Both forward and backward contexts are concatenated to obtain the input representation of the word 'Washington'.
After getting the input representation it is fed to the forward and backward LSTM to get the particular task that you are dealing with. In the diagram mentioned we are trying to get the NER.
Installation of Flair:
You should have PyTorch >=1.1 and Python >=3.6 installed. To install PyTorch on anaconda run the below command-
conda install -c pytorch pytorch
To install flair, run -
pip install flair
Working of Flair
1) Flair Datatypes:
Flair offers two types of objects. They are:
- Sentence
- Tokens
To get the number of tokens in a sentence:
Python3
import flair
from flair.data import Sentence
# take a sentence
s= Sentence('GeeksforGeeks is Awesome.')
print(s)
Output:
2) NER Tags:
To predict tags for a given sentence we will use a pre-trained model as shown below:
Python3
import flair
from flair.data import Sentence
from flair.models import SequenceTagger
# input a sentence
s = Sentence('GeeksforGeeks is Awesome.')
# loading NER tagger
tagger_NER= SequenceTagger.load('ner')
# run NER over sentence
tagger_NER.predict(s)
print(s)
print('The following NER tags are found:\n')
# iterate and print
for entity in s.get_spans('ner'):
print(entity)
Output:
3) Word Embeddings:
Word embeddings give embeddings for each word of the text. As discussed earlier Flair supports many word embeddings including its own Flair Embeddings. Here we will see how to implement some of them.
A) Classic Word Embeddings - This class of word embeddings are static. In this, each distinct word is given only one pre-computed embedding. Most of the common word embeddings lie in this category including the GloVe embedding.
Python3
import flair
from flair.data import Sentence
from flair.embeddings import WordEmbeddings
# using glove embedding
GloVe_embedding = WordEmbeddings('glove')
# input a sentence
s = Sentence('Geeks for Geeks helps me study.')
# embed the sentence
GloVe_embedding.embed(s)
# print the embedded tokens
for token in s:
print(token)
print(token.embedding)
Output:
Note: You can see here that the embeddings for the word 'Geeks' are the same for both the occurrences.
B) Flair Embedding - This works on the concept of contextual string embeddings. It captures latent syntactic-semantic information. The word embeddings are contextualized by their surrounding words. It thus gives different embeddings for the same word depending on it's surrounding text.
Python3
import flair
from flair.data import Sentence
from flair.embeddings import FlairEmbeddings
# using forward flair embeddingembedding
forward_flair_embedding= FlairEmbeddings('news-forward-fast')
# input the sentence
s = Sentence('Geeks for Geeks helps me study.')
# embed words in the input sentence
forward_flair_embedding.embed(s)
# print the embedded tokens
for token in s:
print(token)
print(token.embedding)
Output:
Note: Here we see that the embeddings for the word 'Geeks' are different for both the occurrences depending on the contextual information around them.
C) Stacked Embeddings - Using these embeddings you can combine different embeddings together. Let's see how to combine GloVe, forward and backward Flair embeddings:
Python3
import flair
from flair.data import Sentence
from flair.embeddings import FlairEmbeddings, WordEmbeddings
from flair.embeddings import StackedEmbeddings
# flair embeddings
forward_flair_embedding= FlairEmbeddings('news-forward-fast')
backward_flair_embedding= FlairEmbeddings('news-backward-fast')
# glove embedding
GloVe_embedding = WordEmbeddings('glove')
# create a object which combines the two embeddings
stacked_embeddings = StackedEmbeddings([forward_flair_embedding,
backward_flair_embedding,
GloVe_embedding,])
# input the sentence
s = Sentence('Geeks for Geeks helps me study.')
# embed the input sentence with the stacked embedding
stacked_embeddings.embed(s)
# print the embedded tokens
for token in s:
print(token)
print(token.embedding)
Output:
4) Document Embeddings:
, Unlike word embeddings, document embeddings give a single embedding for the entire text. The document embeddings offered in Flair are:
- A) Transformer Document Embeddings
- B) Sentence Transformer Document Embeddings
- C) Document RNN Embeddings
- D) Document Pool Embeddings
Let's have a look at how the Document Pool Embeddings work-
Document Pool Embeddings — It is a very simple document embedding and it pooled over all the word embeddings and returns the average of all of them.
Python3
import flair
from flair.data import Sentence
from flair.embeddings import WordEmbeddings, DocumentPoolEmbeddings
# init the glove word embedding
GloVe_embedding = WordEmbeddings('glove')
# init the document embedding
doc_embeddings = DocumentPoolEmbeddings([GloVe_embedding])
# input the sentence
s = Sentence('Geeks for Geeks helps me study.')
#embed the input sentence with the document embedding
doc_embeddings.embed(s)
# print the embedded tokens
print(s.embedding)
Output:
Similarly, you can use other Document embeddings as well.
5) Training a Text Classification Model using Flair:
We are going to use the 'TREC_6' dataset available in Flair. You can also use your own datasets as well. To train our model we will be using the Document RNN Embeddings which trains an RNN over all the word embeddings in a sentence. The word embeddings which we will be using are the GloVe and the forward flair embedding.
Python3
from flair.data import Corpus
from flair.datasets import TREC_6
from flair.embeddings import WordEmbeddings, FlairEmbeddings, DocumentRNNEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
# load the corpus
corpus = TREC_6()
# create a label dictionary
label_Dictionary = corpus.make_label_dictionary()
# list of word embeddings to be used
word_embeddings = [WordEmbeddings('glove'),FlairEmbeddings('news-forward-fast')]
# init document embeddings and pass the word embeddings list
doc_embeddings = DocumentRNNEmbeddings(word_embeddings,hidden_size = 250)
# creating the text classifier
text_classifier = TextClassifier(doc_embeddings,label_dictionary = label_Dictionary)
# init the text classifier trainer
model_trainer = ModelTrainer(text_classifier,corpus)
# train your model
model_trainer.train('resources/taggers/trec',learning_rate=0.1,mini_batch_size=40,anneal_factor=0.5,patience=5,max_epochs=200)
Results of training:
The accuracy of the model is around 95%.
Predictions: Now we can load the model and make predictions-
Python3
from flair.data import Sentence
from flair.models import TextClassifier
c = TextClassifier.load('resources/taggers/trec/final-model.pt')
# input example sentence
s = Sentence('Who is the President of India ?')
# predict class and print
c.predict(s)
# print the labels
print(s.labels)
Output:
[HUM (1.0)]
Now you would have got a rough idea of how to use the Flair library.
Similar Reads
NLP for Finance
Technology plays a pivotal role in shaping strategies, optimizing processes, and enhancing decision-making. Among the myriad of technological advancements, Natural Language Processing (NLP) stands out as a transformative force, revolutionizing how financial institutions analyze data, extract insight
8 min read
Information Extraction in NLP
Information Extraction (IE) in Natural Language Processing (NLP) is a crucial technology that aims to automatically extract structured information from unstructured text. This process involves identifying and pulling out specific pieces of data, such as names, dates, relationships, and more, to tran
6 min read
Keyword Extraction Methods in NLP
Keyword extraction is a vital task in Natural Language Processing (NLP) for identifying the most relevant words or phrases from text, and enhancing insights into its content. The article explores the basics of keyword extraction, its significance in NLP, and various implementation methods using Pyth
11 min read
NLP Full Form
Natural Language Processing (NLP) is a fascinating field at the intersection of artificial intelligence and linguistics. It focuses on enabling computers to understand, interpret, and generate human language meaningfully. NLP Full FormThe full form of NLP is Natural Language Processing.What is NLP?N
2 min read
Nlp Algorithms
Natural Language Processing (NLP) is a branch of artificial intelligence (AI) that focuses on developing algorithms to understand and process human language. These algorithms enable computers to comprehend, analyze, and generate human language, allowing for more natural interactions between humans a
5 min read
NLP Algorithms: A Beginner's Guide for 2024
NLP algorithms are complex mathematical methods, that instruct computers to distinguish and comprehend human language. They enable machines to comprehend the meaning of and extract information from, written or spoken data. Indeed, this is nothing but the dictionary that allows robots to understand w
12 min read
History and Evolution of NLP
As we know Natural language processing (NLP) is an exciting area that has grown at some stage in time, influencing the junction of linguistics, synthetic intelligence (AI), and computer technology knowledge. This article takes you on an in-depth journey through the history of NLP, diving into its co
13 min read
10 NLP Project Ideas For Beginners
Machine Learning is a hastily growing area, and Natural Language Processing (NLP) is one of its most exciting and promising branches. NLP makes a specialty of permitting computer systems to recognize, interpret, and generate human language, and it performs an essential function in various programs l
7 min read
Build a Knowledge Graph in NLP
A knowledge graph is a structured representation of knowledge that captures relationships and entities in a way that allows machines to understand and reason about information in the context of natural language processing. This powerful concept has gained prominence in recent years because of the fr
6 min read
RAG Vs Fine-Tuning for Enhancing LLM Performance
Data Science and Machine Learning researchers and practitioners alike are constantly exploring innovative strategies to enhance the capabilities of language models. Among the myriad approaches, two prominent techniques have emerged which are Retrieval-Augmented Generation (RAG) and Fine-tuning. The
9 min read