NLP | Storing Frequency Distribution in Redis Last Updated : 12 Jun, 2019 Summarize Comments Improve Suggest changes Share Like Article Like Report The nltk.probability.FreqDist class is used in many classes throughout NLTK for storing and managing frequency distributions. It's quite useful, but it's all in-memory, and doesn't provide a way to persist the data. A single FreqDist is also not accessible to multiple processes. All that can be changed by building a FreqDist on top of Redis. What is Redis? Redis is a data structure server that is one of the more popular NoSQL databases. Among other things, it provides a network-accessible database for storing dictionaries (also known as hash maps). Building a FreqDist interface to a Redis hash map will allow us to create a persistent FreqDist that is accessible to multiple local and remote processes at the same time. Installation : Install both Redis and redis-py. The Redis website is at http://redis.io/ and includes many documentation resources. To use hash maps, install the latest version, which at the time of this writing is 2.8.9. The Redis Python driver, redis-py, can be installed using pip install redis or easy_install redis. The latest version at this time is 2.9.1. The redis-py home page is at http://github.com/andymccurdy/redis-py/. Once both are installed and a redis-server process is running, you're ready to go. Let's assume redis-server is running on localhost on port 6379 (the default host and port). How it works? The FreqDist class extends the standard library collections.Counter class, which makes a FreqDist a small wrapper with a few extra methods, such as N(). The N() method returns the number of sample outcomes, which is the sum of all the values in the frequency distribution. An API-compatible class is created on top of Redis by extending a RedisHashMapand then implementing the N() method. The RedisHashFreqDist (defined in redisprob.py) sums all the values in the hash map for the N() method Code : Explaining the working Python3 1== from rediscollections import RedisHashMap class RedisHashFreqDist(RedisHashMap): def N(self): return int(sum(self.values())) def __missing__(self, key): return 0 def __getitem__(self, key): return int(RedisHashMap.__getitem__(self, key) or 0) def values(self): return [int(v) for v in RedisHashMap.values(self)] def items(self): return [(k, int(v)) for (k, v) in RedisHashMap.items(self)] This class can be used just like a FreqDist. To instantiate it, pass a Redis connection and the name of our hash map. The name should be a unique reference to this particular FreqDist so that it doesn't clash with any other keys in Redis. Code: Python3 1== from redis import Redis from redisprob import RedisHashFreqDist r = Redis() rhfd = RedisHashFreqDist(r, 'test') print (len(rhfd)) rhfd['foo'] += 1 print (rhfd['foo']) rhfd.items() print (len(rhfd)) Output : 0 1 1 Most of the work is done in the RedisHashMap class, which extends collections.MutableMapping and then overrides all methods that require Redis-specific commands. Outline of each method that uses a specific Redis command: __len__() : This uses the hlen command to get the number of elements in thehash map __contains__(): This uses the hexists command to check if an element existsin the hash map __getitem__(): This uses the hget command to get a value from the hash map __setitem__(): This uses the hset command to set a value in the hash map __delitem__(): This uses the hdel command to remove a value from thehash map keys(): This uses the hkeys command to get all the keys in the hash map values(): This uses the hvals command to get all the values in the hash map items(): This uses the hgetall command to get a dictionary containing all the keys and values in the hash map clear(): This uses the delete command to remove the entire hash map from Redis Comment More infoAdvertise with us Next Article NLP | Storing Frequency Distribution in Redis M mathemagic Follow Improve Article Tags : NLP Natural-language-processing python Redis Practice Tags : python Similar Reads NLP | Storing Conditional Frequency Distribution in Redis The nltk.probability.ConditionalFreqDist class is a container for FreqDist instances, with one FreqDist per condition. It is used to count frequencies that are dependent on another condition, such as another word or a class label. It is being used here to create an API-compatible class on top of Red 2 min read NLP | Storing an ordered dictionary in Redis An ordered dictionary is like a normal dict, but the keys are ordered by an ordering function. In the case of Redis, it supports ordered dictionaries whose keys are strings and whose values are floating point scores. This structure can come in handy in cases where information gain has to be calculat 3 min read Introduction to Spring Data Redis Spring Data Redis is a module of the larger Spring Data project that provides an abstraction for working with Redis. Redis (Remote Dictionary Server) is an in-memory data structure store that can function as a key-value store, cache, message broker, and even a database. While Redis itself is a power 6 min read Complete Guide on Redis Strings Redis String is a sequence of bytes that can store a sequence of bytes, including text, Object, and binary arrays. which can store a maximum of 512 megabytes in one string. Redis String can also be used like a Redis Key for mapping a string to another string. String Data Types are useful in differen 6 min read Introduction to Redis Redis (Remote Dictionary Server) is a fast database used for caching web pages to reduce server load, and message brokering to manage communication between systems. Uses of Redis are:Caching frequently accessed data to improve website.Session storage for web applicationsReal-time analytics and leade 4 min read Like