Python - Cumulative Mean of Dictionary keys

Last Updated : 02 May, 2023

Given the dictionary list, our task is to write a Python Program to extract the mean of all keys.

Input : test_list = [{'gfg' : 34, 'is' : 8, 'best' : 10},

{'gfg' : 1, 'for' : 10, 'geeks' : 9, 'and' : 5, 'best' : 12},

{'geeks' : 8, 'find' : 3, 'gfg' : 3, 'best' : 8}]

Output : {'gfg': 12.666666666666666, 'is': 8, 'best': 10, 'for': 10, 'geeks': 8.5, 'and': 5, 'find': 3}

Explanation : best has 3 values, 10, 8 and 12, their mean computed to 10, hence in result.

Input : test_list = [{'gfg' : 34, 'is' : 8, 'best' : 10},

{'gfg' : 1, 'for' : 10, 'and' : 5, 'best' : 12},

{ 'find' : 3, 'gfg' : 3, 'best' : 8}]

Output : {'gfg': 12.666666666666666, 'is': 8, 'best': 10, 'for': 10, 'and': 5, 'find': 3}

Explanation : best has 3 values, 10, 8 and 12, their mean computed to 10, hence in result.

Method #1 : Using mean() + loop

In this, for extracting each list loop is used and all the values are summed and memorized using a dictionary. Mean is extracted later by dividing by the occurrence of each key.

Python3

# Python3 code to demonstrate working of
# Cumulative Keys Mean in Dictionary List
# Using loop + mean()
from statistics import mean

# initializing list
test_list = [{'gfg' : 34, 'is' : 8, 'best' : 10},
             {'gfg' : 1, 'for' : 10, 'geeks' : 9, 'and' : 5, 'best' : 12},
             {'geeks' : 8, 'find' : 3, 'gfg' : 3, 'best' : 8}]
             
# printing original list
print("The original list is : " + str(test_list))

res = dict()
for sub in test_list:
    for key, val in sub.items():
        if key in res:
            
            # combining each key to all values in
            # all dictionaries
            res[key].append(val)
        else:
            res[key] = [val]

for key, num_l in res.items():
    res[key] = mean(num_l)

# printing result
print("The Extracted average : " + str(res))

Output:

The original list is : [{'gfg': 34, 'is': 8, 'best': 10}, {'gfg': 1, 'for': 10, 'geeks': 9, 'and': 5, 'best': 12}, {'geeks': 8, 'find': 3, 'gfg': 3, 'best': 8}]

The Extracted average : {'gfg': 12.666666666666666, 'is': 8, 'best': 10, 'for': 10, 'geeks': 8.5, 'and': 5, 'find': 3}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #2 : Using defaultdict() + mean()

In this, the task of memorizing is done using defaultdict(). This reduces one conditional check and makes the code more concise.

Python3

# Python3 code to demonstrate working of
# Cumulative Keys Mean in Dictionary List
# Using defaultdict() + mean()
from statistics import mean
from collections import defaultdict

# initializing list
test_list = [{'gfg' : 34, 'is' : 8, 'best' : 10},
             {'gfg' : 1, 'for' : 10, 'geeks' : 9, 'and' : 5, 'best' : 12},
             {'geeks' : 8, 'find' : 3, 'gfg' : 3, 'best' : 8}]
             
# printing original list
print("The original list is : " + str(test_list))

# defaultdict reduces step to memorize.
res = defaultdict(list)
for sub in test_list:
    for key, val in sub.items():
        res[key].append(val)
        
res = dict(res)
for key, num_l in res.items():
    
    # computing mean
    res[key] = mean(num_l)

# printing result
print("The Extracted average : " + str(res))

Output:

The original list is : [{'gfg': 34, 'is': 8, 'best': 10}, {'gfg': 1, 'for': 10, 'geeks': 9, 'and': 5, 'best': 12}, {'geeks': 8, 'find': 3, 'gfg': 3, 'best': 8}]

The Extracted average : {'gfg': 12.666666666666666, 'is': 8, 'best': 10, 'for': 10, 'geeks': 8.5, 'and': 5, 'find': 3}

Time Complexity: O(n²)
Auxiliary Space: O(n)

Method #3: Using pandas library

Import the pandas library.
Create a pandas DataFrame from the test_list.
Use the melt function to transform the DataFrame from wide to long format, with one row for each key-value pair.
Use the groupby function to group the DataFrame by the keys and calculate the mean of the values for each key.Convert the resulting pandas Series to a dictionary.

Python3

import pandas as pd

# initializing list
test_list = [{'gfg' : 34, 'is' : 8, 'best' : 10},
             {'gfg' : 1, 'for' : 10, 'geeks' : 9, 'and' : 5, 'best' : 12},
             {'geeks' : 8, 'find' : 3, 'gfg' : 3, 'best' : 8}]

# create pandas DataFrame from test_list
df = pd.DataFrame(test_list)

# transform DataFrame from wide to long format
df = df.melt(var_name='key', value_name='value')

# group DataFrame by keys and calculate mean of values for each key
res = df.groupby('key').mean()['value'].to_dict()

# print result
print("The Extracted average : " + str(res))

Output:

The Extracted average : {'and': 5.0, 'best': 10.0, 'find': 3.0, 'for': 10.0, 'geeks': 8.5, 'gfg': 12.666666666666666, 'is': 8.0}

Time complexity: O(n*logn), where n is the total number of key-value pairs in the test_list.
Auxiliary space: O(n), where n is the total number of key-value pairs in the test_list.

Method #4: using a list comprehension and the setdefault() method

Create a list of dictionaries test_list.
Create an empty dictionary res.
Loop over each dictionary d in test_list.
Loop over each key-value pair (key, val) in d.
If the key key is not in res, set its value to an empty list. Append the value val to the list associated with the key key in the res dictionary.
Create a new dictionary res_mean.
Loop over each key-value pair (key, val) in the res dictionary.
Compute the mean of the values val associated with the key key using the mean function from the statistics module.
Add a new key-value pair to the res_mean dictionary with the key key and the value equal to the mean value computed in step 8.
Print the res_mean dictionary as a string, with a message indicating that it contains the extracted average values.

Python3

from statistics import mean

test_list = [{'gfg': 34, 'is': 8, 'best': 10},
             {'gfg': 1, 'for': 10, 'geeks': 9,
              'and': 5, 'best': 12},
             {'geeks': 8, 'find': 3, 'gfg': 3, 'best': 8}]

res = {}
for d in test_list:
    for key, val in d.items():
        res.setdefault(key, []).append(val)

res_mean = {key: mean(val) for key, val in res.items()}
print("The Extracted average : " + str(res_mean))

Output

The Extracted average : {'gfg': 12.666666666666666, 'is': 8, 'best': 10, 'for': 10, 'geeks': 8.5, 'and': 5, 'find': 3}

Time complexity: O(nk), where n is the number of dictionaries in test_list and k is the average number of keys in each dictionary.
Auxiliary space: O(mk), where m is the number of unique keys in all the dictionaries in test_list and k is the average number of values associated with each key.