0% found this document useful (0 votes)

6 views

Evalutation_code_for_participant

This document provides a Python script for participants to evaluate their model submissions using labeled training data. It outlines the required datasets, the structure of the input files, and the necessary checks for data integrity before evaluation. The script merges predicted results with actual outcomes to calculate and display the accuracy of the predictions.

Uploaded by

Anuja Hardaha

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Evalutation_code_for_participant

Uploaded by

Anuja Hardaha

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

import pandas as pd

import sys

# Instructions for participants :

'''
Participants can use this code to run on labeled train/out-of-sample data to mimic
evaluation process.
### Datasets required:
This script takes in 3 files as follows:

primary_submission.csv -> This contains the match_id, dataset_type,

win_pred_team_id, win_pred_score, train_algorithm, is_ensemble, train_hps_trees,
train_hps_depth, train_hps_lr, *top 10 feature values. This is file submitted by
participant.
secondary_submission.csv -> This contains feature_name, feature_description,
model_feature_importance_rank, model_feature_importance_percentage,
feature_correlation_dep_var. This is file submitted by participant.
dep_var.csv -> This contains match_id, dataset_type, win_team_id. Participants
can generate from the labeled train data.

Please ensure that the predicted_score column does not have any null columns and
the column names are exactly matching as above.
Please ensure that all these files are stored as ',' separated csv files.

### How to use:

To use this, first open the command line terminal, and call evaluation code script
by passing the locations of submission and actual files respectively.
Sample example of using commandline for running the script:

python Evaluation_Code.py
C/Users/anujahardaha/Documents/final_predictions_with_temp1.csv

'''

def checkDataType1(df):
assert (df['match id'].isna().sum() == 0), 'match id should not have NaNs'
assert (df['match id'].dtype == 'int64'), ('match id is not int64 type')
assert df['win_pred_team_id'].isna().sum(
) == 0, 'win_pred_team_id should not have NaNs'
assert df['win_pred_team_id'].dtype == 'int64', (
'win_pred_team_id is not int64 type')
assert df['win_pred_score'].isna().sum(
) == 0, 'win_pred_score should not have NaNs'
assert df['win_pred_score'].dtype == 'float64', (
'win_pred_score is not float64 type')
assert df['train_algorithm'].isna().sum(
) == 0, 'train_algorithm should not have NaNs'
assert df['train_algorithm'].dtype == 'object', (
'train_algorithm is not object type')
assert df['is_ensemble'].isna().sum(
) == 0, 'is_ensemble should not have NaNs'
assert df['is_ensemble'].dtype == 'object', (
'is_ensemble is not object type')
assert df['train_hps_trees'].isna().sum(
) == 0, 'train_hps_trees should not have NaNs'
assert df['train_hps_depth'].isna().sum(
) == 0, 'train_hps_depth should not have NaNs'
assert df['train_hps_lr'].isna().sum(
) == 0, 'train_hps_lr should not have NaNs'
return None

def checkDataType2(df):
assert df['feat_id'].isna().sum() == 0, 'feat_id should not have NaNs'
assert df['feat_id'].dtype == 'int64', ('feat_id is not int type')
assert df['feat_name'].isna().sum() == 0, 'feat_name should not have NaNs'
assert df['feat_name'].dtype == 'object', ('feat_name is not object type')
assert df['feat_description'].isna().sum(
) == 0, 'feat_description should not have NaNs'
assert df['feat_description'].dtype == 'object', (
'feat_description is not object type')
assert df['model_feat_imp_train'].isna().sum(
) == 0, ' model_feat_imp_train should not have NaNs'
assert df['model_feat_imp_train'].dtype == 'float64', (
'model_feat_imp_train is not float type')
assert df['feat_rank_train'].isna().sum(
) == 0, 'feat_rank_train should not have NaNs'
assert df['feat_rank_train'].dtype == 'int64', (
'feat_rank_train is not int64 type')
return None

def getAccuracy(df):
return round(df[df['winner_id'] ==
df['win_pred_team_id']].shape[0]*100/df.shape[0], 4)

if len(sys.argv) != 4:
sys.exit("Please pass three files only as mentioned in the Instructions.")

# Location of submission file. Header here should include match_id, dataset_type,

win_team_id. The file should be comma separated.
input1_address = sys.argv[1]
df_input1 = pd.read_csv(input1_address, sep=",", header=0)

input2_address = sys.argv[2]
df_input2 = pd.read_csv(input2_address, sep=",", header=0)

# For participants Team : Location of Dependent Variable file. Header here would be
match_id, dataset_type, win_team_id. Participants can generate from the labeled
train data. These files are comma separated
round_eval = sys.argv[3]
df_round = pd.read_csv(round_eval, sep=",", header=0)

assert set(['match id', 'dataset_type', 'win_pred_team_id', 'win_pred_score',

'train_algorithm', 'is_ensemble', 'train_hps_trees',
'train_hps_depth',
'train_hps_lr']).issubset(set(df_input1.columns.tolist())), 'Required columns not
present in primary submission file'
assert set(['indep_feat_id1', 'indep_feat_id2', 'indep_feat_id3', 'indep_feat_id4',
'indep_feat_id5', 'indep_feat_id6', 'indep_feat_id7', 'indep_feat_id8',
'indep_feat_id9',
'indep_feat_id10']).issubset(set(df_input1.columns.tolist())), 'Required indepedent
feature columns not present in primary submission file'
assert set(['feat_id', 'feat_name', 'feat_description', 'model_feat_imp_train',
'feat_rank_train']).issubset(
set(df_input2.columns.tolist())), 'Required columns not present in secondary
submission file'

checkDataType1(df_input1)
checkDataType2(df_input2)

'''
shape_before_join = df_round.shape[0]

r1_size = df_input1[df_input1['dataset_type'] == 'r1'].shape[0]

assert (r1_size ==
df_round.shape[0]), f'R1 data size in input file is incorrect. Expected
rowsize 271 not equal to r1 dataset_type present {r1_size}'
'''

# merging predicted file and dependent variable file

eval_data = pd.merge(df_round, df_input1, on=[
'match id'], how='inner').drop_duplicates()
assert (eval_data.shape[0] == df_round.shape[0]
), 'match ids in submission template does not match eval data'

print('All checks passed...')

print('Accuracy: ', round(getAccuracy(eval_data), 2))

Mercedes-Benz Greener Manufacturing Ai
0% (1)
Mercedes-Benz Greener Manufacturing Ai
16 pages
Rpms-Commitment & Target Setting
78% (9)
Rpms-Commitment & Target Setting
47 pages
Machine Learning Laboratory (21AIL66)
No ratings yet
Machine Learning Laboratory (21AIL66)
7 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
Machine File
No ratings yet
Machine File
27 pages
AIML
No ratings yet
AIML
12 pages
AIML Prograns
No ratings yet
AIML Prograns
6 pages
Assignment 2 Oops
No ratings yet
Assignment 2 Oops
10 pages
Dream Team 11
No ratings yet
Dream Team 11
6 pages
ML File
No ratings yet
ML File
13 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Naive
No ratings yet
Naive
5 pages
Program 1
No ratings yet
Program 1
25 pages
code mlt
No ratings yet
code mlt
9 pages
ML Final-1
No ratings yet
ML Final-1
7 pages
ML Manual
No ratings yet
ML Manual
34 pages
Pract5 1
No ratings yet
Pract5 1
3 pages
Question 1 The Given Dataset Can Be Visualized As Follows
No ratings yet
Question 1 The Given Dataset Can Be Visualized As Follows
13 pages
Exp 5
No ratings yet
Exp 5
4 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
SRM Eswari Workshop Day 1 - Feb 2025 - Jupyter Notebook
No ratings yet
SRM Eswari Workshop Day 1 - Feb 2025 - Jupyter Notebook
39 pages
Ai ML Programs
No ratings yet
Ai ML Programs
34 pages
Assignment#3 (Naive Bayes)
No ratings yet
Assignment#3 (Naive Bayes)
5 pages
Import Numpy As NP
No ratings yet
Import Numpy As NP
4 pages
Multi Classification.py(for 1 Class Tp,Tn,Fp,Fn)
No ratings yet
Multi Classification.py(for 1 Class Tp,Tn,Fp,Fn)
25 pages
ML LAB P-1
No ratings yet
ML LAB P-1
10 pages
indexdw (1)
No ratings yet
indexdw (1)
34 pages
AIML Practical exam codes 1
No ratings yet
AIML Practical exam codes 1
7 pages
Train
No ratings yet
Train
17 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
MLT Practical 1 and 2
No ratings yet
MLT Practical 1 and 2
4 pages
201CS240-MLLABMANUAL
No ratings yet
201CS240-MLLABMANUAL
20 pages
Machine Learning practical file
No ratings yet
Machine Learning practical file
31 pages
Machine Learning Laboratory Record Book: 1 Find S Algorithm
No ratings yet
Machine Learning Laboratory Record Book: 1 Find S Algorithm
22 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
RandomForest
No ratings yet
RandomForest
8 pages
CP4252 MACHINE LEARNING LABORATORY
No ratings yet
CP4252 MACHINE LEARNING LABORATORY
37 pages
linear-reg-signal-and-noise.pdf
No ratings yet
linear-reg-signal-and-noise.pdf
20 pages
ML-Lab-A1-A4
No ratings yet
ML-Lab-A1-A4
6 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
# Update the template to remove pla
No ratings yet
# Update the template to remove pla
2 pages
New ML Lab Manual
No ratings yet
New ML Lab Manual
29 pages
ML Record
No ratings yet
ML Record
18 pages
AD3461_ML Lab Manual
No ratings yet
AD3461_ML Lab Manual
54 pages
featureselection
No ratings yet
featureselection
11 pages
Lab Manual
No ratings yet
Lab Manual
25 pages
221IT027_DA_lab3 (2)
No ratings yet
221IT027_DA_lab3 (2)
5 pages
HIV Regression Source Code
No ratings yet
HIV Regression Source Code
26 pages
Pattern Recognition
No ratings yet
Pattern Recognition
26 pages
ml_all_projectpdf_removed
No ratings yet
ml_all_projectpdf_removed
41 pages
MLAll Practical
No ratings yet
MLAll Practical
27 pages
MACHINE LEARNING manual
No ratings yet
MACHINE LEARNING manual
36 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Report (1)
No ratings yet
Report (1)
2 pages
SimpliClariFy_PGP069Anuja_Hardaha
No ratings yet
SimpliClariFy_PGP069Anuja_Hardaha
3 pages
Pink tax case
No ratings yet
Pink tax case
2 pages
Case Discussion Questions PGP-SM
No ratings yet
Case Discussion Questions PGP-SM
2 pages
Train 1o
No ratings yet
Train 1o
3 pages
Train 2r
No ratings yet
Train 2r
3 pages
Measuring Employee Engagement: How & Why
No ratings yet
Measuring Employee Engagement: How & Why
3 pages
Philippine Disaster Risk Reduction and Management System
100% (3)
Philippine Disaster Risk Reduction and Management System
22 pages
HTTPS::WWW - Lsengineers.co - Uk:media:advice centre:GX160:GX160 Adjustment Information
No ratings yet
HTTPS::WWW - Lsengineers.co - Uk:media:advice centre:GX160:GX160 Adjustment Information
1 page
Sydney Harbour: Bridge
No ratings yet
Sydney Harbour: Bridge
5 pages
BSW3703 Assignment 1 2024
No ratings yet
BSW3703 Assignment 1 2024
10 pages
Low Speed Aerodynamics (Aerodynamics-I) : Multiple Choice Questions Set 3
No ratings yet
Low Speed Aerodynamics (Aerodynamics-I) : Multiple Choice Questions Set 3
4 pages
CHAPTER ONE AND TWO
No ratings yet
CHAPTER ONE AND TWO
26 pages
A Simple Closed-Loop Active Gate Voltage Driver For Controlling Di - DT and DV - DT in IGBTs PDF
No ratings yet
A Simple Closed-Loop Active Gate Voltage Driver For Controlling Di - DT and DV - DT in IGBTs PDF
18 pages
Altera. Direct Sequence Spread Spectrum (DSSS) Modem Reference Design. 2010
No ratings yet
Altera. Direct Sequence Spread Spectrum (DSSS) Modem Reference Design. 2010
44 pages
Ejercicio de Past Simple and Continuous - When - While
No ratings yet
Ejercicio de Past Simple and Continuous - When - While
1 page
BP 22 Jurisprudence
No ratings yet
BP 22 Jurisprudence
38 pages
Bangalore Master List
No ratings yet
Bangalore Master List
9 pages
Confidence Intervals For Pearson's Correlation
No ratings yet
Confidence Intervals For Pearson's Correlation
6 pages
TLE-EPAS Grade10 Q2 LAS2
No ratings yet
TLE-EPAS Grade10 Q2 LAS2
3 pages
Audit of Shareholders Equity
No ratings yet
Audit of Shareholders Equity
2 pages
TRAINING AND DEVELOPMENT Notes
No ratings yet
TRAINING AND DEVELOPMENT Notes
28 pages
Advia 1800 Catalog Parts
No ratings yet
Advia 1800 Catalog Parts
75 pages
Brand SBM
No ratings yet
Brand SBM
10 pages
1 Abhavya Resume
No ratings yet
1 Abhavya Resume
2 pages
Corporate Law and Business Administration1-2
No ratings yet
Corporate Law and Business Administration1-2
118 pages
A Azeezur Rahman Senior Power System Engineer: Email:, Mobile:9940688742
No ratings yet
A Azeezur Rahman Senior Power System Engineer: Email:, Mobile:9940688742
3 pages
asiabolt_eng
No ratings yet
asiabolt_eng
27 pages
Soybean Seasonal Report 2014-15
No ratings yet
Soybean Seasonal Report 2014-15
11 pages
Rental Agreement
No ratings yet
Rental Agreement
2 pages
Power of Education Empowering Individuals
No ratings yet
Power of Education Empowering Individuals
2 pages
Basic Vessel Valuation - Issue No 3
No ratings yet
Basic Vessel Valuation - Issue No 3
4 pages
Slideplayer Com Slide 10294885
No ratings yet
Slideplayer Com Slide 10294885
15 pages
CFAS - Lec. 1 OVERVIEW OF ACCOUNTING
100% (1)
CFAS - Lec. 1 OVERVIEW OF ACCOUNTING
17 pages
Session 15 - MCS - Management of Compensation PDF
No ratings yet
Session 15 - MCS - Management of Compensation PDF
10 pages
Tourism May-June 2022 MG Eng
No ratings yet
Tourism May-June 2022 MG Eng
15 pages

Evalutation_code_for_participant

Uploaded by

Evalutation_code_for_participant

Uploaded by

import pandas as pd

# Instructions for participants :

primary_submission.csv -> This contains the match_id, dataset_type,

### How to use:

# Location of submission file. Header here should include match_id, dataset_type,

assert set(['match id', 'dataset_type', 'win_pred_team_id', 'win_pred_score',

r1_size = df_input1[df_input1['dataset_type'] == 'r1'].shape[0]

# merging predicted file and dependent variable file

print('All checks passed...')

You might also like