In this tutorial, we show you how to compute counterfactual explanations for explaining positively-predicted instances. We use movie viewing data (Movielens1m) where the goal is to predict gender ('Female' user). The counterfactual explanation shows a set of movies such that when removing them from the user's viewing history, the predicted class changes from 'Female' to 'Male'.

Import libraries and import data set.

In [1]:
import pandas as pd
import numpy as np
import sedc_algorithm
from function_edc import fn_1 
import scipy
In [2]:
%run sedc_algorithm.py #run sedc_algorithm.py module

For this demonstration, we use the Movielens 1M data set, which contains movie viewing behavior of users. The target variable is binary (taking value 1 if gender = 'FEMALE' and 0 if gender = 'MALE').

In [24]:
target = pd.read_csv('target_ML1M.csv')
target = 1 - target
data = pd.read_csv('data_ML1M.csv')
feature_names = pd.read_csv('feature_names_ML1M.csv')

Split data into a training and test set (80-20%). We use the finetuned MLP hyperparameter configuration as found in the paper of De Cnudde et al. (2018) titled 'An exploratory study towards applying and demystifying deep learning classification on behavioral big data'. We train the MLP classifier on the training data set.

In [47]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(scipy.sparse.csr_matrix(data.iloc[:,1:3707].values), target.iloc[:,1], test_size=0.2, random_state=0)
In [48]:
from sklearn.neural_network import MLPClassifier
MLP_model = MLPClassifier(activation='relu', learning_rate_init=0.30452, alpha=0.0001, learning_rate='adaptive', early_stopping=True, hidden_layer_sizes=(532,135,1009), solver='lbfgs', batch_size=100)
MLP_model.fit(x_train, y_train)
Out[48]:
MLPClassifier(activation='relu', alpha=0.0001, batch_size=100, beta_1=0.9,
              beta_2=0.999, early_stopping=True, epsilon=1e-08,
              hidden_layer_sizes=(532, 135, 1009), learning_rate='adaptive',
              learning_rate_init=0.30452, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='lbfgs', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

Calculate the Area under the ROC curve (AUC) of the model on the test set.

In [49]:
from sklearn.metrics import roc_auc_score

Scores = MLP_model.predict_proba(x_test)[:,1] #predict scores using the trained MLP model
AUC = roc_auc_score(y_test,Scores) #output AUC of the model 
print("The AUC of the model is %f" %AUC)
The AUC of the model is 0.815354

Predict 25% of the test instances as positive (gender = 'FEMALE') (e.g., because of a limited target budget). Obtain the indices of the test instances that are predicted as 'FEMALE', i.e. the instances that the model is most sure of that they are 'FEMALE' users.

In [50]:
probs = MLP_model.predict_proba(x_test)[:,1]
threshold_classifier_probs = np.percentile(probs,75) 
predictions_probs = (probs>=threshold_classifier_probs)
indices_probs_pos = np.nonzero(predictions_probs)#indices of the test instances that are positively-predicted
In [65]:
probs[4] >= threshold_classifier_probs
Out[65]:
False
In [61]:
classification_model = MLP_model 

def classifier_fn(X):
    c=classification_model.predict_proba(X)
    y_predicted_proba=c[:,1]
    return y_predicted_proba

Create an SEDC explainer object. By default, the SEDC algorithm stops looking for explanations when a first explanation is found or when a 5-minute time limit is exceeded or when more than 50 iterations are required (see edc_agnostic.py for more details). Only the active (nonzero) features are perturbed (set to zero) to evaluate the impact on the model's predicted output. In other words, only the movies that a user has watched can become part of the counterfactual explanation of the model prediction.

In [52]:
explainer_SEDC = SEDC_Explainer(feature_names = np.array(feature_names.iloc[:,1]), 
                               threshold_classifier = threshold_classifier_probs, 
                               classifier_fn = classifier_fn)

Show indices of positively-predicted test instances.

In [53]:
indices_probs_pos #all instances that are predicted as 'FEMALE'
Out[53]:
(array([   1,    2,    6,   13,   17,   18,   27,   35,   39,   45,   46,
          47,   50,   51,   53,   56,   68,   72,   92,   96,   98,  105,
         109,  113,  118,  121,  126,  129,  132,  134,  145,  151,  155,
         159,  165,  178,  182,  184,  187,  188,  193,  194,  196,  205,
         207,  208,  209,  210,  212,  217,  218,  225,  226,  229,  232,
         236,  251,  256,  260,  266,  267,  270,  274,  278,  281,  286,
         289,  299,  300,  304,  311,  326,  327,  334,  335,  337,  344,
         345,  347,  348,  357,  359,  362,  364,  370,  373,  376,  377,
         379,  381,  386,  388,  390,  392,  393,  400,  402,  404,  405,
         406,  410,  418,  422,  426,  428,  429,  432,  434,  435,  438,
         440,  441,  446,  447,  448,  449,  450,  452,  457,  459,  463,
         481,  491,  492,  495,  500,  502,  512,  516,  517,  518,  528,
         530,  531,  536,  538,  541,  543,  544,  545,  553,  560,  561,
         562,  567,  570,  580,  582,  585,  589,  593,  602,  604,  606,
         611,  614,  618,  629,  631,  638,  646,  652,  655,  656,  658,
         662,  664,  668,  673,  674,  675,  683,  684,  686,  692,  693,
         705,  707,  708,  712,  718,  721,  726,  727,  728,  729,  733,
         734,  736,  742,  746,  764,  774,  780,  782,  785,  787,  794,
         797,  798,  799,  802,  804,  809,  813,  835,  839,  851,  853,
         861,  862,  864,  868,  871,  872,  875,  879,  881,  884,  889,
         891,  893,  902,  903,  905,  908,  909,  913,  914,  916,  919,
         921,  931,  932,  933,  935,  951,  958,  959,  961,  972,  975,
         979,  985,  988,  991,  992,  997,  998, 1004, 1005, 1012, 1014,
        1015, 1017, 1026, 1029, 1031, 1040, 1041, 1047, 1049, 1052, 1058,
        1061, 1064, 1065, 1073, 1077, 1082, 1083, 1093, 1095, 1096, 1098,
        1099, 1102, 1109, 1110, 1111, 1112, 1114, 1116, 1119, 1131, 1144,
        1145, 1148, 1149, 1151, 1153, 1154, 1157, 1158, 1159, 1163, 1165,
        1168, 1171, 1181, 1190, 1196], dtype=int64),)

Explain why the user with index = 17 is predicted as a 'FEMALE' user by the model.

In [85]:
index = 17
instance_idx = x_test[index]
explanation = explainer_SEDC.explanation(instance_idx)
Initialization is complete.

 Elapsed time 0 


 Iteration 1 

The difference is 0.132827
Index is 7.000000
Length of new_combinations is 1 features.
New combination cannot be expanded

 Elapsed time 0 


 Size combis to expand 103 

Iterations are done.

 Elapsed time 0 

Show explanation(s) that is/are found.

In [86]:
explanation[0]
Out[86]:
[['Birdcage']]
In [87]:
print("IF the user did not watch the movie(s) " + str(explanation[0][0]) + ", THEN the predicted class would change from 'FEMALE' to 'MALE'.")
IF the user did not watch the movie(s) ['Birdcage'], THEN the predicted class would change from 'FEMALE' to 'MALE'.

Explain why the user with index = 13 is predicted as a 'FEMALE' user by the model.

In [89]:
index = 13
instance_idx = x_test[index]
explanation = explainer_SEDC.explanation(instance_idx)
Initialization is complete.

 Elapsed time 0 


 Iteration 1 

The difference is 0.000000
Index is 6.000000
Length of new_combinations is 1 features.
New combinations can be expanded
Threshold is 0.000006

 Elapsed time 0 


 Size combis to expand 344 


 Iteration 2 

The difference is 0.000006
Index is 68.000000
Length of new_combinations is 2 features.
New combinations can be expanded
Threshold is 0.000339

 Elapsed time 0 


 Size combis to expand 514 


 Iteration 3 

The difference is 0.000339
Index is 34.000000
Length of new_combinations is 3 features.
New combinations can be expanded
Threshold is 0.012820

 Elapsed time 1 


 Size combis to expand 683 


 Iteration 4 

The difference is 0.012820
Index is 22.000000
Length of new_combinations is 4 features.
New combinations can be expanded
Threshold is 0.157805

 Elapsed time 2 


 Size combis to expand 851 


 Iteration 5 

The difference is 0.157805
Index is 67.000000
Length of new_combinations is 5 features.
New combination cannot be expanded

 Elapsed time 2 


 Size combis to expand 851 

Iterations are done.

 Elapsed time 2 

In [90]:
print("IF the user did not watch the movie(s) " + str(explanation[0][0]) + ", THEN the predicted class would change from 'FEMALE' to 'MALE'.")
IF the user did not watch the movie(s) ['Strictly Ballroom (1992)', 'Benny & Joon (1993)', 'Shakespeare in Love (1998)', 'Secrets & Lies (1996)', "Smilla's Sense of Snow (1997)"], THEN the predicted class would change from 'FEMALE' to 'MALE'.

Show more information about the explanation(s): explanation[0] shows the explanation set(s), explanation[1] shows the number of active features of the instance to explain, explanation[2] shows the number of explanations found, explanation[3] shows the number of features in the smallest-sized explanation, explanation[4] shows the time elapsed in seconds to find the explanation, explanation[5] shows the predicted score change when removing the feature(s) in the smallest-sized explanation, explanation[6] shows the number of iterations that the algorithm needed.

In [91]:
explanation
Out[91]:
([['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   "Smilla's Sense of Snow (1997)"]],
 173,
 10,
 5,
 2.3148410320281982,
 [array([0.39455283])],
 5)

Show the 10 first explanation(s) found by the SEDC algorithm to explain the user index = 13. We change max_explained to 10.

In [93]:
explainer_SEDC2 = SEDC_Explainer(feature_names = np.array(feature_names.iloc[:,1]), 
                               threshold_classifier = threshold_classifier_probs, 
                               classifier_fn = classifier_fn, max_explained = 10)
In [94]:
index = 13
instance_idx = x_test[index]
explanation = explainer_SEDC2.explanation(instance_idx)
Initialization is complete.

 Elapsed time 0 


 Iteration 1 

The difference is 0.000000
Index is 6.000000
Length of new_combinations is 1 features.
New combinations can be expanded
Threshold is 0.000006

 Elapsed time 0 


 Size combis to expand 344 


 Iteration 2 

The difference is 0.000006
Index is 68.000000
Length of new_combinations is 2 features.
New combinations can be expanded
Threshold is 0.000339

 Elapsed time 0 


 Size combis to expand 514 


 Iteration 3 

The difference is 0.000339
Index is 34.000000
Length of new_combinations is 3 features.
New combinations can be expanded
Threshold is 0.012820

 Elapsed time 1 


 Size combis to expand 683 


 Iteration 4 

The difference is 0.012820
Index is 22.000000
Length of new_combinations is 4 features.
New combinations can be expanded
Threshold is 0.157805

 Elapsed time 2 


 Size combis to expand 851 


 Iteration 5 

The difference is 0.157805
Index is 67.000000
Length of new_combinations is 5 features.
New combination cannot be expanded

 Elapsed time 2 


 Size combis to expand 851 

Iterations are done.

 Elapsed time 2 

There are 10 explanations found after 1 iteration. The time elapsed about 2 seconds. The number of active features (movies watched) is 173 movies.

In [96]:
explanation
Out[96]:
([['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   "Smilla's Sense of Snow (1997)"],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'While You Were Sleeping (1995)'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Circle of Friends (1995)'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   "What's Eating Gilbert Grape (1993)"],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Edward Scissorhands (1990)'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Mask of Zorro'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Bridges of Madison County'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Elizabeth (1998)'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Shine (1996)'],
  ['Strictly Ballroom (1992)',
   'Benny & Joon (1993)',
   'Shakespeare in Love (1998)',
   'Secrets & Lies (1996)',
   'Dead Again (1991)']],
 173,
 10,
 5,
 2.196092128753662,
 [array([0.39455283]),
  array([0.35266619]),
  array([0.32251244]),
  array([0.31786395]),
  array([0.25685361]),
  array([0.23765272]),
  array([0.20522964]),
  array([0.19500219]),
  array([0.19105433]),
  array([0.18220798])],
 5)