by Batuhan B
Last Updated May 23, 2020 01:19 AM

I made an experiment between the usage of `binary_crossentropy`

and `categorical_crossentropy`

. I try to understand the behavior of these two loss functions on same problem.

I worked on `binary classification`

problem with this data.

In the first experiment, I used `1`

neuron in the last layer with `sigmoid`

activation function and `binary_crossentropy`

. I trained this model 10 times and take the average accuracy. The average accuracy is 74.12760416666666.

The code that I used for first experiment is below.

```
total_acc = 0
for each_iter in range(0, 10):
print each_iter
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=32)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
temp_acc = accuracy*100
total_acc += temp_acc
del model
```

In the second experiment, I used `2`

neurons in the last layer with `softmax`

activation function and `categorical_crossentropy`

. I converted my target `y, into categorical and again I trained this model 10 times and take the average accuracy. The average accuracy is 66.92708333333334.

The code that I used for the second setting is in below:

```
total_acc_v2 = 0
for each_iter in range(0, 10):
print each_iter
X = dataset[:,0:8]
y = dataset[:,8]
y = np_utils.to_categorical(y)
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(2, activation='softmax'))
# compile the keras model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=32)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
temp_acc = accuracy*100
total_acc_v2 += temp_acc
del model
```

I think that these two experiments are identical and should give very similar results. What is the reason of this huge difference between accuracy?

