Frequently Asked Questions
1. How to read a confusion matrix?
A confusion matrix is a tool for evaluating the performance of a trained classifier. Here's how to interpret it:
- Matrix Layout
- Rows represent the predicted class—what the trained classifier predicted for a sample.
- Columns represent the actual class—the label assigned to the sample in the source data.
- Key Elements
- Diagonals (shaded in blue): Represent true positives and true negatives, where the model correctly predicted the class.
- True Positive: The model correctly predicts the positive class.
- True Negative: The model correctly predicts the negative class.
- White Cells (off-diagonal): Represent false positives and false negatives, indicating misclassifications. In a multi-class model, these cells highlight specific class confusions.
- Diagonals (shaded in blue): Represent true positives and true negatives, where the model correctly predicted the class.
- Marginal Information
- Bottom Row (Class Marginals): Displays accuracy (green) and error (red) rates for each class.
- Lower Right Corner: Shows the overall accuracy (green) and overall error rate (red) for the model.
- Correlation of Axis Labels
- Numbers on the X-axis correspond to the class labels on the Y-axis. For example, if "fan-balance (1)" is on the Y-axis, "1" on the X-axis represents the same class.
- Data Representation
- Each number in the matrix corresponds to a segment or block of data from the histogram shown in sections 3.4 and 3.6.
2. What makes a good confusion matrix?
Ideally, a good confusion matrix has the smallest possible numbers in the white cells (indicating minimal misclassifications or wrong predictions).
The confusion matrix provides valuable insights into your model's strengths and areas where it may struggle, helping you refine your classifier further.
3. How should I interpret the complexity numbers?
Complexity numbers can help you gauge the resource consumption of the model on the MCU/ MPU’s. These number are not final as the model can still be optimized. For example: In the previous section 3.9, we see that most of the important frequency bands reside in range < 50 Hz on the decision significance plot. Knowing this, we can use a filter to only use frequencies < 50 Hz in feature computation and save resources (multiplication operations).
If you preselected the target processor on the project creation stage, then the top models are going to be the ones that can fit on that processor - meaning the resource consumption of the models would fall within range of available resources for that target processor.
4. What is one-vs-one classification strategy?
One-vs-One classification strategy is used in machine learning models, particularly for multi-class classification tasks. In this approach:
- A separate binary classifier is trained for every pair of classes in the dataset.
- For example, if there are three classes (A, B, C), the model trains three classifiers: one for A vs. B, one for A vs. C, and one for B vs. C.
- During prediction, the model evaluates all pairwise classifiers and determines the final class based on a voting mechanism or other aggregation strategies.
This strategy helps break down complex multi-class problems into simpler binary classification tasks, making the decision structure easier to interpret.