Skip to main content

Validate

The Validate feature allows users to evaluate the statistical performance of their trained models using the training sample list. This feature is essential for assessing the robustness and accuracy of your classifier before testing it on unseen data. It provides key insights into the model's ability to generalize and highlights areas for improvement.

This section details how to interpret training results, perform K-fold validation, and utilize feature explanation tools to better understand your model's decision-making process. The content is divided into subsections for clarity, with detailed descriptions of each field to ensure ease of use.

The validation process uses two primary approaches to evaluate the trained tool's performance:

  • Training Separation: Tests the accuracy of predictions on 100% of the training data to assess the Base Tool’s compatibility with the classification problem. A lower accuracy than seen in the Build > Explore stage may indicate increased noise or variation in the dataset, suggesting the need for a larger dataset or revisiting the training process.
  • K-fold Validation: A robust validation method where the dataset is split into ten parts, with 10% held out for testing while the remaining 90% is used for training. This process is repeated ten times to estimate performance on unseen data. This result is more representative of real-world model performance.

We recommend using K-fold accuracy as the primary benchmark for evaluating trained tools.

This section contains two main subsections:

  1. Trained Tools
  2. Selected Trained Tool

Trained Tools

This subsection provides an overview of all trained tools, including key details. The table below outlines the fields displayed:

Field NameDescription
Trained Tool DescriptionA brief description of the trained tool.
VersionThe version number of the trained tool.
Created Date and TimeThe timestamp when the trained tool was created.
Sample RateThe rate at which data was sampled during training.
Target RangeThe target range for regression models, showing the minimum and maximum predicted values.

Selected Trained Tool

This subsection provides detailed information about a specific trained tool. It is divided into further categories for ease of navigation.

Base Tool Information

At the top of the section, you will see:

Field NameDescription
Base ToolThe base tool used to create the trained model.
VersionThe version number of the base tool.
Trained WithThe dataset used to train the model.

Training Stats

The Training Stats section provides insights into the model's performance based on the data type:

  • For Classification Data:
    • Confusion Matrix: Displays the actual vs. predicted class performance.
    • Overall Accuracy: The percentage of correct predictions made by the model out of all predictions.
    • F1 score: A metric that balances precision and recall, useful for evaluating model performance on imbalanced datasets.
  • For Regression Data:
    • Target Value Prediction Accuracy: Visualizes actual vs. predicted values using a scatter plot.
    • Error Metrics: Includes key statistics like , RMSE, and MAE.
    • Error Distribution Graph: Plots the percentage range deviation (x-axis) against the point count (y-axis).

Click the Graph button at the bottom to view:

  • Data Points
  • Local Mean
  • Ideal Fit
  • Error Range: ±5%, ±10%, and ±20%.

K-fold Validation

This section provides detailed insights from the K-fold validation process. Follow these steps:

  1. Click Start New K-fold.
  2. Fill in the following fields in the Start New K-fold pop-up:
    • Folds: Specify the number of folds (e.g., 10).
    • Repetition: Enter the number of repetitions for added validation.
    • Holdout: Choose between Random Holdout or Grouped Holdout.
  3. Click Start to begin validation.

For classification datasets, the results include:

  • Confusion Matrix
  • Overall Accuracy
  • F1 Score

Use the Show Sample Level Details option to access:

Field NameDescription
Sample NameDisplays the file name, start value, and end value of the sample.
View SampleAllows viewing of the graphical representation of the sample.
ResultThe predicted state (e.g., idle, fan-blocked).
Expected ResultThe expected state or ground truth.

Use the Filter icon to refine the data:

  • Show All
  • Show Errors
  • Show Successes
  • Show Unpredicted Samples
  • Show Excluded Samples

Click Export to CSV to download the validation results.

Feature Explanation

The Feature Explanation tab allows deeper analysis of the model’s decision-making process:

  1. Select the Class you want to compare from the dropdown menu.
  2. Use the following options:
    • Decision Significance: View average spectral shape graphs showing the frequency (x-axis) vs. significance (y-axis).
    • Class Significance: Compare magnitude vs. frequency graphs for the selected classes.

Exploration Test ID and Status

The Status field displays real-time updates about the tool's validation process. Click the Eye icon to view further details about the test progress.

The Validate feature ensures that your trained tools are robust, accurate, and ready for deployment. By using training statistics, K-fold validation, and feature explanation tools, you can gain confidence in your model's ability to perform reliably on real-world data.