Model Validation And Implementation
Expert-defined terms from the Certificate in Credit Risk Analytics in Python course at London School of International Business. Free to read, free to share, paired with a professional course.
ABCD Test – A statistical test that compares the performance of two credi… #
Related terms: Model comparison, Performance metrics. It evaluates Accuracy, Balance, Calibration, and Discrimination. Example: Applying the ABCD test to a logistic regression model versus a gradient‑boosted tree to decide which better predicts default. Challenge: Requires sufficient out‑of‑sample data to avoid overfitting conclusions.
Acceptance Sampling – A quality‑control technique used to decide whether… #
Related terms: Sampling plan, Type I error. In practice, a bank may sample 5 % of new loan files and compute the default rate; if it exceeds a threshold, the batch is rejected. Challenge: Small sample sizes can produce high variance in estimates.
Adjusted R‑Square – A version of the R‑square statistic that penalizes mo… #
Related terms: R‑square, Over‑fitting. It is useful when comparing linear models with different numbers of predictors in a credit risk context. Example: A model with ten variables may have a higher raw R‑square than one with five, but a lower Adjusted R‑square, indicating unnecessary complexity. Challenge: Not directly applicable to non‑linear models such as random forests.
Algorithmic Bias – Systematic error introduced by a model that leads to u… #
Related terms: Fairness, Disparate impact. For instance, a neural network may assign higher default probabilities to applicants from a particular ZIP code due to historical data patterns. Practical application: Bias detection tools are run during model validation to flag such issues. Challenge: Distinguishing genuine predictive signals from spurious correlations.
Alpha (α) Level – The probability of incorrectly rejecting a null hypothe… #
05. Related terms: Type I error, Significance. In model validation, α determines the confidence needed to claim that a new model outperforms a benchmark. Example: A Kolmogorov‑Smirnov test with α = 0.01 Indicates a 99 % confidence that the KS statistic is not due to random variation. Challenge: Stricter α levels reduce false positives but increase the risk of overlooking true improvements.
Alternative Model – Any model that serves as a competitor to the primary… #
Related terms: Benchmark model, Baseline. Alternatives may include logistic regression, decision trees, or neural networks. Practical application: The alternative model’s performance is compared against the primary model using lift charts and ROC curves. Challenge: Ensuring that the alternative is built on the same data preprocessing pipeline to guarantee a fair comparison.
Annualized Default Rate (ADR) – The proportion of loans that default with… #
Related terms: Default frequency, Hazard rate. ADR is used to calibrate probability‑of‑default (PD) estimates. Example: A portfolio with a 2 % quarterly default rate translates to an ADR of approximately 8 % using the formula 1 − (1 − q)⁴. Challenge: Assumes constant default intensity over the year, which may not hold for seasonal credit products.
Area Under the Curve (AUC) – The integral of the Receiver Operating Chara… #
Related terms: ROC, Gini coefficient. An AUC of 0.75 Indicates that a randomly chosen defaulted borrower will receive a higher risk score than a randomly chosen non‑defaulted borrower 75 % of the time. Practical application: AUC is a standard benchmark in model validation reports. Challenge: AUC is insensitive to calibration errors; a poorly calibrated model can still achieve a high AUC.
Back‑testing – The process of applying a model to historical data to asse… #
Related terms: Historical simulation, Validation. In credit risk, back‑testing may involve rolling‑window analyses of PD estimates against realized defaults. Example: A bank back‑tests its Basel‑III PD model over the past five years to verify that predicted default rates align with observed rates. Challenge: Structural changes in the economy can render historical performance less informative for future risk.
Bootstrap Resampling – A non‑parametric technique that creates multiple p… #
Related terms: Monte Carlo simulation, Confidence interval. Used to estimate the variability of model performance metrics such as AUC or KS. Practical application: A credit risk analyst draws 1 000 bootstrap samples to construct a 95 % confidence interval for the model’s Gini coefficient. Challenge: Computationally intensive for large datasets; may require parallel processing.
Calibration Curve – A plot that compares predicted probabilities with obs… #
Related terms: Reliability diagram, PIT histogram. A well‑calibrated model will have points lying close to the 45‑degree line. Example: After fitting a XGBoost classifier, the analyst groups predictions into deciles and plots the calibration curve to assess PD accuracy. Challenge: Sparse data in high‑risk bins can produce noisy estimates, requiring smoothing techniques.
Confusion Matrix – A table that summarizes the counts of true positives,… #
Related terms: Precision, Recall. In credit risk, the matrix helps quantify the trade‑off between missed defaults (false negatives) and unnecessary rejections (false positives). Example: A model yields 150 true positives, 30 false positives, 800 true negatives, and 20 false negatives. Challenge: Choice of classification threshold heavily influences the matrix composition.
Cross‑validation – A model evaluation technique that partitions data into… #
Related terms: Hold‑out validation, K‑fold. It provides robust estimates of out‑of‑sample performance. Practical application: A 5‑fold cross‑validation is used to compare logistic regression and random forest PD models. Challenge: Time‑series data require a forward‑chaining approach to preserve temporal order.
Customer Lifetime Value (CLV) – The net present value of future cash flow… #
Related terms: Profitability, Risk‑adjusted return. CLV informs pricing and segmentation decisions. Example: A high‑risk borrower with a short loan term may have a lower CLV despite a high interest rate. Challenge: Estimating CLV requires accurate forecasts of default, prepayment, and recovery rates.
Decision Threshold – The cut‑off probability used to convert continuous r… #
G., Approve vs. Reject). Related terms: Classification cutoff, Scorecard cutoff. Adjusting the threshold balances false positive and false negative rates. Practical application: A bank sets a PD threshold of 5 % for auto loans; borrowers with predicted PD below 5 % are approved. Challenge: Regulatory constraints may limit the ability to manipulate thresholds for profit optimization.
Discriminatory Power – The ability of a model to separate defaulters from… #
Related terms: Gini coefficient, KS statistic. Measured by metrics such as AUC or the Kolmogorov‑Smirnov (KS) statistic. Example: A model with a Gini of 0.45 Demonstrates moderate discrimination. Challenge: High discriminatory power does not guarantee good calibration; both aspects must be validated.
Distribution Shift – A change in the underlying data generating process b… #
Related terms: Concept drift, Covariate shift. Can degrade model performance if not detected. Practical application: Monitoring macro‑economic indicators for shifts that affect borrower behavior. Challenge: Distinguishing genuine shift from random noise requires statistical testing and domain expertise.
Dropout Regularization – A technique used in neural networks where a rand… #
Related terms: L1 regularization, L2 regularization. It reduces over‑fitting by preventing co‑adaptation of neurons. Example: A credit scoring deep‑learning model applies a dropout rate of 0.2 To hidden layers. Challenge: Selecting an appropriate dropout rate can be non‑trivial; too high a rate may impede learning.
Exposure at Default (EAD) – The total value a bank is exposed to when a b… #
Related terms: Credit conversion factor, Utilization. EAD is a key input for calculating regulatory capital. Example: For a revolving credit line, EAD is estimated as the drawn amount plus a credit conversion factor applied to the undrawn portion. Challenge: Estimating EAD for off‑balance‑sheet exposures involves significant uncertainty.
Feature Engineering – The process of creating informative variables from… #
Related terms: Variable transformation, Interaction term. Common techniques include binning, one‑hot encoding, and constructing ratios such as debt‑to‑income. Practical application: Converting a borrower’s employment tenure into categorical bins improves logistic regression stability. Challenge: Excessive feature creation can lead to multicollinearity and over‑fitting.
Feature Importance – A metric that quantifies the contribution of each pr… #
Related terms: Permutation importance, SHAP values. In tree‑based models, importance is often derived from split gains. Example: A random forest model ranks “credit utilization” as the most important feature for default prediction. Challenge: Importance measures can be misleading for correlated variables; interpretation must be coupled with domain knowledge.
Finite Sample Bias – The distortion of estimator properties that arises w… #
Related terms: Small‑sample variance, Asymptotic bias. In credit risk, small portfolio segments may produce biased PD estimates. Practical application: Applying Bayesian shrinkage to mitigate bias in low‑frequency default bins. Challenge: Balancing bias reduction against increased variance.
Gini Coefficient – A normalized version of the AUC, calculated as 2 × AUC… #
Related terms: Lorenz curve, Inequality index. It ranges from 0 (no discrimination) to 1 (perfect discrimination). Example: An AUC of 0.70 Yields a Gini of 0.40. Challenge: Like AUC, the Gini does not reflect calibration quality.
Gradient Boosting – An ensemble learning method that builds additive pred… #
Related terms: XGBoost, LightGBM. Frequently used for PD modeling due to its ability to capture non‑linear relationships. Practical application: A bank trains a gradient‑boosted tree model on borrower demographics and transaction history. Challenge: Tuning hyper‑parameters (learning rate, depth, number of trees) is computationally intensive and can lead to over‑fitting if not properly regularized.
Hold‑out Validation – Splitting the dataset into a training set and a sep… #
Related terms: Train‑test split, Out‑of‑sample test. Provides an unbiased estimate of performance on unseen data. Example: 70 % Of observations are used for training, 30 % for validation. Challenge: Random splits may not preserve temporal ordering, leading to optimistic performance estimates for time‑dependent credit data.
Hypothesis Testing – A statistical framework for deciding whether observe… #
Related terms: Null hypothesis, p‑value. Common tests include the paired t‑test for mean differences and the DeLong test for AUC comparison. Practical application: Testing whether a new neural network PD model significantly improves AUC over the incumbent logistic regression. Challenge: Assumptions of independence may be violated when models are trained on overlapping data.
Imbalanced Data – A situation where the number of default cases is far sm… #
Related terms: Class imbalance, Minority class. Leads to biased learning algorithms that favor the majority class. Techniques such as SMOTE, class weighting, or threshold adjustment are employed to address imbalance. Example: A portfolio with a 2 % default rate requires oversampling of defaults to train a robust classifier. Challenge: Synthetic oversampling can introduce noise and unrealistic patterns.
Information Value (IV) – A metric that quantifies the predictive power of… #
Related terms: Weight of evidence, Predictive strength. Calculated as the sum over bins of (distribution difference) × log‑ratio of distributions. IV < 0.02 Suggests no predictive value; 0.1 – 0.3 Is considered moderate. Practical application: Screening variables for inclusion in a scorecard. Challenge: IV can be inflated by small sample sizes in rare bins; smoothing or merging bins is often required.
Kolmogorov‑Smirnov (KS) Statistic – The maximum vertical distance between… #
Related terms: D‑statistic, Discriminatory power. A KS of 0.30 Indicates that at some score threshold, 30 % more defaults are captured than non‑defaults. Used widely in banking to assess model discrimination. Challenge: KS is sensitive to the choice of score binning and may be unstable for very small datasets.
Log‑Odds Ratio – The natural logarithm of the odds of default versus non‑… #
Related terms: Logistic regression coefficient, Odds. In logistic regression, each predictor’s coefficient represents the change in log‑odds per unit increase. Example: A coefficient of 0.5 For “credit utilization” implies that a one‑unit increase multiplies the odds of default by exp(0.5) ≈ 1.65. Challenge: Interpreting log‑odds for categorical variables with many levels can be cumbersome.
Macro‑validation – Validation that assesses model performance across broa… #
Related terms: Segment analysis, Global performance. Ensures that a model does not perform well overall while failing in specific sub‑populations. Practical application: Evaluating PD models separately for mortgage, auto, and credit‑card portfolios. Challenge: Data sparsity in niche segments may limit statistical confidence.
Mean Squared Error (MSE) – The average of squared differences between pre… #
Related terms: RMSE, Loss function. Though more common in regression, MSE can be used to assess PD calibration by treating binary outcomes as 0/1. Example: An MSE of 0.02 Indicates that on average predictions deviate by √0.02 ≈ 14 % From actual outcomes. Challenge: MSE penalizes large errors heavily, which may over‑emphasize outliers.
Model Governance – The framework of policies, procedures, and controls th… #
Related terms: Model risk management, Compliance. Includes documentation standards, version control, and independent review. Practical application: A bank’s model risk committee signs off on a new PD model after a formal validation report. Challenge: Balancing thorough governance with agility in a fast‑changing credit environment.
Model Monitoring – Ongoing surveillance of a model’s performance after de… #
Related terms: Performance drift, Alert thresholds. Key metrics include population stability index (PSI), KS, and calibration error. Example: A monthly PSI exceeding 0.1 Triggers an investigation into potential data drift. Challenge: Setting appropriate alert thresholds to avoid alarm fatigue while catching genuine degradation.
Model Over‑fitting – When a model captures noise in the training data rat… #
Related terms: Generalization error, Regularization. Symptoms include high training accuracy but low validation accuracy. Mitigation techniques include cross‑validation, pruning, and penalization. Challenge: Detecting over‑fitting early in high‑dimensional data where training error may still be low.
Model Risk Appetite – The level of risk a financial institution is willin… #
Related terms: Risk tolerance, Capital allocation. Defined by senior management and embedded in model validation criteria (e.G., Maximum allowable PSI). Practical application: A bank sets a risk appetite that limits the acceptable increase in PD variance to 5 % after model updates. Challenge: Translating qualitative appetite statements into quantitative validation thresholds.
Monte Carlo Simulation – A computational technique that generates a large… #
Related terms: Stochastic modeling, Scenario analysis. Used to estimate the distribution of portfolio losses, incorporating PD, LGD, and EAD variability. Example: Simulating 10 000 macro‑economic paths to evaluate stress‑test impacts on credit risk capital. Challenge: Requires reliable input distributions and can be computationally demanding.
Multicollinearity – The presence of high correlation among predictor vari… #
Related terms: Variance Inflation Factor, Redundancy. In logistic regression, multicollinearity can make coefficient signs unstable. Mitigation strategies include variable selection, principal component analysis, or ridge regression. Challenge: Detecting multicollinearity in large, sparse feature sets common in credit data.
Negative Predictive Value (NPV) – The proportion of borrowers predicted a… #
Related terms: True negative rate, Specificity. Important for lender profitability because it reflects the accuracy of approvals. Example: An NPV of 0.98 Means that 98 % of approved applicants remain current. Challenge: NPV is heavily influenced by the prevalence of defaults in the portfolio.
Out‑of‑Sample Testing – Evaluating a model on data that were not used dur… #
Related terms: Validation set, Hold‑out sample. Provides an unbiased estimate of future performance. Practical application: A PD model is trained on 2015‑2018 data and tested on 2019 data to assess predictive stability. Challenge: Temporal shifts may cause out‑of‑sample performance to deteriorate rapidly.
Partial Dependence Plot (PDP) – A graphical tool that shows the marginal… #
Related terms: ICE plot, Model interpretability. Helps explain non‑linear relationships in tree‑based models. Example: A PDP for “annual income” may reveal diminishing marginal impact beyond a certain threshold. Challenge: PDP assumes feature independence, which may not hold for correlated credit variables.
Population Stability Index (PSI) – A measure of how much a variable’s dis… #
Related terms: Distribution shift, Drift detection. PSI values below 0.1 Indicate stability; 0.1‑0.25 Suggests moderate shift; above 0.25 Signals significant change. Practical application: Monitoring the distribution of “credit score” monthly to detect data drift. Challenge: Selecting appropriate binning and handling sparse bins to avoid misleading PSI values.
Precision – The ratio of true positives to all predicted positives #
Related terms: Positive predictive value, Accuracy. In credit risk, high precision means that approved high‑risk borrowers are indeed likely to default, which may be undesirable. Example: A model with precision 0.70 Predicts defaults correctly 70 % of the time among those flagged as high risk. Challenge: Precision alone does not capture the cost of false negatives; must be balanced with recall.
Probabilistic Forecasting – Generating a full probability distribution fo… #
Related terms: Predictive distribution, Bayesian inference. Enables risk managers to assess tail risk and compute Value‑at‑Risk (VaR). Example: A Bayesian logistic regression provides posterior distributions for PD estimates. Challenge: Computational complexity and the need for prior specification.
Quantile Regression – A regression technique that estimates conditional q… #
G., Median, 95th percentile) of the response variable. Related terms: Conditional distribution, Pinball loss. Useful for modeling loss given default (LGD) at different confidence levels. Practical application: Estimating the 90th‑percentile LGD to support stress‑testing. Challenge: Requires larger sample sizes for stable quantile estimates, especially in the tails.
Random Forest – An ensemble of decision trees built on bootstrapped sampl… #
Related terms: Bagging, Feature importance. Provides robust, non‑linear modeling for PD estimation. Example: A random forest achieves an AUC of 0.78 On a credit‑card default dataset. Challenge: Interpretability can be limited compared to linear models; SHAP values are often used to explain predictions.
Recall – The proportion of actual defaults that are correctly identified… #
Related terms: Sensitivity, Detection rate. High recall reduces missed defaults but may increase false positives. Example: A recall of 0.85 Means 85 % of defaulted borrowers are flagged. Challenge: Balancing recall against precision to meet business objectives and regulatory limits.
Recovery Rate – The proportion of exposure recovered after a default even… #
Related terms: Loss given default, Credit loss. Influences the calculation of LGD (LGD = 1 − Recovery Rate). Practical application: Historical data show a recovery rate of 40 % for unsecured personal loans. Challenge: Recovery rates can be highly volatile across economic cycles.
Regulatory Capital – The minimum amount of capital a bank must hold to co… #
G., Basel III). Related terms: Risk‑weighted assets, Capital adequacy. Calculated using PD, LGD, and EAD inputs from validated models. Example: A loan portfolio with high PDs requires more capital to satisfy the 8 % minimum CET1 ratio. Challenge: Frequent model updates can cause capital requirement fluctuations, requiring careful communication with senior management.
Reproducibility – The ability to obtain the same results when the same an… #
Related terms: Version control, Deterministic runs. In model validation, reproducibility ensures that audit trails can trace every step from data extraction to final metrics. Practical application: Using fixed random seeds and documented data pipelines in Python scripts. Challenge: External dependencies (e.G., Library updates) can break reproducibility if not locked.
Risk‑Adjusted Return on Capital (RAROC) – A performance metric that compa… #
Related terms: Economic profit, Risk premium. Calculated as (Expected Income − Expected Loss) / Economic Capital. Example: A loan segment with a RAROC of 12 % exceeds the bank’s hurdle rate of 8 %. Challenge: Accurate estimation of expected loss and economic capital requires robust PD and LGD models.
Sample Weighting – Assigning different importance levels to observations… #
Related terms: Cost‑sensitive learning, Class weights. Used to address class imbalance or to reflect business priorities. Example: Assigning a weight of 5 to default cases and 1 to non‑defaults in a logistic regression. Challenge: Improper weighting can lead to unstable coefficient estimates and over‑fitting to the minority class.
Segmentation Analysis – Dividing a portfolio into homogeneous groups base… #
Related terms: Cluster analysis, Cohort study. Helps identify sub‑populations where a model may under‑perform. Practical application: Segmenting borrowers by industry and evaluating PD model KS within each segment. Challenge: Small segment sizes reduce statistical power; pooling may be necessary.
Shapley Additive Explanations (SHAP) – A game‑theoretic method that assig… #
Related terms: Feature attribution, Explainable AI. Provides both global and local interpretability for complex models. Example: A SHAP summary plot shows “debt‑to‑income” as the top contributor to high PD scores. Challenge: Computational cost grows with dataset size; sampling may be required.
Sensitivity Analysis – Assessing how variations in model inputs affect ou… #
Related terms: Scenario testing, Stress testing. In credit risk, sensitivity to macro‑economic variables like unemployment rates is examined. Practical application: Increasing unemployment by 2 % and observing the impact on portfolio PD. Challenge: Interactions between variables can produce non‑linear effects that are hard to capture with one‑at‑a‑time approaches.
Significance Level – The threshold at which a statistical result is consi… #
Related terms: Α level, P‑value. Commonly set at 0.05 For model validation tests. Example: A KS difference with p = 0.03 Is deemed statistically significant. Challenge: Multiple testing across many variables inflates the chance of false discoveries; adjustments such as Bonferroni correction may be needed.
Simplex Method – An algorithm for solving linear programming problems, of… #
Related terms: Linear optimization, Scorecard calibration. Allows the translation of logistic regression coefficients into integer score points. Practical application: Optimizing scorecard point allocations to achieve a target KS while preserving monotonicity. Challenge: Ensuring that the linear constraints do not overly restrict model flexibility.
Smoothing Techniques – Methods applied to reduce noise in empirical defau… #
Related terms: Kernel smoothing, Bayesian shrinkage. Examples include moving averages, LOESS, and hierarchical Bayesian models. Practical application: Smoothing default rates for high‑risk score deciles where observations are sparse. Challenge: Over‑smoothing can mask genuine risk differentials.
Specificity – The proportion of non‑default borrowers correctly identifie… #
Related terms: True negative rate, 1 − False positive rate. High specificity reduces unnecessary rejections. Example: A specificity of 0.92 Indicates that 92 % of good borrowers are correctly approved. Challenge: Increasing specificity often reduces recall; the trade‑off must align with business strategy.
Stability Index – A metric that quantifies the consistency of model predi… #
Related terms: PSI, Drift metric. Calculated by comparing score distributions across successive periods. Practical application: A monthly stability index below 0.05 Signals that the model’s score distribution remains unchanged. Challenge: Choosing appropriate binning and handling missing data points.
Stress Testing – Simulating adverse economic scenarios to evaluate the im… #
Related terms: Scenario analysis, Macro‑stress. Includes baseline, adverse, and severely adverse scenarios defined by regulators. Example: A stress test raises unemployment to 10 % and measures the resulting increase in PDs and capital requirements. Challenge: Scenario selection must be plausible yet severe enough to capture tail risk.
Supervised Learning – Machine‑learning paradigm where models are trained… #
G., Default vs. Non‑default). Related terms: Classification, Regression. Most credit risk models are supervised, using historical outcomes to predict future defaults. Practical application: Training a neural network on borrower features with binary default labels. Challenge: Label noise and survivorship bias can impair model quality.
Temporal Validation – Validation that respects the chronological order of… #
Related terms: Forward chaining, Time‑series split. Prevents look‑ahead bias in credit risk modeling. Example: A model is trained on 2010‑2015 data and validated on 2016‑2018 data. Challenge: Limited data in early periods may restrict model complexity.
Threshold Optimization – The process of selecting the decision cut‑off th… #
G., Profit, cost). Related terms: Cost‑benefit analysis, Utility function. Often involves evaluating trade‑offs between false positives and false negatives. Practical application: Using a profit curve to locate the PD threshold that yields the highest expected profit. Challenge: Objective functions may be non‑convex, requiring grid search or evolutionary algorithms.
Time‑Weighted AUC – An adaptation of the AUC that accounts for the time‑t… #
Related terms: Survival analysis, C‑index. Incorporates censored observations, providing a more accurate discrimination measure for loan portfolios with varying maturities. Example: A time‑weighted AUC of 0.71 Reflects better performance than a standard AUC of 0.68. Challenge: Implementation complexity and the need for survival‑type data structures.
Training Set – The portion of data used to fit model parameters #
Related terms: Development data, In‑sample data. Should be representative of the target population and free from leakage. Practical application: Allocating 80 % of a credit dataset to the training set while preserving temporal order. Challenge: Ensuring that the training set does not contain future information that would bias validation.
Traveling‑Wave Kernel – An advanced kernel function for support vector ma… #
Related terms: Kernel trick, SVM. Rarely used in credit risk but can improve discrimination for highly irregular data structures. Example: Applying a traveling‑wave kernel to a high‑dimensional borrower feature space yields a modest AUC increase. Challenge: Parameter tuning is computationally intensive and interpretation is opaque.
Under‑fitting – When a model is too simple to capture the underlying rela… #
Related terms: Bias, Model simplicity. Symptoms include low training accuracy and low validation accuracy. Mitigation includes adding features, increasing model complexity, or reducing regularization. Challenge: Distinguishing under‑fitting from a well‑regularized model that intentionally sacrifices some fit for robustness.
Validation Report – A comprehensive document summarizing the results of m… #
Related terms: Model risk assessment, Audit trail. Must meet regulatory standards such as SR 11‑7. Practical application: The validation team prepares a report for senior management before model deployment. Challenge: Balancing technical depth with readability for non‑technical stakeholders.
Variable Binning – Grouping continuous variables into discrete intervals… #
Related terms: Discretization, Scorecard creation. Techniques include equal‑frequency, equal‑width, and entropy‑based binning. Example: Binning “age” into five categories to capture non‑linear risk patterns. Challenge: Choosing bin edges that preserve predictive power while avoiding over‑fitting.
Variance Inflation Factor (VIF) – A diagnostic metric that quantifies how… #
Related terms: Collinearity, Eigenvalue. VIF > 10 often signals problematic correlation. Practical application: Computing VIF for all predictors in a logistic regression PD model and dropping variables with high VIF. Challenge: VIF does not capture non‑linear relationships that may also cause instability.
Weighted Loss Function – A loss function that assigns different penalties… #
Related terms: Cost‑sensitive learning, Class weighting. In credit risk, false negatives (missed defaults) may be weighted more heavily than false positives. Example: Using a weighted binary cross‑entropy loss where default cases have a weight of 3. Challenge: Selecting appropriate weights that reflect business costs without inducing bias.
Zero‑Inflated Model – A statistical model designed for count data with an… #
Related terms: Poisson regression, Hurdle model. Occasionally applied to modeling the number of missed payments before default. Example: A zero‑inflated negative binomial model captures both the probability of no missed payments and the distribution of missed payments when they occur. Challenge: Model complexity and interpretability increase relative to standard count models.
Z‑Score – A standardized statistic representing the number of standard de… #
Related terms: Standardization, Normalization. Used to compare borrower attributes across different scales. Example: Converting “annual income” to a Z‑score facilitates inclusion in a logistic regression without scale bias. Challenge: Assumes underlying normality; heavy‑tailed variables may require robust scaling alternatives.