You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
As deep learning becomes embedded in critical systems — from healthcare and criminal justice to hiring and content recommendation — understanding its ethical implications and future directions is essential. This lesson explores bias, fairness, interpretability, safety, and the cutting-edge frontiers of deep learning research.
Deep learning models learn patterns from data. If the training data reflects historical biases, the model will reproduce and amplify those biases.
| Source | Description | Example |
|---|---|---|
| Historical bias | The data reflects past discrimination | Hiring data that underrepresents women in engineering |
| Representation bias | Some groups are underrepresented in the data | Medical datasets with mostly light-skinned patients |
| Measurement bias | The way data is collected introduces systematic errors | Crime prediction trained on arrest records (not actual crime) |
| Aggregation bias | A single model is used for groups with different characteristics | A medical model trained on adults applied to children |
| Label bias | Labels are assigned in a biased way | Subjective annotations influenced by annotator demographics |
| Case | Bias Issue |
|---|---|
| Amazon recruiting tool (2018) | Penalised CVs containing the word "women's" due to historical hiring data |
| COMPAS recidivism (2016) | Predicted higher risk scores for Black defendants compared to White defendants with similar profiles |
| Facial recognition (2018) | Significantly higher error rates for darker-skinned women compared to lighter-skinned men |
| Medical imaging (2020) | Models trained on data from one hospital failed to generalise to different populations |
| Strategy | Description |
|---|---|
| Diverse, representative datasets | Ensure training data includes all relevant demographic groups |
| Bias auditing | Evaluate model performance across demographic subgroups |
| Fairness constraints | Add constraints during training to equalise performance across groups |
| Adversarial debiasing | Train a model that cannot predict the protected attribute from its representations |
| Regular re-evaluation | Continuously monitor and audit deployed models |
Deep learning models are often called "black boxes" because their internal decision-making process is opaque. Interpretability is the ability to understand why a model makes a particular prediction.
| Technique | Type | Description |
|---|---|---|
| Grad-CAM | Model-specific | Highlights which parts of an image influenced the CNN's decision |
| SHAP | Model-agnostic | Assigns importance scores to each feature using game theory |
| LIME | Model-agnostic | Explains individual predictions using a local interpretable model |
| Attention visualisation | Model-specific | Shows which tokens a Transformer attends to |
| Feature visualisation | Model-specific | Generates images that maximise neuron activations |
| Integrated Gradients | Model-specific | Attributes predictions to input features using gradient integration |
# SHAP example for model interpretability
import shap
explainer = shap.DeepExplainer(model, background_data)
shap_values = explainer.shap_values(test_data)
shap.summary_plot(shap_values, test_data)
| Concern | Description | Solution |
|---|---|---|
| Training data memorisation | Models can memorise and regurgitate private training data | Differential privacy, data deduplication |
| Model inversion attacks | Attackers can reconstruct training data from model outputs | Limit API access, add noise to outputs |
| Membership inference | Attackers determine if a specific record was in the training data | Differential privacy, regularisation |
| Data sovereignty | Data may be subject to regulations (GDPR, CCPA) | Federated learning, data localisation |
Federated learning trains models across multiple devices or servers without sharing raw data. Each participant trains locally and shares only model updates.
Device 1: Train on local data → Send gradients
Device 2: Train on local data → Send gradients
Device 3: Train on local data → Send gradients
↓
Central server aggregates
↓
Updated global model sent back
Training large deep learning models has a significant carbon footprint.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.