DR Solutions Design and Review

Based on Kumar (2024) and Corbari et al. (2024)

1. What are some of the main vendor lock-in issues the authors identify? How would you mitigate them? According to Kumar (2024), vendor lock-in manifests primarily through Technical and Organizational obstacles.

  • Technical Issues: The use of proprietary APIs and non-standard data formats creates high switching costs. Once an organization integrates deeply with a specific cloud provider’s ecosystem (e.g., using AWS Lambda or proprietary databases), migrating to a different DR provider becomes technically prohibitive due to compatibility issues.
  • Organizational/Legal Issues: Restrictive contracts and the lack of interoperability standards further bind organizations to a single vendor.

Mitigation Strategies: To mitigate these risks, I would recommend a Multi-Cloud Strategy combined with Containerization (e.g., Docker/Kubernetes). By abstracting the application layer from the underlying infrastructure, organizations can move workloads between providers with minimal friction. Additionally, enforcing the use of Open Standards and avoiding proprietary PaaS (Platform as a Service) features where possible ensures that the DR solution remains portable.

2. What are some security concerns with the modern cloud? How can these be mitigated? A major security concern in modern cloud environments is the loss of visibility and control over the underlying infrastructure. However, a more subtle but critical concern identified by Corbari et al. (2024) is the complexity of dependencies. In complex cloud environments, it is difficult to identify exactly which assets are critical to a specific business function. If a DR plan fails to account for a hidden dependency (e.g., an external authentication service), the recovery will fail.

Mitigation Strategies:

  • Mission Thread Analysis (MTA): I would apply the framework proposed by Corbari et al. (2024) to map the “Mission Relevant Cyber Terrain.” This process involves tracing a specific operational thread (e.g., “Process Customer Payment”) end-to-end to identify every critical node and link.
  • Shared Responsibility Awareness: Organizations must clearly define where the vendor’s security responsibility ends and theirs begins, particularly regarding data encryption and access control.

References

  • Corbari, G.I., Khatod, N., Popiak, J.F. and Sinclair, P. (2024) ‘Mission Thread Analysis: Establishing a Common Framework’, The Cyber Defense Review, 9(1), pp. 37–54.
  • Kumar, A. (2024) Cloud Vendor Lock-In: Identify, Strategies and Mitigate. Seminar Paper, Julius-Maximilians-Universität Würzburg.

Modelling Social Engineering Threats Based on Aijaz, M. and Nazir, M. (2024)

1. What are the main challenges in modelling and evaluating the outcomes of Social Engineering Threats (SETs), and how does this study address them?

The primary challenge in modelling SETs is the inherent unpredictability of human behavior, which makes rigorous mathematical evaluation difficult. Unlike technical exploits, SETs rely on psychological manipulation, which is historically hard to quantify. The study addresses this by structuring SETs not as random events, but as systematic processes involving specific modalities (e.g., email, phone) and persuasion principles (e.g., authority, scarcity). By categorizing these variables, the authors are able to apply Markov Chain models to calculate the probability of an attack moving from one stage to the next.

2. How do persuasion principles and modalities contribute to the success of SETs?

Persuasion principles (derived from Cialdini’s framework, such as Reciprocity, Commitment, and Social Proof) act as the “exploit code” of a social engineering attack. The study highlights that the success of a SET depends heavily on the pairing of a Modality (the medium, e.g., social media) with the correct Persuasion Principle. Systematically analyzing these pairs is critical because certain combinations yield higher success rates; for example, “Authority” might be more effective via email, while “Liking” works better on social media. Understanding these combinations allows defenders to predict which specific scenarios pose the highest risk.

3. What role do the Attack Tree Model and Markov Chain Model play in estimating probabilities?

The study utilizes a hybrid approach to estimate risk:

  • Attack Tree Model: This is used to calculate the Attack Occurrence Probability (AOP). It maps the hierarchical structure of an attack, using frequency data to estimate how likely a specific attack path is to be attempted.
  • Markov Chain Model: This is used to calculate the Attack Success Probability (ASP). It models the attack as a sequence of states (e.g., Start > Medium > Persuasion > Compromise). The Markov model calculates the probability of transitioning from one state to the next based on the effectiveness of the chosen persuasion principle.

4. How can the findings support the development of effective policy frameworks?

By quantifying the ASP of specific attacks, organizations can move beyond generic “Security Awareness Training” to targeted interventions. For instance, if the model shows that the “Authority” principle delivered via “Email” has the highest success probability, policies can be adjusted to enforce strict verification for executive requests (e.g., mandatory voice confirmation for wire transfers). This allows resources to be allocated based on mathematical risk rankings rather than anecdotal evidence.

References

  • Aijaz, M. and Nazir, M. (2024) ‘Modelling and analysis of social engineering threats using the attack tree and the Markov model’, International Journal of Information Technology, 16(2), pp. 1231–1238. Available at: https://doi.org/10.1007/s41870-023-01540-z

Bayesian Risk Update (Think Bayes 2) Based on Downey (2022), Chapters 1 & 2

I used Allen Downey’s ThinkBayes2 library to practice Diachronic Bayes, the process of updating a hypothesis ($H$) based on new data (D).

  • The Problem: I worked through the “Cookie Problem” and “Monty Hall Problem” to understand the mechanics of the formula: P(H|D) = (P(H)P(D|H))/(P(D)).
  • Application to Risk: I treated the “Prior” (P(H)) as our initial risk assessment (e.g., “There is a 10% chance of a breach”). I then calculated the “Likelihood” (P(D|H)) based on new evidence (e.g., “A firewall log showed 5 failed login attempts”).
  • Outcome: The calculation produced a “Posterior” probability, mathematically demonstrating that risk is dynamic. This highlighted a flaw in traditional “static” risk registers, which often fail to account for real-time threat intelligence.

Description: This script adapts the ‘Think Bayes’ methodology to a security context. It updates the probability of a specific threat (Hypothesis) being active after observing a specific indicator (Evidence).

# A simplified class structure based on Downey's 'Pmf' (Probability Mass Function)
class RiskHypothesis:
def __init__(self, priors):
"""
priors: Dictionary of {Hypothesis: Probability}
e.g., {'High_Risk': 0.1, 'Low_Risk': 0.9}
"""
self.hypotheses = priors
def normalize(self):
"""Ensures all probabilities sum to 1.0"""
total = sum(self.hypotheses.values())
for hypo in self.hypotheses:
self.hypotheses[hypo] /= total
def update(self, evidence, likelihoods):
"""
Bayes Theorem Application: P(H|E) = P(H) * P(E|H) / P(E)
evidence: String name of the evidence observed
likelihoods: Dictionary of {Hypothesis: Probability_of_Evidence}
"""
for hypo in self.hypotheses:
# 1. Get the Prior P(H)
prior = self.hypotheses[hypo]
# 2. Get the Likelihood P(E|H)
# "If this hypothesis were true, how likely is this evidence?"
likelihood = likelihoods[hypo]
# 3. Calculate Un-normalized Posterior
self.hypotheses[hypo] = prior * likelihood
# 4. Normalize (dividing by P(E))
self.normalize()
# --- USE CASE: INCIDENT RESPONSE ---
# Scenario: We see a failed login. Is it a Brute Force Attack or just a User Mistake?
# 1. ESTABLISH PRIORS (Baseline probability)
# We assume Brute Force attacks are rare (10%) compared to User Mistakes (90%)
priors = {'Brute_Force': 0.1, 'User_Mistake': 0.9}
risk_model = RiskHypothesis(priors)
print("--- PRIOR BELIEFS ---")
print(risk_model.hypotheses)
# 2. NEW EVIDENCE: 5 Failed Logins in 1 Minute
# Likelihood P(E|H):
# - If it IS a Brute Force attack, 5 fails in 1 min is very likely (90%)
# - If it IS a User Mistake, 5 fails in 1 min is rare (5%)
evidence_likelihoods = {
'Brute_Force': 0.90,
'User_Mistake': 0.05
}
# 3. UPDATE BELIEFS
risk_model.update(evidence="5_fails_1_min", likelihoods=evidence_likelihoods)
print("\n--- POSTERIOR BELIEFS (After Evidence) ---")
for hypo, prob in risk_model.hypotheses.items():
print(f"{hypo}: {prob:.2%}")
# Result: The probability of 'Brute_Force' will jump significantly

References:

Monte Carlo Simulation (Python) Based on Fizell (2022)

To understand how to model risks with high variance, I developed a Python script using numpy and pandas to run a Monte Carlo simulation. Instead of relying on a single “average” prediction for future risk exposure (e.g., potential financial loss), the simulation generated 1,000 random iterations based on historical volatility.

  • Key Concept Applied: The script used norm.ppf (Percent Point Function) to generate random variables within a specified mean and standard deviation, effectively simulating “black swan” events and best/worst-case scenarios.
  • Outcome: The output provided a probability distribution rather than a single number. This allowed me to state with 95% confidence that the potential risk exposure would fall within a specific range, providing a far more defensible metric for stakeholders than a “High/Medium/Low” label.

Description: This script simulates 1,000 potential outcomes for a financial risk scenario (e.g., cost of a data breach) using historical volatility data. It calculates the 95% confidence interval (Value at Risk).

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import norm
# --- CONFIGURATION ---
# Scenario: Estimating potential financial loss from a supply chain disruption
# Based on historical data, we assume a normal distribution of daily loss.
simulations = 1000 # Number of iterations
days_to_forecast = 30 # Duration of the risk event
avg_daily_loss = 5000 # Mean daily loss in GBP
std_dev_loss = 1500 # Volatility (Standard Deviation)
# --- MONTE CARLO SIMULATION ---
def run_simulation():
results = []
for i in range(simulations):
# Generate random daily losses based on normal distribution
# norm.ppf converts a random percentage (0-1) to a value on the distribution curve
daily_losses = norm.ppf(np.random.rand(days_to_forecast), loc=avg_daily_loss, scale=std_dev_loss)
# Cumulative sum of losses for this 30-day iteration
total_event_cost = daily_losses.sum()
results.append(total_event_cost)
return np.array(results)
# --- EXECUTION & ANALYSIS ---
simulated_costs = run_simulation()
# Calculate Key Metrics
mean_cost = np.mean(simulated_costs)
worst_case = np.percentile(simulated_costs, 95) # 95th percentile (Value at Risk)
best_case = np.percentile(simulated_costs, 5) # 5th percentile
print(f"--- RISK FORECAST (30 DAYS) ---")
print(f"Mean Expected Cost: £{mean_cost:,.2f}")
print(f"95% Confidence Worst Case: £{worst_case:,.2f}")
print(f"5% Confidence Best Case: £{best_case:,.2f}")
# Optional: Visualization code would go here
# plt.hist(simulated_costs, bins=50)

References:

Fizell, Z. (2022) How to Create a Monte Carlo Simulation using Python. Available at: https://towardsdatascience.com/how-to-create-a-monte-carlo-simulation-using-python-c24634a0978a/

GDPR Case Study Analysis: Social Engineering Attack

1. What is the specific aspect of GDPR that your case study addresses? This case addresses Article 32 (Security of Processing) and Article 5(1)(f) (Integrity and Confidentiality). The case involved a law firm where a staff member fell victim to a social engineering attack (phishing), allowing a malicious actor to install malware and defraud a client. The core GDPR issue was the data controller’s failure to implement “appropriate technical and organisational measures” to ensure a level of security appropriate to the risk. Specifically, the firm relied on a cloud email service without enforcing basic industry-standard security settings, such as strong passwords or Multi-Factor Authentication (MFA).

2. How was it resolved? Upon discovering the breach, the firm immediately commissioned a full forensic investigation to determine the root cause and extent of the compromise. Based on the findings, they implemented enhanced technical security measures (specifically enabling MFA) and conducted mandatory cyber security and data protection training for all staff. The DPC concluded the case by requesting updates on these implementations to ensure the risk of reoccurrence was mitigated.

3. If this was your organisation, what steps would you take as an Information Security Manager to mitigate the issue? As an Information Security Manager, I would align our mitigation strategy with ISO/IEC 27001 standards to ensure compliance with GDPR Article 32:

  • Implement Technical Controls (ISO 27001 A.9): I would mandate Multi-Factor Authentication (MFA) for all external access, particularly for cloud-based email services. Reliance on passwords alone is no longer considered “appropriate” for protecting sensitive client data.
  • Security Awareness Training (ISO 27001 A.7.2.2): I would implement a continuous “phishing simulation” program rather than one-off training. This tests employee resilience to social engineering in real-time.
  • Vendor Risk Management: Since the firm used a third-party cloud provider, I would review the shared responsibility model to ensure we are not assuming default settings are secure. We must configure the “tenant” side of the cloud service to meet our specific risk appetite.

References

Threat Modelling for Industrial Cyber-Physical Systems

Based on Jbair, M., Ahmad, B., Maple, C. and Harrison, R. (2022)

1. What are the key elements and interdependencies in a cyber-physical system that must be captured in a comprehensive threat model?

A comprehensive threat model for Cyber-Physical Systems (CPS) must move beyond simple asset lists to capture the dynamic relationships between physical and digital components. Jbair et al. (2022) propose a data model that links ten critical parameters: Threat Actors (insider/outsider), Assets (classified by the Purdue Model), Vulnerabilities, Threats (using STRIDE), and Cyber-Attacks (using ICS ATT&CK tactics).

Critically, these elements are interdependent: a Threat Actor exploits a Vulnerability via specific Tactics, Techniques, and Procedures (TTPs) to manipulate an Asset. The accuracy of the risk analysis depends on capturing these links because the Attack Impact varies depending on the asset’s physical function (e.g., a PLC at Level 1 has a higher safety impact than a workstation at Level 4). Failing to map these interdependencies results in a “siloed” view that misses cascading physical risks.

2. How can threat modelling help identify attack entry points and system vulnerabilities in cyber-physical energy systems?

Threat modelling identifies entry points by mapping the “attack surface” exposed by the convergence of IT and OT (Operational Technology). By utilizing frameworks like ICS ATT&CK, analysts can model specific attack trees—such as a “Man-in-the-Middle” attack on a PLC or a “Denial of Service” on an HMI. This structured approach moves beyond ad-hoc vulnerability scanning to identify complex attack paths where an adversary might pivot from a corporate network (Level 4) to control systems (Level 1).

However, a major challenge is that traditional threat modelling tools are often “static” and do not integrate with the engineering tools used to build CPS. Jbair et al. (2022) highlight that existing methodologies often lack the ability to determine risk severity or provide a roadmap for mitigation, making it difficult to justify security investments to engineering stakeholders.

3. How can scenario-specific metrics and risk assessment methodologies be used to prioritise vulnerabilities?

To effectively prioritize vulnerabilities, organizations must move from qualitative “guesswork” to quantitative metrics. The paper proposes a formulaic approach where Risk (R) is the product of the Attack Vector (AV) and Attack Likelihood (AL) (R = AV \times AL).

  • Attack Vector (AV): Calculated using the geometric mean of threat actor skills, threat exposure, and impact severity.
  • Attack Likelihood (AL): Derived from historical data of similar attacks in the sector (e.g., assessing if a specific malware strain is trending in the energy sector).

By applying these metrics to a Risk Heat Map, vulnerabilities can be classified from “Very Low” to “Very High.” This allows security teams to automate the generation of Mitigation Controls (such as firmware monitoring or password enforcement) specifically for the highest-risk assets, ensuring that limited resources are targeted where they prevent the most significant physical and digital damage.

References

  • Jbair, M., Ahmad, B., Maple, C. and Harrison, R. (2022) ‘Threat modelling for industrial cyber physical systems in the era of smart manufacturing’, Computers in Industry, 137, p. 103611. Available at: https://doi.org/10.1016/j.compind.2022.103611

The Role of AI in Risk Management

Based on Kalogiannidis et al. (2024)

1. How does NLP improve the efficiency and accuracy of risk assessment processes? Natural Language Processing (NLP) fundamentally shifts risk assessment from a manual, labor-intensive process to an automated one capable of handling vast datasets. Kalogiannidis et al. (2024) highlight that over 80% of enterprise data is unstructured (e.g., text reports, social media), which traditional quantitative methods often struggle to process. By automating the analysis of this unstructured data, NLP significantly speeds up risk identification, finding supported by 70.2% of technology specialists. Furthermore, NLP reduces the human bias and error inherent in manual qualitative assessments, with 79.2% of respondents agreeing it improves identification accuracy.

2. In what ways can AI-powered data analytics enhance risk prediction and support business continuity? AI-powered analytics enables a transition from reactive to proactive risk management. Unlike traditional methods that rely on historical data and static risk factors, AI analytics can detect subtle patterns and anomalies in real-time streams. The study found that 71.5% of respondents agreed AI enhances the accuracy of predicting potential risks, rather than just reporting on past ones. Crucially, for business continuity, these tools allow for the rapid identification of “emerging risks” that have not yet materialised, with 93.5% of professionals noting that it supports a proactive approach.

3. Why is it important for businesses to integrate multiple AI technologies, beyond just NLP? While NLP is effective for efficiency, the study’s regression analysis indicates its direct impact on business continuity is only moderate compared to other technologies. In contrast, the integration of AI into Incident Response Planning demonstrated the highest statistical impact on minimising business disruption (coefficient of 0.361). Therefore, a “comprehensive strategy” is required: NLP for data processing, predictive analytics for identifying emerging threats, and AI-driven incident response to enhance resilience during crises. Relying solely on one tool leaves gaps in the Risk Management Process.

References

  • Kalogiannidis, S., Kalfas, D., Papaevangelou, O., Giannarakis, G. and Chatzitheodoridis, F. (2024) ‘The Role of Artificial Intelligence Technology in Predictive Risk Assessment for Business Continuity: A Case Study of Greece’, Risks, 12(2), p. 19. Available at: https://doi.org/10.3390/risks12020019