0 Core Principle of This Application
⚠ Architecture Notice
This application is built on a Naive Bayes architecture. It assumes that every piece of evidence you enter is conditionally independent from all other evidence, given the hypothesis. Understanding this assumption is essential for using the tool correctly.
Bayesian reasoning is the mathematically principled way to update beliefs in light of new information. You start with a prior — what you believe before seeing the evidence — and you revise it into a posterior — what you should believe after seeing the evidence.
This application performs that revision iteratively: each evidence item is applied one after another, each multiplying the running probability by a factor derived from its Likelihood Ratio and your stated Confidence in it. The result is a transparent, auditable chain of reasoning.
1 How the Calculation Works
The engine converts the prior probability into odds, multiplies by each evidence item's effective Likelihood Ratio, then converts back to a probability. In compact form:
Master formula
Posterior Odds = Prior Odds × ∏ ( LRᵢ ^ confidenceᵢ )
The step-by-step logic:
- Convert your Prior Probability
P into odds: Odds = P / (1 − P)
- For each evidence item i, compute its Effective LR:
LR_eff = LRᵢ ^ confidenceᵢ
- Multiply the running odds by each Effective LR in sequence.
- Convert the final odds back to a probability:
P_posterior = Odds / (1 + Odds)
Why odds instead of percentages? Multiplying odds by Likelihood Ratios is mathematically clean and exact. Working directly in probabilities would require more complex arithmetic at every step.
📐 Worked Numerical Example
Prior
Prior probability set to 0.40
Odds = 0.667
Ev.1
LR = 3.0, Confidence = 0.8 → LR_eff = 3.0 ^ 0.8 = 2.41
Odds × 2.41 = 1.607
Ev.2
LR = 0.5, Confidence = 1.0 → LR_eff = 0.5 ^ 1.0 = 0.50
Odds × 0.50 = 0.804
Post.
Convert final odds back to probability
P = 0.804 / 1.804 ≈ 0.446
2 Hypothesis
The Hypothesis is the central claim you want to evaluate. It must be a clearly defined, falsifiable statement — something that can be either supported or contradicted by real-world observations.
Example hypothesis:
"Village development policies that focus heavily on physical infrastructure may unintentionally increase income inequality among residents."
A well-formed hypothesis is specific enough that you can meaningfully ask: "Does this piece of evidence make this claim more or less plausible?" All evidence you enter into the system is evaluated against this single hypothesis.
3 Prior Probability
The Prior Probability is your starting belief in the hypothesis before considering any current evidence. It represents the background plausibility of the claim based on existing knowledge, prior studies, expert consensus, or institutional experience.
- 0.50 — You have no initial lean; the hypothesis is as likely as not. A reasonable default when you lack prior information.
- 0.30 — You consider the hypothesis unlikely based on prior knowledge.
- 0.70 — Prior experience or literature already suggests the hypothesis is plausible.
- 0.10 – 0.20 — Strong reason to doubt; use only if existing evidence strongly points away from the hypothesis.
- 0.80 – 0.90 — Substantial prior evidence already exists in favor; evidence is being used to confirm a near-established view.
Be honest about your prior. A biased starting point will distort the final posterior regardless of how carefully you calibrate your evidence. When in doubt, use 0.50 and let the evidence speak.
3.1 The Rule of 0 and 1 (Cromwell's Rule)
In Bayesian logic, if you set a Prior to exactly 0 or 1, no amount of evidence can ever change your mind. Multiplying by zero always results in zero.
Never say never: Always leave a tiny margin for doubt (e.g., 0.01 or 0.99). As a field analyst, assigning 100% certainty to an initial belief is the death of objective reasoning.
4 Decision Threshold
The Decision Threshold is the posterior probability value at which you consider the hypothesis sufficiently supported to justify a decision, intervention, or recommendation. It separates exploratory uncertainty from actionable confidence.
- 0.60 — Low bar; appropriate for exploratory analysis or flagging issues for further investigation.
- 0.70 – 0.75 — Moderate confidence; suitable for programmatic recommendations or preliminary policy decisions.
- 0.80 – 0.85 — High bar; appropriate for significant resource allocation, formal reports, or public-facing recommendations.
- 0.90+ — Very high bar; use only where the cost of being wrong is severe.
Choosing the right threshold is a judgment call. Consider the consequences of a false positive (acting when the hypothesis is actually false) versus a false negative (failing to act when it is true). Higher-stakes decisions warrant a higher threshold.
5 Evidence Items
Each evidence item represents a discrete real-world observation that bears on the hypothesis. Evidence can come from a wide variety of sources:
- Field observations and site visit reports
- Survey data and community interviews
- Statistical indicators and administrative data
- Expert assessments and peer-reviewed findings
- Policy documents and program outcomes
Evidence Direction
Each item must be labeled as either supporting (For) or contradicting (Against) the hypothesis. This directionality is built into the Likelihood Ratio: an LR above 1.0 supports the hypothesis, and an LR below 1.0 contradicts it. Providing both types of evidence produces a more balanced and credible analysis.
6 Likelihood Ratio (LR)
The Likelihood Ratio quantifies how much a specific piece of evidence shifts the probability of the hypothesis. Formally, it is the ratio of two probabilities:
Definition
LR = P(Evidence | Hypothesis is TRUE) ÷ P(Evidence | Hypothesis is FALSE)
In plain language: How much more (or less) likely is it that we would observe this evidence if the hypothesis is true, compared to if it were false?
LR ≥ 5 Strong support — evidence is much more likely if hypothesis is true
LR 2–4 Moderate support — meaningfully increases confidence
LR 1.2–1.9 Weak support — slight positive update
LR = 1 Neutral — evidence tells you nothing about the hypothesis
LR 0.6–0.9 Weak contradiction — slight negative update
LR 0.3–0.5 Moderate contradiction — meaningfully decreases confidence
LR ≤ 0.2 Strong contradiction — evidence strongly argues against hypothesis
Tip: When in doubt, err toward moderate values (1.5–3.0 for supporting, 0.3–0.7 for contradicting). Extreme values like 10 or 0.05 require very strong justification and can dominate the entire analysis.
★ Likelihood Ratio Suggestion Scale
If you are unsure what Likelihood Ratio value to assign, you can use the built-in Preset dropdown. The scale translates qualitative evidence strength into approximate LR values commonly used in Bayesian reasoning and intelligence analysis.
| Evidence Strength | Approximate LR |
| Strong Support (For) | 10.0 |
| Moderate Support (For) | 3.0 |
| Weak Support (For) | 1.5 |
| Neutral | 1.0 |
| Weak Against | 0.7 |
| Moderate Against | 0.4 |
| Strong Against | 0.2 |
These values are guidelines only. Users may still manually type custom Likelihood Ratios if more precise estimates are available.
7 Confidence — What It Is and Why It Matters
★ Key Concept
Confidence is not how strongly you believe the evidence supports the hypothesis. That is what the Likelihood Ratio captures. Confidence is a separate, orthogonal judgment: how reliable and trustworthy is the evidence itself?
The Core Distinction
Every piece of evidence carries two independent questions:
- If this evidence is real and accurate, how much does it update my belief? → This is the Likelihood Ratio.
- How much do I trust that this evidence is real and accurate in the first place? → This is your Confidence.
You might have an observation that, if true, would be very powerful evidence — but it comes from a single, unverified source. You give it a high LR because the finding is significant, but a low Confidence because you are uncertain about its quality. The model gracefully handles both dimensions simultaneously.
The Mathematical Effect
Confidence works as an exponent applied to the Likelihood Ratio:
Effective Likelihood Ratio
LR_effective = LR ^ confidence
What this means in practice:
- At Confidence = 1.0, the evidence is applied at full strength:
LR_eff = LR
- At Confidence = 0.5, the evidence is applied at half-power:
LR_eff = √LR
- At Confidence = 0.0, the evidence has no effect:
LR_eff = 1.0 (neutral)
Crucially, the direction of the evidence is preserved — reducing confidence simply pulls the effect toward neutrality, it does not reverse it.
Confidence Reference Scale
0.2Anecdotal / unverified
0.4Single source, unclear method
0.6Reliable source, some caveats
0.8Strong source, minor uncertainty
1.0Verified, robust, high-quality data
What Degrades Confidence?
- Source quality: Unverified reports, single informants, rumors → lower confidence
- Sample size: One village versus a survey of one hundred villages → lower confidence for small samples
- Observation recency: Data from several years ago may no longer reflect current conditions
- Measurement method: Subjective assessments are less reliable than objective measurements
- Potential bias: Evidence gathered by parties with a stake in the outcome
- Consistency: Evidence that contradicts many other reliable sources
📌 Confidence in Practice
Scenario
LR = 3.0 (moderately strong support)
C = 1.0
Verified field data, representative sample
LR_eff = 3.00
C = 0.7
Reliable source but only partial coverage
LR_eff = 2.16
C = 0.4
Single informant, unverified observation
LR_eff = 1.55
C = 0.1
Rumor / very low quality source
LR_eff = 1.11
The key insight: Confidence allows you to include weak or uncertain evidence without letting it dominate the result. It is the mechanism that prevents low-quality data from being treated as equally powerful as verified findings. This makes the model robust to real-world data quality variation.
8 Impact on Posterior
This indicator shows the percentage-point change in the posterior probability attributable to a single evidence item. It answers the question: "Of all the evidence I entered, which one moved the needle the most?"
Use this to:
- Identify the most influential factors driving your conclusion
- Spot evidence that may be over-weighted (very high impact with low confidence)
- Prioritize which data to verify or expand in further fieldwork
- Communicate to stakeholders which findings are most critical
9 Notes / Evidence Description
The notes field is optional but strongly recommended for any analysis that will be reviewed, shared, or revisited. Document:
- The source of the evidence (report name, survey date, institution)
- How the observation was collected (methodology)
- Why you assigned the specific LR and Confidence values you chose
- Any caveats or limitations associated with the evidence
- Links to supporting documents or data tables
This documentation is what separates a transparent, auditable analysis from an opaque black box. It allows others — and your future self — to critically evaluate the reasoning behind each probability update.
10 The Independence Assumption & Double-Counting Risk
The core limitation of Naive Bayes is that it treats every evidence item as independent of the others. In reality, two observations may be causally linked: Evidence A may cause Evidence B. If you enter both independently, you apply the same underlying signal twice and double-count it.
Example of double-counting risk:
— Evidence A: "Road infrastructure in the area is severely damaged"
— Evidence B: "Logistics and supply distribution are significantly delayed"
If damaged roads cause distribution delays, these are not independent observations. Entering both multiplies the impact of a single underlying cause, artificially inflating the posterior.
The Fieldwork Solution: Composite Evidence
When two observations are causally connected, merge them into a single composite evidence statement and assign one representative Likelihood Ratio to the combined observation:
"Logistics and supply distribution are significantly delayed as a direct consequence of severe damage to road infrastructure."
This single entry captures the full meaning without double-counting. The Likelihood Ratio you assign should reflect the combined strength of both signals, and your Confidence should reflect how well-documented the causal link is.
11 Naive Bayes vs. Full Bayesian Networks
Full Bayesian Networks (also called Bayesian Belief Networks) represent causal dependencies between variables using a Directed Acyclic Graph (DAG). Each variable can conditionally depend on others, and inference propagates through the network via algorithms such as:
- Belief Propagation (Message Passing) — efficient on tree-structured networks
- Variable Elimination — exact inference by marginalizing over hidden variables
- MCMC Sampling — approximate inference for large, complex networks
| Aspect |
Naive Bayes (This App) |
Full Bayesian Network |
| Structure |
Star topology — all evidence connects directly to hypothesis node |
Arbitrary DAG — nodes can connect to any other node |
| Independence |
Evidence items assumed conditionally independent |
Dependencies between variables explicitly modeled |
| Computation |
Very fast — a single pass of multiplications |
Computationally intensive — requires CPT tables and inference algorithms |
| Data Needed |
Only LR and confidence estimates per evidence item |
Full Conditional Probability Tables for each node and parent combination |
| Infrastructure |
Runs fully offline in a browser; no server required |
Typically requires dedicated libraries (e.g., pgmpy, Netica) or server-side compute |
| Transparency |
Every update is traceable and explainable to non-experts |
Inference steps are often opaque without specialist knowledge |
| Best for |
Field decision support, rapid assessments, resource-constrained environments |
Academic research, complex causal modeling, machine learning pipelines |
12 Why Naive Bayes Is Practical for Fieldwork
- Fully Offline Operation: The entire computation runs inside your browser. No internet connection, server, or external library is required — critical for remote or low-connectivity field environments.
- Low Cognitive Overhead: Analysts need only two numbers per evidence item — a Likelihood Ratio and a Confidence value — rather than full conditional probability tables with potentially dozens of entries.
- Empirically Robust: Despite the independence assumption, Naive Bayes classifiers consistently perform well in practice — even when variables are correlated. The model is more resilient to mild dependency violations than its theoretical limitations might suggest.
- Transparent Reasoning Chain: Every evidence item's contribution to the posterior is visible and individually interpretable. The analysis can be fully explained to policymakers, community leaders, or auditors without specialist training.
- Auditable and Reproducible: All inputs — prior, LRs, confidence values — are explicit and documented. A colleague can reproduce, challenge, or refine the analysis by adjusting any single parameter.
- Graceful Handling of Uncertainty: Through the Confidence parameter, the model explicitly represents the varying quality of real-world field evidence — something many simpler frameworks cannot do.
13 Guardrails: Detecting Confirmation Bias
Human analysts are naturally prone to Confirmation Bias—the tendency to only collect evidence that supports our existing worldview. This application monitors the composition of your evidence list in real-time.
Triggering a Warning
- Homogeneous Evidence: If 100% of your items have an LR > 1.0, the system will flag a "Blindspot Warning." It is statistically rare in complex governance environments for no contradictory data to exist.
- Cluster Over-reliance: If 80% of the movement in your posterior probability comes from a single cluster (e.g., "Economic Data"), the system warns of a lack of multidimensionality.
Actionable Strategy: When a bias warning appears, do not simply delete supporting evidence. Instead, spend your next "field hour" specifically looking for "Red Teams" or counter-arguments to your hypothesis.
14 The Confidence Matrix: Calibration over Instinct
Moving a slider based on "gut feeling" can be hard to defend in a formal audit. The Confidence Matrix Calculator (🧮) uses a 3-point triangulation to standardize how evidence quality is scored:
| Dimension |
Low Score (0.2 - 0.4) |
High Score (0.8 - 1.0) |
| Source Reliability |
Rumors, unverified social media, single informant. |
Official registries, institutional data, cross-verified experts. |
| Sample/Method |
Anecdotal observations, tiny sample size, leading questions. |
Randomized surveys, peer-reviewed methodology, representative samples. |
| Recency |
Data from >2 years ago in a fast-changing environment. |
Real-time data or very recent site visit (current quarter). |
By averaging these three scores, you transform a subjective "feeling" into a methodological assessment that can be documented in your final report.
15 Pro-Tip: The Contrast Method for LR
Struggling to pick an LR? Use the "Contrast Question":
How much more likely is this evidence in a world where the hypothesis is TRUE,
compared to a world where it is FALSE?
If you would expect to see this evidence 60% of the time if the claim is true, but only 20% of the time if it's false, your LR is 3.0 (60/20).
16 🚀 Quick Start: Learning from Zero
If you are new to Bayesian reasoning, do not worry about the formulas at first.
Start with these three simple steps and focus on the logic of the evidence.
-
Set the Prior to 0.5:
This represents an initial position of uncertainty — you are essentially saying
“I have no strong belief yet.” In Bayesian analysis, this is often called a
non-informative starting point. From there, the evidence will update the probability.
-
Use the Likelihood Slider Relatively:
Think about how likely the evidence would appear if the hypothesis were true.
-
If the evidence would very likely appear when the hypothesis is true,
move the slider to the right (supporting evidence).
-
If the evidence would rarely appear when the hypothesis is true,
move the slider to the left (evidence against the hypothesis).
The goal is not perfect numbers, but a reasonable estimate of how the evidence
changes the likelihood of the hypothesis.
-
Use the Confidence Matrix:
If you are unsure about the reliability of the information, click the 🧮 icon.
The matrix helps estimate how trustworthy or strong the evidence source is.
This allows the system to adjust the influence of that evidence automatically.
17 🧠 Mental Model: Thinking Like a Bayesian
Bayesian thinking is closer to how a detective works than how a judge works.
A judge makes a final decision at the end of a trial, but a detective constantly
updates their belief as new clues appear.
Key idea:
Bayesian reasoning does not aim to find an absolute truth immediately.
Instead, it estimates the most reasonable belief given the evidence available
right now.
When new evidence appears tomorrow, the probability should change.
Updating beliefs in light of new data is not inconsistency — it is the
core principle of rational scientific reasoning.
18 🛑 Common Beginner Mistakes
-
Using Extreme Likelihood Ratios Too Quickly:
Assigning very large Likelihood Ratios (for example LR = 20) for evidence that
is based only on rumors or weak observations. In most practical analyses,
moderate values (around 1.5 – 3.0) are safer unless the evidence is extremely
strong or experimentally verified.
-
Ignoring Contradictory Evidence:
Adding many pieces of supporting evidence while intentionally ignoring
information that contradicts the hypothesis. This is a classic
confirmation bias.
A robust analysis should always consider both supporting and opposing evidence.
-
Confusing Evidence Strength with Source Reliability:
Remember that these two concepts are different:
-
Likelihood Ratio (LR): how strongly the evidence supports or
contradicts the hypothesis.
-
Confidence: how much you trust the source or quality of that evidence.
Strong evidence from an unreliable source should still be treated cautiously.
19 🎯 Decision Threshold
The decision threshold defines the minimum probability required
before taking action on a hypothesis.
In Bayesian analysis, probability alone does not automatically determine a decision.
A threshold acts as a boundary between "continue investigating" and
"sufficient evidence to act."
Decision rule:
Act if Posterior ≥ Threshold.
For example, if the threshold is set to 0.75 and the calculated
posterior probability is 0.70, the evidence suggests the hypothesis
is plausible, but not strong enough to justify action yet.
The appropriate threshold depends on the context and the cost of being wrong.
- Exploratory analysis: 0.60 – 0.70
- Policy decisions: 0.70 – 0.80
- High-risk decisions: 0.80 – 0.95
Higher thresholds require stronger evidence but reduce the risk of acting on
incorrect conclusions.
20 📊 Confidence Matrix (Evidence Quality)
Not all evidence has the same reliability. The Confidence Matrix
helps users estimate the overall quality of an evidence source by scoring
three dimensions.
The system averages these scores to generate the final
confidence value used in the Bayesian update.
The confidence score modifies how strongly a piece of evidence influences
the final probability.
Three Evidence Quality Dimensions
-
Source Reliability
Measures how trustworthy the source of the information is.
- Rumor / informal claim → low reliability
- Official statistics or verified data → high reliability
-
Sample Size / Method
Evaluates how the evidence was collected.
- Anecdotal observation → weaker evidence
- Structured surveys or peer-reviewed studies → stronger evidence
-
Recency
Considers how current the information is.
- Outdated data may no longer reflect present conditions
- Recent observations typically provide stronger signals
The average of these three dimensions becomes the final
confidence score applied to the evidence.
Effective LR = LR ^ confidence
This means low-confidence evidence still contributes to the analysis,
but its impact on the final probability is reduced.
21 📈 Evidence Impact
Each evidence item displays its impact on the posterior probability.
This shows how much the hypothesis probability changed after the evidence
was incorporated into the calculation.
- Positive impact → evidence supports the hypothesis
- Negative impact → evidence weakens the hypothesis
This feature helps analysts quickly identify which observations are
driving the final result.
22 🔍 Top Drivers
The Top Drivers section highlights the pieces of evidence
that contributed the most to the final probability update.
This allows users to quickly identify the key factors influencing the
analytical conclusion.
Large positive values indicate strong supporting evidence,
while large negative values indicate strong counter-evidence.