# PDSL to ProbLog Translation
**Version:** 0.1.0
This document describes the complete translation from PDSL to ProbLog syntax, including patterns, edge cases, and optimization strategies.
## Table of Contents
1. [Translation Overview](#translation-overview)
2. [Basic Translations](#basic-translations)
3. [Advanced Patterns](#advanced-patterns)
4. [Optimization Strategies](#optimization-strategies)
5. [Edge Cases](#edge-cases)
6. [Bidirectional Translation](#bidirectional-translation)
## Translation Overview
### Design Principles
1. **Semantic Preservation:** PDSL and ProbLog programs have identical semantics
2. **Readability:** Generated ProbLog should be human-readable
3. **Efficiency:** Optimize for ProbLog inference performance
4. **Completeness:** Support all ProbLog features
### Translation Pipeline
```
PDSL Source
↓
Lexer + Parser
↓
AST (validated)
↓
ProbLog Generator
↓
ProbLog Output
```
## Basic Translations
### 1. Probabilistic Facts
**PDSL:**
```pdsl
0.3 :: rain
```
**ProbLog:**
```prolog
0.3::rain.
```
**Pattern:** Direct one-to-one translation.
### 2. Probabilistic Rules
**PDSL:**
```pdsl
0.9 :: flies(X) :- bird(X)
```
**ProbLog:**
```prolog
0.9::flies(X) :- bird(X).
```
**Pattern:** Add period at end, preserve structure.
### 3. Deterministic Facts
**PDSL:**
```pdsl
bird(sparrow)
```
**ProbLog:**
```prolog
bird(sparrow).
```
**Pattern:** Add period terminator.
### 4. Conjunctions
**PDSL:**
```pdsl
0.8 :: alarm :- burglar, window_broken
```
**ProbLog:**
```prolog
0.8::alarm :- burglar, window_broken.
```
**Pattern:** Comma separator preserved.
### 5. Negation
**PDSL:**
```pdsl
0.9 :: flies(X) :- bird(X), not penguin(X)
```
**ProbLog:**
```prolog
0.9::flies(X) :- bird(X), \+ penguin(X).
```
**Pattern:** `not` → `\+` (negation by failure).
### 6. Annotated Disjunctions
**PDSL:**
```pdsl
0.3 :: weather(rainy); 0.5 :: weather(cloudy); 0.2 :: weather(sunny)
```
**ProbLog:**
```prolog
0.3::weather(rainy); 0.5::weather(cloudy); 0.2::weather(sunny).
```
**Pattern:** Semicolon separators preserved, single period at end.
### 7. Observations (Evidence)
**PDSL:**
```pdsl
observe fever
```
**ProbLog:**
```prolog
evidence(fever, true).
```
**Pattern:** Wrap in `evidence/2` predicate with `true` value.
**Negated observation:**
**PDSL:**
```pdsl
observe not raining
```
**ProbLog:**
```prolog
evidence(raining, false).
```
**Pattern:** Negation becomes `false` value.
### 8. Queries
**PDSL:**
```pdsl
query flu
```
**ProbLog:**
```prolog
query(flu).
```
**Pattern:** Wrap in `query/1` predicate.
### 9. Learning Directives
**PDSL:**
```pdsl
learn parameters from dataset("medical_data.csv")
```
**ProbLog:**
```prolog
% Learn parameters from medical_data.csv
:- learn(medical_data.csv).
```
**Pattern:** Comment + directive syntax.
## Advanced Patterns
### 1. Multiple Models
**PDSL:**
```pdsl
probabilistic_model Weather {
0.3 :: rain
}
probabilistic_model Traffic {
0.4 :: congestion :- rain
}
```
**ProbLog:**
```prolog
% Model: Weather
0.3::rain.
% Model: Traffic
0.4::congestion :- rain.
```
**Pattern:** Models merged into single file, separated by comments.
### 2. Complex Rules
**PDSL:**
```pdsl
0.7 :: service_available :-
up(server1),
up(router),
not maintenance_mode
```
**ProbLog:**
```prolog
0.7::service_available :- up(server1), up(router), \+ maintenance_mode.
```
**Pattern:** Multi-line PDSL flattened to single line.
### 3. Nested Terms
**PDSL:**
```pdsl
parent(alice, bob)
ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z)
```
**ProbLog:**
```prolog
parent(alice, bob).
ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
```
**Pattern:** Recursive rules preserved.
### 4. Arithmetic and Comparisons
**PDSL:**
```pdsl
high_risk(X) :- temperature(X, T), T > 100
```
**ProbLog:**
```prolog
high_risk(X) :- temperature(X, T), T > 100.
```
**Pattern:** Comparison operators preserved.
### 5. Conditional Probabilities (Multiple Rules)
**PDSL:**
```pdsl
0.9 :: cloudy :- rain
0.5 :: cloudy :- not rain
```
**ProbLog:**
```prolog
0.9::cloudy :- rain.
0.5::cloudy :- \+ rain.
```
**Pattern:** Multiple rules for same head (conditional probability table).
### 6. Default Probabilities
**PDSL:**
```pdsl
0.7 :: alarm :- burglar
0.6 :: alarm :- earthquake
0.01 :: alarm # Default/baseline rate
```
**ProbLog:**
```prolog
0.7::alarm :- burglar.
0.6::alarm :- earthquake.
0.01::alarm.
```
**Pattern:** Unconditional fact for baseline probability.
## Optimization Strategies
### 1. Fact Reordering
Reorder facts for better inference performance:
**PDSL (any order):**
```pdsl
0.3 :: a :- b, c
c
b
```
**ProbLog (optimized):**
```prolog
% Base facts first
b.
c.
% Then rules
0.3::a :- b, c.
```
### 2. Query Ordering
Place queries at end:
**PDSL:**
```pdsl
query result
0.5 :: coin
result :- coin
```
**ProbLog (optimized):**
```prolog
% Facts and rules
0.5::coin.
result :- coin.
% Queries
query(result).
```
### 3. Evidence Grouping
Group evidence declarations:
**ProbLog (optimized):**
```prolog
% Facts
0.01::flu.
0.9::fever :- flu.
% Evidence
evidence(fever, true).
evidence(cough, true).
% Queries
query(flu).
```
### 4. Comment Annotations
Add helpful comments:
**PDSL:**
```pdsl
probabilistic_model Medical {
# Prior probability
0.01 :: flu
# Likelihood
0.9 :: fever :- flu
}
```
**ProbLog:**
```prolog
% Model: Medical
% Prior probability
0.01::flu.
% Likelihood
0.9::fever :- flu.
```
## Edge Cases
### 1. Empty Models
**PDSL:**
```pdsl
probabilistic_model Empty {
}
```
**ProbLog:**
```prolog
% Model: Empty
% (empty model)
```
**Handling:** Generate comment placeholder.
### 2. Probability Edge Values
**PDSL:**
```pdsl
0.0 :: impossible
1.0 :: certain
```
**ProbLog:**
```prolog
0.0::impossible.
1.0::certain.
```
**Handling:** Preserve exact values (though ProbLog may optimize them away).
### 3. Annotated Disjunctions with Remainder
**PDSL:**
```pdsl
0.3 :: a; 0.5 :: b # Implicit 0.2 for "neither"
```
**ProbLog:**
```prolog
0.3::a; 0.5::b.
```
**Handling:** ProbLog automatically handles remainder probability.
### 4. Multiple Queries
**PDSL:**
```pdsl
query disease(flu)
query disease(covid)
query recovered
```
**ProbLog:**
```prolog
query(disease(flu)).
query(disease(covid)).
query(recovered).
```
**Handling:** Each query becomes separate statement.
### 5. Special Characters in Identifiers
**PDSL:**
```pdsl
has_fever(patient_123)
```
**ProbLog:**
```prolog
has_fever(patient_123).
```
**Handling:** Underscores preserved, alphanumeric characters allowed.
### 6. String Arguments
**PDSL:**
```pdsl
file_path("/path/to/data.csv")
```
**ProbLog:**
```prolog
file_path('/path/to/data.csv').
```
**Handling:** Convert double quotes to single quotes (Prolog convention).
## Bidirectional Translation
### ProbLog → PDSL
Basic conversion rules:
1. **Remove periods:** `rain.` → `rain`
2. **Wrap in model:** Add `probabilistic_model` wrapper
3. **Convert evidence:** `evidence(X, true)` → `observe X`
4. **Convert queries:** `query(X)` → `query X`
5. **Convert negation:** `\+` → `not`
**Example:**
**ProbLog:**
```prolog
0.3::rain.
0.9::cloudy :- rain.
evidence(cloudy, true).
query(rain).
```
**PDSL:**
```pdsl
probabilistic_model Generated {
0.3 :: rain
0.9 :: cloudy :- rain
observe cloudy
query rain
}
```
### Handling ProbLog-Specific Features
Some ProbLog features don't have direct PDSL equivalents:
1. **Directives:** `:- use_module(library(lists)).`
- **Translation:** Add as comment in PDSL
2. **Built-in predicates:** `is/2`, `member/2`, etc.
- **Translation:** Preserve as-is (PDSL passthrough)
3. **Continuous distributions:** Currently not in PDSL v0.1.0
- **Translation:** Error or treat as extension
## Translation Examples
### Example 1: Medical Diagnosis
**PDSL:**
```pdsl
probabilistic_model MedicalDiagnosis {
# Priors
0.01 :: flu
0.001 :: covid
# Symptoms
0.9 :: fever :- flu
0.95 :: fever :- covid
0.1 :: fever
# Evidence
observe fever
# Queries
query flu
query covid
}
```
**ProbLog:**
```prolog
% Model: MedicalDiagnosis
% Priors
0.01::flu.
0.001::covid.
% Symptoms
0.9::fever :- flu.
0.95::fever :- covid.
0.1::fever.
% Evidence
evidence(fever, true).
% Queries
query(flu).
query(covid).
```
### Example 2: Bayesian Network
**PDSL:**
```pdsl
probabilistic_model Alarm {
# Root nodes
0.001 :: burglar
0.002 :: earthquake
# Alarm depends on both
0.95 :: alarm :- burglar, earthquake
0.94 :: alarm :- burglar, not earthquake
0.29 :: alarm :- not burglar, earthquake
0.001 :: alarm :- not burglar, not earthquake
# Calls depend on alarm
0.9 :: john_calls :- alarm
0.05 :: john_calls :- not alarm
0.7 :: mary_calls :- alarm
0.01 :: mary_calls :- not alarm
# Evidence
observe john_calls
observe mary_calls
# Query
query burglar
}
```
**ProbLog:**
```prolog
% Model: Alarm
% Root nodes
0.001::burglar.
0.002::earthquake.
% Alarm depends on both
0.95::alarm :- burglar, earthquake.
0.94::alarm :- burglar, \+ earthquake.
0.29::alarm :- \+ burglar, earthquake.
0.001::alarm :- \+ burglar, \+ earthquake.
% Calls depend on alarm
0.9::john_calls :- alarm.
0.05::john_calls :- \+ alarm.
0.7::mary_calls :- alarm.
0.01::mary_calls :- \+ alarm.
% Evidence
evidence(john_calls, true).
evidence(mary_calls, true).
% Query
query(burglar).
```
### Example 3: Social Network
**PDSL:**
```pdsl
probabilistic_model SocialNetwork {
# People
person(alice)
person(bob)
person(carol)
# Friendship probabilities
0.3 :: friends(alice, bob)
0.4 :: friends(bob, carol)
0.2 :: friends(alice, carol)
# Symmetric friendship
friends(X, Y) :- friends(Y, X)
# Transitive connections
0.7 :: connected(X, Y) :- friends(X, Y)
0.5 :: connected(X, Z) :- friends(X, Y), connected(Y, Z)
query connected(alice, carol)
}
```
**ProbLog:**
```prolog
% Model: SocialNetwork
% People
person(alice).
person(bob).
person(carol).
% Friendship probabilities
0.3::friends(alice, bob).
0.4::friends(bob, carol).
0.2::friends(alice, carol).
% Symmetric friendship
friends(X, Y) :- friends(Y, X).
% Transitive connections
0.7::connected(X, Y) :- friends(X, Y).
0.5::connected(X, Z) :- friends(X, Y), connected(Y, Z).
% Query
query(connected(alice, carol)).
```
## Translation Validation
### Validation Checklist
After translation, verify:
1. ✅ All probabilistic facts have `::` operator
2. ✅ All statements end with period
3. ✅ Negations use `\+` not `not`
4. ✅ Evidence uses `evidence/2` predicate
5. ✅ Queries use `query/1` predicate
6. ✅ Variables start with uppercase
7. ✅ Predicates/constants start with lowercase
8. ✅ Parentheses balanced
9. ✅ Comments preserved
10. ✅ Annotated disjunctions on single line with semicolons
### Validation Script
```typescript
function validateProbLog(problog: string): ValidationResult {
const errors: string[] = [];
// Check 1: All :: followed by valid atom
const probFacts = problog.match(/[0-9.]+::[^.]+/g) || [];
for (const fact of probFacts) {
if (!fact.endsWith('.') && !fact.includes(';')) {
errors.push(`Probabilistic fact missing period: ${fact}`);
}
}
// Check 2: Evidence format
const evidences = problog.match(/evidence\([^)]+\)/g) || [];
for (const ev of evidences) {
if (!ev.includes(', true') && !ev.includes(', false')) {
errors.push(`Invalid evidence format: ${ev}`);
}
}
// Check 3: Query format
const queries = problog.match(/query\([^)]+\)/g) || [];
for (const q of queries) {
if (!q.endsWith(').')) {
errors.push(`Query missing period: ${q}`);
}
}
return {
valid: errors.length === 0,
errors
};
}
```
## Performance Considerations
### Translation Time Complexity
- Lexing: O(n) where n = source length
- Parsing: O(n) for recursive descent
- Code generation: O(m) where m = AST nodes
- Overall: O(n + m) ≈ O(n)
### Memory Usage
- AST size: ~100 bytes per node
- Symbol table: ~50 bytes per symbol
- Output string: ~1.2x input size (due to formatting)
### Optimization Opportunities
1. **Incremental compilation:** Only recompile changed models
2. **Caching:** Cache parsed ASTs for repeated translations
3. **Streaming:** Process large files in chunks
4. **Parallel processing:** Compile independent models in parallel
## Troubleshooting Common Issues
### Issue 1: Syntax Error in Generated ProbLog
**Symptom:** ProbLog parser fails
**Cause:** Unescaped special characters
**Solution:** Escape quotes, backslashes in strings
### Issue 2: Incorrect Probability Values
**Symptom:** ProbLog returns unexpected results
**Cause:** Floating point precision
**Solution:** Round probabilities to 6 decimal places
### Issue 3: Variable Binding Errors
**Symptom:** ProbLog complains about unbound variables
**Cause:** Unsafe PDSL rules (should be caught by validator)
**Solution:** Ensure all head variables appear in body
### Issue 4: Negation Not Working
**Symptom:** Negated literals not behaving as expected
**Cause:** Used `not` instead of `\+`
**Solution:** Always translate `not` to `\+`
## Future Extensions
### Planned Translation Features
1. **Optimization passes:** Dead code elimination, constant folding
2. **Pretty printing:** Configurable formatting options
3. **Source maps:** Track PDSL→ProbLog line mappings for debugging
4. **Macro expansion:** Support for PDSL macros
5. **Type-directed optimization:** Use type info for better code
## Reference Implementation
See parser implementation in `src/probabilistic/codegen/generator.ts`.
## Testing Translation
```typescript
// tests/probabilistic/translation.test.ts
describe('PDSL to ProbLog Translation', () => {
test('translates probabilistic facts', () => {
const pdsl = '0.7 :: rain';
const problog = translate(pdsl);
expect(problog).toBe('0.7::rain.');
});
test('translates observations to evidence', () => {
const pdsl = 'observe fever';
const problog = translate(pdsl);
expect(problog).toBe('evidence(fever, true).');
});
test('translates negation correctly', () => {
const pdsl = '0.9 :: flies(X) :- bird(X), not penguin(X)';
const problog = translate(pdsl);
expect(problog).toContain('\\+ penguin(X)');
});
test('preserves annotated disjunctions', () => {
const pdsl = '0.3 :: a; 0.7 :: b';
const problog = translate(pdsl);
expect(problog).toBe('0.3::a; 0.7::b.');
});
});
```
---
**This translation guide ensures correct, efficient, and idiomatic ProbLog code generation from PDSL.**