Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
9897777
docs: add MIG SQLModels design plan
hf-kklein Jan 7, 2026
7336e3e
feat: add SQLModel support for Message Implementation Guides (MIG)
hf-kklein Jan 7, 2026
6979c80
docs: add MIG SQLModels documentation to README
hf-kklein Jan 7, 2026
d306310
style: fix formatting and line length issues
hf-kklein Jan 7, 2026
acdb7c4
fix: disable pylint no-member for SQLModel Relationship fields
hf-kklein Jan 7, 2026
1a68e91
chore: add .claude/ to .gitignore
hf-kklein Jan 7, 2026
1403147
feat: add MIG hierarchy materialized view and diff view
hf-kklein Jan 7, 2026
fc0bc0d
Merge origin/main into feat/mig-hierarchy-and-diff-view
hf-kklein Jan 7, 2026
8964585
test: add snapshot test for MIG diff view comparing COMDIS versions
hf-kklein Jan 7, 2026
2d37feb
test: add larger PRICAT diff snapshot for MIG diff view
hf-kklein Jan 7, 2026
0d9d9cf
refactor: use path-based matching for MIG diff view
hf-kklein Jan 7, 2026
b5178e3
test: add IFTSTA snapshot to demonstrate "modified" entries in MIG diff
hf-kklein Jan 7, 2026
32856fa
docs: add MIG hierarchy and diff view documentation to README
hf-kklein Jan 7, 2026
17d7bee
chore: add segmente, hierarchie to spellcheck ignore list
hf-kklein Jan 7, 2026
ba7231f
Merge branch 'main' into feat/mig-hierarchy-and-diff-view
hf-kklein Jan 7, 2026
4006917
fmt: format MIG SQL files with JetBrains SQL formatter
hf-kklein Jan 7, 2026
3f6932c
Update unittests/test_mig_views.py
hf-kklein Jan 9, 2026
2497e3f
Update unittests/test_mig_views.py
hf-kklein Jan 9, 2026
efca64f
Merge branch 'main' into feat/mig-hierarchy-and-diff-view
hf-kklein Mar 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,86 @@ ORDER BY sort_path;

</details>

#### Befüllen einer Datenbank mit MIG-Informationen
Analog zu den AHBs lassen sich auch MIGs in eine Datenbank überführen und "flach" ziehen.
Da MIGs die vollständige Nachrichtenstruktur beschreiben (Segmentgruppen, Segmente, Datenelementgruppen, Datenelemente und Codes), ist die Hierarchie oft tiefer als bei AHBs.

```python
# pip install fundamend[sqlmodels]
from pathlib import Path
from fundamend.sqlmodels import create_db_and_populate_with_mig_view, MigHierarchyMaterialized
from sqlmodel import Session, create_engine, select

mig_paths = [
Path("UTILTS_MIG_1.1c_Lesefassung_2023_12_12.xml"),
# weitere MIG XML-Dateien hier hinzufügen
]
sqlite_file = create_db_and_populate_with_mig_view(mig_paths)
engine = create_engine(f"sqlite:///{sqlite_file}")
with Session(bind=engine) as session:
stmt = select(MigHierarchyMaterialized).where(MigHierarchyMaterialized.format == "UTILTS").order_by(
MigHierarchyMaterialized.sort_path
)
results = session.exec(stmt).all()
```
oder in plain SQL:
```sql
-- sqlite dialect
SELECT path,
type,
segmentgroup_id,
segment_id,
segment_name,
dataelement_id,
dataelement_name,
code_value,
code_name,
line_status_std,
line_status_specification
FROM mig_hierarchy_materialized
WHERE format = 'UTILTS'
ORDER BY sort_path;
```

<details>
<summary>Finde heraus, welche Zeilen in einem MIG zwischen zwei Versionen hinzukommen, gelöscht oder geändert wurden</summary>
<br>

Dafür gibt es die View `v_mig_diff`, die mit `create_mig_diff_view(session)` erstellt werden kann:
```python
from fundamend.sqlmodels import create_mig_diff_view
create_mig_diff_view(session)
```

Die View erwartet 4 Filter-Parameter beim Abfragen und liefert einen `diff_status`:
- `added`: Zeile existiert in der neuen Version, aber nicht in der alten
- `deleted`: Zeile existiert in der alten Version, aber nicht in der neuen
- `modified`: Zeile existiert in beiden Versionen, aber mit unterschiedlichen Werten (bei `modified` enthält `changed_columns` die Liste der geänderten Spalten)
- `unchanged`: Zeile ist in beiden Versionen identisch

Alle Wert-Spalten existieren doppelt (`old_*` und `new_*`), um die Werte aus beiden Versionen nebeneinander anzuzeigen.

**Matching-Strategie:** Anders als AHBs (die Prüfidentifikatoren als stabile semantische Anker haben) repräsentieren MIGs die vollständige Nachrichtenstruktur ohne solche Anker. Diese View matched Zeilen anhand ihrer lesbaren `path`-Spalte (z.B. "Nachrichten-Kopfsegment > Nachrichten-Kennung > ...") statt struktureller IDs. Das liefert semantisch sinnvollere Vergleiche, hat aber Einschränkungen:
- Wenn ein Element zwischen Versionen umbenannt wird, erscheint es als added+deleted statt modified
- Elemente mit identischen Namen an unterschiedlichen strukturellen Positionen könnten falsch gematcht werden

```sql
-- Alle Änderungen zwischen zwei MIG-Versionen anzeigen
SELECT path, diff_status, changed_columns,
old_line_status_std, new_line_status_std,
old_line_status_specification, new_line_status_specification,
old_line_name, new_line_name
FROM v_mig_diff
WHERE old_format_version = 'FV2504'
AND new_format_version = 'FV2510'
AND old_format = 'IFTSTA'
AND new_format = 'IFTSTA'
AND diff_status != 'unchanged'
ORDER BY sort_path;
```

</details>

### CLI Tool für XML➡️JSON Konvertierung
Mit
```bash
Expand Down
2 changes: 2 additions & 0 deletions domain-specific-terms.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@ rekursive
finde
contrl
Elemente
segmente
hierarchie
7 changes: 7 additions & 0 deletions src/fundamend/sqlmodels/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@
MigSegmentGroup,
MigSegmentGroupLink,
)
from .mig_diff_view import MigDiffLine, create_mig_diff_view
from .migview import MigHierarchyMaterialized, create_db_and_populate_with_mig_view, create_mig_view

__all__ = [
"create_ahb_view",
Expand All @@ -62,4 +64,9 @@
"MigSegment",
"MigSegmentGroup",
"MigSegmentGroupLink",
"create_mig_view",
"MigHierarchyMaterialized",
"create_db_and_populate_with_mig_view",
"create_mig_diff_view",
"MigDiffLine",
]
188 changes: 188 additions & 0 deletions src/fundamend/sqlmodels/create_mig_diff_view.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
-- Assume that materialize_mig_view.sql has been executed already.
-- This view allows comparing two MIG versions to find added, deleted, and modified rows.
--
-- IMPORTANT: This view produces a cross-product of all version pairs. You MUST filter by version and format.
--
-- Usage for comparing FV2410 -> FV2504 for UTILTS format:
-- SELECT * FROM v_mig_diff
-- WHERE old_format_version = 'FV2410'
-- AND old_format = 'UTILTS'
-- AND new_format_version = 'FV2504'
-- AND new_format = 'UTILTS'
-- ORDER BY sort_path;
--
-- diff_status can be: 'added', 'deleted', 'modified', 'unchanged'
-- The view compares line_status_std, line_status_specification, and line_name to determine modifications.
--
-- For deleted rows, old_ columns are populated and new_ columns are NULL.
-- For added rows, new_ columns are populated and old_ columns are NULL.
--
-- MATCHING STRATEGY:
-- Unlike AHBs (which have Prüfidentifikatoren as stable semantic anchors), MIGs represent the complete
-- message structure without such anchors. This view matches rows by their human-readable 'path' column
-- (e.g., "Nachrichten-Kopfsegment > Nachrichten-Kennung > ...") rather than structural id_path.
-- This provides more semantically meaningful comparisons but has limitations:
-- - If an element is renamed between versions, it appears as added+deleted rather than modified
-- - Elements with identical names at different structural positions may be incorrectly matched
-- For structural comparisons, use id_path directly from mig_hierarchy_materialized.

DROP TABLE IF EXISTS v_mig_diff;
DROP VIEW IF EXISTS v_mig_diff;

CREATE VIEW v_mig_diff AS
WITH version_pairs AS (SELECT DISTINCT old_v.edifact_format_version AS old_format_version,
old_v.format AS old_format,
new_v.edifact_format_version AS new_format_version,
new_v.format AS new_format
FROM (SELECT DISTINCT edifact_format_version, format FROM mig_hierarchy_materialized) old_v
JOIN (SELECT DISTINCT edifact_format_version, format
FROM mig_hierarchy_materialized) new_v
ON old_v.format = new_v.format
WHERE old_v.edifact_format_version < new_v.edifact_format_version),

-- Pre-compute changed_columns once, derive diff_status from it
modified_check AS (SELECT TRIM(
CASE
WHEN old_tbl.line_status_std IS NOT new_tbl.line_status_std
THEN 'line_status_std, '
ELSE '' END ||
CASE
WHEN old_tbl.line_status_specification IS NOT new_tbl.line_status_specification
THEN 'line_status_specification, '
ELSE '' END ||
CASE
WHEN old_tbl.line_name IS NOT new_tbl.line_name
THEN 'line_name'
ELSE '' END
, ', ') AS changed_columns,
new_tbl.id_path AS id_path,
new_tbl.sort_path AS sort_path,
new_tbl.path AS path,
new_tbl.type AS line_type,
old_tbl.edifact_format_version AS old_format_version,
old_tbl.format AS old_format,
old_tbl.segmentgroup_id AS old_segmentgroup_id,
old_tbl.segment_id AS old_segment_id,
old_tbl.dataelement_id AS old_dataelement_id,
old_tbl.code_value AS old_code_value,
old_tbl.line_status_std AS old_line_status_std,
old_tbl.line_status_specification AS old_line_status_specification,
old_tbl.line_name AS old_line_name,
new_tbl.edifact_format_version AS new_format_version,
new_tbl.format AS new_format,
new_tbl.segmentgroup_id AS new_segmentgroup_id,
new_tbl.segment_id AS new_segment_id,
new_tbl.dataelement_id AS new_dataelement_id,
new_tbl.code_value AS new_code_value,
new_tbl.line_status_std AS new_line_status_std,
new_tbl.line_status_specification AS new_line_status_specification,
new_tbl.line_name AS new_line_name
FROM version_pairs vp
JOIN mig_hierarchy_materialized new_tbl
ON new_tbl.edifact_format_version = vp.new_format_version
AND new_tbl.format = vp.new_format
JOIN mig_hierarchy_materialized old_tbl
ON old_tbl.edifact_format_version = vp.old_format_version
AND old_tbl.format = vp.old_format
AND old_tbl.path = new_tbl.path)

-- Modified and unchanged rows
SELECT CASE WHEN changed_columns != '' THEN 'modified' ELSE 'unchanged' END AS diff_status,
NULLIF(changed_columns, '') AS changed_columns,
id_path,
sort_path,
path,
line_type,
old_format_version,
old_format,
old_segmentgroup_id,
old_segment_id,
old_dataelement_id,
old_code_value,
old_line_status_std,
old_line_status_specification,
old_line_name,
new_format_version,
new_format,
new_segmentgroup_id,
new_segment_id,
new_dataelement_id,
new_code_value,
new_line_status_std,
new_line_status_specification,
new_line_name
FROM modified_check

UNION ALL

-- Added rows (exist in new but not in old for the specific version pair)
SELECT 'added' AS diff_status,
NULL AS changed_columns,
new_tbl.id_path,
new_tbl.sort_path,
new_tbl.path,
new_tbl.type AS line_type,
vp.old_format_version AS old_format_version,
vp.old_format AS old_format,
NULL AS old_segmentgroup_id,
NULL AS old_segment_id,
NULL AS old_dataelement_id,
NULL AS old_code_value,
NULL AS old_line_status_std,
NULL AS old_line_status_specification,
NULL AS old_line_name,
new_tbl.edifact_format_version AS new_format_version,
new_tbl.format AS new_format,
new_tbl.segmentgroup_id AS new_segmentgroup_id,
new_tbl.segment_id AS new_segment_id,
new_tbl.dataelement_id AS new_dataelement_id,
new_tbl.code_value AS new_code_value,
new_tbl.line_status_std AS new_line_status_std,
new_tbl.line_status_specification AS new_line_status_specification,
new_tbl.line_name AS new_line_name
FROM version_pairs vp
JOIN mig_hierarchy_materialized new_tbl
ON new_tbl.edifact_format_version = vp.new_format_version
AND new_tbl.format = vp.new_format
WHERE NOT EXISTS (SELECT 1
FROM mig_hierarchy_materialized old_tbl
WHERE old_tbl.edifact_format_version = vp.old_format_version
AND old_tbl.format = vp.old_format
AND old_tbl.path = new_tbl.path)

UNION ALL

-- Deleted rows (exist in old but not in new for the specific version pair)
SELECT 'deleted' AS diff_status,
NULL AS changed_columns,
old_tbl.id_path,
old_tbl.sort_path,
old_tbl.path,
old_tbl.type AS line_type,
old_tbl.edifact_format_version AS old_format_version,
old_tbl.format AS old_format,
old_tbl.segmentgroup_id AS old_segmentgroup_id,
old_tbl.segment_id AS old_segment_id,
old_tbl.dataelement_id AS old_dataelement_id,
old_tbl.code_value AS old_code_value,
old_tbl.line_status_std AS old_line_status_std,
old_tbl.line_status_specification AS old_line_status_specification,
old_tbl.line_name AS old_line_name,
vp.new_format_version AS new_format_version,
vp.new_format AS new_format,
NULL AS new_segmentgroup_id,
NULL AS new_segment_id,
NULL AS new_dataelement_id,
NULL AS new_code_value,
NULL AS new_line_status_std,
NULL AS new_line_status_specification,
NULL AS new_line_name
FROM version_pairs vp
JOIN mig_hierarchy_materialized old_tbl
ON old_tbl.edifact_format_version = vp.old_format_version
AND old_tbl.format = vp.old_format
WHERE NOT EXISTS (SELECT 1
FROM mig_hierarchy_materialized new_tbl
WHERE new_tbl.edifact_format_version = vp.new_format_version
AND new_tbl.format = vp.new_format
AND new_tbl.path = old_tbl.path);
Loading
Loading