Skip to content

Commit a56dcf4

Browse files
authored
Merge pull request #221 from eccenca/feature/databricks-and-jdbc-ECC-8000
add JDBC stub / databricks integration / JDBC tag ...
2 parents ee2e24b + 2f3a34b commit a56dcf4

8 files changed

Lines changed: 136 additions & 52 deletions

File tree

data/integrations.yml

Lines changed: 39 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -256,66 +256,68 @@ integrations:
256256
# Databases
257257
#####
258258

259+
- name: Hive
260+
icon: ":simple-apachehive:"
261+
content: |
262+
Read from or write to an embedded Apache {{p.Hive}} endpoint.
263+
264+
- name: pgvector
265+
icon: ":black_large_square:"
266+
content: |
267+
Store vector embeddings into [pgvector](https://github.com/pgvector/pgvector)
268+
using the {{p.cmem_plugin_pgvector_Search}}.
269+
259270
- name: Neo4J
260271
icon: ":simple-neo4j:"
261272
content: |
262273
Use the {{p.neo4j}} dataset for reading and writing [Neo4j graphs](https://neo4j.com/).
263274
264-
- name: PostgreSQL
265-
icon: ":simple-postgresql:"
275+
- name: Snowflake
276+
icon: ":simple-snowflake:"
266277
content: |
267-
PostgreSQL can be accessed with the {{p.Jdbc}} dataset and a
268-
[JDBC driver](https://central.sonatype.com/artifact/org.postgresql/postgresql/versions).
278+
Snowflake can be accessed with the {{p.SnowflakeJdbc}} dataset (JDBC driver included).
269279
270-
- name: MariaDB
271-
icon: ":simple-mariadb:"
280+
- name: Microsoft SQL
281+
icon: ":material-microsoft:"
272282
content: |
273-
MariaDB can be accessed with the {{p.Jdbc}} dataset and a
274-
[JDBC driver](https://central.sonatype.com/artifact/org.mariadb.jdbc/mariadb-java-client/overview).
283+
The Microsoft SQL Server can be accessed with the {{p.Jdbc}} dataset (JDBC driver included).
275284
276-
- name: SQLite
277-
icon: ":simple-sqlite:"
285+
- name: PostgreSQL
286+
icon: ":simple-postgresql:"
278287
content: |
279-
SQLite can be accessed with the {{p.Jdbc}} dataset and a
280-
[JDBC driver](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc).
288+
PostgreSQL can be accessed with the {{p.Jdbc}} dataset (JDBC driver included).
281289
282290
- name: MySQL
283291
icon: ":simple-mysql:"
284292
content: |
285-
MySQL can be accessed with the {{p.Jdbc}} dataset and a
286-
[JDBC driver](https://central.sonatype.com/artifact/org.mariadb.jdbc/mariadb-java-client/overview).
287-
288-
- name: Hive
289-
icon: ":simple-apachehive:"
290-
content: |
291-
Read from or write to an embedded Apache {{p.Hive}} endpoint.
293+
MySQL can be accessed with the {{p.Jdbc}} dataset (JDBC driver included).
292294
293-
- name: Microsoft SQL
294-
icon: ":material-microsoft:"
295-
content: |
296-
The Microsoft SQL Server can be accessed with the {{p.Jdbc}} dataset and a
297-
[JDBC driver](https://central.sonatype.com/artifact/com.microsoft.sqlserver/mssql-jdbc).
298-
299-
- name: Snowflake
300-
icon: ":simple-snowflake:"
295+
- name: MariaDB
296+
icon: ":simple-mariadb:"
301297
content: |
302-
Snowflake can be accessed with the {{p.SnowflakeJdbc}} dataset and a
303-
[JDBC driver](https://central.sonatype.com/artifact/net.snowflake/snowflake-jdbc).
298+
MariaDB can be accessed with the {{p.Jdbc}} dataset (JDBC driver included).
304299
305-
- name: pgvector
306-
icon: ":black_large_square:"
300+
- name: SQLite
301+
icon: ":simple-sqlite:"
307302
content: |
308-
Store vector embeddings into [pgvector](https://github.com/pgvector/pgvector)
309-
using the {{p.cmem_plugin_pgvector_Search}}.
303+
SQLite can be accessed with the {{p.Jdbc}} dataset and a
304+
[Custom JDBC driver](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc).
305+
Please have a look at
306+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
310307
311308
- name: Trino
312309
icon: ":simple-trino:"
313310
content: |
314-
[Trino](https://github.com/trinodb/trino) can be access with the
315-
{{p.Jdbc}} dataset and a [JDBC driver](https://trino.io/docs/current/client/jdbc.html).
311+
[Trino](https://github.com/trinodb/trino) can be access with the {{p.Jdbc}} dataset and a
312+
[Custom JDBC driver](https://trino.io/docs/current/client/jdbc.html).
313+
Please have a look at
314+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
316315
317316
- name: Databricks
318317
icon: ":simple-databricks:"
319318
content: |
320-
[Databricks](https://www.databricks.com) can be access with the
321-
{{p.Jdbc}} dataset and a [JDBC driver](https://www.databricks.com/spark/jdbc-drivers-download).
319+
[Databricks](http://databricks.com/) can be accessed with the {{p.Jdbc}} dataset and a
320+
[Custom JDBC driver](https://github.com/databricks/databricks-jdbc).
321+
Please have a look at
322+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
323+

docs/build/integrations/index.md

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,10 @@ to interact with any [Azure AI Foundry provided Large Language Models](https://a
5151

5252
---
5353

54-
[Databricks](https://www.databricks.com) can be access with the
55-
[Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a [JDBC driver](https://www.databricks.com/spark/jdbc-drivers-download).
54+
[Databricks](http://databricks.com/) can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
55+
[Custom JDBC driver](https://github.com/databricks/databricks-jdbc).
56+
Please have a look at
57+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
5658

5759

5860
- :material-email-outline:{ .lg .middle } eMail / SMTP
@@ -142,8 +144,7 @@ GraphDB can be used as the integrated Quad Store as well.
142144

143145
---
144146

145-
MariaDB can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
146-
[JDBC driver](https://central.sonatype.com/artifact/org.mariadb.jdbc/mariadb-java-client/overview).
147+
MariaDB can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset (JDBC driver included).
147148

148149

149150
- :simple-mattermost:{ .lg .middle } Mattermost
@@ -158,16 +159,14 @@ the [Send Mattermost messages](../../build/reference/customtask/cmem_plugin_matt
158159

159160
---
160161

161-
The Microsoft SQL Server can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
162-
[JDBC driver](https://central.sonatype.com/artifact/com.microsoft.sqlserver/mssql-jdbc).
162+
The Microsoft SQL Server can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset (JDBC driver included).
163163

164164

165165
- :simple-mysql:{ .lg .middle } MySQL
166166

167167
---
168168

169-
MySQL can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
170-
[JDBC driver](https://central.sonatype.com/artifact/org.mariadb.jdbc/mariadb-java-client/overview).
169+
MySQL can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset (JDBC driver included).
171170

172171

173172
- :simple-neo4j:{ .lg .middle } Neo4J
@@ -254,8 +253,7 @@ using the [Search Vector Embeddings](../../build/reference/customtask/cmem_plugi
254253

255254
---
256255

257-
PostgreSQL can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
258-
[JDBC driver](https://central.sonatype.com/artifact/org.postgresql/postgresql/versions).
256+
PostgreSQL can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset (JDBC driver included).
259257

260258

261259
- :other-powerbi:{ .lg .middle } PowerBI
@@ -315,8 +313,7 @@ execute a [SOQL query (Salesforce)](../../build/reference/customtask/cmem_plugin
315313

316314
---
317315

318-
Snowflake can be accessed with the [Snowflake SQL endpoint](../../build/reference/dataset/SnowflakeJdbc.md) dataset and a
319-
[JDBC driver](https://central.sonatype.com/artifact/net.snowflake/snowflake-jdbc).
316+
Snowflake can be accessed with the [Snowflake SQL endpoint](../../build/reference/dataset/SnowflakeJdbc.md) dataset (JDBC driver included).
320317

321318

322319
- :simple-apachespark:{ .lg .middle } Spark
@@ -331,7 +328,9 @@ execute a [SOQL query (Salesforce)](../../build/reference/customtask/cmem_plugin
331328
---
332329

333330
SQLite can be accessed with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
334-
[JDBC driver](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc).
331+
[Custom JDBC driver](https://central.sonatype.com/artifact/org.xerial/sqlite-jdbc).
332+
Please have a look at
333+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
335334

336335

337336
- :material-ssh:{ .lg .middle } SSH
@@ -357,8 +356,10 @@ Tentris can be used as the integrated Quad Store as well (beta).
357356

358357
---
359358

360-
[Trino](https://github.com/trinodb/trino) can be access with the
361-
[Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a [JDBC driver](https://trino.io/docs/current/client/jdbc.html).
359+
[Trino](https://github.com/trinodb/trino) can be access with the [Remote SQL endpoint](../../build/reference/dataset/Jdbc.md) dataset and a
360+
[Custom JDBC driver](https://trino.io/docs/current/client/jdbc.html).
361+
Please have a look at
362+
[Setup and use of JDBC Drivers](../../deploy-and-configure/configuration/dataintegration/jdbc/).
362363

363364

364365
- :black_large_square:{ .lg .middle } Virtuoso

docs/build/loading-jdbc-datasets-incrementally/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
icon: material/database-sync
33
tags:
44
- ExpertTutorial
5+
- JDBC
56
---
67
# Loading JDBC datasets incrementally
78

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
nav:
22
- Build (DataIntegration): index.md
3+
- Adding JDBC drivers: jdbc
34
- Activity Reference: activity-reference
45

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
---
2+
tags:
3+
- Configuration
4+
- JDBC
5+
---
6+
# Setup and use of JDBC Drivers
7+
8+
Corporate Memory supports JDBC connections to database management systems (DBMSs).
9+
The platform includes several JDBC drivers by default.
10+
You can also add and use custom drivers.
11+
12+
!!! info "References"
13+
14+
For more technical details, see the following reference pages:
15+
16+
- [Remote SQL endpoint](../../../../build/reference/dataset/Jdbc.md) and
17+
- [Snowflake SQL endpoint](../../../../build/reference/dataset/SnowflakeJdbc.md).
18+
19+
## Bundled JDBC Drivers
20+
21+
The platform includes the following JDBC drivers:
22+
23+
- PostgreSQL (`postgresql v42.7.10`)
24+
- MariaDB (includes support for MySQL, `mariadb-java-client v3.5.7`)
25+
- Microsoft SQL Server (`mssql-jdbc v13.2.1.jre11`)
26+
- Snowflake (`snowflake-jdbc v3.28.0`)
27+
28+
## Custom JDBC Drivers
29+
30+
In addition to the bundled JDBC drivers, you can register custom JDBC drivers.
31+
The following sections describe the required configuration.
32+
33+
### Download Custom JDBC Driver
34+
35+
Download the JDBC driver for each database management system that you want to connect to.
36+
[Integrations](../../../../build/integrations/index.md) provides links for well-known systems and lists those that are actively used with Corporate Memory.
37+
38+
### Provide a Custom JDBC Driver
39+
40+
Consult your solutions manager or DevOps specialist for options to copy or inject the JDBC driver `jar` into a Corporate Memory deployment.
41+
Depending on the deployment model, suitable options include:
42+
43+
- The Docker Compose package `cmem-orchestration` mounts the folder `./conf/dataintegration/plugin/` into the DataIntegration container.
44+
The configuration snippets below assume this location, which maps to `/opt/cmem/eccenca-DataIntegration/dist/etc/dataintegration/conf/plugin/` inside the container.
45+
- A dedicated _Build project_ in which the driver JAR files are uploaded as project file resources.
46+
- Dedicated file or resource mounts in a Docker Compose or Helm/Kubernetes configuration.
47+
48+
## Driver Registration
49+
50+
A custom JDBC driver must be registered in the DataIntegration configuration file `dataintegration.conf`, in the `spark.sql.options` section.
51+
The following example shows how to register a custom JDBC driver for Databricks:
52+
53+
```conf
54+
55+
spark.sql.options {
56+
57+
# driver name
58+
jdbc.drivers = "databricks"
59+
# path to the jar in the docker container
60+
jdbc.databricks.jar = "/opt/cmem/eccenca-DataIntegration/dist/etc/dataintegration/conf/plugin/DatabricksJDBC.jar"
61+
# class name
62+
jdbc.databricks.name = "com.databricks.client.jdbc.Driver"
63+
64+
}
65+
66+
```
67+
68+
## Use the Driver
69+
70+
JDBC drivers are used through the **Remote SQL endpoint** or **Snowflake SQL endpoint** dataset type.
71+
72+
![](jdbc-dataset.png){ class="bordered" width="85%" }
73+
74+
Configure them in the dataset configuration dialog.
75+
For details about the JDBC connection string, consult your DBMS or JDBC driver documentation.
76+
77+
![](jdbc-config-databricks.png){ class="bordered" width="100%" }
391 KB
Loading
79.4 KB
Loading

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ extra:
6161
Volume: fontawesome-solid-hard-drive
6262
"Load Balancer": simple-awselasticloadbalancing
6363
Variables: variables
64+
JDBC: jdbc
6465
# https://squidfunk.github.io/mkdocs-material/setup/setting-up-versioning/
6566
version:
6667
provider: mike
@@ -186,6 +187,7 @@ theme:
186187
simple-awselasticloadbalancing: simple/awselasticloadbalancing
187188
fontawesome-solid-hard-drive: fontawesome/solid/hard-drive
188189
variables: material/variable-box
190+
jdbc: material/database
189191
# https://squidfunk.github.io/mkdocs-material/reference/annotations/
190192
admonition:
191193
note: fontawesome/solid/note-sticky

0 commit comments

Comments
 (0)