Skip to content

After COPY DATABASE FROM <pg>, the types from pg are improperly formatted in EXPORT commands #429

@NickCrews

Description

@NickCrews

What happens?

I have some custom enum and composite types in postgres. When I copy that entire database over to a duckdb instance, some things get garbled. First, let me show what I'm doind.

Repro using the @duckdb/node-api and @electric-sql/pglite-socket nodejs packages. mkdir repro && cd repro && pnpm init && pnpm add @duckdb/node-api==1.5.1-r.1 @electric-sql/pglite==0.4.3 @electric-sql/pglite-socket==0.1.3 and then node repro.js for the following js:

import { rm, mkdir } from 'node:fs/promises';
import duckdb from '@duckdb/node-api';
import { PGlite } from '@electric-sql/pglite';
import { PGLiteSocketServer } from '@electric-sql/pglite-socket';
import { join } from 'node:path';


async function create(repro_dir) {
    await rm(repro_dir, { recursive: true, force: true });
    await mkdir(repro_dir, { recursive: true });
    const path = join(repro_dir, 'dst.duckdb');

    const pgdb = await PGlite.create();
    pgdb.exec(`
            CREATE TYPE my_status_enum AS ENUM ('active', 'inactive');
            CREATE TYPE my_address AS (
                street TEXT,
                city TEXT,
                zip TEXT
            );
            CREATE TABLE t (id INTEGER, status my_status_enum, address my_address);
            INSERT INTO t VALUES (1, 'active', ROW('123 Main St', 'Anytown', '12345')), (2, 'inactive', ROW('456 Elm St', 'Othertown', '67890')), (3, NULL, NULL);
        `
    );
    const server = new PGLiteSocketServer({db: pgdb, host: 'localhost', port: 5432});
    await server.start();

    try {
        const dst = await duckdb.DuckDBInstance.create(path);
        const dstConn = await dst.connect();

        console.log('Attaching PGlite Postgres catalog and running COPY FROM DATABASE...');
        await dstConn.run('INSTALL postgres;');
        await dstConn.run('LOAD postgres;');
        await dstConn.run(
            'ATTACH \'host=localhost port=5432 dbname=postgres user=postgres sslmode=disable\' AS pg (TYPE postgres);'
        );

        const catalogReader = await dstConn.runAndReadAll('SELECT current_database();');
        const targetCatalog = String(catalogReader.getRows()[0][0]).replace(/"/g, '""');
        await dstConn.run(`COPY FROM DATABASE pg TO "${targetCatalog}";`);
        await dstConn.run(`EXPORT DATABASE '${join(repro_dir, 'postcreate')}' (FORMAT CSV);`);
        const results = await dstConn.runAndReadAll('SELECT * FROM duckdb_types WHERE type_name ILIKE \'my_%\';');
        console.log('Data in duckdb_types table after COPY FROM DATABASE:');
        console.table(results.getRows());
    } finally {
        await pgdb.close();
        await server.stop();
    }
};

async function reconnect(repro_dir){
    const path = join(repro_dir, 'dst.duckdb');
    console.log('Attempting to reopen destination database...');
    try {
        const instance = await duckdb.DuckDBInstance.create(path);
        const con = await instance.connect();
        await con.run(`EXPORT DATABASE '${join(repro_dir, 'after_reopen')}' (FORMAT CSV);`);
        const results = await con.runAndReadAll('SELECT * FROM duckdb_types WHERE type_name ILIKE \'my_%\';');
        console.log('Data in duckdb_types table after reopening database:');
        console.table(results.getRows());
        console.log('OK reopen succeeded');
    } catch (error) {
        console.log(`FAIL reopen failed: ${error instanceof Error ? error.message : String(error)}`);
    }
}

async function main() {
    const cmd = process.argv[2];
    const dir = process.argv[3] || 'repro';

    if (cmd === 'create') {
        await create(dir);
    } else if (cmd === 'reopen') {
        await reconnect(dir);
    } else {
        console.error(`Unknown command: ${cmd}`);
        process.exit(1);
    }
}

main();

After this runs, if I look at the repro/postcreate/schema.sql file, which was created from the EXPORT command, then I get this:

CREATE SCHEMA public;
CREATE TYPE public.my_address AS my_address
CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' );
CREATE TABLE public.t(id INTEGER, status ENUM('active', 'inactive'), address STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR));

Note how:

  • CREATE TYPE public.my_address AS my_address isn't legal. If you actually tried to IMPORT from this directory you would get errors.
  • CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' ); is legal and worked.
  • The CREATE TABLE statement decided to redefine the column types with anonymous types, instead of using the stored types.

What I would expect:

CREATE SCHEMA public;
CREATE TYPE public.my_address AS STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR)
CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' );
CREATE TABLE public.t(id INTEGER, status public.my_status_enum, address public.my_address);

Or, if that's not possible, then it would be less ideal, but workable if this just used the anonymous types and didn't generate illegal CREATE TYPE statements:

CREATE SCHEMA public;
CREATE TABLE public.t(id INTEGER, status ENUM('active', 'inactive'), address STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR));

I have not tested on main for any of the packages.

To Reproduce

see above.

OS:

macOS

PostgreSQL Version:

17

DuckDB Version:

1.5.1

DuckDB Client:

nodejs

Full Name:

Nick Crews

Affiliation:

Ship Creek Group

Have you tried this on the latest main branch?

  • I have NOT done this.

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions