Load and Replicate

On this page Carat arrow pointing down

Use data-load-and-replication mode to perform a one-time bulk load of source data and start continuous replication in a single command.

Tip:

You can also load and replicate separately using data-load and replicate-only.

Note:

This feature is in preview and subject to change. To share feedback and/or issues, contact Support.

Before you begin

Prerequisites

Oracle Instant Client

Install Oracle Instant Client on the machine that will run molt and replicator:

  • On macOS ARM machines, download the Oracle Instant Client. After installation, you should have a new directory at /Users/$USER/Downloads/instantclient_23_3 containing .dylib files. Set the LD_LIBRARY_PATH environment variable to this directory:

    icon/buttons/copy
    export LD_LIBRARY_PATH=/Users/$USER/Downloads/instantclient_23_3
    
  • On Linux machines, install the Oracle Instant Client dependencies and set the LD_LIBRARY_PATH to the client library path:

    icon/buttons/copy
    sudo apt-get install -yqq --no-install-recommends libaio1t64
    sudo ln -s /usr/lib/x86_64-linux-gnu/libaio.so.1t64 /usr/lib/x86_64-linux-gnu/libaio.so.1
    curl -o /tmp/ora-libs.zip https://replicator.cockroachdb.com/third_party/instantclient-basiclite-linux-amd64.zip
    unzip -d /tmp /tmp/ora-libs.zip
    sudo mv /tmp/instantclient_21_13/* /usr/lib
    export LD_LIBRARY_PATH=/usr/lib
    

Enable ARCHIVELOG

Enable ARCHIVELOG mode on the Oracle database. This is required for Oracle LogMiner, Oracle's built-in changefeed tool that captures DML events for replication.

icon/buttons/copy
SELECT log_mode FROM v$database;
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;
SELECT log_mode FROM v$database;
LOG_MODE
--------
ARCHIVELOG

1 row selected.

Enable supplemental primary key logging for logical replication:

icon/buttons/copy
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
SUPPLEMENTAL_LOG_DATA_MIN SUPPLEMENTAL_LOG_DATA_PK
------------------------- ------------------------
IMPLICIT                  YES

1 row selected.

Enable FORCE_LOGGING to ensure that all data changes are captured for the tables to migrate:

icon/buttons/copy
ALTER DATABASE FORCE LOGGING;

Limitations

  • Running DDL on the source or target while replication is in progress can cause replication failures.
  • Migrations must be performed from a single Oracle schema. You must include --schema-filter so that MOLT Fetch only loads data from the specified schema. Refer to Schema and table filtering.

    • Specifying --table-filter is also strongly recommended to ensure that only necessary tables are migrated from the Oracle schema. A userscript is required to use --table-filter with an Oracle source.
  • Replication will not work for tables or column names exceeding 30 characters. This is a limitation of Oracle LogMiner.

  • Oracle LogMiner does not support the following data types:

    • User-defined types (UDTs)
    • Nested tables
    • VARRAY
    • LONGBLOB/CLOB columns (over 4000 characters)
  • If your Oracle workload executes UPDATE statements that modify only LOB columns, these UPDATE statements are not supported by Oracle LogMiner and will not be replicated.

  • If you are using Oracle 11 and execute UPDATE statements on XMLTYPE or LOB columns, those changes are not supported by Oracle LogMiner and will be excluded from ongoing replication.

  • If you are migrating LOB columns from Oracle 12c, use AWS DMS Binary Reader instead of LogMiner. Oracle LogMiner does not support LOB replication in 12c.

  • Oracle advises against LONG RAW columns and recommends converting them to BLOB. LONG RAW can only store binary values up to 2GB, and only one LONG RAW column per table is supported.

Prepare the source database

Create migration user on source database

Create a dedicated migration user (e.g., MIGRATION_USER) on the source database. This user is responsible for reading data from source tables during the migration.

icon/buttons/copy
CREATE USER MIGRATION_USER WITH PASSWORD 'password';

Grant the user privileges to connect, view schema objects, and select the tables you migrate.

icon/buttons/copy
GRANT CONNECT, RESOURCE ON DATABASE source_database TO MIGRATION_USER;
GRANT USAGE ON SCHEMA migration_schema TO MIGRATION_USER;
GRANT SELECT ON ALL TABLES IN SCHEMA migration_schema TO MIGRATION_USER;
ALTER DEFAULT PRIVILEGES IN SCHEMA migration_schema GRANT SELECT ON TABLES TO MIGRATION_USER;
icon/buttons/copy
CREATE USER MIGRATION_USER@'%' IDENTIFIED BY 'password';

Grant the user privileges to select only the tables you migrate:

icon/buttons/copy
GRANT SELECT ON source_database.* TO MIGRATION_USER@'%';
FLUSH PRIVILEGES;
icon/buttons/copy
CREATE USER MIGRATION_USER IDENTIFIED BY '{password}';
Note:

When migrating from Oracle Multitenant (PDB/CDB), this should be a common user. Prefix the username with C## (e.g., C##MIGRATION_USER).

Grant the migration user privileges to connect, read metadata, and SELECT and FLASHBACK the tables you plan to migrate. The tables should all reside in a single schema (e.g., migration_schema). For details, refer to Schema and table filtering.

Oracle Multitenant (PDB/CDB) user privileges

Connect to the Oracle CDB as a DBA and grant the following:

icon/buttons/copy
-- Basic access
GRANT CONNECT TO C##MIGRATION_USER;
GRANT CREATE SESSION TO C##MIGRATION_USER;

-- General metadata access
GRANT EXECUTE_CATALOG_ROLE TO C##MIGRATION_USER;
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$DATABASE TO C##MIGRATION_USER;

-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_USERS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_OBJECTS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_SYNONYMS TO C##MIGRATION_USER;
GRANT SELECT ON DBA_TABLES TO C##MIGRATION_USER;

Connect to the Oracle PDB as a DBA and grant the following:

icon/buttons/copy
-- Allow C##MIGRATION_USER to connect to the PDB and see active transaction metadata
GRANT CONNECT TO C##MIGRATION_USER;
GRANT CREATE SESSION TO C##MIGRATION_USER;

-- General metadata access
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$SESSION TO C##MIGRATION_USER;
GRANT SELECT ON V_$TRANSACTION TO C##MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT ON migration_schema.tbl TO C##MIGRATION_USER;
GRANT FLASHBACK ON migration_schema.tbl TO C##MIGRATION_USER;
Single-tenant Oracle user privileges

Connect to the Oracle database as a DBA and grant the following:

icon/buttons/copy
-- Basic access
GRANT CONNECT TO MIGRATION_USER;
GRANT CREATE SESSION TO MIGRATION_USER;

-- General metadata access
GRANT SELECT_CATALOG_ROLE TO MIGRATION_USER;
GRANT EXECUTE_CATALOG_ROLE TO MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$DATABASE TO MIGRATION_USER;
GRANT SELECT ON V_$SESSION TO MIGRATION_USER;
GRANT SELECT ON V_$TRANSACTION TO MIGRATION_USER;

-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO MIGRATION_USER;
GRANT SELECT ON DBA_USERS TO MIGRATION_USER;
GRANT SELECT ON DBA_OBJECTS TO MIGRATION_USER;
GRANT SELECT ON DBA_SYNONYMS TO MIGRATION_USER;
GRANT SELECT ON DBA_TABLES TO MIGRATION_USER;

-- Grant these two for every table to migrate in the migration_schema
GRANT SELECT ON migration_schema.tbl TO MIGRATION_USER;
GRANT FLASHBACK ON migration_schema.tbl TO MIGRATION_USER;

Configure source database for replication

Enable logical replication by setting wal_level to logical in postgresql.conf or in the SQL shell. For example:

icon/buttons/copy
ALTER SYSTEM SET wal_level = 'logical';

For MySQL 8.0 and later sources, enable global transaction identifiers (GTID) consistency. Set the following values in mysql.cnf, in the SQL shell, or as flags in the mysql start command:

  • --enforce-gtid-consistency=ON
  • --gtid-mode=ON
  • --binlog-row-metadata=full

For MySQL 5.7 sources, set the following values. Note that binlog-row-image is used instead of binlog-row-metadata. Set server-id to a unique integer that differs from any other MySQL server you have in your cluster (e.g., 3).

  • --enforce-gtid-consistency=ON
  • --gtid-mode=ON
  • --binlog-row-image=full
  • --server-id={ID}
  • --log-bin=log-bin
Create source sentinel table

Create a checkpoint table called _replicator_sentinel in the Oracle schema you will migrate:

icon/buttons/copy
CREATE TABLE migration_schema."_replicator_sentinel" (
  keycol NUMBER PRIMARY KEY,
  lastSCN NUMBER
);

Grant permissions to modify the checkpoint table. In Oracle Multitenant, grant the permissions on the PDB:

icon/buttons/copy
GRANT SELECT, INSERT, UPDATE ON migration_schema."_replicator_sentinel" TO C##MIGRATION_USER;
Grant LogMiner privileges

Grant LogMiner privileges. In Oracle Multitenant, grant the permissions on the CDB:

icon/buttons/copy
-- Access to necessary V$ views
GRANT SELECT ON V_$LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER;
GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER;

-- SYS-prefixed views (for full dictionary access)
GRANT SELECT ON SYS.V_$LOGMNR_DICTIONARY TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_LOGS TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_PARAMETERS TO C##MIGRATION_USER;
GRANT SELECT ON SYS.V_$LOGMNR_SESSION TO C##MIGRATION_USER;

-- Access to LogMiner views and controls
GRANT LOGMINING TO C##MIGRATION_USER;
GRANT EXECUTE ON DBMS_LOGMNR TO C##MIGRATION_USER;

The user is responsible for querying redo logs from LogMiner, querying active transaction information to obtain the starting point for initial ongoing replication, and updating the internal _replicator_sentinel table previously created on the Oracle source schema by the DBA.

Verify LogMiner privileges

Query the locations of redo files in the Oracle database:

icon/buttons/copy
SELECT
    l.GROUP#,
    lf.MEMBER,
    l.FIRST_CHANGE# AS START_SCN,
    l.NEXT_CHANGE# AS END_SCN
FROM
    V$LOG l
JOIN
    V$LOGFILE lf
ON
    l.GROUP# = lf.GROUP#;
   GROUP# MEMBER                                       START_SCN                END_SCN 
_________ _________________________________________ ____________ ______________________ 
        3 /opt/oracle/oradata/ORCLCDB/redo03.log         1232896    9295429630892703743 
        2 /opt/oracle/oradata/ORCLCDB/redo02.log         1155042                1232896 
        1 /opt/oracle/oradata/ORCLCDB/redo01.log         1141934                1155042 

3 rows selected.

Get the current snapshot System Change Number (SCN):

icon/buttons/copy
SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
-----------
2358840

1 row selected.

Load the redo logs into LogMiner, replacing {current-scn} with the SCN you queried:

icon/buttons/copy
EXEC DBMS_LOGMNR.START_LOGMNR(
  STARTSCN => {current-scn},
  ENDSCN   => 2358840,
  OPTIONS  => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG
);
PL/SQL procedure successfully completed.
Tip:

If you receive ORA-01435: user does not exist, the Oracle user lacks sufficient LogMiner privileges. Refer to Grant LogMiner privileges.

Prepare the target database

Create the target schema

Convert the source schema into a CockroachDB-compatible schema. CockroachDB supports the PostgreSQL wire protocol and is largely compatible with PostgreSQL syntax.

  • The source and target schemas must match. Review Type mapping to understand which source types can be mapped to CockroachDB types.

    For example, a source table defined as CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY); must have a corresponding schema and table in CockroachDB:

    icon/buttons/copy
    CREATE SCHEMA migration_schema;
    CREATE TABLE migration_schema.tbl (pk INT PRIMARY KEY);
    
    • By default, table and column names are case-insensitive in MOLT Fetch. If using the --case-sensitive flag, schema, table, and column names must match Oracle's default uppercase identifiers. Use quoted names on the target to preserve case. For example, the following CockroachDB SQL statement will error:

      CREATE TABLE co.stores (... store_id ...);
      

      It should be written as:

      CREATE TABLE "CO"."STORES" (... "STORE_ID" ...);
      

      When using --case-sensitive, quote all identifiers and match the case exactly (for example, use "CO"."STORES" and "STORE_ID").

  • Every table must have an explicit primary key. For more information, refer to Primary key best practices.

  • Review Transformations to understand how computed columns and partitioned tables can be mapped to the target, and how target tables can be renamed.

  • By default on CockroachDB, INT is an alias for INT8, which creates 64-bit signed integers. PostgreSQL and MySQL default to 32-bit integers. Depending on your source database or application requirements, you may need to change the integer size to 4. For more information, refer to Considerations for 64-bit signed integers.

Schema Conversion Tool

The MOLT Schema Conversion Tool (SCT) automates target schema creation. It requires a free CockroachDB Cloud account.

  1. Upload a source .sql file to convert the syntax and identify unimplemented features and syntax incompatibilities in the schema.

  2. Import the converted schema to a CockroachDB cluster:

Syntax that cannot automatically be converted will be displayed in the Summary Report. These may include the following:

String case sensitivity

Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB.

For more information about the case sensitivity of strings in MySQL, refer to Case Sensitivity in String Searches from the MySQL documentation. For more information about CockroachDB strings, refer to STRING.

Identifier case sensitivity

Identifiers are case-sensitive in MySQL and case-insensitive in CockroachDB. When using the Schema Conversion Tool, you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase.

AUTO_INCREMENT attribute

The MySQL AUTO_INCREMENT attribute, which creates sequential column values, is not supported in CockroachDB. When using the Schema Conversion Tool, columns with AUTO_INCREMENT can be converted to use sequences, UUID values with gen_random_uuid(), or unique INT8 values using unique_rowid(). Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, refer to Unique ID best practices.

Note:

Changing a column type during schema conversion will cause MOLT Verify to identify a type mismatch during data validation. This is expected behavior.

ENUM type

MySQL ENUM types are defined in table columns. On CockroachDB, ENUM is a standalone type. When using the Schema Conversion Tool, you can either deduplicate the ENUM definitions or create a separate type for each column.

TINYINT type

TINYINT data types are not supported in CockroachDB. The Schema Conversion Tool automatically converts TINYINT columns to INT2 (SMALLINT).

Geospatial types

MySQL geometry types are not converted to CockroachDB geospatial types by the Schema Conversion Tool. They should be manually converted to the corresponding types in CockroachDB.

FIELD function

The MYSQL FIELD function is not supported in CockroachDB. Instead, you can use the array_position function, which returns the index of the first occurrence of element in the array.

Example usage:

icon/buttons/copy
SELECT array_position(ARRAY[4,1,3,2],1);
  array_position
------------------
               2
(1 row)

While MySQL returns 0 when the element is not found, CockroachDB returns NULL. So if you are using the ORDER BY clause in a statement with the array_position function, the caveat is that sort is applied even when the element is not found. As a workaround, you can use the COALESCE operator.

icon/buttons/copy
SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999);

Drop constraints and indexes

To optimize data load performance, drop all non-PRIMARY KEY constraints and indexes on the target CockroachDB database before migrating:

Warning:

Do not drop PRIMARY KEY constraints.

You can recreate the constraints and indexes after loading the data.

Create the SQL user

Create a migration user in the CockroachDB cluster that has the necessary privileges.

To create a user crdb_user in the default database (you will pass this username in the target connection string):

icon/buttons/copy
CREATE USER crdb_user WITH PASSWORD '{password}';

Grant the necessary permissions to run either IMPORT INTO or COPY FROM on the target tables, depending on the MOLT Fetch data load mode) you will use.

IMPORT INTO privileges

Grant SELECT, INSERT, and DROP (required because the table is taken offline during the IMPORT INTO) privileges on all tables in the target schema:

icon/buttons/copy
GRANT SELECT, INSERT, DROP ON ALL TABLES IN SCHEMA migration_schema TO crdb_user;

If you plan to use cloud storage with implicit authentication for data load, grant the EXTERNALIOIMPLICITACCESS system-level privilege:

icon/buttons/copy
GRANT EXTERNALIOIMPLICITACCESS TO crdb_user;

COPY FROM privileges

Make the user a member of the admin role:

icon/buttons/copy
ALTER USER crdb_user WITH ADMIN OPTION;

Replication privileges

Grant permissions to create the staging schema for replication:

icon/buttons/copy
GRANT admin TO crdb_user;

Configure data load

When you run molt fetch, you can configure the following options for data load:

Connection strings

Define the connection strings for the source and target databases, and keep them secure.

Source connection string

The --source flag specifies the connection string for the source database:

--source 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

For example:

--source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full'
--source 'mysql://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full&sslcert={path_to_client_crt}&sslkey={path_to_client_key}&sslrootcert={path_to_ca_crt}'

For example:

--source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt'
--source 'oracle://{username}:{password}@{host}:{port}/{service_name}'

In Oracle Multitenant, --source specifies the connection string for the PDB. --source-cdb specifies the connection string for the CDB. The username in both --source and --source-cdb is the common user that owns the tables you will migrate.

--source 'oracle://{username}:{password}@{host}:{port}/{PDB_service_name}'
--source-cdb 'oracle://{username}:{password}@{host}:{port}/{CDB_service_name}'

Escape the C## prefix in the Oracle Multitenant username. For example, write C##MIGRATION_USER as C%23%23:

--source 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLPDB1'
--source-cdb 'oracle://C%23%23MIGRATION_USER:password@host:1521/ORCLCDB'

Target connection string

The --target flag specifies the connection string for the target CockroachDB database:

--target 'postgres://{username}:{password}@{host}:{port}/{database}?sslmode=verify-full'

For example:

--target 'postgres://crdb_user@localhost:26257/defaultdb?sslmode=verify-full'

For details, refer to Connect using a URL.

Secure connections

To keep your database credentials out of shell history and logs, follow these best practices when specifying your source and target connection strings:

  • Avoid plaintext connection strings.

  • URL-encode connection strings for the source database and CockroachDB so special characters in passwords are handled correctly.

    • Given a password a$52&, pass it to the molt escape-password command with single quotes:

      icon/buttons/copy
      molt escape-password 'a$52&'
      

      Use the encoded password in your --source connection string. For example:

      --source 'postgres://postgres:a%2452%26@localhost:5432/replicationload'
      
  • Provide your connection strings as environment variables. For example:

    export SOURCE="postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full"
    export TARGET="postgres://root@localhost:26257/molt?sslmode=verify-full"
    

    Afterward, reference the environment variables as follows:

    --source $SOURCE
    --target $TARGET
    
  • If possible, use an external secrets manager to load the environment variables from stored secrets.

Intermediate file storage

MOLT Fetch can write intermediate files to either a cloud storage bucket or a local file server:

Destination MOLT Fetch flag(s) Address and authentication
Cloud storage --bucket-path Specify a s3://bucket/path, gs://bucket/path, or azure-blob://bucket/path URL.
  • AWS S3: Set AWS_REGION, AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID as environment variables or use --use-implicit-auth, --assume-role, and/or --import-region. Refer to Cloud storage authentication.
  • GCS: Authenticate with Application Default Credentials or use --use-implicit-auth. Refer to Cloud storage authentication.
  • Azure Blob Storage: Set AZURE_ACCOUNT_NAME and AZURE_ACCOUNT_KEY as environment variables or use --use-implicit-auth. Refer to Cloud storage authentication.
Local file server --local-path
--local-path-listen-addr
--local-path-crdb-access-addr
Write to --local-path on a local file server at --local-path-listen-addr; if the target CockroachDB cluster cannot reach this address, specify a publicly accessible address with --local-path-crdb-access-addr. No additional authentication is required.
Tip:

Cloud storage is often preferable to a local file server, which can require considerable disk space.

Cloud storage authentication

Ensure that access control is properly configured for Amazon S3, Google Cloud Storage, or Azure Blob Storage.

Amazon S3
  • Set the following environment variables in the terminal running molt fetch:

    icon/buttons/copy
    export AWS_REGION='us-east-1'
    export AWS_SECRET_ACCESS_KEY='key'
    export AWS_ACCESS_KEY_ID='id'
    
    • To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:

      docker run \
        -e AWS_ACCESS_KEY_ID='your-access-key' \
        -e AWS_SECRET_ACCESS_KEY='your-secret-key' \
        -v ~/.aws:/root/.aws \
        -it \
        cockroachdb/molt fetch \
        --bucket-path 's3://migration/data/cockroach' ...
      
  • Alternatively, set --use-implicit-auth to use implicit authentication. When using assume role authentication, specify the service account with --assume-role. For example:

    --bucket-path 's3://migration/data/cockroach'
    --assume-role 'arn:aws:iam::123456789012:role/MyMigrationRole'
    --use-implicit-auth
    
  • Set --import-region to specify an AWS_REGION (e.g., --import-region 'ap-south-1').

  • Ensure the S3 bucket is created and accessible by authorized roles and users only.

Google Cloud Storage
  • Authenticate your local environment with Application Default Credentials:

    Using gcloud:

    icon/buttons/copy
    gcloud init
    gcloud auth application-default login
    

    Using the environment variable:

    icon/buttons/copy
    export GOOGLE_APPLICATION_CREDENTIALS={path_to_cred_json}
    
    • To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:

      docker run \
        -e GOOGLE_APPLICATION_CREDENTIALS='/root/.config/gcloud/application_default_credentials.json' \
        -v ~/.config/gcloud:/root/.config/gcloud \
        -it \
        cockroachdb/molt fetch \
        --bucket-path 'gs://migration/data/cockroach' ...
      
  • Alternatively, set --use-implicit-auth to use implicit authentication. When using assume role authentication, specify the service account with --assume-role. For example:

    --bucket-path 'gs://migration/data/cockroach
    --use-implicit-auth
    --assume-role 'user-test@cluster-ephemeral.iam.gserviceaccount.com'
    
  • Ensure the Google Cloud Storage bucket is created and accessible by authorized roles and users only.

Azure Blob Storage
  • Set the following environment variables in the terminal running molt fetch:

    icon/buttons/copy
    export AZURE_ACCOUNT_NAME='account'
    export AZURE_ACCOUNT_KEY='key'
    

    You can also speicfy client and tenant credentials as environment variables:

    icon/buttons/copy
    export AZURE_CLIENT_SECRET='secret'
    export AZURE_TENANT_ID='id'
    
    • To run molt fetch in a containerized environment (e.g., Docker), pass the required environment variables using -e. If your authentication method relies on local credential files, you may also need to volume map the host path to the appropriate location inside the container using -v. For example:

      docker run \
        -e AZURE_ACCOUNT_NAME='account' \
        -e AZURE_ACCOUNT_KEY='key' \
        -e AZURE_CLIENT_SECRET='secret' \
        -e AZURE_TENANT_ID='id' \
        -v ~/.azure:/root/.azure \
        -it \
        cockroachdb/molt fetch \
        --bucket-path 'azure-blob://migration/data/cockroach' ...
      
  • Alternatively, set --use-implicit-auth to use implicit authentication: For example:

    --bucket-path 'azure-blob://migration/data/cockroach'
    --use-implicit-auth
    

    This mode supports Azure managed identities and workload identities.

  • Ensure the Azure Blob Storage container is created and accessible by authorized roles and users only.

Table handling mode

MOLT Fetch can initialize target tables on the CockroachDB database in one of three modes using --table-handling:

Mode MOLT Fetch flag Description
none Default mode
  • Loads data into existing tables without altering schema or data.
  • Exits early if schemas mismatch in some cases.
truncate-if-exists --table-handling truncate-if-exists
  • Truncates target tables before loading data.
  • Exits early if schemas mismatch in some cases.
drop-on-target-and-recreate --table-handling drop-on-target-and-recreate
  • Drops and recreates target tables before loading data.
  • Automatically creates missing tables with PRIMARY KEY and NOT NULL constraints.
  • Use none when you need to retain existing data and schema.
  • Use --table-handling truncate-if-exists to clear existing data while preserving schema definitions.
  • Use --table-handling drop-on-target-and-recreate for initial imports or when source and target schemas differ, letting MOLT Fetch generate compatible tables automatically.
Note:

When using the drop-on-target-and-recreate option, only PRIMARY KEY and NOT NULL constraints are preserved on the target tables. Other constraints, such as FOREIGN KEY references, UNIQUE, or DEFAULT value expressions, are not retained.

To guide schema creation with drop-on-target-and-recreate, you can explicitly map source types to CockroachDB types. Refer to Type mapping.

Schema and table filtering

MOLT Fetch can restrict which schemas (or users) and tables are migrated by using the --schema-filter, --table-filter, and --table-exclusion-filter flags:

Filter type Flag Description
Schema filter --schema-filter POSIX regex matching schema names to include; all matching schemas and their tables are moved.
Table filter --table-filter POSIX regex matching table names to include across all selected schemas.
Table exclusion filter --table-exclusion-filter POSIX regex matching table names to exclude across all selected schemas.
Tip:

Use --schema-filter to migrate only the specified schemas, and refine which tables are moved using --table-filter or --table-exclusion-filter.

When migrating from Oracle, you must include --schema-filter to name an Oracle schema to migrate. This prevents Fetch from attempting to load tables owned by other users. For example:

--schema-filter 'migration_schema'

Table filter userscript

When migrating only a subset of tables using --table-filter, you must supply a userscript that filters change events to those tables. For example, save the following as table_filter.ts:

import * as api from "replicator@v1";

// List the source tables (matching source names) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("defaultdb.migration_schema", {
  dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Replicate only if the table matches one of the allowed tables
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    // Ignore all other tables
    return null;
  },
  deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Optionally filter deletes the same way
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    return null;
  },
});

Pass the userscript to MOLT Fetch with the --userscript replication flag:

--replicator-flags "--userscript table_filter.ts"

Table filter userscript

When migrating only a subset of tables using --table-filter, you must supply a userscript that filters change events to those tables. For example, save the following as table_filter.ts:

import * as api from "replicator@v1";

// List the source tables (matching source names) to include in replication
const allowedTables = ["EMPLOYEES", "PAYMENTS", "ORDERS"];

// Update this to your target CockroachDB database and schema name
api.configureSource("defaultdb.migration_schema", {
  dispatch: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Replicate only if the table matches one of the allowed tables
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    // Ignore all other tables
    return null;
  },
  deletesTo: (doc: Document, meta: Document): Record<Table, Document[]> | null => {
    // Optionally filter deletes the same way
    if (allowedTables.includes(meta.table)) {
      let ret: Record<Table, Document[]> = {};
      ret[meta.table] = [doc];
      return ret;
    }
    return null;
  },
});

Pass the userscript to MOLT Fetch with the --userscript replication flag:

--replicator-flags "--userscript table_filter.ts"

Data load mode

MOLT Fetch can use either IMPORT INTO or COPY FROM to load data into CockroachDB:

Statement MOLT Fetch flag Description
IMPORT INTO Default mode
  • Fastest option, but takes target tables offline during load.
  • Supports compression using the --compression flag to reduce storage used during export.
  • Executes as a distributed background job in CockroachDB, so is more efficient for large, wide, or heavily partitioned tables.
COPY FROM --use-copy or --direct-copy
  • Slower than IMPORT INTO, but keeps target tables online and queryable during load.
  • Does not support compression.
  • Runs on the MOLT host and streams data row-by-row, which can increase memory usage and limit concurrency for large tables (many rows) or wide tables (many columns or large values like JSONB).
  • Use IMPORT INTO (the default mode) for large datasets, wide rows, or partitioned tables.
  • Use --use-copy when tables must remain online during data load.
  • Use --direct-copy only when you cannot move data to a public cloud, or want to perform local testing without intermediate storage. In this case, no intermediate file storage is used.

Metrics

By default, MOLT Fetch exports Prometheus metrics at http://127.0.0.1:3030/metrics. You can override the address with --metrics-listen-addr '{host}:{port}', where the endpoint will be http://{host}:{port}/metrics.

Cockroach Labs recommends monitoring the following metrics during data load:

Metric Name Description
molt_fetch_num_tables Number of tables that will be moved from the source.
molt_fetch_num_task_errors Number of errors encountered by the fetch task.
molt_fetch_overall_duration Duration (in seconds) of the fetch task.
molt_fetch_rows_exported Number of rows that have been exported from a table. For example:
molt_fetch_rows_exported{table="public.users"}
molt_fetch_rows_imported Number of rows that have been imported from a table. For example:
molt_fetch_rows_imported{table="public.users"}
molt_fetch_table_export_duration_ms Duration (in milliseconds) of a table's export. For example:
molt_fetch_table_export_duration_ms{table="public.users"}
molt_fetch_table_import_duration_ms Duration (in milliseconds) of a table's import. For example:
molt_fetch_table_import_duration_ms{table="public.users"}

You can also use the sample Grafana dashboard to view the preceding metrics.

Note:

Metrics from the replicator process are enabled by setting the --metricsAddr replication flag, and are served at http://{host}:{port}/_/varz.

To view Oracle-specific metrics from replicator, import this Grafana dashboard.

Replication flags

In the molt fetch command, use --replicator-flags to pass options to the included replicator process that handles continuous replication. For details on all available flags, refer to the MOLT Fetch documentation.

Flag Description
--metricsAddr Enable Prometheus metrics at a specified {host}:{port}. Metrics are served at http://{host}:{port}/_/varz.
Note:

When using --table-filter, you must also include --userscript. Refer to Table filter userscript.

Note:

When using --table-filter, you must also include --userscript. Refer to Table filter userscript.

Load data into CockroachDB

Start the initial load of data into the target database. Continuous replication of changes will start once the data load is complete.

  1. Issue the MOLT Fetch command to move the source data to CockroachDB, specifying --mode data-load-and-replication to perform an initial load followed by continuous replication. In this example, the --metricsAddr :30005 replication flag enables a Prometheus endpoint at http://localhost:30005/_/varz where replication metrics will be served. You can use these metrics to verify that replication has drained in a later step.

    Specify a replication slot name with --pglogical-replication-slot-name. This is required for replication after data load.

    icon/buttons/copy
    molt fetch \
    --source $SOURCE \ 
    --target $TARGET \
    --schema-filter 'migration_schema' \
    --table-filter 'employees|payments|orders' \
    --bucket-path 's3://migration/data/cockroach' \
    --table-handling truncate-if-exists \
    --pglogical-replication-slot-name cdc_slot \
    --replicator-flags '--metricsAddr :30005' \
    --mode data-load-and-replication
    
    icon/buttons/copy
    molt fetch \
    --source $SOURCE \ 
    --target $TARGET \
    --schema-filter 'migration_schema' \
    --table-filter 'employees|payments|orders' \
    --bucket-path 's3://migration/data/cockroach' \
    --table-handling truncate-if-exists \
    --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \
    --mode data-load-and-replication
    

    The command assumes an Oracle Multitenant (CDB/PDB) source. --source-cdb specifies the container database (CDB) connection string.

    icon/buttons/copy
    molt fetch \
    --source $SOURCE \
    --source-cdb $SOURCE_CDB \
    --target $TARGET \
    --schema-filter 'migration_schema' \
    --table-filter 'employees|payments|orders' \
    --bucket-path 's3://migration/data/cockroach' \
    --table-handling truncate-if-exists \
    --replicator-flags '--metricsAddr :30005 --userscript table_filter.ts' \
    --mode data-load-and-replication
    
  2. Check the output to observe fetch progress.

    A starting fetch message indicates that the task has started:

    {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
    
    {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
    
    {"level":"info","type":"summary","num_tables":3,"cdc_cursor":"2358840","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
    

    data extraction messages are written for each table that is exported to the location in --bucket-path:

    {"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"}
    
    {"level":"info","table":"migration_schema.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"}
    

    data import messages are written for each table that is loaded into CockroachDB:

    {"level":"info","table":"migration_schema.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}
    
    {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
    
    {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
    
    {"level":"info","table":"migration_schema.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"2358840","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
    

    A fetch complete message is written when the fetch task succeeds:

    {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
    
    {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
    

    This message includes a cdc_cursor value. You must set --defaultGTIDSet to this value when starting replication-only mode:

    icon/buttons/copy
    --defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29
    
    {"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":3,"tables":["migration_schema.employees","migration_schema.payments","migration_schema.payments"],"cdc_cursor":"2358840","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
    

Replicate changes to CockroachDB

  1. Continuous replication begins immediately after fetch complete.

  2. Check the output to observe replicator progress.

    A starting replicator message indicates that the task has started:

    {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"starting replicator"}
    

    The staging database name message contains the name of the staging schema. The schema name contains a replication marker for streaming changes, which is used for resuming replication, or performing failback to the source database.

    {"level":"info","time":"2025-02-10T14:28:13-05:00","message":"staging database name: _replicator_1739215693817700000"}
    

    upserted rows log messages indicate that changes were replicated to CockroachDB:

    DEBUG  [Jan 22 13:52:40] upserted rows                                 conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"migration_schema\".\"employees\"" upserted=1
    
    Tip:

    If replication is interrupted, you can resume replication.

Stop replication and verify data

Use MOLT Verify to confirm that the source and target data is consistent. This ensures that the data load was successful.

  1. Stop application traffic to your source database. This begins downtime.

  2. Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. There are two ways to determine that replication has fully drained:

    • When replication is caught up, you will not see new upserted rows logs.
    • If you set up the replication metrics endpoint with --metricsAddr in the preceding steps, metrics are available at:

      http://{host}:{port}/_/varz
      

      Use the following Prometheus alert expression to observe when the combined rate of upserts and deletes is 0 for each schema:

      sum by (schema) (rate(apply_upserts_total[$__rate_interval]) + rate(apply_deletes_total[$__rate_interval]))
      
  3. Cancel replication to CockroachDB by entering ctrl-c to issue a SIGTERM signal. This returns an exit code 0.

  4. Issue the MOLT Verify command, specifying the source and target connection strings and the tables to validate.

    icon/buttons/copy
    molt verify \
    --source $SOURCE \ 
    --target $TARGET \
    --table-filter 'employees|payments|orders'
    
    icon/buttons/copy
    molt verify \
    --source $SOURCE \ 
    --target $TARGET \
    --table-filter 'employees|payments|orders'
    
    icon/buttons/copy
    molt verify \
    --source $SOURCE \ 
    --target $TARGET \
    --table-filter 'employees|payments|orders'
    
    Note:

    With Oracle Multitenant deployments, --source-cdb is not necessary for verify.

  5. Check the output to observe verify progress.

    A verification in progress indicates that the task has started:

    {"level":"info","time":"2025-02-10T15:35:04-05:00","message":"verification in progress"}
    

    starting verify messages are written for each specified table:

    {"level":"info","time":"2025-02-10T15:35:04-05:00","message":"starting verify on public.employees, shard 1/1"}
    

    A finished row verification message containing a summary is written after each table is compared. For details on the summary fields, refer to the MOLT Verify page.

    {"level":"info","type":"summary","table_schema":"public","table_name":"employees","num_truth_rows":200004,"num_success":200004,"num_conditional_success":0,"num_missing":0,"num_mismatch":0,"num_extraneous":0,"num_live_retry":0,"num_column_mismatch":0,"time":"2025-02-10T15:35:05-05:00","message":"finished row verification on public.employees (shard 1/1)"}
    

    A verification complete message is written when the verify task succeeds:

    {"level":"info","net_duration_ms":699.804875,"net_duration":"000h 00m 00s","time":"2025-02-10T15:35:05-05:00","message":"verification complete"}
    

Modify the CockroachDB schema

Add any constraints or indexes that you previously removed from the CockroachDB schema to facilitate data load.

Note:

If you used the --table-handling drop-on-target-and-recreate option for data load, only PRIMARY KEY and NOT NULL constraints are preserved. You must manually recreate all other constraints and indexes.

For the appropriate SQL syntax, refer to ALTER TABLE ... ADD CONSTRAINT and CREATE INDEX. Review the best practices for creating secondary indexes on CockroachDB.

Cutover

Perform a cutover by resuming application traffic, now to CockroachDB.

Troubleshooting

Fetch exits early due to mismatches

molt fetch exits early in the following cases, and will output a log with a corresponding mismatch_tag and failable_mismatch set to true:

  • A source table is missing a primary key.
  • A source and table primary key have mismatching types.
  • A STRING primary key has a different collation on the source and target.
  • A source and target column have mismatching types that are not allowable mappings.
  • A target table is missing a column that is in the corresponding source table.
  • A source column is nullable, but the corresponding target column is not nullable (i.e., the constraint is more strict on the target).

molt fetch can continue in the following cases, and will output a log with a corresponding mismatch_tag and failable_mismatch set to false:

  • A target table has a column that is not in the corresponding source table.
  • A source column has a NOT NULL constraint, and the corresponding target column is nullable (i.e., the constraint is less strict on the target).
  • A DEFAULT, CHECK, FOREIGN KEY, or UNIQUE constraint is specified on a target column and not on the source column.
ORA-01950: no privileges on tablespace

If you receive ORA-01950: no privileges on tablespace 'USERS', the Oracle migration user does not have sufficient quota on the tablespace used to store its values (by default, this is USERS, but can vary). Grant a quota to the schema. For example:

-- change UNLIMITED to a suitable limit for the table owner
ALTER USER migration_schema QUOTA UNLIMITED ON USERS;
No tables to drop and recreate on target

When expecting a bulk load but seeing no tables to drop and recreate on the target, ensure the migration user has SELECT and FLASHBACK privileges on each table to be migrated. For example:

GRANT SELECT, FLASHBACK ON migration_schema.employees TO C##MIGRATION_USER;
GRANT SELECT, FLASHBACK ON migration_schema.payments TO C##MIGRATION_USER;
GRANT SELECT, FLASHBACK ON migration_schema.orders TO C##MIGRATION_USER;
Table or view does not exist

If the Oracle migration user lacks privileges on certain tables, you may receive errors stating that the table or view does not exist. Either use --table-filter to limit the tables to be migrated, or grant the migration user SELECT privileges on all objects in the schema. Refer to Create migration user on source database.

Missing redo logs or unavailable SCN

If the Oracle redo log files are too small or do not retain enough history, you may get errors indicating that required log files are missing for a given SCN range, or that a specific SCN is unavailable.

Increase the number and size of online redo log files, and verify that archived log files are being generated and retained correctly in your Oracle environment.

Missing replicator flags

If required --replicator-flags are missing, ensure that the necessary flags for your mode are included. For details, refer to Replication flags.

Replicator lag

If the replicator process is lagging significantly behind the current Oracle SCN, you may see log messages like: replicator is catching up to the current SCN at 5000 from 1000…. This indicates that replication is progressing but is still behind the most recent changes on the source database.

Oracle sessions remain open after forcefully stopping molt or replicator

If you shut down molt or replicator unexpectedly (e.g., with kill -9 or a system crash), Oracle sessions opened by these tools may remain active.

  • Check your operating system for any running molt or replicator processes and terminate them manually.
  • Once both processes are confirmed stopped, ask a DBA to check for lingering Oracle sessions with:

    SELECT sid, serial#, username, status, osuser, machine, program
    FROM v$session
    WHERE username = 'C##CDB_USER';
    

    Wait for any remaining sessions to show an INACTIVE status, then terminate them using:

    ALTER SYSTEM KILL SESSION 'sid,serial#' IMMEDIATE;
    

See also

×