> ## Documentation Index
> Fetch the complete documentation index at: https://docs2.zenskar.com/llms.txt
> Use this file to discover all available pages before exploring further.

# PostgreSQL

## Overview

[PostgreSQL](https://www.postgresql.org/) is a powerful, open-source relational database known for its reliability, extensibility, and standards compliance. In Zenskar, you can use PostgreSQL as a data source to import transactional, usage, or customer data for metering, billing, and analytical reporting.

***

## Prerequisites

Create a dedicated read-only PostgreSQL user for replicating data. Alternatively, you can use an existing user:

```sql theme={null}
CREATE USER <user_name> PASSWORD 'your_password_here';
```

Provide the user read-only access to relevant schemas and tables:

```sql theme={null}
GRANT USAGE ON SCHEMA <schema_name> TO <user_name>;
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO <user_name>;
ALTER DEFAULT PRIVILEGES IN SCHEMA <schema_name> GRANT SELECT ON TABLES TO <user_name>;
```

<Callout icon="📖" theme="default">
  You must re-run the above command for each schema from which you expect to replicate data.
</Callout>

<Callout icon="🚧" theme="warn">
  You must allow inbound traffic from [Zenskar IP addresses](https://www.zenskar.com/docs/zenskar-ips) .
</Callout>

***

## Set up a PostgreSQL data-source connector via Zenskar dashboard

1. Log into your Zenskar dashboard.
2. In the left side bar, click **Usage > Data Sources**.
3. In the top-right corner, click **+ ADD DATA SOURCE**.
4. On the **Add New Data Source** page, configure the PostgreSQL connector.
5. Click on the **SAVE SOURCE** button.

### Connector configuration

#### General configuration

| Field           | Description                                   | Required |
| :-------------- | :-------------------------------------------- | :------- |
| **Source Name** | Enter a unique name for this data source.     | Yes      |
| **Source Type** | Select **PostgreSQL** from the dropdown menu. | Yes      |

#### Connector configuration

| Field                                                                                                   | Description                                                                                                                         | Required    |
| ------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| Host                                                                                                    | The hostname or IP address of the PostgreSQL server.                                                                                | Yes         |
| Port                                                                                                    | The port on which PostgreSQL listens for incoming connections.                                                                      | Yes         |
| Schemas                                                                                                 | The schema you want to replicate in Zenskar.                                                                                        | Yes         |
| Database name                                                                                           | The name of the PostgreSQL database to connect to.                                                                                  | Yes         |
| Username                                                                                                | The username of the PostgreSQL database.                                                                                            | Yes         |
| Password                                                                                                | The password for the database user.                                                                                                 | Yes         |
| [SSL Modes](https://docs.zenskar.com/docs/data-source-connector-for-postgresql#ssl-modes-in-postgresql) | SSL configuration mode: Available options are: **Disable**, **Allow**, **Prefer**, **Require**, **Verify-CA**, and **Verify-Full**. | Conditional |

***

## Set up a PostgreSQL data-source connector via API

<Cards columns={0}>
  <Card title="Refer to the create data-source connector API reference." href="https://www.zenskar.com/reference/create-data-source-connector" icon="fa-info" target="_blank" />
</Cards>

<Callout icon="💡" theme="default">
  The `connector_config` object is the only part of the request that differs across connector types (such as BigQuery, Snowflake, or Redshift). The [create data-source connector API reference](https://www.zenskar.com/reference/create-data-source-connector) provides a generic overview, while this document explains the PostgreSQL-specific structure of `connector_config`.
</Callout>

### Request example

```curl theme={null}
curl --location 'https://api.zenskar.com/datasources' \
  -H 'x-api-key: <your-api-key>' \
  -H 'apiversion: 20240301' \
  -H 'Content-Type: application/json' \
  -d '{
    "name":"test_postgresql",
    "connector_type":"Postgres",
    "destination":"Postgres",
    "status":"active",
    "connector_config":{
        "port":5432,
        "schemas":["public"],
        "ssl":true,
        "ssl_mode":{
            "mode":"require",
            "client_certificate":"certificate",
            "client_key":"key"},
        "database":"db",
        "username":"zenskar",
        "password":"password",
        "host":"acme"},
        "source_definition_id":"decd3333-5647-4c0b-adf4-da0e75f5a750",
        "remote_conn":true
  }'
```

#### &#x20;PostgreSQL-specific connector configuration

The `connector_config` object contains configuration fields specific to the PostgreSQL data source.

| Field                         | Description                                                                                                                | Required                                                 |
| ----------------------------- | -------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- |
| `port`                        | The port on which the PostgreSQL server listens for incoming connections (default: `5432`).                                | Yes                                                      |
| `schemas`                     | List of schemas to be replicated from the PostgreSQL database.                                                             | Yes                                                      |
| `ssl`                         | Indicates whether SSL should be used for the connection (`true` or `false`).                                               | Yes                                                      |
| `ssl_mode.mode`               | SSL mode to use for the connection. Supported values: `disable`, `allow`, `prefer`, `require`, `verify-ca`, `verify-full`. | Conditional (required if `ssl` = `true`)                 |
| `ssl_mode.client_certificate` | The client certificate to use when connecting over SSL.                                                                    | Conditional (required for `verify-ca` and `verify-full`) |
| `ssl_mode.client_key`         | The private key associated with the client certificate.                                                                    | Conditional (required for `verify-ca` and `verify-full`) |
| `database`                    | The name of the PostgreSQL database to connect to.                                                                         | Yes                                                      |
| `username`                    | The username used to authenticate with the PostgreSQL database.                                                            | Yes                                                      |
| `password`                    | The password for the specified database user.                                                                              | Yes                                                      |
| `host`                        | The hostname or IP address of the PostgreSQL server.                                                                       | Yes                                                      |
| `source_definition_id`        | The unique identifier for the PostgreSQL connector definition in Zenskar.                                                  | Yes                                                      |
| `remote_conn`                 | Indicates whether a remote (SSH tunnel or bastion) connection should be used.                                              | No                                                       |

## Addendum

### SSL modes in PostgreSQL

| SSL mode        | Description                                                                                                                                                          |
| --------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **disable**     | SSL is disabled. The connection between the client (Zenskar) and the PostgreSQL server is unencrypted.                                                               |
| **allow**       | Tries to connect without SSL first; if that fails, retries with SSL. This mode allows either encrypted or unencrypted connections depending on server configuration. |
| **prefer**      | (Default behavior in many clients.) Tries to connect with SSL first, but falls back to a non-SSL connection if SSL isn’t available.                                  |
| **require**     | Always uses SSL to encrypt the connection, but does **not** validate the server certificate. Ensures encryption but not identity verification.                       |
| **verify-ca**   | Establishes an SSL connection and verifies that the server’s certificate is signed by a trusted Certificate Authority (CA). Protects against untrusted certificates. |
| **verify-full** | Establishes an SSL connection, verifies the certificate’s CA, **and** checks that the server hostname matches the certificate. This is the most secure mode.         |

<Callout icon="❗️" theme="error">
  If SSL mode is set to **Require**, the connection will fail if the PostgreSQL instance doesn’t support SSL.
</Callout>

### Data replication

#### Change Data Capture (CDC) replication

Zenskar uses [logical replication](https://www.postgresql.org/docs/10/logical-replication.html) of PostgreSQL’s write-ahead log (WAL) to incrementally capture data changes, including deletions.

You should configure the PostgreSQL data source with CDC when:

* You need a record of deletions.
* Your database is large (500 GB or more).
* Your table has a primary key but lacks a suitable incremental cursor field (for example, `updated_at`).

<Callout icon="⚠️" theme="warn">
  CDC must connect to the *primary* database. Configuring CDC on a replica host will result in failures.
</Callout>

#### Set up CDC

1. Grant `REPLICATION` privileges to the read-only user:

   ```sql theme={null}
   ALTER USER <user_name> REPLICATION;
   ```

2. Enable logical replication on your PostgreSQL database.

   * **For self-hosted or VM environments (bare metal, EC2, GCE, Docker):**
     Edit the [`postgresql.conf`](https://www.postgresql.org/docs/current/config-setting.html) file:

     | Parameter               | Description                                 | Set value to                                    |
     | ----------------------- | ------------------------------------------- | ----------------------------------------------- |
     | `wal_level`             | Type of logging in the WAL                  | `logical`                                       |
     | `max_wal_senders`       | Maximum number of WAL sender processes      | Minimum `1`                                     |
     | `max_replication_slots` | Maximum number of replication slots allowed | `1` (or higher if other services read from WAL) |

   * **For AWS RDS or Aurora:**

     * Go to the **Configuration** tab of your DB cluster.
     * Locate your **cluster parameter group** and set `rds.logical_replication = 1`.
     * Restart your instance manually or during the next maintenance window.

   * **For Azure Database for PostgreSQL:**

     * Change the replication mode to `logical` from the Azure portal, or run:

       ```bash theme={null}
       az postgres server configuration set --resource-group <group> \
         --server-name <server> \
         --name azure.replication_support \
         --value logical
       az postgres server restart --resource-group <group> --name <server>
       ```

3. Create a dedicated replication slot for Zenskar using the `pgoutput` plugin:

   ```sql theme={null}
   SELECT pg_create_logical_replication_slot('zenskar_slot', 'pgoutput');
   ```

   The output will include the slot name to be filled in the Zenskar setup page.

4. Define replication identities for all tables you want to replicate:

   ```sql theme={null}
   ALTER TABLE <table_name> REPLICA IDENTITY DEFAULT;
   ```

   Use `FULL` for tables with large or TOASTable fields:

   ```sql theme={null}
   ALTER TABLE <table_name> REPLICA IDENTITY FULL;
   ```

5. Create a publication including all tables to be replicated:

   ```sql theme={null}
   CREATE PUBLICATION zenskar_publication FOR TABLE <tbl1, tbl2, tbl3>;
   ```

   The publication name is customizable.

<Callout icon="💡" theme="default">
  The Zenskar UI allows table selection for CDC, but if a table is not part of the publication, it will not be replicated. If a table lacks a replication identity, Zenskar can create it automatically if the user has sufficient permissions.
</Callout>

#### Xmin replication

The **xmin** method is a cursorless replication mode that tracks inserts and updates using PostgreSQL’s internal `xmin` system column.

Use **xmin** when:

* There’s no suitable cursor field (for example, `updated_at`).
* You want to replace a full-refresh sync.
* You are replicating PostgreSQL tables smaller than 500 GB.
* You are not replicating non-materialized views (these are unsupported by xmin).

<br />
