Horje
How to Get Data from two Different Tables in Different Servers in PostgreSQL?

PostgreSQL is an open-source, object-relational database management system that stores data in rows, with columns representing different attributes. It enables secure data storage, processing, and retrieval. Developed by a global team of volunteers, PostgreSQL is a sophisticated database solution that is free to download and compatible with Linux, Windows, and macOS operating systems.

Accessing Data from Multiple Tables

Accessing data from multiple tables within the same PostgreSQL database is straightforward and can be accomplished with standard SQL JOIN operations. However, accessing data from tables in different servers’ databases requires additional steps. Let us look into some common ways we generally access data from a database.

1. Tables Present In Same Database

Accessing tables within the same database is simple, involving standard SQL queries with JOIN operations.

SELECT *
FROM table1
JOIN table2 ON table1.id = table2.id;

2. Tables In Different Databases but on the Same Server(machine)

When tables are in different databases on the same server, you can use the ‘dblink‘ extension to perform cross-database queries.

SELECT *
FROM dblink('dbname=otherdb', 'SELECT id, name FROM table_name') AS table_name(id INT, name TEXT)
JOIN local_table ON local_table.id = table_name.id;

3. Tables in Different Databases and also on Different Servers.

Accessing data from tables on different servers requires using Foreign Data Wrappers (FDWs), which we will discuss in detail in this article.

Hosting PostgreSQL on Two Different Servers

First, we need a basic setup to work on this topic. We need PostgreSQL to be installed on three different servers and perform actions on the databases present in them through our local machine. The diagram below illustrates the layout of our PostgreSQL setup on a local machine and two AWS EC2 instances, we will use this for querying data across different databases located on separate servers from local hosts.

Screenshot-2024-07-03-214514-min

PostgreSQL hosted on different Servers

I have installed PostgreSQL on three different machines(servers), one on my local machine and other two in AWS EC2 instances.

If you are trying to setup a similar environment that I have made, ensure that you make the following changes to Postgres config files and EC2 servers security groups I have mentioned below.

1. Modify the security group

We should change the security group to allow traffic on port 22 (SSH) and port 5432 (PostgreSQL) for the EC2 instances hosted on AWS. This step allows any externals connections to listen the Port range 5432 on which PostgreSQL runs.

Screenshot-2024-07-03-220324

Modify security groups for PostgreSQL port

2. Modify PostgreSQL config files

By default, PostgreSQL listens only to the localhost commands, so to allow external connections to make changes to the databases we need to make the following changes to PostgreSQL config files.

  • Update postgresql.conf to listen on the instance’s IP:
sudo nano /etc/postgresql/12/main/postgresql.conf

Change listen_addresses to ‘*’.

  • Update pg_hba.conf to allow connections:
sudo nano /etc/postgresql/12/main/pg_hba.conf

Add:

host    all            all            0.0.0.0/0             md5

Creating Tables in All Three Servers

In this setup, we have PostgreSQL databases hosted on a local machine and two AWS EC2 instances. The goal is to access and integrate data from three tables stored across these different servers. We need some data on the databases to perform actions so I have uploaded three different datasets on three different servers.

  • Localhost: Stores the sales table
Screenshot-2024-07-04-122046

Sales table in local server

  • EC2 Instance 1: Stores the store table.
Screenshot-2024-07-04-121105

Store table in server1

  • EC2 Instance 2: Stores the products table.
Screenshot-2024-07-04-122755

Products table in server2

To get clear picture of which servers hold which tables you see the below image how I uploaded the data across different servers.

we will be querying the store table in server-1 and products table in server-2 from our local host machine.

Screenshot-2024-07-03-222703

Data distribution in different servers

Add Servers to pgAdmin4

Note: This step is not mandatory , it just gives more clear picture on different servers which have different databases and tables.

Once You have done all the setup add both the EC2 instance servers to pgAdmin4 through the register server option available in the pdAdmin4.

Screenshot-2024-07-03-224638

Server connection into pgAdmin4

Integrating Data from Multiple PostgreSQL Servers

In this setup, we have PostgreSQL databases hosted on a local machine and two AWS EC2 instances. The goal is to access and integrate data from three tables stored across these different servers. Steps to Integrate Data

To access and integrate data from tables stored on different servers, we’ll use the `postgres_fdw` (Foreign Data Wrapper) extension. This allows us to create a connection to a remote PostgreSQL database and access its tables as if they were local.

1. Install the FDW Extension

The `postgres_fdw` extension is installed to enable foreign data wrapper functionality, which allows PostgreSQL to access tables on a remote server.

CREATE EXTENSION postgres_fdw;

2. Create a Server Connection

Next, create a foreign server connection to the first EC2 instance (Server 1) where the `store` table is located, here we have created a new server link called “link_to_server1” (you can give any name you want), along with it I have provided my Ip address and database name on which store table resides.

CREATE SERVER link_to_server1 

FOREIGN DATA WRAPPER postgres_fdw

OPTIONS (host '13.127.146.100', dbname 'postgres', port '5432');

3. Create a User Mapping

A user mapping is established, allowing the local PostgreSQL user (`postgres`) to authenticate with the remote server using the specified username and password.

CREATE USER MAPPING FOR postgres

SERVER link_to_server1

OPTIONS (user 'postgres', password 'postgres');

4. Import the Tables

The `IMPORT FOREIGN SCHEMA` command imports the schema from the remote server, making the remote tables available in the local database’s `public` schema. public schema is the default schema in all postgres servers.

IMPORT FOREIGN SCHEMA public 

FROM SERVER link_to_server1 INTO public;

5. Query the Remote Table

Once the foreign schema is imported, you can run SQL queries on the remote table (`store`) as if it were a local table.

SELECT * FROM public.store;

Tryout in pgAdmin4

Let us try all the steps in one place in pdAdmin4 playground, First, let me try querying the store table which is present in EC2 server-1 from my local machine to show that I do not have “store” table in my local machine.

Screenshot-2024-07-04-125702-min

No store table present in Local Host

Let me show you the overall steps in one place to get clearer picture

Screenshot-2024-07-04-130837-min

Querying store table from Local Host

Now create another server link similar to the above one to get the products table from the EC2 Instance server-2 into our local host server

Screenshot-2024-07-04-131913-min

Querying products table from EC2 server-2

Now we have three tables in total on our local host server, we have two foreign tables from two different servers. See the below Screenshot.

Screenshot-2024-07-04-134118

Tables in local host after all steps

Operations On the Tables

Now let us perform a join operation between sales and store tables and observe the results.

Screenshot-2024-07-04-133159

Join between sales and store tables

Now let us perform join operation between the store table and products table, we should remember that these two tables are present in two different servers and we are retrieving data from third server i.e. local host server.

Join

Join between store and products from local host server

By using foreign data wrappers, we can efficiently manage and operate on distributed tables across different servers. This setup also enables you to perform JOIN operations between tables located on different servers as shown above.

Conclusion

Thus, Using the postgres_fdw extension in PostgreSQL allows seamless access to data stored across different servers, making remote tables appear as local ones. This setup, demonstrated with PostgreSQL databases on a local machine and two AWS EC2 instances, simplifies data integration and querying. By leveraging foreign data wrappers, you can efficiently manage and operate on distributed data, enhancing your database’s flexibility and functionality.




Reffered: https://www.geeksforgeeks.org


Databases

Related
Best 50 PostgreSQL Interview Questions and Answers for 2024 Best 50 PostgreSQL Interview Questions and Answers for 2024
To 50 Oracle Interview Questions and Answers for 2024 To 50 Oracle Interview Questions and Answers for 2024
Fuzzy matching in Elasticsearch Fuzzy matching in Elasticsearch
DateTime2 vs DateTime in SQL Server DateTime2 vs DateTime in SQL Server
Troubleshooting Common Elasticsearch Problems Troubleshooting Common Elasticsearch Problems

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
19