PostgreSQL Expert needed for large batch update

Completado Publicado hace 10 meses Pagado a la entrega
Completado Pagado a la entrega

I have a test table called names with about 1 billion records of fake names of people (first_name and last_name).

I created a table called last_name_distinct in which I have each last name with the number of records it has in the names table stored in the field last_name_distict.count. To populate last_name_distict from the names table took about 30 minutes and resulted in about 20 million unique last names.

In the names table I also have a field called page. This is a page number which will be used to display the names. I want to display the names for a specific last_name over multiple pages, with a maximum of 100 records per page. So if the last name "Doe" has 150 records, it will display over 2 pages, 100 on the first page and 50 on the second page.

So to do this I asked ChatGPT to make an sql script (see below) to loop on the last_name_distinct table and then loop on the names table for the current last_name and update the page field with a page number for each record in names.

But the process for doing this takes way too long, about 1 month to run, and I am using a very fast multi-core dedicated server.

I tried using both the sql script provided by ChatGPT and writing my own program in php to do the same job, but both run too long.

I am guessing that the most efficient way of doing this is to run an sql script from within psql (\i [login to view URL]), which is what I did with the scipt ChatGPT made for me.

But I am wondering if there is a faster way of doing this? Maybe there is someone out there who is a PostgeSQL expert and nows a trick or two to make this run faster? If you have the solution please let me know.

DO $$

DECLARE

name_row record;

counter integer := 0;

record_count integer := 0;

record_row names%ROWTYPE; -- Declare a record variable to hold the values from the names table

BEGIN

-- Loop through distinct last names where count > 100

FOR name_row IN (SELECT last_name, count FROM last_name_distinct WHERE count > 100) LOOP

-- Reset the counter for each distinct last name

counter := 0;

-- Retrieve records from names table matching the current last name, sorted by last name and first name

FOR record_row IN (SELECT * FROM names WHERE last_name = name_row.last_name ORDER BY last_name, first_name) LOOP

-- Increment the counter for each record

counter := counter + 1;

-- Calculate the page number

[login to view URL] := (counter - 1) / 100 + 1;

-- Update the page field in the names table using the name_id

UPDATE names SET page = [login to view URL] WHERE name_id = record_row.name_id;

-- Print the last_name_distinct.last_name field for every 100,000th record processed

record_count := record_count + 1;

IF record_count % 100000 = 0 THEN

RAISE NOTICE 'Processing last name: %', name_row.last_name;

END IF;

END LOOP;

END LOOP;

END $$;

PostgreSQL PostgreSQL Administration PostgreSQL Programming

Nº del proyecto: #36898796

Sobre el proyecto

8 propuestas Proyecto remoto Activo hace 10 meses

Adjudicado a:

muhammadosama181

Hello I would love to help you learn and gaining more knowledge about SQL and MYSQL I am a software developer with a good experience with different RDBMs like MYSQL , Postgres, Oracle and MS server I work daily with Más

$15 USD en 7 días
(0 comentarios)
0.0

8 freelancers están ofertando un promedio de $21 por este trabajo

Midzukaya

Hi, Andrew M. A Frontend React Developer with experience of 8+ years in full-stack web development. Reading title - PostgreSQL Expert needed for large batch update and thought that I am very suitable for this project. Más

$50 USD en 2 días
(0 comentarios)
0.0
megabait82

Hi, I went through your project description and you're looking for someone experienced in PostgreSQL, PostgreSQL Administration, PostgreSQL Programming. I've confirmed your requirements and it seems like I am a gre Más

$10 USD en 7 días
(0 comentarios)
0.0
solomiiahuliak

Hello Andrew M. Good morning! Thank you for considering my proposal for the "PostgreSQL Expert needed for large batch update" position. I am excited about the opportunity to work with you and deliver high-quality work Más

$10 USD en 4 días
(0 comentarios)
0.0
vvelicko032

Hi, Andrew M. Going through the job posting. I just noticed that you are looking for a talented dev who is good at PostgreSQL Administration, PostgreSQL and PostgreSQL Programming. I've confirmed your requirements and Más

$10 USD en 6 días
(0 comentarios)
0.0
Gaurav147

Hi There, The slow performance you're experiencing is likely due to the nested loops and individual UPDATE statements within the loop. Performing updates on a billion-row table with such a nested loop can be time-cons Más

$20 USD en 7 días
(0 comentarios)
0.0
freelancermauree

Greetings Dear Client. Welcome to my profile, Home to Professional and Quality services with 100% customer satisfaction guarantee. I'm a Certified & Experienced Expert in the respective project requirements. Dear Clie Más

$30 USD en 2 días
(0 comentarios)
0.0