Replace trino sql with postgresql on site SQL for on premise #5813

ydayagi · 2025-12-04T08:58:35Z

Jira Ticket

FLPATH-2822

Description

Replacing Trino+Hive+Parquet with Postgres for on-prem deployment

Testing

Checkout Branch
Restart Koku
Hit endpoint or launch shell
1. You should see ...
Do more things...

Release Notes

proposed release note

* [COST-####](https://issues.redhat.com/browse/COST-####) Fix some things

gemini-code-assist · 2025-12-04T08:59:05Z

Summary of Changes

Hello @ydayagi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the data access and processing mechanisms to support PostgreSQL as a primary data store for on-premise deployments. It introduces a flexible database accessor pattern, allowing the system to seamlessly adapt to either Trino or PostgreSQL environments. The changes primarily focus on abstracting database interactions, updating data ingestion pipelines for Parquet files, and providing conditional feature flag management, all aimed at enhancing deployment flexibility and maintainability for different environments.

Highlights

Database Abstraction Layer: Introduced an abstract ReportDBAccessor and concrete implementations for Trino and PostgreSQL, allowing the application to dynamically switch between database backends based on the ONPREM setting.
On-Premise PostgreSQL Support: Enabled PostgreSQL as the data store for on-premise deployments, replacing the Trino+Hive+Parquet stack. This includes specific SQL generation for schema, table, and partition management in PostgreSQL.
Conditional Unleash Client: Implemented a MockUnleashClient and conditional initialization of the Unleash feature flag client based on a new DISABLE_UNLEASH setting, allowing Unleash to be optionally disabled.
Parquet Processing Updates: Modified the Parquet report processing logic across various cloud providers (AWS, Azure, GCP, OCP) to accommodate the new database abstraction, including conditional data handling and table/partition creation for PostgreSQL.
Dependency Updates: Updated Pipfile to pin boto3 to a specific version, constrain unleashclient to <6.0.0, and add sqlalchemy as a new dependency.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant architectural change by abstracting database access to support both Trino and PostgreSQL, enabling on-premise deployments to use Postgres instead of a Trino/Hive/Parquet stack. The changes are well-structured, using an abstract base class and factory pattern to switch between database backends based on the ONPREM setting.

My review focuses on the correctness and security of the new implementation. I've identified a few critical issues:

A potential SQL injection vulnerability in the PostgreSQL accessor.
A bug in the Trino accessor that would cause SQL syntax errors.
An incomplete (stubbed) implementation for data deletion in Postgres, which could lead to data inconsistencies.
A date-handling bug in partition creation logic that would fail at the end of the year.

I've provided detailed comments and suggestions for each of these points. Addressing them will be crucial for the stability and security of the new on-premise data architecture.

koku/koku/reportdb_accessor_trino.py

gemini-code-assist · 2025-12-04T09:02:51Z

koku/koku/reportdb_accessor_postgres.py

+    def get_schema_check_sql(self, schema_name: str):
+        """Return the SQL to check if a schema exists."""
+        return f"SELECT 1 FROM information_schema.schemata WHERE schema_name = '{schema_name}'"
+
+    def get_table_check_sql(self, table_name: str, schema_name: str):
+        return f"SELECT 1 FROM information_schema.tables WHERE table_name = '{table_name}' AND table_schema = '{schema_name}'"


These methods construct SQL queries using f-strings with values that could potentially be controlled by external factors, which is a SQL injection risk. While schema_name and table_name might currently come from trusted sources, it is a security best practice to always use parameterized queries.

A proper fix would involve a broader refactoring:

The ReportDBAccessor interface (e.g., get_schema_check_sql) should be changed to return a tuple of (sql_template, parameters).

The calling method (_execute_trino_sql in report_parquet_processor_base.py) should be updated to accept these parameters and pass them to cursor.execute().

This would make the database interaction safer and more robust against injection attacks.

koku/masu/processor/parquet/parquet_report_processor.py

gemini-code-assist · 2025-12-04T09:02:52Z

koku/masu/processor/report_parquet_processor_base.py

+        partition_values_lower = [f"'{self._provider_uuid}'", f"'{self._year}'", f"'{self._month}'"]
+        partition_values_upper = [f"'{self._provider_uuid}'", f"'{self._year}'", f"'{int(self._month)+1:02d}'"]
+        sql = get_report_db_accessor().get_partition_create_sql(self._schema_name, self._table_name, self._partition_name, partition_values_lower, partition_values_upper)


The calculation for partition_values_upper does not correctly handle the year-end rollover. When self._month is '12', int(self._month)+1 will become 13, which is an invalid month and will cause an error when creating partitions for December data.

You should use date arithmetic, for example with dateutil.relativedelta, to correctly calculate the start of the next month. You will also need to ensure datetime is imported in this file (import datetime).

Suggested change

partition_values_lower = [f"'{self._provider_uuid}'", f"'{self._year}'", f"'{self._month}'"]

partition_values_upper = [f"'{self._provider_uuid}'", f"'{self._year}'", f"'{int(self._month)+1:02d}'"]

sql = get_report_db_accessor().get_partition_create_sql(self._schema_name, self._table_name, self._partition_name, partition_values_lower, partition_values_upper)

start_date = datetime.date(int(self._year), int(self._month), 1)

next_month_date = start_date + relativedelta(months=1)

partition_values_lower = [f"'{self._provider_uuid}'", f"'{start_date.year}'", f"'{start_date.strftime('%m')}'"]

partition_values_upper = [f"'{self._provider_uuid}'", f"'{next_month_date.year}'", f"'{next_month_date.strftime('%m')}'"]

sql = get_report_db_accessor().get_partition_create_sql(self._schema_name, self._table_name, self._partition_name, partition_values_lower, partition_values_upper)

masayag · 2026-01-05T12:44:14Z

@ydayagi
When ONPREM=true, the application switches to PostgreSQL queries but still creates the hive database through Django migrations. This suggests the hive infrastructure may be unnecessarily provisioned in on-premise deployments where it's not actively used. Can you confirm hive DB isn't necessary for the ONPREM and if so, skip its creation and running migrations for it?

lcouzens

So the initial plan for on-prem was OCP data only to start with. Is there a need to update all these cloud provider parts too (AWS, Azure, GCP)?

I'm concerned we are trying to change too much in one go rather than incremental steps. I'd rather see OCP work exclusively first before we try anything beyond that. It would also help during the review process too since it would be more focused.

In addition to that were you able to run the full IQE integration tests on this locally and get passes? That's the first step I suggest we take, get the tests running and see what results we can get compared to main.

lcouzens · 2026-01-06T11:41:21Z

koku/subs/postgres_sql/aws/determine_ids_for_provider.sql

@@ -0,0 +1,14 @@
+SELECT


We have no plans today to do subs integration on prem, so there shouldnt be any need to add postgres versions of these files. (Everything under subs can be skipped)

it was easier to convert everything than to pick the 100% necessary parts. i think that the tests and CI do not have a clear distinction for only ocp.

masayag · 2026-01-06T12:15:28Z

@ydayagi testing the image with the on-prem chart raises an error in the cost-onprem-celery-worker-ocp deployment:

[2026-01-06 11:24:55,954] INFO 68869c80-7a53-4593-a89f-f71afb519d08 68869c80-7a53-4593-a89f-f71afb519d08 d198b16e-f22a-4520-9b87-5e5a2dc44160 67 setting summary ManifestState.FAILED for manifest: 3
[2026-01-06 11:24:55,963] ERROR 68869c80-7a53-4593-a89f-f71afb519d08 68869c80-7a53-4593-a89f-f71afb519d08 d198b16e-f22a-4520-9b87-5e5a2dc44160 67 Task failed: syntax error at or near "LIKE"
LINE 1: SHOW TABLES LIKE 'openshift_storage_usage_line_items_daily'
                    ^
Traceback (most recent call last):
  File "/opt/koku/.venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.SyntaxError: syntax error at or near "LIKE"
LINE 1: SHOW TABLES LIKE 'openshift_storage_usage_line_items_daily'
                    ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/koku/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 479, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^

the ONPREM env variable is set for True for this deployment.

Edit: Issue was solved by rebuilding the image. seems the image was outdated and didn't contain the latest state of this PR.

myersCody · 2026-01-06T17:32:39Z

...atabase/postgres_sql/aws/openshift/populate_daily_summary/0_prepare_daily_summary_tables.sql

@@ -0,0 +1,154 @@
+CREATE TABLE IF NOT EXISTS {{schema | sqlsafe}}.managed_aws_openshift_daily_temp


So technically the SQL directory is also postgres_sql. I think instead of postgres_sql as the directory name can we change it to on_site since this is SQL that will be running outside of the SASS moving forward.

ok. though, the change is about postgres instead of trino. we do not have to tie it to on-site/on-prem

ydayagi · 2026-01-11T14:17:25Z

@ydayagi When ONPREM=true, the application switches to PostgreSQL queries but still creates the hive database through Django migrations. This suggests the hive infrastructure may be unnecessarily provisioned in on-premise deployments where it's not actively used. Can you confirm hive DB isn't necessary for the ONPREM and if so, skip its creation and running migrations for it?

hive and django were never used together and Django is unaware of hive. django db api is only used for postgres. after my changes, the use of hive and postgres via trino is replaced by using django. even if u see the word hive in log, it is not really hive. it is just the log message that was not changed

ydayagi · 2026-01-11T14:21:31Z

So the initial plan for on-prem was OCP data only to start with. Is there a need to update all these cloud provider parts too (AWS, Azure, GCP)?

I'm concerned we are trying to change too much in one go rather than incremental steps. I'd rather see OCP work exclusively first before we try anything beyond that. It would also help during the review process too since it would be more focused.

In addition to that were you able to run the full IQE integration tests on this locally and get passes? That's the first step I suggest we take, get the tests running and see what results we can get compared to main.

it was easier to convert everything. i think that the tests (unit , iqe) do not have a 100% ocp only set. in addition to that, i was told we are going to do ocp on cloud as well. otherwise, what's the point in cost mgmt? and if it is not goign to be used, then no need to review or test. but reverting it would be a great .
i tested with IQE. when i chose the test_api_ocp* tests it also tests gcp, azure and aws. all test_api_ocp_ingest* tests passed

ydayagi · 2026-01-11T14:25:47Z

@ydayagi testing the image with the on-prem chart raises an error in the cost-onprem-celery-worker-ocp deployment:

[2026-01-06 11:24:55,954] INFO 68869c80-7a53-4593-a89f-f71afb519d08 68869c80-7a53-4593-a89f-f71afb519d08 d198b16e-f22a-4520-9b87-5e5a2dc44160 67 setting summary ManifestState.FAILED for manifest: 3
[2026-01-06 11:24:55,963] ERROR 68869c80-7a53-4593-a89f-f71afb519d08 68869c80-7a53-4593-a89f-f71afb519d08 d198b16e-f22a-4520-9b87-5e5a2dc44160 67 Task failed: syntax error at or near "LIKE"
LINE 1: SHOW TABLES LIKE 'openshift_storage_usage_line_items_daily'
                    ^
Traceback (most recent call last):
  File "/opt/koku/.venv/lib/python3.11/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
           ^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.SyntaxError: syntax error at or near "LIKE"
LINE 1: SHOW TABLES LIKE 'openshift_storage_usage_line_items_daily'
                    ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/koku/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 479, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^

the ONPREM env variable is set for True for this deployment.

Edit: Issue was solved by rebuilding the image. seems the image was outdated and didn't contain the latest state of this PR.

sorry about that. i had a few fixes missing in the image

FLPATH-2826 https://issues.redhat.com/browse/FLPATH-2826

FLPATH-2893 https://issues.redhat.com/browse/FLPATH-2893

FLPATH-2946 https://issues.redhat.com/browse/FLPATH-2946

FLPATH-2957 https://issues.redhat.com/browse/FLPATH-2957 - check_table_exists(): Use get_table_check_sql() instead of SHOW TABLES - find_expired_partitions(): Use get_expired_data_ocp_sql() instead of $partitions metadata query with date_parse() - drop_expired_partitions(): Use get_delete_by_month_sql() instead of hardcoded DELETE with hive. prefix

When ONPREM is True, use MockUnleashClient instead of real UnleashClient to avoid dependency on Unleash service.

FLPATH-2822 https://issues.redhat.com/browse/FLPATH-2822

codecov · 2026-01-14T16:32:41Z

Codecov Report

❌ Patch coverage is 60.63830% with 222 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.7%. Comparing base (0bdb1ce) to head (aa53670).
⚠️ Report is 1 commits behind head on main.

❌ Your patch check has failed because the patch coverage (60.6%) is below the target coverage (90.0%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #5813     +/-   ##
=======================================
- Coverage   94.3%   93.7%   -0.6%     
=======================================
  Files        345     348      +3     
  Lines      29740   30155    +415     
  Branches    3239    3277     +38     
=======================================
+ Hits       28054   28260    +206     
- Misses      1098    1298    +200     
- Partials     588     597      +9

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist bot reviewed Dec 4, 2025

View reviewed changes

ydayagi force-pushed the trino2postgres branch 2 times, most recently from d897f90 to 376790c Compare December 7, 2025 06:32

ydayagi changed the title ~~trino2postgres: connections, tables, schemas, partitions~~ trino2postgres Dec 15, 2025

ydayagi force-pushed the trino2postgres branch 4 times, most recently from 4755d08 to 61820b7 Compare December 23, 2025 13:35

ydayagi force-pushed the trino2postgres branch from cccceb1 to e86c9c6 Compare January 1, 2026 11:25

ydayagi marked this pull request as ready for review January 1, 2026 11:25

ydayagi requested review from a team as code owners January 1, 2026 11:25

ydayagi force-pushed the trino2postgres branch from e86c9c6 to 96a5a59 Compare January 1, 2026 12:50

lcouzens reviewed Jan 6, 2026

View reviewed changes

myersCody reviewed Jan 6, 2026

View reviewed changes

myersCody changed the title ~~trino2postgres~~ Replace trino sql with postgresql on site SQL for on premise Jan 9, 2026

ydayagi force-pushed the trino2postgres branch from 9161571 to 75c5dc8 Compare January 11, 2026 15:02

ydayagi added 7 commits January 12, 2026 17:06

replace trino connections with django

6e6cba9

FLPATH-2826 https://issues.redhat.com/browse/FLPATH-2826

trino - create tables, schemas and partitions in postgres

2dca878

FLPATH-2893 https://issues.redhat.com/browse/FLPATH-2893

delete old report data before processing

1acf272

FLPATH-2946 https://issues.redhat.com/browse/FLPATH-2946

convert SQL files in trino_sql folders

bb1cb8a

replace hard coded trino sql

21a1500

Add MockUnleashClient for ONPREM mode

9010621

When ONPREM is True, use MockUnleashClient instead of real UnleashClient to avoid dependency on Unleash service.

ydayagi added 4 commits January 12, 2026 17:09

update Pipfile.lock

3f3b3b7

docker-compose file with onprem support

2bad0dc

bug fixes after iqe tests

dea95ec

FLPATH-2822 https://issues.redhat.com/browse/FLPATH-2822

add koku/docs/onprem_data_flow.md

194a572

ydayagi force-pushed the trino2postgres branch 5 times, most recently from 31abca3 to 14a37ce Compare January 14, 2026 15:43

fix unit tests

aa53670

ydayagi force-pushed the trino2postgres branch from 14a37ce to aa53670 Compare January 14, 2026 16:18

		@@ -0,0 +1,154 @@
		CREATE TABLE IF NOT EXISTS {{schema \| sqlsafe}}.managed_aws_openshift_daily_temp

Replace trino sql with postgresql on site SQL for on premise #5813

Are you sure you want to change the base?

Replace trino sql with postgresql on site SQL for on premise #5813

Uh oh!

Conversation

ydayagi commented Dec 4, 2025

Jira Ticket

Description

Testing

Release Notes

Uh oh!

gemini-code-assist bot commented Dec 4, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

masayag commented Jan 5, 2026

Uh oh!

lcouzens left a comment

Choose a reason for hiding this comment

Uh oh!

lcouzens Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

ydayagi Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

masayag commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

myersCody Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

ydayagi Jan 11, 2026

Choose a reason for hiding this comment

Uh oh!

ydayagi commented Jan 11, 2026

Uh oh!

ydayagi commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydayagi commented Jan 11, 2026

Uh oh!

codecov bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

masayag commented Jan 6, 2026 •

edited

Loading

ydayagi commented Jan 11, 2026 •

edited

Loading

codecov bot commented Jan 14, 2026 •

edited

Loading