Implement Parallelized map and optimize Database search API #2669

ndegwamartin · 2024-09-07T10:43:27Z

IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).

Description
Optimizes the DatabaseImpl search APIs FHIR Resource(serialized) to HAPI FHIR Structure mapping block by introducing a parallelized implementation that uses async couroutines within each mapping iteration.

Alternative(s) considered
Have you considered any alternatives? And if so, why have you chosen the approach in this PR?

Type
Enhancement

Screenshots (if applicable)

Checklist

I have read and acknowledged the Code of conduct.
I have read the Contributing page.
I have signed the Google Individual CLA, or I am covered by my company's Corporate CLA.
I have discussed my proposed solution with code owners in the linked issue(s) and we have agreed upon the general approach.
I have run ./gradlew spotlessApply and ./gradlew spotlessCheck to check my code follows the style guide of this project.
I have run ./gradlew check and ./gradlew connectedCheck to test my changes locally.
I have built and run the demo app(s) to verify my change fixes the issue and/or does not break the demo app(s).

- Optimize Database search API

FikriMilano

The change looks great!

Additionally, could you provide some performance comparison between the old and new code? That will be cool to know

jingtang10

great work thanks @ndegwamartin!

FORK - With unmerged PR #9 - WUP #13 SDK - WUP google#2178 - WUP google#2650 - WUP google#2663 PERF - WUP google#2669 - WUP google#2565 - WUP google#2561 - WUP google#2535

jingtang10 · 2024-09-11T14:18:15Z

To summarised our discussion yesterday, I think there's still work to be done in this PR - @ndegwamartin to investigate thread pool etc. Pls comment when this is ready for next round of review - @FikriMilano @aditya-07 @yigit @stevenckngaa @vorburger @kevinmost pls also take a look at this.

ndegwamartin · 2024-09-17T16:15:02Z

Device: Physical, Samsung Galaxy Active Tab 2
Mode : Benchmarking with Kotlin system Timing's measureTimeMillis
Scope: Database search API method search

Optimization: None

Run 1

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	8	~0.2
Task	~17K	~24	~1.8
Patient	~11K	~456	~1.3

Run 2

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	~2	~0.1
Task	~17K	~22	~1.7
Patient	~11K	~472	~1.2

Optimization: Using async with parent context (usually Dispatchers.IO)

Run 1

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	4.8	~0.2
Task	~17K	~24	~1.7
Patient	~11K	~450	~1.3

Run 2

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	~2	~0.1
Task	~17K	~24	~1.7
Patient	~11K	~455	~1.3

Optimization: Using async with Dispatchers.Default .
(Note - Threads safety of the FHIR JsonParser is achieved through creating a new instance for each loop)

Run 1

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	~5	~0.2
Task	~17K	~5.4	~1.8
Patient	~11K	~208	~1.4

Run 2

Resource Type	Total Records	Timetaken(seconds)	DB Query(seconds)
Group	~1K	~0.5	~0.1
Task	~17K	~5	~1.7
Patient	~11K	~204	~1.3

Note - The tests were carried out in a QA test environment. In the real world Patients would be more than Groups (i.e. Patients = ~10 x No. of Groups ) and Tasks would be even more than Patients (i.e Tasks = ~30 x No. of Patients)

ndegwamartin · 2024-09-17T16:24:32Z

Full specs of the device:

Samsung Galaxy Tab Active2
Android 9 (28)
3GB Memory

MJ1998 · 2024-09-19T09:38:57Z

engine/src/main/java/com/google/android/fhir/db/impl/DatabaseImpl.kt

@@ -460,6 +470,11 @@ internal class DatabaseImpl(
  }
 }

+/** Implementation of a parallelized map */
+suspend fun <A, B> Iterable<A>.pmap(f: suspend (A) -> B): List<B> = coroutineScope {


can you rename pmap to something which recommends to pass functions doing CPU intensive work.
May be "pmapCPU" ?

Yeah makes total sense because of the Dispatcher constraint

I had restricted it for use in the DB search API class but with the rename I could potentially move it out to the generic Utils class for reuse elsewhere.

ndegwamartin requested a review from a team as a code owner September 7, 2024 10:43

ndegwamartin requested a review from jingtang10 September 7, 2024 10:43

Implement Parallelized Map

2fc407c

- Optimize Database search API

ndegwamartin force-pushed the issue2668-opt-dbsearch branch from 0eb0187 to 2fc407c Compare September 7, 2024 10:46

FikriMilano reviewed Sep 9, 2024

View reviewed changes

jingtang10 reviewed Sep 10, 2024

View reviewed changes

Merge branch 'master' into issue2668-opt-dbsearch

edded37

ndegwamartin mentioned this pull request Sep 17, 2024

Optimize the Database Search API #2668

Open

ndegwamartin marked this pull request as draft September 17, 2024 16:41

Search API perfomance DB optimization - Default Dispatcher

209da12

ndegwamartin marked this pull request as ready for review September 18, 2024 10:46

MJ1998 reviewed Sep 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Parallelized map and optimize Database search API #2669

Implement Parallelized map and optimize Database search API #2669

ndegwamartin commented Sep 7, 2024 •

edited

Loading

FikriMilano left a comment

jingtang10 left a comment

jingtang10 commented Sep 11, 2024

ndegwamartin commented Sep 17, 2024

ndegwamartin commented Sep 17, 2024

MJ1998 Sep 19, 2024

ndegwamartin Sep 19, 2024

ndegwamartin Sep 19, 2024

Implement Parallelized map and optimize Database search API #2669

Are you sure you want to change the base?

Implement Parallelized map and optimize Database search API #2669

Conversation

ndegwamartin commented Sep 7, 2024 • edited Loading

FikriMilano left a comment

Choose a reason for hiding this comment

jingtang10 left a comment

Choose a reason for hiding this comment

jingtang10 commented Sep 11, 2024

ndegwamartin commented Sep 17, 2024

ndegwamartin commented Sep 17, 2024

MJ1998 Sep 19, 2024

Choose a reason for hiding this comment

ndegwamartin Sep 19, 2024

Choose a reason for hiding this comment

ndegwamartin Sep 19, 2024

Choose a reason for hiding this comment

ndegwamartin commented Sep 7, 2024 •

edited

Loading