Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions site/posts/industry-talk-review/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "Industry Talk Review"
author: "Rita Pecuch"
date: "2025-11-18"
categories: [review, git, pharmaceuticals]
toc: TRUE
toc-title: "Table of Contents"
toc-depth: 5
---

# Introduction

The usage of Git in statistical programming within the pharmaceutical industry has been steadily increasing over the last several years. Beyond the discussions of our PHUSE working group, several talks at recent industry events have showcased why and how various companies are leveraging this technology.

# Search Strategy

The search for materials spanned the 2025 PHUSE archive, as heavy representation from pharmaceutical companies is customary for PHUSE events, as well as one relevant paper that was presented at PharmaSUG 2025 and R/Pharma 2025.

# Major Themes

## Increasing Usage of Open-Source Programming

Open-source languages, particularly R, have emerged as powerful tools for reporting, statistical modeling, and enabling actionable insights within the pharmaceutical industry, capable of facilitating entire workflows from data cleaning to advanced analytics in a single environment [1,2]. In addition, the integration of frameworks such as RShiny allows users to interactively visualize large-scale datasets without requiring knowledge of HTML, CSS, or JavaScript, thus reducing barriers to entry for open-source programming [1]. This accessibility, combined with R’s capacity to integrate data from multiple studies and leverage historical information, provides the flexibility and scalability needed to support diverse stakeholder needs and accommodate growing amounts of data, particularly from international, multicenter trials [1].

Large organizations such as GSK, Pfizer, and CDISC, as well as smaller biotech companies, have incorporated open-source tools into their strategies, highlighting the broader industry momentum toward transparency, collaboration, and freedom from the high costs, restrictions, and often steep learning curves of proprietary software [2,3,4,5,6]. The rise of open-source solutions has also began to blur the boundaries between statistical and computer science programmers, emphasizing collaboration and the ability to customize, rerun, and update clinical analyses efficiently [2,6].

## Rise of Agile Methologies

Agile methodologies are becoming increasingly common in the pharmaceutical industry due to their focus on real-time communication, iterative workflows, and dynamic approaches that foster continuous improvement [1,7]. Successfully leading and managing change within agile environments requires tailored strategies, including the use of sprints to drive ongoing progress and adaptability [7]. Key agile principles—such as iterative progress and continuous feedback—can also be valuable when applied to complex industry-specific efforts such as SCE migration, ensuring flexibility and responsiveness throughout the transition [8].

## Need for Version Control

For teams using open-source technology in the pharmaceutical industry, Git provides robust support for agile and collaborative development, especially when managing complex projects that require frequent and coordinated contributions from multidisciplinary teams [1]. Its integration with platforms like GitHub enables transparent code management, collaborative planning, documentation, and discussion, all of which are essential for efficient development activities [4,5]. The transparency of code changes and use of features such as pull requests further enhance collaboration and foster team learning, and allows opportunities to document why certain decisions were made regarding code changes. Templating and automation such as GitHub Actions and other bash scripting streamline project setup and repetitive tasks, reducing the risk of human error [9,10].

Additionally, Git’s ecosystem provides valuable opportunities for data-driven reporting and strategic oversight. Tools like the GitHub API allow organizations to automatically extract key metrics for continuous improvement as well as reporting to senior leadership. For example, GSK extract metrics such as the proportion and growth of R code versus SAS, the total number of repositories using R, and usage trends over time [3]. Languages like R offer dedicated packages to interface with the GitHub API, making metric collection and analysis more accessible.

## GxP Compliance

For teams using open-source technology in the pharmaceutical industry, the use of Git can support compliance with key regulatory and auditing requirements by providing a complete immutable history of code changes [1,10]. Compliance with regulations such as CFR21 Part 11 requires that programming takes place in controlled environments, preserving data privacy while allowing specific metadata sharing with regulatory authorities to facilitate reproducibility [2,6,9]. Git is integrated with several SCEs (e.g., Posit Workbench/Connect) [5].

Although Git is traditionally used for tracking code changes, having a clear easy to understand history of which code versions produced which data outputs is extremely beneficial. Data outputs, even in development and testing environments, and traditionally stored in specific environments. Taking advantage of the unique identifiers known as commit hashes assigned to each version of the code by Git can serve as helpful labels for different versions of data outputs. For example, a strategy employed by Graticule is to label different versions of data outputs stored in their data lake with the commit hash of the version of the code that produced those outputs [11].

## Training and Support

Despite the many advantages of using Git to support emerging trends of open-source technology use and collaborative programming in the pharmaceutical industry, this technology can have a steep learning curve of concepts that are new to many statistical programmers [7]. The open-source community is meant to be collaborate by nature and offers many resources and learning communities to support individuals that are learning Git [1,5]. Specific recommendations that have been discussed for better fostering a culture that embraces change include clear communication of benefits and specific goals of new approach, recognition of employees' efforts along the learning journey by celebrating small wins, embedding subject matter experts into adopting teams, providing ample notice in advance of the change, and providing ample opportunities for practice and repetition [7,8]. Metrics can also be gathered along the way by utilizing the GitHub API to assess trends which could be improved across repositories [3]. This can vary by organization and what the specific goals are to be tracked, but examples could include number of merge conflicts and commit message analysis.

# Conclusion

Git can provide robust support for change tracking as open-source programming and agile methodologies become more prevalanet in the pharmaceutical industry. Ways in using this technology to best support GxP practices are continuing to be discussed, and fostering a culture that welcomes change for the better is essential for a smooth implementation.

# References

1. Andhale, S.H., Sood, S. (2025). Harnessing the Power of R Shiny in a GxP compliant and validated manner for clinical trials use [White paper OS3464]. Sycamore Informatics India Private Limited. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS10.pdf

2. Patel, J. (2025). Evaluation of Open-Source Technologies in Clinical Trial Reporting [White paper PP15]. AstraZeneca US Biometric. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP15.pdf

3. Arancibia, B. (2025). What R We Counting? The Quest to Quantify R Usage in Biostatistics Organizations [White paper OS02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS02.pdf

4. Hume, S. (2025). A Technical Roadmap Defining CDISC's Path to End-to-End Automation [White paper DS05]. CDISC. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_DS05.pdf

5. Liao, C. (2025). Building Rome in a Few Months: How Open Source Technologies Empower Small Pharma [White paper OS07]. Recursion. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS07.pdf

6. Kim, J. (2025). The changes in the job of statistical programmer in the
pharmaceutical industry [White paper ET07]. Pfizer Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET07.pdf

7. Ching, E. (2025). Learning to skate, pedal, and drive – Change leadership in the world of Agile delivery [White paper PD02]. GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PD02.pdf

8. Tahiliani, K., Jain, A. (2025). A Journey Through an Innovative Approach to Migration of Legacy System to New Multilingual Statistical Computing Environment (SCE). GSK. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_ET21.pdf

9. Shi, T., Song, L., Nomula, V.K. (2025). Implementation of a risk-governed, reproducible, and collaborative R workflow for Real World Evidence Projects [White paper PP27]. Merck & Co., Inc. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_PP27.pdf

10. Zhang, P., Chen, J., Yang, F., Chang, V., Lee, J. (2025). Generate Clinical Study Report (CSR) document using {quarto} and {shiny} [White paper OS06]. CIMS Global. https://phuse.s3.eu-central-1.amazonaws.com/Archive/2025/Connect/US/Orlando/PAP_OS06.pdf

11. Dusendang, J., Bath, S., Orozco, S., Asper, A. , Koren, Y. (2025). Integrating Collaborative Programming with Automated Traceability and Reproducibility in Pharma Studies and Real-World Data Projects by Adapting DevOps Best-Practices. [Paper OS-111]. https://pharmasug.org/proceedings/2025/OS/PharmaSUG-2025-OS-111.pdf.