PharmaSUG India Single-Day Event
Innovation with Integrity: Shaping the Future of Clinical Data & Standards!
Saturday, April 11, 2026
PharmaSUG India SDE was a great success – many thanks to our sponsors, presenters, and full house of 150 attendees!
Check out our photo album!
Shashikant Kumar
Ephicacy
Single-Day Event Co-Chair
Ajay Gupta
Daiichi Sankyo
Single-Day Event Co-Chair
Balasubramanian L.
Eli Lilly & Company
Single-Day Event Co-Chair
Conference Committee:
Ajay Gupta (Daiichi Sankyo), Eric Larson (IQVIA), Venky Chakravarthy (Takeda), Natalie Martinez (Eli Lilly), Shashikant Kumar (Ephicacy), Balasubramanian L. (Lilly)
Social Media:
Amy Zhang, Alice Cheng, Jyoti Jo Agarwal
Questions? Contact us!
Registration and Rates
| Registration Type | Early Registration by 28MAR2026 | Regular Registration 29MAR-09APR2026 | Late/On-site Registration 10APR-11APR2026 |
| SDE | $40 | $50 | NA |
Event Schedule
Saturday, April 11, 2026 | Single-Day Event Presentations (click link for slides)
Posters
| Title | Author |
| Advancing Clinical Data Management Through Risk-Based Quality Management and AI-Driven Innovation | Renisha Robinson, SENSAN |
| From Raw Data to Regulatory Trust: Innovating Clinical Data Standards Without Compromising Integrity | Swaroop Kumar Koduri, Ephicacy |
| Shaping Our Submission Journey with D-pack | Priyadharshini Gopalakrishnan and Rajesh Kulal, Zifo |
| Enhancing Efficiency in aCRF Preparation Through Automated R Driven Annotation | Vaishali Govindarajan and Priyadharshini Rajasekaran, Zifo |
| Navigating the Fog: A Story Pointing Framework for Clinical Programming Effort | Sarita Singh, Veramed |
Presentation Descriptions
Keynote: Evolution: How AI is Changing Drug Development
Keynote: Evolution: How AI is Changing Drug Development
Debiprasad Roy, VP & Head of Digital Strategy, Analytics, and Programming, Exelixis
AI is disrupting drug development! The increasing use of AI for target identification, in silico modelling, protocol optimization, feasibility assessment, clinical trial workflows, analysis, and submissions is changing the paradigm of drug development. The hope is we can shorten the time to get drugs to market, helping patients with critical needs. It also raises questions about hype versus actual benefits and about how to govern the AI process within a regulatory framework.
I will explore three use cases across Biostatistics, Data Management, and Statistical Programming where AI is mature, driving and improving practices, reducing time, resourcing, and costs.
An AI-Powered Metadata-Driven CDISC ADaM Dataset Automation
An AI-Powered Metadata-Driven CDISC ADaM Dataset Automation
Surendra Gunti, Eli Lilly
Background: The CDISC Analysis Data Model (ADaM) Implementation Guide emphasizes standardized dataset structures and clear traceability from SDTM to analysis results. However, traditional ADaM programming remains resource-intensive and prone to inconsistency due to varied interpretations of standards and derivation logic.
Objective: To develop an AI-enabled, metadata-driven automation framework aligned with ADaM Implementation Guide principles that minimizes manual effort and reduces implementation variability in CDISC ADaM dataset development.
Methods: An AI-powered framework was developed using the Standards Automation and Metadata Process ADaM foundation integrated with an AI enabled Programming Assistance. Structured metadata formalizes analysis populations, parameters, analysis records, and derivation conventions. AI-assisted rule interpretation automates ADaM dataset generation and validation while enforcing controlled terminology, variable roles, and traceability from SDTM sources
Engineering the Digital Thread: Provenance and Impact Analysis in Metadata Management
Engineering the Digital Thread: Provenance and Impact Analysis in Metadata Management
Mukul Goyal, Sycamore
In a modern Metadata Repository (MDR), the ability to innovate rapidly depends on the architectural integrity of the underlying system. As global standards are cloned, modified, and customized across various clinical studies, maintaining a “Digital Thread” of provenance is no longer optional—it is a Governance mandate. This paper explores a data standards-centric framework to ensure study data specifications are integral through provenance and impact analysis:
Provenance to CDISC Standards: MDR should show the IG and version of the CDISC standards being used in a study. This includes the CDISC foundation standards such as CDISC SDTM, ADaM, as well as Controlled Terminology. The “Archetype” is the immutable origin of metadata, and the system persists this relationship across multiple layers of cloning, ensuring absolute provenance. This allows for a 1:1 trace back to the “Source of Truth”.
Provenance to Company Standards: Data Collection standards, such as Global CRFs managed in the Global Library and are reused across multiple Therapeutic Area (TA) Libraries. These CRFs have TA-specific adoptions. When used across studies, provenance is maintained to TA and Global Libraries, so one can easily identify the variant of the form in use.
Such provenance and lineage enable easy reporting and impact analysis when changes are planned to these Standard CRFs. The relationship is crucial to standards governance and automatically managed by the system as study teams focus on their key deliverable, i.e., study specifications for CRF, SDTM, ADaM ,etc. By shifting from manual oversight to an architecturally enforced digital thread, study teams work in an MDR system that reduces the manual validation burden and ensures that the future of clinical data standards is both agile and compliant.
Automation of CDISC Data Review Remains No Longer Optional: It’s A Critical Need of the Hour
Automation of CDISC Data Review Remains No Longer Optional: It’s A Critical Need of the Hour
Mrityunjay Kumar, Ephicacy
Regulatory submissions require extensive and rigorous review of SDTM, ADaM, and TLF outputs to ensure accuracy, consistency, and compliance with CDISC standards and regulatory expectations. Current data review practices remain largely manual, checklist-driven, and highly dependent on individual reviewer expertise. These approaches are time-consuming, resource-intensive, and prone to variability, making them increasingly unsustainable as clinical trial designs grow more complex and development timelines continue to compress. Automating CDISC data review has therefore become a critical need of the hour.
This talk presents a practical and scalable framework for automating data review across the clinical programming lifecycle. The framework combines metadata-driven validation checks, automated cross-dataset consistency rules, programmable reviewer checklists, and visualization-based review dashboards developed using R Shiny. Through real-world case examples, the approach demonstrates systematic identification of common data quality issues, including variable mismatches, derivation inconsistencies, population flag errors, and misalignment between ADaM datasets and TLF outputs.
From a trend’s perspective, current automation efforts primarily focus on rule-based validations and standard compliance checks. In contrast, future data review models are expected to evolve toward intelligent, risk-based, and adaptive frameworks that incorporate AI-assisted anomaly detection, pattern recognition, and learning from historical submission and inspection feedback. Such advancements will enable proactive quality oversight, continuous monitoring, and greater consistency across studies.
Governance, documentation, and validation considerations are also discussed to support regulatory acceptability of automated review solutions. Attendees will gain actionable insights into designing future-ready automated CDISC data review frameworks for reliable and submission-ready clinical datasets.
Advanced Analytics and Automation with SAS Clinical Acceleration
Advanced Analytics and Automation with SAS Clinical Acceleration
Sudeshna Guhaneogi, SAS
“Clinical acceleration” refers to strategies, technologies, and platforms designed to speed up clinical trials and the overall drug development process while ensuring compliance, data quality, and patient safety. It has become a major focus in life sciences with decentralized trials, real-world evidence, biomarker data, and digital protocols increasing complexity.
SAS Clinical Acceleration, built on SAS Viya, aims to streamline the path from data capture to regulatory submission, reducing bottlenecks in drug development. In addition, it supports regulated, auditable, traceable workflows to avoid regulatory delays, integrates with open-source tools, no-code/low-code interfaces, and standard data models (e.g., CDISC), and helps organizations reduce submission timelines and collaborate across teams and CROs. SAS Viya is a cloud-native, end-to-end data and AI platform that makes development easier and boosts productivity with tools that are friendly for everyone on your team.
This paper discusses the key features of SAS Clinical Acceleration, namely, the centralized clinical data repository, the Statistical Computing Environment (SCE), and No Code/ Low Code SAS Studio IDE. SAS Clinical Acceleration is one of the founding stones to engage into a real transformation, enabling Agentic driven process.
Patient-Reported Outcomes (PRO) and Electronic Clinical Outcome Assessments (eCOA) in Oncology Clinical Trials: Applying Regulatory Guidance from Study Initiation Through Analysis and Interpretation
Patient-Reported Outcomes (PRO) and Electronic Clinical Outcome Assessments (eCOA) in Oncology Clinical Trials: Applying Regulatory Guidance from Study Initiation Through Analysis and Interpretation
Yogesh Sonawane & Roger Steven James, IQVIA
Background
In oncology, patient-reported outcomes (PRO) collected via electronic clinical outcome assessment (eCOA) provide critical evidence about symptoms, functioning, and health-related quality of life—often complementing survival- and tumor-based endpoints. However, the evidentiary value of PRO data is frequently limited by inconsistent endpoint strategy, suboptimal instrument selection, preventable missingness, and unclear interpretation. Regulatory agencies (e.g., FDA, EMA) increasingly expect PRO endpoints to be fit-for-purpose, well-defined, and supported by rigorous data integrity and analysis plans.
Objectives
This abstract outline an end-to-end, oncology-focused framework for implementing PRO/eCOA in alignment with key regulatory expectations, spanning:
Study initiation and endpoint strategy (e.g., context of use, estimands, meaningful within-patient change) eCOA setup and conduct (instrument selection, training, compliance monitoring, and data integrity) Data analysis (missing data handling, longitudinal and responder-based methods) Interpretation and reporting (clinical meaningfulness, triangulation with clinical endpoints, and submission-ready displays).
The Identity Shift in Clinical Data Management: How Data Science Is Reshaping CDM Workflows
The Identity Shift in Clinical Data Management: How Data Science Is Reshaping CDM Workflows
Rishi Raj, SENSAN
As clinical trials evolve into highly digitized, data-intensive ecosystems, the role of Clinical Data Management (CDM) is undergoing a measurable and practical transformation. Traditional CDM activities focused on rule-based validation, retrospective data cleaning, and static listings are increasingly challenged by real-time data streams, decentralized trial designs, and the growing use of analytics across clinical operations.
This session examines how data science concepts are being embedded into everyday CDM workflows, reshaping how data are reviewed, reconciled, and acted upon without redefining CDM as a separate discipline. Modern trials generate large volumes of structured and semi-structured data from EDC, eCOA, wearables, laboratory systems, IRT/RTSM, and operational platforms. Managing this complexity requires CDM teams to move beyond completeness checks toward pattern recognition, anomaly detection, and trend-based oversight aligned with RBQM principles.
Drawing on practical implementation experience, the session demonstrates how AI-enabled data review, centralized monitoring, and automated signal detection are being operationalized within standards-driven CDM environments. Particular focus is placed on trial desk–style oversight tools that consolidate multi-source data into a single operational view, enabling continuous review and reconciliation across EDC, eCOA, and external data streams while maintaining traceability and audit readiness.
A detailed example highlights eCOA-based dosing oversight in trials with narrow therapeutic windows, where timing, compliance, and dose accuracy are critical. Real-time integration of dosing data with centralized monitoring and anomaly detection enables earlier identification of missed doses, delayed administration, and protocol deviations than traditional listings or periodic reviews. Similar approaches are discussed for wearables, laboratory data, and IRT/RTSM, where intelligent reconciliation and trend-based alerts reduce manual effort and support patient safety.
Beyond tools and analytics, the session addresses the practical implications of this shift for CDM teams, including evolving skill requirements, closer collaboration with programming, biostatistics, and clinical operations, and the governance needed to ensure regulatory confidence when analytics inform oversight decisions. Rather than replacing traditional CDM foundations, data science is presented as an augmentation enabling CDM to evolve from reactive data review to proactive, standards-aligned trial oversight.
Attendees will gain a clear, practical understanding of how data science concepts are already influencing CDM workflows today and how PharmaSUG practitioners can adopt these approaches responsibly within existing clinical data and standards frameworks.
From Static TLFs to Insightful Exploration: An Interactive-First Approach with Teal
From Static TLFs to Insightful Exploration: An Interactive-First Approach with Teal
Himani Narang, J&J
Traditional reporting workflows often involve creating a large number of static TLFs upfront, which can be time-consuming and lead to multiple ad-hoc requests during review. An Interactive-First approach provides a more efficient way of working by enabling early data exploration and collaboration before finalizing static outputs.
This presentation demonstrates how the Teal package in R can be used to implement an Interactive-First strategy for clinical data analysis. Instead of producing extensive TLFs at the start, Teal-based Shiny applications allow stakeholders to explore data dynamically and identify the most relevant outputs collaboratively. This reduces rework and minimizes downstream ad-hoc requests. The presentation also highlights how Teal supports auto-generation of R code from interactive analyses, allowing exploratory work to be seamlessly transitioned into reproducible deliverables. By shifting the focus from static outputs to interactive exploration, teams can improve efficiency, enhance communication, and accelerate decision-making while maintaining clarity and confidence in the results.
Explore More with dataviewer(): The Next Step Beyond View()
Explore More with dataviewer(): The Next Step Beyond View()
Madhan Kumar, ICON
Efficient and traceable data exploration is essential in clinical research, particularly during SDTM and ADaM dataset review where programmers must rapidly understand large, metadata-rich datasets. The traditional R View() function for inspecting data frames, this approach remains limited to static exploration without interactivity or reproducibility features which makes it difficult to view large clinical datasets.
To address these limitations, I have developed “dataviewR”, an R package provides an interactive and feature-rich environment for exploring clinical datasets. The dataviewer() function enables users to filter, subset, and explore data interactively, and automatically produces equivalent dplyr code representing each operation. This bridges graphical exploration and programmatic workflows, allowing exploratory findings to be directly integrated into validated analysis pipelines.
The tool also displays dataset attributes including variable labels and metadata, improving understanding of SDTM and ADaM structures during review. This presentation demonstrates how dataviewer() improves clinical data review efficiency, reduces manual coding effort, and enhances integrity in exploratory workflows. The installation and usage is clearly documented in the package’s site.
The package has been published in CRAN. For more details, please visit the below official websites of the package.
A Knowledge Graph Driven Approach to Semantic SDTM Validation
A Knowledge Graph Driven Approach to Semantic SDTM Validation
Gitika Kishor, Pfizer
Regulatory submissions require SDTM datasets that are technically compliant, internally consistent, traceable, and aligned with evolving CDISC standards. The approach detects higher‑order inconsistencies by encoding CDISC structures, controlled terminology, and inter‑domain constraints as an ontology and querying the resulting graph, that rule‑based logics systematically miss. This poster demonstrates a knowledge graph (KG) based approach to SDTM validation, leveraging semantic relationships embedded in CDISC structures.
The method builds KG representing SDTM domains, variables, controlled terminology, and cross domain linkages using packages such as pandas for data ingestion, networkx for semantic modelling, and py2neo for Neo4j dashboard integration. From SDTM inputs, it generates a dynamic KG where nodes (subjects, events) and edges (timing, causation) follow standardized CDISC ontologies. Querying the graph enables automated detection of anomalies, including missing domain links, incorrect terminology, inconsistent identifiers, and dependency issues.
This graph‑based validation strategy provides metadata‑aware, adaptive, and reliable traceable validation, improving scalability, transparency, and regulatory readiness.
Evolving Submission Standards: Transitioning from XPT to Dataset-JSON
Evolving Submission Standards: Transitioning from XPT to Dataset‑JSON
Gomathi S, ICON
The SAS XPORT (XPT) format has served as the FDA‑mandated transport standard for clinical study data submissions for more than three decades. While reliable, its limitations present growing challenges for clinical data workflows. To address these issues, the FDA and CDISC have initiated an evaluation of Dataset‑JSON, a modern, robust, flexible, and metadata‑rich alternative. The successful 2023 CDISC pilot project, along with the FDA’s subsequent Federal Register notice in 2025 requesting feedback on adopting Dataset‑JSON as a potential future standard, suggests that it may soon become an accepted submission format.
This presentation focuses on preparing for this transition by explaining the structure of Dataset‑JSON, how it differs from XPT, and how it integrates with Define‑XML metadata. It will walk through converting SDTM/ADaM datasets to Dataset‑JSON and validating them using {datasetjson} and {jsonlite} in R, along with guidance on establishing dual‑format workflows (XPT and Dataset-JSON) during a transitional period.
Taming Real World Data (RWD) with CDISC Standards: A Practical, Standards-Based Approach for Regulatory-Ready Evidence
Taming Real World Data (RWD) with CDISC Standards: A Practical, Standards-Based Approach for Regulatory-Ready Evidence
Tamilselvi Narayanasamy, SENSAN
Real World Data (RWD) originates from diverse, non-traditional sources such as Electronic Health Records (EHRs), medical claims and billing systems, patient registries, and data captured through digital health technologies and mobile devices. Regulatory agencies, including the FDA and EMA, are increasingly recognizing the value of RWD to complement evidence generated from randomized controlled trials by supporting assessments of treatment effectiveness, safety, and long-term outcomes across the product lifecycle. Despite its potential, RWD is not collected primarily for research purposes, leading to challenges related to data heterogeneity, variable data quality, inconsistent terminologies, and limited traceability issues that directly impact statistical programming, analysis, and regulatory reporting. These challenges complicate downstream analytics, cross-study comparisons, and reuse of data for submissions.
This session demonstrates how applying CDISC standards, particularly SDTM and ADaM, provides a practical, standards-based framework to transform raw RWD into analysis-ready datasets suitable for regulatory and exploratory use. Using publicly available datasets such as the FDA Sentinel Initiative and open EHR-based RWD examples, the session illustrates key steps in mapping real-world clinical data to CDISC models, addressing traceability, and supporting reproducible analyses. By standardizing RWD using CDISC, organizations can improve data transparency, interoperability, and programming efficiency, facilitate integration with clinical trial data, and enable scalable, regulator-ready analytics. This approach supports more informed decision-making, accelerates evidence generation, and strengthens the role of RWD/RWE in modern clinical development.
Poster Descriptions
Advancing Clinical Data Management Through Risk-Based Quality Management and AI-Driven Innovation
Advancing Clinical Data Management Through Risk-Based Quality Management and AI-Driven Innovation
Renisha Robinson, SENSAN
The modernization of clinical research is driven by advancements in clinical data standards and risk-based quality oversight. Standardized frameworks such as CDISC have improved data interoperability, streamlined regulatory submissions, and enhanced global compliance. In parallel, automated validation protocols and real-time analytics powered by artificial intelligence (AI) and machine learning (ML) are transforming how data is monitored, reviewed, and analysed throughout the clinical trial lifecycle.
Risk-Based Quality Management (RBQM) represents a key innovation in this transformation. Moving away from traditional 100% Source Data Verification (SDV), RBQM focuses on critical-to-quality (CtQ) factors that directly impact patient safety and data integrity. Through risk assessment, centralized monitoring, Key Risk Indicators (KRIs), and Quality Tolerance Limits (QTLs), organizations can proactively identify and mitigate risks while optimizing resources. Emerging technologies—including AI-driven risk detection, centralized statistical monitoring, and real-world data integration—further enhance RBQM implementation. Together, these innovations enable faster trial execution, improved operational efficiency, and high-quality, regulatory-compliant data in modern clinical research.
From Raw Data to Regulatory Trust: Innovating Clinical Data Standards Without Compromising Integrity
From Raw Data to Regulatory Trust: Innovating Clinical Data Standards Without Compromising Integrity
Swaroop Kumar Koduri, Ephicacy
As clinical trials grow in complexity and development timelines shorten, clinical programming teams are under increasing pressure to deliver faster results while maintaining regulatory trust. Health authorities expect consistent application of clinical data standards, transparent derivations, and end-to-end traceability across the data lifecycle.
This poster presents a standards-driven approach for transforming raw clinical data into regulatory-trusted, submission-ready outputs without compromising data integrity. The approach emphasizes early adoption of SDTM and ADaM principles, metadata alignment across datasets, and controlled, macro-light SAS programming to ensure clarity, reproducibility, and scalability. Built-in validation checkpoints and traceability mechanisms support consistency from raw data through analysis datasets and final tables, listings, and figures. A practical case illustration from pharmacokinetic and safety analyses demonstrates reduced downstream rework, improved audit readiness, and fewer late-stage validation findings while supporting efficient delivery timelines. This poster reinforces that embedding integrity throughout the data pipeline enables sustainable innovation, enhances regulatory confidence, and supports consistent delivery of high-quality clinical data.
Shaping Our Submission Journey with D-pack
Shaping Our Submission Journey with D-pack
Priyadharshini Gopalakrishnan and Rajesh Kulal, Zifo
Clinical data submission in the eCTD for FDA and PMDA involves a standard folder (m5) structure, which provides readability and clarity for the reviewing authorities. Putting together this structure could be tedious, especially when playing the part of the intermediate layer rather than being directly involved in the submission. A lot of concerns or confusion about the next steps arise as ambiguity prevails in a few parts of the structure.
The specific ones include the format and naming convention of the additional documents, as well as the necessary documents to be placed under miscellaneous. In addition, manual structuring of the package is often associated with an increase in errors and inconsistencies. To battle that, we have developed a SAS macro that not only compiles the essential m5 components, but also performs automated validation, such as format and file presence checks, datetime stamp mismatch and modlog verification for the right Sponsor name and Study ID, leading to precise, timely, and efficient submissions. The presentation will probe the working principle of our D-pack macro and provide additional pointers to consider for seamless delivery of the submission package or aiding in the process.
Enhancing Efficiency in aCRF Preparation Through Automated R‑Driven Annotation
Enhancing Efficiency in aCRF Preparation Through Automated R‑Driven Annotation
Vaishali Govindarajan and Priyadharshini Rajasekaran, Zifo
In the world of clinical trials, adherence to Metadata Submission Guidelines (MSG) is paramount before submitting data to regulatory bodies. One of the prerequisites is an annotated Case Report Forms (aCRFs). However, it is difficult, time-consuming, and error-prone to manually annotate and bookmark them in accordance with criteria.
To alleviate the tedious manual tasks associated with aCRF annotation and aesthetics handling, we propose an automation such as a R tool. The tool that complies with regulatory standards allowing programmers only to focus on identifying SDTM domains and variables, providing this information via simple Excel input to the tool. It generates an aCRF with 85% completeness, complete with annotations utilizing various visual cues such as color-coded boxes and font sizes to denote different domains and variables, reducing the manual work leaving only minimal drag-and-drop adjustments to finalize annotations quickly and consistently efficiently. By outlining the subsequent steps for completing aCRF annotation, this tool has streamlined the process, impacting productivity and accuracy, which makes programmers focus on essential tasks.
Navigating the Fog: A Story Pointing Framework for Clinical Programming Effort
Navigating the Fog: A Story Pointing Framework for Clinical Programming Effort
Sarita Singh, Veramed
Estimating clinical programming tasks in hours is often like predicting a journey’s duration without knowing the road conditions—smooth highways can instantly turn into steep hills or unexpected detours. In the production of SDTM, ADaM, and TLF deliverables, these “road conditions” manifest as messy raw data, complex derivations, and shifting Statistical Analysis Plan (SAP) requirements. Traditional hour-based estimates frequently fail because they cannot account for the inherent volatility of clinical data. Clinical programmers often note that “time guessing” leads to burnout and missed milestones.
To provide a clearer view of this terrain, this paper introduces a structured Story Pointing framework tailored specifically for the clinical trial environment. Rather than focusing on the clock, this method functions as a map of the journey. Using a customised Pharma-Fibonacci scale, programming tasks are evaluated based on three critical dimensions: Data Integration Complexity, Derivation Density, and Specification Stability.
By shifting the focus from “how long” to “how difficult,” the framework allows lead programmers to assess effort and uncertainty more objectively. This approach helps teams plan sprints realistically, anticipate technical debt earlier, and reduce the high-pressure “crunch” common during regulatory submissions. Discussion outlines how to implement this simple scoring system to align stakeholder expectations and navigate the complexities of clinical reporting with greater predictability and confidence.
Presenters






















