PharmaSUG 2024 Training Seminars
Enhance your PharmaSUG experience by attending optional pre- and post-conference training seminars taught by seasoned experts. Half-day courses are only $200 with a conference registration, or $300 without a conference registration. You can sign up for classes through the seminar registration system. Space is limited!
Note that the seminar registration system is separate from the conference registration system this year. You must register for the conference first in order to receive the $100 discount on the seminar registration fee.
Schedule
Sunday, May 19, 2024
| Course Title | Instructor(s) | Time | |
| #1 | End-to-End Electronic Submission Components for Regulatory Submission of Clinical Study Data | Prafulla Girase & Nate Freimark | 8:00 AM - 5:00 PM |
| #1-1 | Understanding and Creating Define-XML 2.1 using SAS® openCST | Lex Jansen | CANCELED |
| #1-2 | Introduction to the Production of Tables, Listings, and Figures (TLFs) Using Python, R, and SAS® | Kirk Lafler | 8:00 AM - 12:00 PM |
| #2-2 | Code Crunchers: Mastering Statistics for Programmers with Precision and Power | Jim Box | 8:00 AM - 12:00 PM |
| #2-1 | Navigating LLM / ChatGPT Revolution - From Its Potential to Practical Application | Kevin Lee | 1:00 PM - 5:00 PM |
| #1-3 | CDISC ARS Standards Training - Streamlining Analysis Results Reporting | Bhavin Busa & Bess LeRoy | 1:00 PM - 5:00 PM |
| #2-3 | Advanced ADaM Topics: Avoiding ADaM Pitfalls | Sandra Minjoe & Mario Widel | 1:00 PM - 5:00 PM |
Wednesday, May 22, 2024
| Course Title | Instructor(s) | Time | |
| #31 | Share Your Code with SAS Packages – Tutorial from 0 to Hero | Bartosz Jablonski | 1:00 PM - 5:00 PM |
| #32 | Hands-On Functions: How to Build Your Own User-Defined FCMP Functions and Macro Functions | Troy Martin Hughes | 1:00 PM - 5:00 PM |
| #33 | Mastering Statistical Hypothesis Testing Using R with Comparisons to SAS | Ryan Lafler & Daniela Nuñez | 1:00 PM - 5:00 PM |
| #34 | SDTM – A Deeper Dive Into the Basics and Beyond | Soumya Rajesh & Kristin Kelly | 1:00 PM - 5:00 PM |
Course Descriptions
End-to-End Electronic Submission Components for Regulatory Submission of Clinical Study Data
Prafulla Girase, Nate Freimark
Sunday, May 19, 2024, 8:00 AM – 5:00 PM
A regulatory submission of clinical study data also needs to be accompanied by various other electronic submission (eSUB) components such as Define-XML, annotated CRF, study data reviewer’s guide, analysis data reviewer’s guide etc. This seminar will take a deep dive into each of these components. It will educate attendees about various requirements, best practices, consistency checks etc. It will also go over key considerations related to preparation of a whole eSUB package for a submission such as folder structure considerations, PDF validation practices, final package checklist, regulatory hand-off etc.
Prerequisite: Very basic knowledge of eSUB components.
Introduction to the Production of Tables, Listings, and Figures (TLFs) Using Python, R, and SAS®
Kirk Lafler
Sunday, May 19, 2024, 8:00 AM – 12:00 PM
The streamlined production of Tables, Listings, and Figures (TLFs) in a clinical study represents essential elements of a Statistical Analysis Plan (SAP) by providing answers to regulatory questions along with the necessary support documentation for the clinical data contained in them. Tables are derived from source data and often involve the manipulation of data, listings represent the various reports that are generated in file formats like Rich Text Format (RTF), and figures representing graphical output that provides visual clarity and understanding of the data. This course provides attendees with an introduction to the production of TLFs using Python, R, and SAS® software.
Seminar Topics
- Introduction to Tables, Listings, and Figures (TLFs) in Clinical Studies.
- Terms, Abbreviations, Acronyms, and Definitions.
- Statistical Analysis Plan (SAP) Defined.
- Highlight a SAP for Clinical Studies.
- Review ICH E3 and Clinical Study Report (CSR) Template.
- Explore an Example Clinical Study Report (CSR).
- Explore the Types of Data: Nominal, Ordinal, Interval, and Ratio.
- A High-level Introduction to Python, R, and SAS software.
- Process of Data Preparation: Data Cleaning, Data Manipulation, Data Transformation, TLF Planning.
- Example Tables, Listings, and Figures (TLFs).
- Process of Producing Tables, Listings, and Figures (TLFs).
- Packages used to produce TLFs in Python: Rich Text Format (RTF) Output – Pandas, numpy, scikit, matplotlib.
- Packages used to produce TLFs in R: Rich Text Format (RTF) Output – Tidyverse (Data Manipulation), atable, r2rtf (Reporting in RTF Format).
- Procedures used to produce TLFs with SAS software: Categorical Data: PROCs – FREQ, SGPLOT, LOGISTIC, GENMOD, GLIMMIX; Continuous Data: PROCs – MEANS, UNIVARIATE, TTEST, NPAR1WAY, GLM, REG, MIXED, NLIN, NLMIXED; Graphing Data: PROCs – UNIVARIATE (Histograms, QQPlots), Statistical Graphics (SG) SGPLOT (Histograms, BoxPlots, BarCharts, ScatterPlots, RegressionPlots, ScatterPlot Matrix), Graph Template Language (GTL); Rich Text Format (RTF) Output: PROCs – SORT, FORMAT, SQL, TRANSPOSE, REPORT, TABULATE, Output Delivery System (ODS); Process of using ODS DOCUMENT in the Creation of TLFs.
- TLF Validation Activities.
Intended Audience: Programmers, Statistical Programmers, Clinical Scientists, Statisticians, Data Managers, and Others desiring to learn how to produce Tables, Listings, and Figures (TLFs) using Python, R, and SAS Software.
Prerequisites: Basic understanding of Statistics, but no previous Python, R, and SAS experience required
Delivery Method: Instructor-led tutorial with code examples
Seminar Material: e-Course Notes (PDF Format) and code are provided to Attendees.
Navigating LLM / ChatGPT Revolution - From Its Potential to Practical Application
Kevin Lee
Sunday, May 19, 2024, 1:00 PM – 5:00 PM
ChatGPT is at the forefront of the next revolution, and in this seminar, we’re about to embark on a journey that will demystify this remarkable technology. Imagine a virtual assistant that can comprehend and generate human-like text from Clinical Trial Data, a tool that can answer questions about specific patients, write SAS/R/Python codes, assist in content creation such as tables/listing/graphs, and even unleash its creativity in the realm of art. To truly harness its potential, you need to understand how to use ChatGPT in the art of prompt engineering, application development using ChatGPT API in Python/SAS and fine-tuning ChatGPT with your own data. In the seminar, we will embark on an exploration of ChatGPT that will equip you with the knowledge and skills to leverage its capabilities effectively while ensuring ethical and responsible use.
The seminar will cover various aspects of Large Language Models (LLM), with a focus on ChatGPT. It will begin with an introduction to LLM. The seminar then delves into ChatGPT, explaining its purpose, development history, and potential impact on organizations and individuals. The seminar will explore ChatGPT applications, including website prompts, API integration, Python coding and fine-tuning, and presents use cases ranging from simple inquiries, SAS coding, SAS migration to R/Python, to art generation. There’s an emphasis on how to effectively use ChatGPT through prompt engineering techniques. Concerns regarding ChatGPT, such as data privacy, bias, and ethical considerations, will be addressed. Finally, the seminar touches on enterprise-level ChatGPT implementation, discussing risk mitigation and regulatory compliance.
After attending the seminar, you will emerge with a comprehensive understanding of the LLM and ChatGPT phenomenon, including its architecture, practical use cases, prompt engineering, and API applications using Python/SAS. You will learn practical skills to effectively utilize ChatGPT in various domains, from effective prompts, content development, coding, art generation and more. Furthermore, you will gain insights into the ethical considerations surrounding LLM and ChatGPT, encompassing data privacy, bias, and regulatory compliance. You will also be equipped with knowledge about enterprise-level ChatGPT implementation and risk mitigation strategies. Ultimately, this seminar will empower you to leverage ChatGPT’s transformative potential while adhering to ethical and responsible ChatGPT practices in the Biometrics Department.
Back to top
CDISC ARS Standards Training – Streamlining Analysis Results Reporting
Bhavin Busa, Bess LeRoy
Sunday, May 19, 2024, 1:00 PM – 5:00 PM
CDISC Analysis Results Standard (ARS) is set to be a foundational standard and is scheduled to be released in Q1 2024. Join us for an in-depth seminar on the ARS logical model, an initiative to streamline analysis results, covering Tables, Figures, and Listings (TFL). Learn about the background and development of this logical model, geared towards enhancing automation, reproducibility, reusability, and traceability. This is your chance to be industry-ready. Discover insights into the current challenges analysts face in their workflows and our vision for the future state. We’ll introduce machine-readable analysis results metadata and structured analysis results data (ARD) representation, aiming to automate statistical outputs. Through practical examples, we’ll guide you through the model’s elements, demonstrating its potential to elevate traceability, reproducibility, and overall quality in clinical trial analysis and reporting. Reference the CDISC ARS model, project files, utilities, and docs on GitHub at: https://github.com/cdisc-org/analysis-results-standard.
For further information:
- ARS draft logical schema can be viewed at https://cdisc-org.github.io/analysis-results-standard/
- CDISC ARS User Guide is available at https://wiki.cdisc.org/display/ARSP/Analysis+Results+User+Guide
Advanced ADaM Topics: Avoiding ADaM Pitfalls
Sandra Minjoe, Mario Widel
Sunday, May 19, 2024, 1:00 PM – 5:00 PM
The seminar instructors have been working in the industry, volunteering on the CDISC ADaM team, and giving ADaM training for many years. This seminar pulls together common issues that they have seen and recommends ways to avoid them. Content will cover issues related to the ADaM classes of ADSL, BDS, OCCDS, and ADAM OTHER, including:
- When to use ADPL in addition to ADSL, vs. ADSL by itself
- When to use/not use DTYPE
- What content to put in PARAM vs. PARQUAL
- When to use BASETYPE and what can be used instead of BASETYPE
- How to incorporate FDA’s FMQs
- When to use class ADAM OTHER
Hands-on exercises will be included. Please bring a computer with Excel to get the most out of this seminar.
Share Your Code with SAS Packages – Tutorial from 0 to Hero
Bartosz Jablonski
Wednesday, May 22, 2024, 1:00 PM – 5:00 PM
When working with SAS code, especially when it becomes more and more complex, there is a point in time when a developer decides to break it into small pieces. The developer creates separate files for macros, formats/informats, and for functions or data too. Eventually the code is ready and tested and sooner or later you will want to share code with another SAS programmer. Maybe a friend has written a bunch of cool macros that will help you get your work done faster. Or maybe you have written a pack of functions that would be useful to your friend. There is a chance you have developed a program using local PC SAS, and you want to deploy it to a server, perhaps with a different OS. If your code is complex (with dependencies such as multiple macros, formats, datasets, etc.), it can be difficult to share. Often when you try to share code, the receiver will quickly encounter an error because of a missing helper macro, missing format, or whatever… Small challenge, isn’t it?
How nice it would be to have it all (i.e. the code and its structure) wrapped up in a single file – a SAS package – which could be copied and deployed, independent from OS, with a one-liner like: %loadPackage(MyPackage).
In the seminar an idea of how to create such a SAS package in a fast and convenient way, using the SAS Packages Framework, will be shared. We will discuss:
- concept of a package,
- the framework
- overview of the process, and
- how to build a package.
The intended audience for the presentation is intermediate or advanced SAS developers (i.e. with good knowledge of Base SAS and practice in macro programming) who want to learn how to share their code with others. All materials are publicly available at seminar’s GitHub: https://github.com/yabwon/HoW-SASPackages.
Hands-On Functions: How to Build Your Own User-Defined FCMP Functions and Macro Functions
Troy Martin Hughes
Wednesday, May 22, 2024, 1:00 PM – 5:00 PM
Attend and receive a FREE copy of the author’s 550-page book, SAS® Data-Driven Development: From Abstract Design to Dynamic Functionality, Second Edition, released in 2022! Students will receive the physical book at the training!
“User-defined” functions are those functions that are created by SAS users, as contrasted with “built-in” functions that are part of out-of-the-box Base SAS functionality. SAS provides two methods to build user-defined functions, including the SAS macro language and the SAS Function Compiler (aka PROC FCMP). This introductory course demonstrates how to build user-defined functions (and subroutines)—including both macro functions and FCMP functions. No prior experience with the SAS macro language or PROC FCMP syntax is required. User-defined functions improve software reusability—that is, the ability of code modules to be reused in future software projects, and to be reused by multiple SAS users within a team or organization. Reusability enables a function to be developed once but used repeatedly, which reduces the workload of the SAS users who are writing programs, by enabling us to rely on previously built (and fully tested) code modules. Thus, user-defined functions facilitate more flexible and configurable software, as well as a more productive, efficient SAS team.
This HANDS-ON workshop enables students to run all programs in real-time using SAS Display Manager, SAS Enterprise Guide, or SAS OnDemand for Academics. FCMP function topics comprise approximately 2/3 of the course, and include:
- Gentle introduction to PROC FCMP syntax and the construction of user-defined functions and subroutines (with the FUNCTION and SUBROUTINE statements, respectively)
- Use of OUTARGS option to modify multiple arguments (within the calling program)
- Passing character and/or numeric data types to functions
- Passing arrays to functions, and utilizing arrays within functions
- Declaring, initializing, and referencing hash objects within functions
- Calling functions and subroutines from the DATA step, and from %SYSFUNC and %SYSCALL
- Calling functions from PROC FORMAT
Macro function topics comprise approximately 1/3 of the course, and include:
- Gentle introduction to the SAS macro language, including differentiation between SAS macros and SAS macro functions
- Differentiation between positional and keyword parameters
- Defining optional parameters and default parameter values
- Passing macro lists and two-dimensional data structures to functions
- Use of the PARMBUFF option in the %MACRO statement to facilitate multi-element arguments
- Macro function argument validation, exception handling, and use of global macro variables as return values / return codes
Mastering Statistical Hypothesis Testing Using R with Comparisons to SAS
Ryan Lafler, Daniela Nuñez
Wednesday, May 22, 2024, 1:00 PM – 5:00 PM
This half-day course is open to all aspiring and experienced data scientists, statisticians, bioinformatics scientists, and clinical programmers interested in understanding, designing, and developing parametric and non-parametric statistical hypothesis tests for clinical experiments. This course leverages the R and SAS programming languages to conduct statistical hypothesis testing using real-world examples geared towards the pharmaceutical industry, clinical trials, and the biological and life sciences. Attendees are given a rigorous introduction to frequentist hypothesis testing including discussions about parametric statistical distributions, significance levels, error rates, effect sizes, statistical power, standard errors, confidence intervals, and p-values. Attendees also learn about strategies for successful experimental design, controlling for confounding and lurking covariates, handling missing values, and assessing causation against correlation.
Several parametric hypothesis tests including t-tests, Chi-Squared tests, One-Way ANOVA (Analysis of Variance), Factorial ANOVA, and One-Way MANOVA (Multivariate Analysis of Variance) are covered in R with comparisons to SAS, including a thorough discussion of each test’s assumptions, use-cases, output, and limitations. Frequently used non-parametric equivalents including the Mann-Whitney U test, the Wilcoxon Signed-Rank test, and the Kruskal-Wallis test are similarly investigated and developed in R.
By enrolling in this course, each attendee receives the documented R and SAS code files, their personal copy of the PDF version of the slides, and the confidence to successfully perform statistical hypothesis testing in their organization.
SDTM – A Deeper Dive into the Basics and Beyond
Soumya Rajesh, Kristin Kelly
Wednesday, May 22, 2024, 1:00 PM – 5:00 PM
While SDTM & the SDTMIG have been around for a while, we often need to refresh our memories about the nuances and nitty gritty details surrounding this fundamental standard. This interactive seminar is tailored to cover not only some of the basics of SDTM, but also those topics that, in our experience, pose as challenges to sponsors and programming teams. These include:
- “Why SDTM”: the background and purpose of SDTM
- “How to do SDTM”: high-level concepts about SDTM IG/Model, CT etc.
- “Ideas” – deeper dive into examples (assumptions, things from knowledge base, etc)
- New domains and variables introduced in the SDTMIG v3.4, and how to include some of these in the current (3.3) version per FDA request,
- How to handle visit occurrences in SV
- Collected versus derived Exposure data, keeping null permissable variables, etc.
The seminar will also include examples and exercises that highlight some of the topics listed above, that could be used to generate discussion, and get an assessment the overall understanding of SDTM by the attendees.
The audience level that this seminar is targeted for is beginner to intermediate – individuals who are new to the pharmaceutical industry: life science / data management professionals/ programmers / statisticians, etc. This would also help experienced SDTM programmers who have created submission datasets in the past who are looking for a refresher on recent changes to the standards.
Instructors












