Paper presentations are the heart of a SAS users group meeting. PharmaSUG 2017 will feature over 200 paper presentations, posters, and hands-on workshops. Papers are organized into 15 academic sections and cover a variety of topics and experience levels. Detailed schedule information will be added in May.
Note: This information is subject to change. Last updated 23-Mar-2017.
Beyond the Basics
|Paper No.||Author(s)||Paper Title (click for abstract)|
|CP01||Kelly Spak||Creating a Winning Resume Reflecting Your Skills and Experience With Honesty and Integrity.|
|CP02||Janet Stuelpner||The Road Not Often Taken|
|CP03||Edward Slezinger||What Makes a Candidate Stand Out?|
|CP04||Greg Nelson||The Elusive Data Scientist: Real-world analytic competencies|
|CP05||Kathy Bradrick||Strategies and decisions for marketing yourself as an independent contractor|
& Sree Harsha Sreerama Reddy
|The Road Not Taken: SAS for Pharmacometrics|
|CP07||Christine Young||Becoming a Successful Manager of SAS® Programmers from an Ex-Programmer's Perspective|
|CP08||Priscilla Gathoni||Schoveing Series 3: Living by Knowing When to STOP|
& Kevin Lee
|Show Me The Job!!!|
|CP10||Yingqiu Yvette Liu||Career development for SAS programmers in biopharmaceutical industry|
|Paper No.||Author(s)||Paper Title (click for abstract)|
|DA01||Ashok Gunuganti||Real Time Clinical Trial Oversight with SAS|
|DA02||Harivardhan Jampala||Let's Check the Data Integrity using Statistical Programmer with SAS|
& Jameson Cai
|Conversion of CDISC specifications to CDISC data - specifications driven SAS programming for CDISC data mapping|
|DA04||Eric Kammer||Data Change Report|
|DA05||Brian Armstrong||Clinical and Vendor Database Harmony; Can't we all just get along?|
|DA06||Paul Stutzman||Check Your Data: Tools for Automating Data Assessment|
& Sonal Torawane
|Data Integrity: One step before SDTM|
Data Visualizations & Graphics
|Paper No.||Author(s)||Paper Title (click for abstract)|
|HT01||Art Carpenter||Five Ways to Create Macro Variables: A Short Introduction to the Macro Language|
|HT02||Kirk Paul Lafler
& Mira Shapiro
& Ryan Paul Lafler
|Point-and-Click Programming Using SAS® Enterprise Guide®|
& Rebecca Ottesen
|Survival 101 - Just Learning to Survive|
|HT04||Vince Delgobbo||New for SAS® 9.4: Including Text and Graphics in Your Microsoft Excel Workbooks, Part 2|
|HT05||Jim Box||SAS Studio - the next evolution of SAS programming environments|
& Michael Digiantomasso
|Usage of Pinnacle 21 Community Toolset 2.x.x for Clinical Programmers|
|HT07||Bill Coar||Single File Deliverables: Next Steps|
|Paper No.||Author(s)||Paper Title (click for abstract)|
|IB01||Greg Nelson||A Practical Guide to Healthcare Data: Tips, traps and techniques|
|IB02||Maria Dalton||Good Programming Practices at Every Level|
& Wen Tan
|How to define Treatment Emergent Adverse Event (TEAE) in crossover clinical trials|
|IB04||Kevin Lee||How to find the best MDR solution for your organization|
& Mike Lozano
|SDTM Cartography - Learn To Create SDTM Mapping Specifications|
& Bill Coar
|Data Monitoring Committee Report Programming: Considering a Risk-Based Approach to Quality Control|
& Mario Widel
& Richard Addy
|Building a Fast Track for CDISC: Practical Ways to Support Consistent, Fast and Efficient SDTM Delivery|
Management & Support
Statistics & Pharmacokinetics
|Paper No.||Author(s)||Paper Title (click for abstract)|
& Scott Kosten
|Multiple Imputation: A Statistical Programming Story|
& Hangtao Xu
|Multiplicity Controlled Analyses Using SAS/IML|
|SP03||Ronald Smith||Using Prentice-Williams-Peterson Gap-Time Model and PROC PHREG to analyze recurrent events data in Clinical Trials|
|SP04||Sharmeen Reza||Population PK/PD Analysis - SAS® with R and NONMEM® Make Customization Easy|
|SP05||Jonathan L Moscovici
& Bohdana Ratitch
|Combining Survival Analysis Results after Multiple Imputation of Censored Event Times|
|SP06||Kriss Harris||Adverse Event Data over Time|
|SP07||Meda Sammanna||Modelling and analysis of recurrent event data|
Techniques & Tutorials
Applications DevelopmentAD01 : Here Comes The Smart Mock Table: A Novel Way of Creating Clinical Summary Tables Without Any Table Programming (No Kidding!)
Joseph Hinson, inVentiv Health
The creation of a clinical summary table typically involves two main phases: the statistical analyses of data, and the presentation of the analyses results onto a layout predefined by a mock table. The first phase can be pretty straightforward, simply involving the calling of SAS statistical procedures plus a few DATA steps. The second phase, a relatively time-consuming part, would constitute taking those procedure outputs and programming their placements in specified positions to become the summary table. It turns out that this second phase might not even be necessary. The mock table can be made to populate itself with analyses results! Such a "smart" mock table can be made by embedding macro calls directly in the RTF mock document. These macros would contain SAS procedures or DATA step codes wrapped inside a DOSUBL function and called by a %SYSFUNC to generate single macro variables. Finally, the entire smart RTF mock table, can be placed on the SAS Input Stack as a %INCLUDE for macro processing using the new SAS 9.4 STREAM Procedure. Such an approach could potentially allow programmers to focus entirely on data analyses, significantly shortening turnaround times for deliverables. Table cosmetic changes could be done at the mock table level with no need for reprogramming. Data point repositioning could also be implemented directly on the mock table by simply relocating the macro calls. The technique in fact is applicable to any WORD document, including the CONSORT Flow Diagram (see author's other paper).
AD02 : Automated Generation of Clinical Study Reports using SAS® and RTF (Literate programming)
Rajaram Venkatesan, Cognizant Technology Solution
Julien Sauser, Nestle Research Center
Carlos Antonio De Castro, Nestle Research Center
Clinical study report (CSR) is the final milestone in any clinical trial. A CSR typically consists in Table of Contents (TOC), background of study, interpretation of results, tables, listings and figures (TLF). The process of populating TLFs, updating TLF numbers, cross-references and incorporating results in a CSR are laborious and potentially error prone. An 'automated' technique is presented in this paper which uses SAS® and literate programming concept to generate a CSR document populated with specified TLFs, text and cross-references. CSR written using literate programming reduces significant amount of time post database lock (DBL). Literate programming efficiently combines the text and the statistical analysis in one single SAS source file which can be run immediately after unblinding the data to automatically produce the CSR in MS word and PDF format. The method can also be used to produce reports that need to be updated on an ongoing basis such as Data Quality Review report as the single source document can be reused and updated multiple times without applying any change. The first draft of the CSR or any other document can then be ready in hours as opposed to days post unblinding.
AD03 : Three Issues and Corresponding Work-Around Solution for Generating Define.xml 2.0 Using Pinnacle 21 Enterprise
Jeff Xia, Merck
This paper discusses three technical issues that occur in define.xml 2.0 files generated by Pinnacle 21 Enterprise, explains their causes, and presents an integrated work-around solution to resolve them with a simple SAS® macro. As per the latest "Standards Catalog" released by the FDA, the agency will stop accepting define.xml 1.0 in NDA/BLA submissions starting from March 2018, which will require the industry to move to define.xml 2.0. The web-based application Pinnacle 21 Enterprise is emerging as a useful tool in generating define.xml 2.0 in a high quality format that meets the agency's expectation for submissions. We identified three technical issues during our submission process when we generated define.xml 2.0 using Pinnacle 21 Enterprise: 1) certain hyperlinks to external files do not work as expected and behave like broken links in Microsoft Internet Explorer (IE) when the files are saved in Windows NTFS system; 2) the order of domains in the table of contents does not follow CDISC recommendations and industry convention; 3) the text of the define.xml does not wrap properly in some windows-based text editors. This improper formatting causes difficulty when reviewing the xml syntax or performing minor updates using these editors. These issues can be resolved by the solution presented in this paper.
AD04 : A Vivid and Efficient Way to Highlight Changes in SAS Dataset Comparison
Jeff Xia, Merck
Lugang Larry Xie, Merck & Co.
Shunbing Zhao, Merck & Co.
How to find out the updates/changes in periodic data extractions in an ongoing study with the least efforts is an important practical question. Proc Compare, a conventional method widely used in the industry, might not be good enough to meet the business needs because of its limitations and gray areas in dataset comparison. This paper presents a method to programmatically identify updates in a dataset between two different versions, and produce a report in MS Excel spreadsheet format with highlighted changes/updates in a vivid way. Despite the amount of data points in the dataset, the data changes between transfers might be just very limited. This method provides an efficient way to bring reviewers attention only to these data changes, and greatly cut down unnecessary review burdens.
AD05 : An Efficient Solution to Efficacy ADaM Design and Implementation
Chengxin Li, Pfizer
Zhongwei Zhou, Pfizer
This paper investigates some standardization methods for the design and implementation of ADSL and efficacy ADEFF datasets. For ADSL, based on rigid SDTM common domains, and ADSL components and functions, ADSL variables are designated into categories of global, project, and study (GPS). The global variables (approximately 80% of all ADSL variables) in ADSL are specified, derived, and validated only once within a company; the project variables can be further managed within a therapeutic area or an indication; and study variables are handled at specific study level. A global macro is developed to implement the ADSL processing, where the macro is called for deriving "G" variables and "P" level variables, respectively. The "S" level variables are added from study programming team. Therefore the programming team can focus mainly on study specific variable derivations. For ADEFF, this paper introduces a two-layer ADaM design method for generating the efficacy endpoints dataset. The first layer is an interim dataset developed with timing windows and imputation rules. Derived from the first layer dataset and used for supporting all the efficacy endpoint analyses, the second layer is an endpoints dataset holding either binary or continuous endpoints in a vertical structure or horizontal structure. In each layer, the derivation flows in sequential steps; the individual steps are maximally macrotized. With this approach, the complicated concepts are divided into simple manageable steps, which are assembled together and further polished (i.e., aligning metadata with specifications) in production. For traceability, the first layer dataset is recommended for submission as well.
AD06 : Obtaining an Automated Summary of PROC COMPARE Results: &SYSINFO and VB Script
Umesh Gautam, Trial Runners
Jeff Roberts, Trial Runners
In the life cycle of a clinical analysis project it's often necessary to do complete validation of data sets and outputs multiple times as data sets and outputs are created more than once for draft, dry run, data cuts, etc. During program development there's no substitute for solid program construction and close attention to the comparison results of double programming validation. However, once robust production and validation programs are in place, the validation of programs and outputs against new data may be obtained much more efficiently with automated comparison and summary of the results. This paper will show how such a process may be implemented using SAS® automatic variables and VBScript.
AD07 : Work with me here...(or there or anywhere): Practical Cloud Collaboration with JMP Clinical for Clinical Trial Data Reviews
Kelci Miclaus, SAS
Drew Foglia, SAS Institute, Inc.
JMP Clinical is a solution built on powerful analytics and data visualization of both SAS and JMP software. It provides a rich environment for generating clinical trials reviews driven by CDISC data standards (SDTM/ADaM) for safety analysis, fraud detection and data integrity/quality, operational monitoring and oversight, and data management. Configuration options to enable study and review sharing allow the solution to be a part of collaborative architecture for clinical trials data analysis and review. A scenario of such sharing using mapped network drives will be shown for generating clinical trial safety analysis reports including patient profiles and auto-generated patient narratives for medical monitor review. With the increasing popularity and acceptance of cloud-based storage services, collaborating with colleagues, partners, regulatory agencies, and customers has never been easier. This presentation will also illustrate how, using Google Drive, JMP Clinical enables biostatisticians to author and instantly share safety, efficacy, and operational reviews with monitors, medical writers, and sponsor executives. Furthermore, with automatic syncing capabilities, reviewer-generated status updates and comments are collected even while working offline. Once reconnected, reviewers receive notifications that alert them to the presence of new reviews or comments. This infrastructure supports a more flexible working environment, providing easy access of up-to-the-minute reports for personnel working in the field, even when access to the internet is tenuous. We demonstrate a simple example for the scenario of centrally-located pharmaceutical research staff authoring content and distributing it to monitors engaged in audits at clinical sites, an undertaking with potential global reach
AD08 : Building a Better Dashboard Using SAS® Base Software
Kirk Paul Lafler, Software Intelligence Corporation
Josh Horstman, Nested Loop Consulting
Roger Muller, Data To Events, Inc
Organizations around the world develop business intelligence dashboards to display the current status of "point-in-time" metrics and key performance indicators. Effectively designed dashboards often extract real-time data from multiple sources for the purpose of highlighting important information, numbers, tables, statistics, metrics, and other content on a single screen. This presentation introduces basic rules for "good" dashboard design and the metrics frequently used in dashboards, to build a simple drill-down dashboard using the DATA step, PROC FORMAT, PROC PRINT, PROC MEANS, ODS, ODS Statistical Graphics, PROC SGPLOT and PROC SGPANEL in SAS® Base software.
AD09 : Important and Valuable Things You Can Do with SAS® Metadata DICTIONARY Tables and SASHELP Views
Kirk Paul Lafler, Software Intelligence Corporation
SAS® users can easily and quickly access metadata content with a number of read-only SAS data sets called DICTIONARY tables or their counterparts, SASHELP views. During a SAS session, information (known as metadata) is captured including SAS system options along with their default values, assigned librefs, table names, column names and attributes, formats, indexes, and more. This presentation explores how metadata can be used as input into a SAS code generator or a SAS macro to produce the desired results, the application of specific DICTIONARY table and SASHELP view content, examples related to the creation of dynamic code, and the creation of a data dictionary.
AD10 : Laboratory Data Standardization with SAS
In clinical trials, lab data usually come from central labs and/or many different local labs. Different labs usually have different lab test names, units and ranges. In order to pool and analyze these data efficiently and correctly, central/local lab test names, units and results have to be converted to standard lab test names, units, results. We built a SAS-based lab standardization system. This system will map local lab test names to SDTM test names, map various lab units to standard units, assign standard units to lab tests, maintain conversion factor tables, do numeric conversions for lab results and ranges. It can also standardize character lab results, creates and maintains character ranges. Finally, a dozen of lab reports are created based on the standard data to facilitate lab data review and cleaning.
AD11 : Generating Homeomorphically Irreducible Trees (Solving the Blackboard Problem in the Movie "Good Will Hunting")
John R Gerlach, Dataceutics, Inc.
In the movie, Good Will Hunting (1997), a mathematics professor challenges his students to draw all Homeomorphically Irreducible Trees of Order Ten, that is, a collection of trees each having ten dots connected by lines. The well-known blackboard problem in the movie poses a formidable challenge, especially for larger trees having twenty or thirty nodes. It would require an extremely large blackboard to draw all the trees, as well as to erase those deemed redundant or incorrect. This paper explains a SAS® solution for generating Homeomorphically Irreducible Trees of order N.
AD12 : Using SAS® ODS EXCEL Destination with SAS University Edition® to send graphs to Excel
William E Benjamin Jr, Owl Computer Consultancy LLC
Students now have access to a SAS® learning tool called SAS University Edition®. This online tool is freely available to all, for non-commercial use. This means it is basically a free version of SAS that can be used to teach yourself, or someone else how to use SAS. Since a large part of my body of writings has focused upon moving data between SAS and Excel, I thought I would take some time to highlight the tasks that permit movement of data between SAS and Excel using the SAS University Edition software. This paper will be directed towards sending graphs to Excel using the new ODS EXCEL destination software.
AD13 : Using SAS® for Application Programming Interface Requests
Mike Jadoo, SAS user
Application Programming Interface (API) methods of requesting data has been around for some time. Their primary uses are for front-end webs developers who wish to use these respective data sets for charts, tables, and maps. However, this method of data request can also be useful for data processing and analysis as fewer steps in the data products process can be achieved. In the paper we will discover how an API request can remove steps in the data production process and how to make a request form major statistical account producers. Moreover, some useful tips will be relieved when using this method.
AD14 : Developing Standardized Clinical Review Tools Using Shiny In R
Jimmy Wong, FDA
At FDA/CDER's Office of Biostatistics, development of standardized, interactive tools for reviewers has begun with the implementation of routine analyses and programming tasks in the Shiny framework. Shiny is an R library that allows statistical programmers to develop web applications within the realm of R, either building from existing base code or starting off from scratch. Shiny applications are inherently interactive with controls for users to specify and outputs displaying the end results. Along with Shiny, other relevant R libraries can be utilized to enhance features of the user experience. This approach supports the office goals of increasing efficiency in clinical reviews and lightening the workload of reviewers, who are required to perform such routine work. Examples of the Shiny applications that are currently under development include producing visualizations for patient-reported outcomes data, analyzing patient perception of their health status in psychometrics studies, and generating subgroup forest plots. This presentation will focus on the significance of standardized tools in clinical reviews with case study examples that illustrate the development process of such tools. The intended audience should have at least a basic to intermediate knowledge of R.
AD15 : Excel-VBA Tool to Auto-Create Validation log and Review Form using List of TLF's.
Balaji Ayyappan, inVentiv Clinical
When a clinical study gets started we start with the List of Reports (Tables, Listings, Figures and Appendices) to be created for interim analysis (like BDR's, DMC, SRT deliveries) and Final CSR delivery. Creating and maintaining the Validation Log and Review Form documents are vital and important for the submission process. We created a tool to auto create the Validation Log and Client Review Form. Validation Log File is pre-defined columns with corresponding TLF's - where the programmers/statisticians need to enter their work status, comments, validation comments, initials and dates during work progress. Review log Sheet - where clients fill their comments, issues/solutions discussed to during review meeting, initials and dates. Using this tool we can also create the framework for the Shell development document by outputting the List of reports to a Word file with corresponding Table Numbers and Titles with links that are used to create the TOC section automatically. This tool helps to create these documents in time efficient way and avoid manual error. This tool is developed using Excel-VBA technology.
AD16 : Migration to SAS Grid: Steps, Successes, and Obstacles for Performance Qualification Script Testing
Amanda Lopuski, Chiltern
Yongxuan Mike Tan, Chiltern
Through detailed planning and preparation activities, we set our sights on a successful migration of our global macros and UNIX utilities to a Linux based system on SAS Grid. But how did our planning and preparations fare when it was go time? In this paper we will reflect back on the steps taken to prepare for sandbox and formal testing of our performance qualification macro and utility scripts. We will share the successes and obstacles faced in this new environment during sandbox testing, the methods we developed to mitigate issues as they occurred, and how we prepared for clean execution in the formal testing environment. Then, we will share feedback from our testing activities in the formal validation environment and conclude with takeaways from this experience. This paper will be useful to any group with sights similar to ours: migration to SAS Grid. It will provide valuable, first-hand insight into the migration with takeaway information to improve the process for others.
AD17 : Metadata integrated programming
Jesper Zeth, Novo Nordisk A/S
Jan Skowronski, Novo Nordisk A/S
With the growing complexity of pharmaceutical projects it is becoming increasingly relevant to address how we are programming, and how we ensure that our programming is as flexible as possible with regards to implementing changes and re-using programs across projects. It is essential to be able to implement requested changes as fast as possible, with certainty that the changes are implemented everywhere needed, and that we are able to scale changes across projects, trials and subjects. One of the important tools that can be used in this context is integrating metadata in both data- and output programming. When designed properly the use of metadata in programming provides the ability to implement a variety of changes across datasets and outputs, by only changing the actual metadata, making changes both very fast and uniform. The purpose of this paper is to illustrate how metadata can be integrated in both data- and output programming tasks. The examples in the paper are based on experience acquired as part of a Novo Nordisk submission team. Metadata was to a large extent used in all data- and output programming, and a number of tools were developed using these metadata. Finally the paper will elaborate on learnings following the chosen metadata model and the challenges that were experienced in the process.
AD18 : Leveraging Centralized Programming Resources Using SAS Integration Technologies
Tony Chang, Global Statistical Programming, Amgen Inc
Remote access from multiple locations to centralized SAS services can greatly benefit to the globalization of pharmaceutical and bio-tech industry. SAS Integration Technologies allows client application to interact with SAS services from any remote sites where different system and infrastructure may be used. This paper describes a custom client application that we developed using SAS Integration Technologies. The application was used for a subsidiary company running medical dictionary coding for their clinical study data by leveraging SAS modules that invoke Amgen's centralized coding system. The tasks accomplished by our client application are explained in this paper, which includes dynamically generating SAS program by the client application, running SAS program on the remote server, and transferring data sets between the client machine and the SAS server. The benefits of the client application using SAS Integration Technologies are also discussed.
AD19 : SAS application to automate a comprehensive review of the define.xml and all of its's components
Walter Hufford, Novartis Pharmaceuticals
Vincent Guo, Novartis Pharmaceuticals
Mijun Hu, Novartis Pharmaceuticals
The define.xml is a large electronic document comprised of many different but interrelated components such as blankcrf.pdf, Data Reviewer's Guide and metadata. Further complicating the situation is that most electronic submissions contain several studies and an SCS and an SCE, each of which requires their own define. Reviewing the define.xml to ensure consistency, accuracy and completeness within a single define as well as across defines is both time consuming and resource intensive (and often mind-numbing if you have a large submission with many defines to review). Automating review of the define.xml can be achieved with a few simple, easy to develop, SAS® macros. The result is a much quicker review requiring substantially less resources. In addition findings are noted in a standardized manner allowing for quicker issue resolution and tracking. We will describe our define.xml review tool in detail and provide code snippets from our macros which should allow you to implement a similar tool with little effort.
AD20 : Challenge for Dramatic Streamlining of TLFs Creation Process with SAS and Java
Yohei Takanami, Takeda Pharmaceutical Company, Ltd.
Fumihiro Yamasaki, Takeda Pharmaceutical Company, Ltd.
It requires a large amount of time and effort to create tables, listings and figures (TLFs) and conduct QC/Verification of them in clinical trials. In this paper, we will introduce a GUI-based system with SAS and Java that enables users to generate over 90% of TLFs in typical clinical trials. The system has been successfully implemented in the actual TLFs creation process in Takeda in Japan, which has resulted in the reduction of workload, time and cost with keeping high quality. Key factors and considerations for streamlining TLFs creation process will also be discussed.
AD21 : Integration of multiple files into a study report using Word VBA macros
Heather Wood, DCRI
Jack Shostak, DCRI
This paper provides an overview of a report integration process which uses Word VBA macros and takes multiple individual RTF/DOC/DOCX files that have been produced by SAS programs and compiles them all into a study report. The process begins with a TXT file which lists the names and paths of all the outputs that are desired in the study report, in the order in which they need to appear. The report integration VBA macros use this TXT file to identify and process each output for the study report. First, the section breaks which SAS PROC REPORT puts at the end of every page of an output are replaced with page breaks. Then, the outputs are all compiled together with a new section break separator between each output. The result is a study report compiled from several outputs with one section break between each output in the report, rather than one section break between each page of the report. After the report is compiled the report integration macros will update the page numbering and create and append a Table of Contents. Finally, bookmarks are added and a bookmarked PDF version of the study report is created. This paper is for anyone who creates and compiles outputs for study reporting, who uses SAS version 9 or above. Some familiarity with Visual Basic for Applications may help but is not necessary.
AD22 : A novel method to track mapping of all CRF variables into SDTM datasets.
Aishhwaryapriya Elamathivadivambigai, Seattle Genetics
A common approach to ensure mapping of all CRF variables into SDTM datasets is by manually comparing annotated CRF pages with mapping documents. However, this approach is highly time consuming, error prone and more tedious, especially in situations when new variables are added to CRFs as the study progresses. Hence, a user friendly automated approach or a technique that could facilitate evaluating whether all CRF variables are successfully mapped to SDTM domains is highly desirable.So, in an attempt to provide a solution, this paper introduces a method that can automatically evaluate whether or not all CRF variables are properly mapped to SDTM datasets and also report a list of CRF variables that are yet to be mapped to the user. This approach accurately captures all the CRF variables from the CRF document and compares each variable with the SDTM program log files. The programming techniques used for this approach and a working example will be described in this paper.
AD23 : Supporting the CDISC Validation Life-Cycle with Microsoft Excel VBA
Eric Crockett, Chiltern International
Clinical research is increasingly based on standardized (CDISC) data. Evaluating whether the data conforms to the applicable CDISC standards is required and is often an iterative process done over the course of a study. Documenting the explanations for issues that cannot be resolved and tracking the trends in conformance findings over the life-cycle of a study can easily turn into a burdensome manual process repeated with each Pinnacle 21 report. Utilizing Microsoft Excel Visual Basic for Applications (VBA) macro code that will be provided and discussed, successive validation reports can be compared and explanatory comments can be migrated. Leveraging this automated application can dramatically reduce the time spent tracking, evaluating and documenting conformance findings.
AD24 : SAS Macro for Derivation of Best Overall Response per RECIST v. 1.1
Bob Zhong, Johnson & Johnson
Jiangfan Li, Johnson & Johnson
Hong Xie, Johnson & Johnson
Peter De Porre, Johnson & Johnson
Kenneth Maahs, Johnson & Johnson
Kyounghwa Bae, Johnson & Johnson
Response Evaluation Criteria in Solid Tumors version 1.1 (RECIST 1.1) has been quickly adopted since its publication in 2009 as an efficacy assessment standard in oncology clinical trials in solid tumors. However, the rules for determining best overall response (BOR) need practical considerations and certain clarifications in detail to handle real-world data. We addressed some ambiguities in RECIST 1.1 and provided detailed handlings required to develop a SAS macro for an accurate derivation of BOR. BOR is derived from time-point overall responses of CR, PR, SD, PD, NE, and Unknown. To derive BOR, non-critical responses that do not affect derivation of BOR are ignored first. Then, only the first three time-point overall responses are included. Complicated cases such as PR-SD-PR and unconfirmed CR or PR are calculated in more detail. Our macro can be applied to clinical trials with/without requirement of CR/PR confirmation, and in both interim analyses (potential future data available) and final analysis (no future data). This macro will allow for both accurate and timely determination of best overall responses via computer programming.
AD25 : Controlled Terminology Without Excel
Mike Molter, Wright Ave Partners
The world of clinical data standards brought with it plenty of rules about metadata. This domain must have these variables, those variable values must be no longer than 8 characters, these other variables are restricted to certain values, etc. What standards came up short on was how/where these rules could be stored and maintained, as well as any processes for implementing them. Nowhere is this more evident than controlled terminology. Standard metadata such as NCI controlled terminology has been made available through Excel, and the rules for implementing these standards are stored in PDF documents. Much of the industry has defaulted to Excel for the collection and maintenance of study metadata. While on the surface, copying and pasting codelists from the NCI spreadsheet to a study spreadsheet may seem straightforward, Excel doesn't have the capability to enforce rules such as codelist extensions, sponsor-defined codelists, allowable data types, and subsetting a codelist multiple ways for multiple variables. Additionally, controlling access to a centralized Excel file has its challenges. This paper presents a simple controlled terminology application which features storage and maintenance through a graph database. This paper will show similarities and differences between graph databases and SAS data sets. A web interface that guides a user through defining controlled terminology for a study, following CDISC rules, without copying and pasting, and without Excel, will be demonstrated. Users will discover the value of standards stored in a database and controlled access through web forms.
AD26 : Stop waiting - Get notification email at the end of SAS® code execution using SAS® EMAIL Engine
Nirav Darji, GCE Solutions pvt ltd
DATA step and statements like FILENAME and FILE can be used for automatically sending custom e-mails. This utility can help programmers in cases where SAS codes take lot of time to complete the execution and we kept waiting for completion. It can also help when the teams are working in different time zones/Shifts and team working in another time zone or next shift get notified once the execution gets completed. We will see how to send email with or without attachment (HTML, EXCEL, SAS files etc.) or personalized email and how to use macro to send email when it fulfills certain conditions.
Beyond the BasicsBB01 : One Project, Two Teams: The Unblind leading the Blind
Kristen Harrington, Rho, Inc.
In the pharmaceutical world, there are instances where multiple independent programming teams exist to ensure blinded treatments are maintained by the appropriate parties. For the purposes of this discussion, blinded corresponds to fake data and unblinded corresponds to actual data. Within these projects, blinded programmers use temporary fake data to create programs which produce both blinded and unblinded results. To ensure the blind is maintained and ethics are upheld, only the blinded programming team produces and modifies the programs. While robust programming is key, another major contributing factor for success is communication. This discussion will explore the process of initializing a project supported by blinded and unblinded teams , successful communication techniques when real data comes into play, and ways to effectively troubleshoot validation issues without providing unblinding information.
BB02 : Making Documents 'Intelligent' with Embedded Macro Calls, DOSUBL and Proc STREAM: An example with the CONSORT Flow Diagram
Joseph Hinson, inVentiv Health
Documents can be considered 'intelligent' if they can self-process parts of themselves. One way is to embed them with macro elements. Such macro-laden documents can then be placed on the SAS Input Stack for macro processing. However documents like RTF, XML, and HTML, tend to have extraneous codes that would violate SAS syntax if placed on the Input Stack as is. Placing non-SAS documents on the Input Stack therefore requires the STREAM Procedure, which by disabling the SAS Compiler can allow the intact document codes mixed with macro elements. Once the Macro Facility has resolved all the macro elements, the document is streamed back to a file location. Having document templates embedded with macro variables is nothing new, but until now, the role had been limited to just text substitutions. By embedding documents with actual macro calls, self-processing of documents become possible. In such cases, the DATA steps and procedures within the macros need to be wrapped inside a DOSUBL function and called with a %SYSFUNC, to force computation and prevent SAS code from being streamed out to the output file. Such an approach is ideal for documents like the Consolidated Standards Of Reporting Trials (CONSORT) Flow Diagram, which depicts the progress through the phases of a clinical trial (enrolment, intervention allocation, follow-up, and data analysis) by showing the counts of study participants for each phase. With an intelligent CONSORT template, the counts are replaced by macro calls such that the flow diagram undergoes self-processing when passed through Proc STREAM.
BB03 : The REPORT Procedure and ODS Destination for Microsoft Excel: The Smarter, Faster Way to Create First-Rate Excel Reports
Jane Eslinger, SAS Institute
Does your job require you to create reports in Microsoft Excel on a quarterly, monthly, or even weekly basis? Are you creating all or part of these reports by hand, referencing another sheet containing rows and rows and rows of data? If so, stop! There is a better way! The new ODS destination for Excel enables you to create native Excel files directly from SAS. Now you can include just the data you need, create great-looking tabular output, and do it all in a fraction of the time! This paper shows you how to use PROC REPORT to create polished tables that contain formulas, colored cells, and other customized formatting. Also presented in the paper are the destination options used to create various workbook structures, such as multiple tables per worksheet. Using these techniques to automate the creation of your Excel reports will save you hours of time and frustration, enabling you to pursue other endeavors.
BB04 : Reporting Non-Printable and Special Characters for Review in Excel
Abhinav Srivastva, Gilead Sciences
Data in clinical trials can be transmitted in various formats such as Excel, CSV, tab-delimited, ASCII files or via any Electronic Data Capture (EDC) tool. A potential problem arises when data has embedded special characters or even non-printable characters which affects all downstream analysis and reporting, in addition to being non-compliant with CDISC standards. The paper will briefly present some ways to identify these characters but the emphasis will be on creating an excel report which summarizes these in a way that can be easily reviewed and appropriate action can be planned. Creating a summary report as this can be used by the programmers as initial steps in data cleaning activities with each data transfer. Some of the features of the excel report include traffic-lighting effects, hyperlinks and tool-tip for providing additional information.
BB05 : I've Got to Hand It to You; Portable Programming Techniques
Art Carpenter, CA Occidental Consultants
As technology expands, we have the need to create programs that can be handed off - to clients, to regulatory agencies, to parent companies, or to other projects, and handed off with little or no modification by the recipient. Minimizing modification by the recipient often requires the program itself to self-modify. To some extent the program must be aware of its own operating environment and what it needs to do to adapt to it. There are a great many tools available to the SAS® programmer, which will allow the program to self-adjust to its own surroundings. These include location-detection routines, batch files based on folder contents, the ability to detect the version and location of SAS, programs that discern and adjust to the current operating system and the corresponding folder structure, the use of automatic and user defined environmental variables, and macro functions that use and modify system information. Need to create a portable program? We can hand you the tools.
BB06 : Good artists copy;Great artists steal; (implement pseudo R ,Jquery & code in Base SAS without installing R or other Applications
Hui Liu, MSD
There are some syntax sugars in other languages we are dreaming of using in SAS. Why? Because they are more expressive. To borrow these syntax sugars usually we install R or other languages and use them in the background by passing X command to them or relying on some modern SAS modules such as PROC IML to link them together. The downside of these ways is that the code is no longer the pure SAS code and when we pass SAS code mixed with R code to FDA they have to use a computer with both SAS/IML and R installed to regenerate results. We aren't supposed to carry a cow everywhere when we only want to drink a cup of milk. So, our goal is to STEAL these syntax features from other languages and implement them in pure SAS BASE/macro as pseudo code. The philosophy of pseudo code is "what you see is what you get, almost". Examples of code syntax we want to steal (include but not limited to) a. R-like "apply" function to execute BY group analysis on macros (without BY option inside/ without source code). %_(%str( apply(sashelp.class,sex,MacroWithNoByOption(sashelp.class)))); b. Jquery style wildcard to manipulate variables. Data new; %_(%str( set old; /*Impute value for variables ended with "TERM"*/ if $("*term") ='' then $("*term") ='N/A'; /*drop all variables look like ae*flag*/ drop #("ae*flag"); /*rename all variables:* to old_* */ rename $("*")=old_$("*"); /*inherit format/label from variable in other dataset as objects oriented programming style*/ format money %_($(sashelp.cars.invoice.format)); Label size=%_($(sashelp.cars.engineSize.label)); )); run;
BB07 : Simplifying Your %DO Loop with CALL EXECUTE
Arthur Li, City of Hope
One often uses an iterative %DO loop to execute a section of a macro repetitively. An alternative method is to utilize the implicit loop in the DATA step with the EXECUTE routine to generate a series of macro calls. One of the advantages in the latter approach is eliminating the needs of using indirect referencing. To better understand the use of the CALL EXECUTE, it is essential for programmers to understand the mechanism and the timing of macro processing to avoid programming errors. These technical issues will be discussed in detail in this paper.
BB08 : Using Hash tables for AE search strategies
Vinodita Bongarala, Seattle Genetics
Liz Thomas, Seattle Geneticd
As part of adverse event safety analysis, adverse events of special interest (AESI) are identified within a study by a variety of means. Most commonly, AESIs can be identified with standardized MedDRA Queries (SMQs), MedDRA System Organ Classes (SOCs), or a customized collection of MedDRA preferred terms (PTs) or lower level terms (LLTs). Analysis of AESIs is similar, regardless of the search strategy used to identify them. Identifying AESIs may involve merging the AE dataset with multiple datasets on keys of different levels (LLT, PT or SOC). Using a hash table as opposed to multiple sorted or unsorted merges can simplify both the code and execution time. Faster processing with SAS® hash tables has considerable utility in large safety databases. The efficiency gain of using hash tables over other methods of merging (both sorted and unsorted) has been well characterized. This paper demonstrates how to use hash tables to identify AESIs and two methods of structuring the resulting data for rapid subsequent tabulation and analysis, with benchmark comparison to using unstructured SQL merges to obtain the same results.
BB09 : Expansion of Opportunities in Programming: DS2 Features and Examples of Usage Object Oriented Programming in SAS
Serhii Voievutskyi, Experis Clinical
DS2 is a SAS programming language that allows us process data in a way that we could not do before in the traditional DATA step. Proc DS2 is a powerful instrument for advanced problem solving and data manipulation. It also includes additional data types, ANSI SQL types, programming structure elements, threaded processing and user-defined methods. DS2 language is based on object-oriented concepts which allows programmer to use packages and methods. This presentation will demonstrate how to modify the traditional DATA step with using DS2 to resolve different issues and how object-oriented programming can be used in statistical programming.
BB10 : A Macro to carry values through observations forwards or backwards over null values within a by group or a SAS(R) dataset.
Timothy Harrington, SAS Programmer
This paper explains a macro to address the problem of having to impute missing values of a SAS(R) data set column with prior (or following) values, when the data type may be numeric or character, and be for a whole data set or within a BY groups. This is a relatively common situation in Pharmaceutical programming, which is often cumbersome to both program and verify. Also described is the structure of the macro, which uses DICTIONARY.COLUMNS to determine the data type, carries values over one or multiple successive observations with missing or user defined 'null' amounts, and when the column is numeric calculates the differences between quantities in successive observations. Included in this paper are examples of practical applications, such as imputing missing dates and times of concentration sampling events since the first dose or most recent dose in PK data.
BB11 : SAS Programmer's Guide to Life on the SAS Grid
Eric Brinsfield, Meridian Analytics
With the goal of utilizing computing power more efficiently and centralizing support, many organizations are moving all or large portions of their SAS computing to a SAS Grid platform. This often forces many SAS programmers to move from Windows to Linux, from local computing to server computing, and from personal resources to shared resources. This shift can be easy if you make slight changes in behavior, practice and expectation. This presentation offers many suggestions for not only adapting to the SAS Grid but taking advantage of the parallel processing for longer running programs. Sample programs will be provided to demonstrate best practices for developing programs on the SAS Grid. Program optimization and program performance analysis techniques will be discussed in detail.
BB12 : Programming LYRIC Response in Immunomodulatory Therapy Trials
Yang Wang, Seattle Genetics
The LYmphoma Response to Immunomodulatory therapy Criteria (LYRIC) is a newly proposed tumor response criteria for immunomodulatory therapies such as immune checkpoint inhibitors. The current Lugano criteria work well for traditional chemotherapeutic regimens and chemoimmunotherapeutic regimens. It includes complete remission (CR), partial remission (PR), complete remission unconfirmed (CRu), stable disease (SD), relapsed disease (RD) and progressive disease (PD). However, Lymphoma therapy with immune mechanisms may cause tumor flares. These tumor flares can be associated with clinical and imaging findings suggesting progressive disease (PD). Since tumor flares generally occur during the first two or three weeks of treatment, without a more flexible interpretation, it is possible that some patients can be prematurely removed from a potentially benefiting treatment, thus leading to underestimation of the magnitude of the clinical benefit of the testing agent. LYRIC criteria introduces a new term "Indeterminate Response (IR)". Adding this new term allows distinguishing of flare/pseudo-progression and true PD by biopsy or subsequent imaging. This paper explains how this IR term data is collected and derived. It also discusses the data collection and programming challenges.
BB13 : Harnessing the Power of the Manifest File in the SAS® Life Sciences Analytic Framework
Kevin Clark, SAS Institute
The job manifest file is a very important but poorly understood file to many users of the SAS® Life Sciences Analytic Framework (LSAF). Users often ask the consulting team at SAS why this file is created and how it can be used. The purpose of this paper is to describe what types of information are contained within the manifest file and demonstrate a step-by-step process for using the manifest file to re-create a job. In LSAF, a job refers to a set of one or more SAS programs, the inputs required to run these programs, and the locations in which to store the outputs of the programs. Jobs can be utilized by users to run tasks which must be performed on a regular basis. By parsing the manifest file for information about a job and its associated tasks, inputs, outputs, and parameters, it is relatively simple to re-create the job. The ability to re-create a job could be especially useful if a company is working on a submission for a clinical trial, and the FDA requests a certain set of displays from an earlier analysis of the data. If the job which created these analyses no longer exists or has been overwritten, the job manifest files can be used to re-create the job needed to run the analyses.
BB14 : A SAS®sy Study of eDiary Data
Amie Bissonett, inVentiv Health Clinical
Many sponsors are using electronic diaries (eDiaries) to allow subjects to enter study data themselves, such as daily events, concomitant medications taken, and symptoms that occur. Depending on the study, subjects may enter data at varying time increments, from weekly or monthly history up to a daily account of activities and events. The timeliness of the data entry as well as the cleanliness of the data make a big impact on deriving SDTM and ADaM data sets and how the analysis will be performed. This paper goes through different scenarios and gives some tips to help from data cleaning, setting up the variable derivations, and programming the analysis data sets.
BB15 : A Case of assessing adverse events of interest based on their grade changes
Sriramu Kundoor, Seattle Genetics
Title: A Case of assessing adverse events of interest based on their grade changes. Authors: Sriramu Kundoor (Sreeram Kundoor), Seattle Genetics, Inc., Bothell WA Abstract: Adverse Events play an important role in assessing the safety of a drug. While shift tables are commonly used to describe laboratory data and determine shifts in toxicity grades, a similar approach can also be taken to determine the AE grade shifts in describing AE data. These shift tables provide an insight on whether a subject is recovering/resolving or recovered/resolved and the KM analysis provides further insight of the median duration for the improvement or resolution which in turn helps the safety assessment of the product. In order to do this, we have to filter the data for the adverse events of special interest and find the shift from chosen (Maximum) grade to the subsequent lowest grade. With suitable examples, this paper describes an approach to create AE shift tables and the KM estimate for the median duration of improvement or resolution. This paper also discusses various challenges and complex scenarios encountered and an approach to overcome the issues (and the logic behind them).
BB16 : Don't Get Lost! A High-efficiency, Low-tech Solution for Navigating Your Department's Many-Many-Many SOPs and Guidance Documents
Michael Hagendoorn, Amgen, Inc.
Tim Yerington, Amgen, Inc.
For programmers and biostatisticians alike, producing the statistical analyses for a clinical study report for any trial presents a treacherous odyssey through the choppy waters of dozens upon dozens of ever-changing operating procedures, guidance documents, best practices, templates, and silently understood general knowledge. To avoid needing to keep all this content loaded into our brain's RAM at all times, functions may create document lists or searchable-title trackers which are typically topic-oriented to help staff find documents related to a specific topic. While this goes a long way, it does not provide the answer to common questions like, "my database locks 2 months from now; which section in which of these documents should I review *today* to get ready for that?" - nor does it ensure everyone keeps up to date with the myriad of updates made to this impressive guidance library on an ongoing basis. As a result, teams may be working with outdated guidance; incorrect processes; or inefficient, long-since improved best practices, posing risks to functional compliance, quality, and efficiency. We will present a simple tool called the Statistical Project Plan which, at its core, is a centrally managed, departmental, interactive, scalable checklist that guides statisticians and programmers in performing just the right steps at just the right time using just the right sections in our many SOPs, manuals, template libraries, and guidelines from study start-up all the way through close-out.
BB17 : Writing Efficient Queries in SAS Using PROC SQL with Teradata
Mina Chen, Roche
The emergence of big data, as well as advancements in data science approaches and technology, is providing pharmaceutical companies with an opportunity to gain novel insights that can enhance and accelerate drug research and development. The pharmaceuticals industry has seen an explosion in the amount of available data beyond that collected from traditional, tightly controlled clinical trial environments. Investing in data enrichment, integration, and management will allow the industry to combine real-time and historical information for deeper insight as well as competitive advantage. On the other hand, pharmaceutical companies are faced with the unique big data challenges collecting more data than ever before. There has never been a greater need for efficient data analytical approach. Together, SAS and Teradata provide a combined analytics solution that helps big data analysis and reporting. This paper will introduce, based on real study experience, how to connect to Teradata in SAS and how to conduct analysis with SQL in a more efficient way.
BB18 : Clinical Data Visualization using TIBCO Spotfire® and SAS®
Ajay Gupta, PPD Inc
In Pharmaceuticals/CRO industries, we often get requests from stakeholders for real-time access to clinical data to explore the data interactively and to gain a deeper understanding. TIBCO Spotfire 7.6 is an analytics and business intelligence platform, which enables data visualization in an interactive mode. Users can further integrate TIBCO Spotfire with SAS (use for data programming) and create visualizations with powerful functionality for e.g. data filters, data flags. This visualization can help the user to self-review the data in multiple ways and will save a significant of time. This paper will demonstrate some basic visualization created using TIBCO Spotfire and SAS using raw and SDTM datasets. This paper will also discuss the possibility of creating quick visualizations to review the third party vendor (TPV) data in formats like EXCEL, .CSV.
BB19 : Transition to an Efficient Efficacy Programmer
Siddharth Kumar Mogra, GCE Solutions Inc
During initial years of Clinical SAS programming, usually a SAS programmer gets assigned to creating listings and safety outputs. Over the course of time the programmer becomes confident and comfortable in creating summary statistics, shift tables, descriptive statistics. But many times programmer who does not have statistical background is not as comfortable to take on the efficacy outputs or the creation of analysis dataset tasks. This paper will describe the challenges faced by a programmer with non-statistical background and lists steps and process of how to approach programming of efficacy outputs. The paper will discuss following: 1. Challenges Faced 2. The process : Table Shell, SAP, Reading SAS Output, Identifying correct output data set and variables, Communication, Decimals - attention to Details 3. Example : Proc Mixed
BB20 : Automating Title and Footnote Extraction Using Visual Basic for Applications (VBA) and SAS"
Tony Cardozo, Spaulding Clinical
ABSTRACT When developing tables, figures and listings (TFLs) for clinical trial data we have to ensure we are providing outputs that are consistent with the Statistical Analysis Plan (SAP) shells. Titles and footnotes, among other key elements, need a high level of accuracy to minimize review cycles with the statistician and overall development time. Depending on your current process, this may involve copying and pasting every title and footnote into a centralized macro or individual programs. This may be done by multiple programmers and can be a time consuming process that leaves opportunities for human error. When the TFL shells are created in Microsoft Word, title and footnote extraction can be done programmatically utilizing Visual Basic for Applications (VBA) and SAS". In this paper we'll discuss what VBA is, how to set it up and program title and footnote extraction. Also, we'll cover how to process the text streams to catch special characters (d, e, µ, etc) and provide SAS" Output Delivery System (ODS) friendly equivalents. Additionally, we will go over some assumptions made about the shells and how post processing with SAS" can make the whole title/footnote process from shells to TFLs dynamic and automated.
BB21 : Beyond IF THEN ELSE: Techniques for Conditional Execution of SAS® Code
Josh Horstman, Nested Loop Consulting
Nearly every SAS® program includes logic that causes certain code to be executed only when specific conditions are met. This is commonly done using the IF&THEN&ELSE syntax. In this paper, we will explore various ways to construct conditional SAS logic, including some that may provide advantages over the IF statement. Topics will include the SELECT statement, the IFC and IFN functions, the CHOOSE and WHICH families of functions, as well as some more esoteric methods. We'll also make sure we understand the difference between a regular IF and the %IF macro statement.
Career PlanningCP01 : Creating a Winning Resume Reflecting Your Skills and Experience With Honesty and Integrity.
Kelly Spak, Chiltern International Ltd.
Creating a Winning Resume Reflecting Your Skills and Experience With Honesty and Integrity. Kelly Spak, Senior Recruiter, Chiltern International Ltd., King of Prussia, PA A great resume is important to securing the next step in your career. This presentation will identify the dos and don'ts of your resume submission which will help set you above the competition. Topics will include personal data, education, certifications, present and previous employment (including ways to list out employer/contractor roles), select accomplishments, social media and references.
CP02 : The Road Not Often Taken
Janet Stuelpner, SAS
Is there a simple career path to working in the life science industry? Do you need to study programming in college. Do you need to be a statistics major? What kind of skills can I learn while still at the university so that I can work in the life sciences? If I major in political science, can I still work in pharma or biotech? What if I am a history major, will that work? As with anything that you learn while you are in school, it is not always the content that will advance you to your final career. In this paper, I will discuss my path to having a successful career as a programmer. You will see how the twists and turns of my life helped me to get where I wanted to go. My story is not unique or unusual. I will talk about how an unusual or unique path can bring you success, no matter what your background may be.
CP03 : What Makes a Candidate Stand Out?
Edward Slezinger, Fred Hutchinson Cancer Research Center - SCHARP
One of the most typical questions interviewers ask is, "What makes you stand out among your peers?" Hiring managers are interviewing many candidates that share the same qualifications you possess, and they want to hire someone that will bring something unique to the table. So what sets you apart? How do you prove to them that you meet their needs and the desire to bring something more to their organization? We will discuss and highlight areas where your talents, skills and "that thing" can open a door to a new and exciting role. We'll also discuss some potential pitfalls to your success to watch out for.
CP04 : The Elusive Data Scientist: Real-world analytic competencies
Greg Nelson, ThotWave
You've all seen the job posting which looks more like an advertisement for the ever-elusive unicorn. They begin by outlining the required skills that include a mixture of tools, technologies and masterful "things that you should be able to do." Unfortunately, many such postings begin with restrictions to those with advanced degrees in math, science, statistics, or computer science and experience in your specific industry. They must be able to perform predictive modeling, natural language processing and, for good measure, candidates should only apply if they know artificial intelligence, cognitive computing, and machine learning. The candidate should be proficient in SAS, R, Python, Hadoop, ETL, real-time, in-cloud, in-memory, in-database and must be a master storyteller. I know of no-one who would be able to fit that description and still be able to hold a normal conversation with another human. In our work, we have developed a competency model for analytics which describe nine performance domains that encompass the knowledge, skills, behaviors, and dispositions that today's analytic professional should possess in support of a learning, analytically-driven organization. In this paper, we will describe the model and provide specific examples of job families and career paths that can be followed based on the domains which best fit your skills and interests. We will also share with participants a self-assessment tool where they can see where the stack up!
CP05 : Strategies and decisions for marketing yourself as an independent contractor
Kathy Bradrick, Triangle Biostatistics, LLC
Independent contractors have the responsibility of marketing themselves, in other words doing their own business development. This session will explore how to market yourself, what project aspects to consider, how to make yourself appealing to recruiters and employers, and things to think about as you consider going back to permanent employment.
CP06 : The Road Not Taken: SAS for Pharmacometrics
Vishak Subramoney, Certara Strategic Consulting
Sree Harsha Sreerama Reddy, Certara Strategic Consulting
This paper aims to introduce SAS specialists from diverse backgrounds to the exciting new world of Pharmacometrics and explores the limitless possibilities in this cutting edge field. Pharmacometrics is the science of using mathematical models to integrate knowledge of biology, pharmacology and physiology to quantify the relationship between exposure and response. Model Based Drug Development (MBDD) holds the key to personalized healthcare and is the clear future of Drug Research and Development. As you can imagine, data is the crux of any scientific endevour. Preparing high quality datasets for Pharmacokinetic and Pharmacodynamic modeling is the primary remit of a PK Programmer. Other responsibilities include using innovative ways of data exploration (e.g. graphical analysis), ensuring the smooth transition of knowledge from SDTM/ADAM datasets to a modeling ready dataset. As SAS programmer, Pharmacometrics is not the first thing that comes to our minds as a career choice. But, with a basic understanding of pharmacology, an inclination towards mathematical methodologies and some hard work, one can easily transition to the specialized role of a PK Programmer. Are your ready to take "The Road Not Taken"?
CP07 : Becoming a Successful Manager of SAS® Programmers from an Ex-Programmer's Perspective
Christine Young, Chiltern, International
Have you ever wondered how to take that next step from programmer to manager of programmers? Or even if you want to take that step? What are the challenges? What are the rewards? What are some techniques you can carry with you from your programming days to make you a more effective manager? Which do you need to scrap? Here is a look at all of these questions and answers from someone who has been there and successfully done that.
CP08 : Schoveing Series 3: Living by Knowing When to STOP
Priscilla Gathoni, AstraZeneca, Statistical Programming
Would changing the way you prioritize your life help ensure a successful and fruitful life? Would you like to renew the way you live, think, feel, and will? Is the senselessness of demanding more and exhausting yourself in the pursuit of what keeps you trapped in a vicious cycle of "striving and never arriving" keeping you awake? Do you know when to stop and determine what is important and valuable to your life? Do you have daily habits that keep you in a comfort zone that you would like to break? Would you like to stop working and start living? Does parenting make you nervous and question loving? How many of you procrastinate and never have time to start focusing? Are you so consumed in your mind that you take over all conversations and never listen to what the other person is saying? Do you live your life thinking that fate is all you have and nothing else can stop your circumstances from continuing? Is fear guiding you? Are you apprehensive of what will happen tomorrow and have not forgiven the past? This paper will unlock your ability to answer these very important questions. The goal is to start believing in yourself, your success and have the unwavering faith to achieve your life's purpose and unlock your unlimited potential! Everything you ever wished for will be brought forth as you have a harmonious relationship with yourself, the great architect of the universe, and the world. Practice knowing when to stop.
CP09 : Show Me The Job!!!
Richelle Serrano, Clindata Insight
Kevin Lee, Clindata Insight
The movie, Jerry Maguire spawned several popular quotations, including "Show Me The Money!" shouted repeatedly in a phone exchange between Rod Tidwell and Jerry Maguire (played by Tom Cruise). It took Jerry Maguire almost a year to find his client (the job seeker) the best job without compromises. Finding the best job is not a short and easy process. The paper will speak to the challenges faced by Biometrics job seekers today, and show how landing the best job, can be far less painful, and more effective, by partnering with a technical recruiter. The paper begins with what 'not' to fear in a technical resume. Statistical Programming hiring managers in pharmaceutical organizations are looking for individuals with well-defined technical capabilities and a solid understanding of the clinical trial process. The paper will show how job seekers can get the appropriate career guidance from a recruiter that understands their background and skills. The paper will discuss key qualifications to showcase in your resume that will motivate hiring managers to take notice. The paper will demonstrate in the examples how a resume should be built to become more powerful and convincing with descriptive and detailed explanations. The paper will also introduce how the job seeker should prepare for a successful interview. Lastly, the paper will present suggestions for what to do now to leverage tools and resources available to ensure you're prepared when the perfect opportunity arises.
CP10 : Career development for SAS programmers in biopharmaceutical industry
Yingqiu Yvette Liu, PA
SAS is a powerful commercial software suite for analytics, business intelligence and data management widely used in biopharmaceutical industry. The career as SAS programmer in this industry is promising and rewarding. In this paper, the author will illustrate career opportunities and development paths for SAS programmers.
Data IntegrityDA01 : Real Time Clinical Trial Oversight with SAS
Ashok Gunuganti, Trevena
A phase 3 clinical trial is an expensive undertaking with multiple players working together to accomplish study startup, conduct, data collection and data monitoring. It is of great importance that the study is implemented as specified in the protocol and collected data is clean and of good quality for a successful study. The paper presents a framework of tools developed in SAS, using periodic snapshots of the data available in the EDC system during the conduct phase of the trial with the intention of providing the study team with real time feedback on study conduct, timely data entry and overall data quality. These tools/aids comprising a combination of excel work books and graphical aids which are setup to run at a pre-specified times (weekly) throughout the conduct of trial. Tools/Aids developed comprise of reports on  EDC data entry metrics to make sure the data is being entered into the database in timely manner to help generate queries and clean the data.  Customized programmatic edits checks excel workbook with embedded links to the EDC system used as aid by the data monitors.  Excel based patient lifecycle listings used as aids by the trial managers for data quality oversight  custom Clinical event graphical patient profiles and other custom reports.
DA02 : Let's Check the Data Integrity using Statistical Programmer with SAS
Harivardhan Jampala, chiltern international Pvt ltd
The success of any clinical trial depends on the accuracy and integrity of the study process and the data produced from the trial. As in any experiment, data plays the central role and almost everybody involved in a clinical trial generates, maintains or explains data. Hence, we all agree, it's vital that the data is clean and more importantly, it's fully utilized to make everyone's job easier and efficient. A plethora of software's and tools are employed by various contributors in clinical research across the industry to help make sense of clinical and operational data. The scope and investment in such tools depends on the budget and organizational priorities. There are organizations that operate with moderate or minimalistic software resources. This paper will explore various ways in which the SAS programmers (either statistical programmers or clinical programmers) can help other departments in such organizations. The paper will list some examples of such offerings, sometimes with a sample approach, like: data listings, CRF tracking metrics, patient profiles (all data points), and Site summary metrics, patients profiles with specific data points to help CRAs, Patient safety summaries, Custom safety narrative templates. This paper also deals with some of the graphical representation of the data before the actual statistical programming starts for Tables, listings, Figures. It also describes unique ways for summarizing clinical trials data in a way that makes it easy to identify unintentional or intentional errors in data about individual subjects or clinical sites.
DA03 : Conversion of CDISC specifications to CDISC data - specifications driven SAS programming for CDISC data mapping
Yurong Dai, Eli Lilly
Jameson Cai, Eli Lilly
We'd like to introduce metadata driven approach utilizing sas programming techniques for SDTM and ADaM data mapping. Metadata are extracted from specifications and converted into dataset's attributes, format, variable names and their order and sorting order for specification implementation in our reference code. It increases reference code's reusability, efficiency and consistency between data specifications and output data, and reduced re-work after data specification's update, during code development for SDTM mapping and ADaM datasets derivation.
DA04 : Data Change Report
Eric Kammer, Novartis
This paper shows a data change concept for comparing data between data milestones for clinical studies that was developed using SAS©. SAS© allows the data changes to be colored in GREEN in an output Excel© file indicating what data has changed which can be new, deleted, or changed records. While this program was developed for a Unix© system it is applicable to any environment since it uses a metadata concept. Data can then be compared to see if any significant changes occurred between milestones that could affect safety or any other important decisions applicable with analysis, monitoring, or clinical assessment.
DA05 : Clinical and Vendor Database Harmony; Can't we all just get along?
Brian Armstrong, QST Consultations, Ltd.
Clinical trial data collection is often comprised of multiple databases, in a hub and spoke network with the clinical database as the hub and external vendor databases as spokes (e.g. laboratory, electronic diary, electrocardiogram, magnetic resonance imaging, pharmacokinetic, etc.). The submission database is presented as clean package of datasets with associated define documentation. However, behind the scenes, the creation of that neat and tidy submission package is often complicated by data collection inconsistencies and data reconciliation issues that cause inefficiencies to programmers. Spending time at the front end of a clinical trial to ensure consistent data values among various databases and constructing checks to reconcile common data points will promote quality data and more efficient programming.
DA06 : Check Your Data: Tools for Automating Data Assessment
Paul Stutzman, Axio Research
This paper presents tools that can be employed to check the consistency and validity of data, and it discusses building blocks for automating these tasks. Individually or in combination, these components can be used to check data within data sets, across data sets, and over time. It is important to carefully examine data to ensure consistency and accuracy. This can be a time-consuming process, so automating these tasks can be of great value. These data checking tasks range from simple (e.g. making sure related variable pairs like --TEST/--TESTCD are consistent with a data set) to more complex (e.g. seeing if VISIT/VISITNUM pairs are consistent across all data sets, or looking at how the structure of a set of data sets and the information they contain change over time). Regardless of the complexity, there are some fundamental building blocks that can help automate these processes.
DA07 : Data Integrity: One step before SDTM
Pavan Kathula, Anna University at Chennai
Sonal Torawane, University of Pune
In clinical research, errors occur despite careful study design, conduct, and implementation of error-prevention strategies. Data cleaning intends to identify and correct these errors or at least to minimize their impact on study results. Little guidance is currently available in the peer-reviewed process on how to set up and carry out cleaning efforts in an efficient and ethical way. With the growing importance of Good Clinical Practice guidelines and regulations, data cleaning and other aspects of data handling will emerge from being mainly subjects to being the focus of comparative methodological studies and process evaluations. We would like to present an overall summary of the scattered information, integrated into a conceptual framework aimed at assisting investigators with planning and implementation. Our presentation will explain the suggestions on using unique specifications, processes and specific methods along with SAS to maintain data integrity and cleanliness. With this suggestion, the scientific reports might describe data-cleaning methods, error types and rates, error deletion and correction rates. Utilization of simple tips and techniques along with the proper documentation will impact on not only the study results but also the time, effort and cost. However, this presentation will not talk about the SDTM techniques.
Data StandardsDS01 : The Untapped Potential of the Protocol Representation Model
Frank Diiorio, CodeCrafters, Inc.
Jeffrey Abolafia, Rho, Inc.
Recent FDA guidances have established CDISC models such as ADaM and SDTM as submission standards. As a result, most organizations have focused CDISC implementation strategies on SDTM and ADaM, which have usually led to higher costs and longer timelines. Moving standards implementation "upstream" can maximize the value obtained from CDISC standards. One largely overlooked standard is the Protocol Representation Model (PRM), the beginning "end" of "end-to-end." The PRM has the content and potential to streamline research throughout the entire product life cycle. This paper is an overview of the PRM. It describes what it is; discusses the business case for its implementation; describes Rho's implementation strategy; demonstrates how its use at Rho has improved operations; and presents strategies for collecting and storing PRM metadata. The paper should give the reader an appreciation of the content and scope of the PRM and its use beyond simply storing protocol items as metadata.
DS02 : CDISC's CDASH and SDTM: Why You Need Both!
Kit Howard, CDISC
Shannon Labout, CDISC
Many clinical research organizations, especially in the biopharma industry, believe that CDISC's CDASH data capture standard is an unnecessary addition to the SDTM submission data standard. The rationale is that CDASH is very similar to SDTM, and the few differences merely create confusion and additional work. While it is true that CDASH is very similar to SDTM, these two standards solve different problems, and using them together can positively impact data capture, quality, usability, repurposing, and traceability, among other considerations. This paper explores the similarities and differences between CDASH and SDTM, how the standards relate to each other, and why both are crucial for high quality clinical research.
DS03 : SDTM - Just a Walk in the (Theme) Park, Exploring SDTM in the Most Magical Place on Earth
Christine Mcnichol, Chiltern International
By now, most in the industry know the basic concepts of SDTM, now let us infuse some Disney magic and take a virtual vacation to look at SDTM from a perspective inspired by one of the happiest places on earth. Some of the basic as well as more complicated pieces of SDTM can be shown in examples of things encountered during a day at a theme park in Central Florida. With a little imagination, we can equate our virtual vacation to a clinical trial. Come along as our vacationers (subjects) take part in our vacation (study) where they will experience various rides and attractions (treatments). We will take a look at how these experiences could be mapped to SDTM, from describing the plan of the virtual vacation in the TDM through capturing the events of the day in SDTM domains. Within this trip, we will look at some more common but important domain examples such as DM, AE, DS, VS and more. Additionally, we will take a look at some more advanced concepts in SDTM in areas within our vacation example such as the usage of EX vs EC, basic trial design strategies to use in TDM creation, EDC and external data to support core study data, as well as a few applicable device domains. We will take a look at these concepts in a new and creative way as we explore SDTM with a magical twist.
DS04 : The CDISC Trial Design Model (TDM), the EPOCH variable, and the Treatment Emergent Flag: How to Leverage these to Improve Review
Thomas Guinter, Independent
Janet Reich, Sr. Manager Amgen
This session will provide recommendations on the appropriate granularity the CDISC Trial Design Module (TDM) variables ARM, EPOCH and ELEMENT from a reviewability perspective and provide best practice recommendations. The agency Study Data Technical Conformance Guide requests the EPOCH variable be included in SDTM general-observation-class domains (e.g., AE, CM, EX, LB, VS, etc.) to help reviewers understand where in the study timeline things happened. Best practices for appropriate granularity of the EPOCH varaible will be provided, as well as contrasting the EPOCH variable with the AE Treatment Emergent Flag (AETRTEM) so sponsors can provide reviewers with the most useable information to facilitate review.
DS05 : Implementation of STDM Pharmacogenomics/Genetics Domains on Genetic Variation Data
Linghui Zhang, Merck
Pharmacogenomics (PGx) explores how gene expression and genetic makeup affect individual responses to drugs. It has been recognized since the early 1960s that inherited variation contributes to drug responses. Currently PGx studies are widely used at various stages of drug development and labeling, such as stratifying patients in clinical trials, improving drug safety and optimizing doses. Although the Guidance for Industry: Pharmacogenomics Data Submissions was officially issued by FDA in 2005, due to the complexity and specificity of PGx data, the Clinical Data Interchange Standard Consortium (CDISC) published the clinical data standard for genomics and biomarker data, Study Data Tabulation Model Implementation Guide: Pharmacogenomics/Genetics (SDTMIG-PGx) in July 2015. Since PGx is a still new topic to clinical trial programmers, the type, quality and analysis of PGx data, as well as biospecimen collection and handling techniques for PGx studies, are not broadly discussed in the programming community. Moreover, new domains and variables are developed to capture PGx data in SDTMIG-PGx. Therefore, it is a challenge to programmers to implement SDTM PGx domains. This paper will provide a high level introduction on the type and application of PGx data and the strategies and practical considerations of creating SDTM PGx domains. By using the single nucleotide polymorphism (SNP), a type of well-known genetic variation data as an example, the details of mapping human SNPs to SDTM PGx domains will be illustrated in detail.
DS06 : Harmonizing CDISC Data Standards across Companies: A Practical Overview with Examples
Keith Shusterman, Chiltern
Prathima Surabhi, AstraZeneca
Binoy Varghese, MedImmune
Whether due to the fact that standardized data are more useful, or that submission of CDISC data is now required, many companies have worked hard to establish local CDISC interpretation guides. Those guides support consistent application of CDISC SDTM and ADaM standards within a company. When companies merge or collaborate, company-specific data standards need to be harmonized to ensure a smooth hand-off of data for the purposes of analysis and reporting. We will share a general overview of the issues encountered in harmonization and present strategies on how to effectively plan and harmonize company-specific CDISC implementation standards. We will walk through a few representative examples from a recent harmonization effort.
DS07 : Where Is the Link Broken - Another Look at SDTM Oncology Tumor Packages
Hong Wang, Boehringer Ingelheim
Ke Xiao, Boehringer Ingelheim
CDISC published three oncology tumor related domains - Tumor Identification (TU), Tumor Results (TR) and Disease Response (RS) in SDTMIG v3.2 (2013). This provides a general guideline for data regulation and standardization for oncology studies based on RECIST criteria and/or its modifications as well as other assessment criteria (Cheson or Hallek). Essentially three domains do not function independently; instead they have inherent connections and have been linked through --LNKIDs and -LNKGRPs. For example, TRLNKID/TULNKID is used to link assessment records in TR domain with corresponding identification records in TU domain. However, using RSLNKGRP and TRLNKGRP to connect RS and TR might not be sufficient. Neither is it clear to group records within RS domain. As an illustration, solely using -LNKGRPs is feasible to connect all the measurements, including those for target/non-target/new lesion responses, in TR domain with respective overall assessment in RS domain at a measurement point. However, it is not straightforward to identify within RS domain which response records contribute to overall response especially when symptomatic deterioration is of interest; nor is it clear of the link of measurements for target/non-target/new lesions in TR domain with corresponding responses in RS domain at a particular visit. Being able to establish/clarify such link relations is not only crucial for keeping traceability within/between domains, but also important for future time-to-event analysis. In this paper, the authors come up proposals and provide a more efficient and accurate way to link TR and RS domain. Examples would also be given for illustration purpose.
DS08 : The CDISC SDTM Exposure Domains (EX & EC) Demystified. How EC Helps You Produce A Better (more compliant) EX.
Thomas Guinter, Independent
Janet Reich, Sr. Manager Amgen
This session will review the SDTM EC and EX domains and show how the EC domain should be leveraged to provide a clear audit trail to the EX domain, and improve the reviewability of the EX domain. The Ex domain is one of the most important domains in most submissions, and agency feedback has explained that it often does not meet the reviewers expectations. Clear examples will be provided to help explain how EC can help sponsors produce a compliant EX domain that meets reviewers expectations.
DS09 : RELREC - SDTM Programmer's Bermuda Triangle
Charumathy Sreeraman, Ephicacy Lifescience Analytics
The CDISC Study Data Tabulation Model (SDTM) provides framework for organizing and converting clinical trial data into standard formats. This supports easy interpretation and maintaining consistency across trials. Sometimes, it becomes vital to establish relationship between records/datasets in SDTM to facilitate linking process at the time of conversion. Logic of relationship is either identified by profile/outliers in data (ex: PC and PP) or by identifying the data link between domains to examine associated information from individual domains collectively (ex: TU, TR and RS). Related Records (RELREC) - Special Purpose Relationship Domain - can capture these explicit and inexplicit relationship(s) to aid further in-depth exploration of data collected during trials. Often, perceived as a challenging zone, this beauty is yet to be explored to its maximum potential. This paper details the process to standardize relationships within and between SDTM domains by using the concept of Group Identifier (--GRPID - a variable used to link a block of related records within a subject in a domain), distinct requirements for assigning RELID, appropriate usage of RELTYPE and best variables to be considered for populating IDVAR in the following scenarios: i. The relationship between an intervention and its findings related to the efficacy endpoints of the clinical study ii. The relationship and control of event record over intervention, exposure and disposition of the subject involved in the trial iii. The relationship between oncology specific domains and iv. The relationship between pharmacokinetics domains
DS10 : Ahead of the Curve: Leading with Industry Data Requirements
Maria Dalton, GlaxoSmithKline
Nancy Haeusser, GlaxoSmithKline
Most programmers in the pharmaceutical industry are aware of the requirement to submit data in CDISC data format. However, the CDISC mandate is not always clear-cut as there are some nuances and ambiguities to this requirement. Also, there are many other data requirements for the pharmaceutical industry, such as data disclosure regulations, data sharing policies, and dictionary requirements. In addition, clinical trials may now include real world data or electronic health care data which follow new data standards. It can be very challenging for companies to stay "ahead of the curve" in this area. This paper discusses some of the recent developments in industry data requirements, and it also discusses the importance of an organized, cross-functional approach to managing and embedding industry data requirements within a pharmaceutical or CRO company. This paper covers industry data standards and regulatory data requirements, so it is not specific to any programming language. This paper will be helpful to all levels of programmers in the pharmaceutical industry, including lead programmers and managers.
DS11 : Leveraging metadata when mapping to CDISC Standards with SAS® machine learning in a Results as a Service plus Model (RAAS+)
Ben Bocchicchio, SAS Institute
Sandeep Juneja, SAS Institute
Preetesh Parikh, SAS Institute
Mapping raw EDC data to a SDTM or ADAM standard is a very time consuming process. As the evolution of standards continues to develop, the ability to collect and reuse metadata on how raw variables get mapped into a standardized variables can help reduce the time to standardize subsequent studies. The reduction in time and effort comes from the ability to reuse the mapping by analyzing the raw and standardized metadata. Using the power of Machine Learning algorithms can further help derive and predict possible mappings. Instead of customer having install software locally and making necessary configuration to run, SAS will offer this as a hosted solution so that customer can upload data and standards, run the process and results of the process will be SAS code containing the mapping information that can be utilized in the customer environment. Process Flow: Identify standards -> Add Data Source -> Review metadata -> Table Mapping -> Variable Mapping -> Auto mapping -> Smart* Mapping -> Generate SAS Program shells -> Publish Mapping to Library (for reuse) This application uses base SAS running on a Linux server; The background of an end user is familiarity with SAS coding and CDISC models.
DS12 : Considerations and Conventions in the Submission of the SDTMIG Tumor and Response Domains
Jerry Salyers, Accenture Life Sciences
Fred Wood, Accenture Life Siences
The SDTMIG tumor domains (TU, TR, and RS) were first published in conjunction with the release of SDTMIG v3.1.3. The original intended use of these domains was largely confined to oncology. The TU domain contains tumor identification data and the TR domain contains subsequent tumor measurements over the course of a clinical trial. The measurements of target lesions and the continued evaluation of non-target lesions are used to support response assessment criteria such as outlined in RECIST (Response Evaluation Criteria in Solid Tumors) and represented in the RS domain. With the development of Therapeutic Area User Guides (TAUGs), most notably the CV TAUG, the use of the TU and TR domains has been broadened to allow for the identification and assessment of other types of "lesions", not just those associated with cancer and its treatment. This paper will examine how these two domains are now used across the spectrum of clinical studies and indications. Along with this expansion in the use of the "tumor domains", is the similar broadening of the Response (RS) domain. In the oncology setting, RS is used to represent the subject's response to treatment, either specifically pertaining to target or non-target lesions or their overall response. Starting with SDTMIG v3.3 (targeted for release in early 2017), RS will also be used to submit Clinical Classifications. Examples of this use would include the Child-Pugh hepatic assessment, the NYHA assessment of congestive heart failure, and the ECOG (Eastern Oncology Cooperative Group).
DS13 : Planning to Pool SDTM by Creating and Maintaining a Sponsor-Specific Controlled Terminology Database
Cori Kramer, Chiltern International
Ragini Hari, Chiltern International
When SDTM data are consistently standardized, data can be easily pooled across studies. To consistently standardize raw data to SDTM across studies, there must be an assigned place for each collection field and each collection field must be in its assigned place. Following the SDTM Implementation Guide and applying Controlled Terminology (CT) as specified is sufficient to support fairly consistent SDTM mapping across studies. To enhance the consistency, sponsor guidelines must be established and maintained. Establishing, maintaining and enforcing sponsor-defined controlled terminology offers a simple, effective way to ensure that SDTM domains are consistently mapped across studies. We will share an overview of the benefits of leveraging a sponsor-specific controlled terminology database using examples and will suggest how a sponsor-specific CT database can be efficiently developed and used.
DS14 : Considerations in Submitting Standardized Electronic Data Under the Animal Rule: The Use of Domains in the SDTMIG and the SENDIG
Fred Wood, Accenture Life Siences
The Animal Rule provides a mechanism for the approval of drugs and biological products when efficacy studies in humans are not ethical or feasible. Examples of products that would be covered under the Rule include vaccines against diseases such as smallpox and anthrax. The Animal Rule states that "FDA may grant approval based on well-controlled animal studies, when the results of those studies establish that the product is reasonably likely to produce clinical benefit in humans." Clinical trials are still required to evaluate the safety of the product, and for determining the appropriate dose. As a result of the need for a combination of animal studies to demonstrate efficacy and human studies to demonstrate safety, a combination of domains from the SDTMIG and SENDIG will be needed for submissions to the FDA under the Animal Rule. It is also expected that domains in the SDTMIG will need to be applied to animals, and that new domains not currently in either of the implementation guides will be needed. Because no single standard or guide provides implementation advice down to the domain level for Animal Rule submissions, an effort has begun to create an implementation guide based on the SDTM. This paper will discuss initial considerations in developing such a document.
DS15 : Common Programming Errors in CDISC data
Sergiy Sirichenko, Pinnacle 21
Data in standardized format is now a required part of regulatory submissions. CDISC standards have now achieved widespread adoption and have become a commodity skillset for thousands of clinical programmers. Nevertheless, there are still mapping and programming errors commonly observed in standardized data. These reduce the overall quality of submissions and should be avoided. Especially since majority of programming errors can be fixed even after the study data is locked, which is not the case with data management and data collection design issues. This presentation will share our experience of the most common mapping and programming errors observed across hundreds of regulatory submissions. We will provide examples and recommendations on how to detect these issues, how to evaluate their impact on regulatory review, and how each can be corrected.
DS16 : ADaM Compliance - Validating your Specifications
Trevor Mankus, PRA Health Sciences
Kent Letourneau, PRA Inernational
As of December 17th, 2016, the FDA and PMDA require that all new studies included in submissions have their analysis datasets created in compliance with the CDISC ADaM standards. The possibility of having your submission not accepted heightens the importance of having an effective process for ensuring that you are following this standard. At PRA Health Sciences, this process starts with a review of the metadata in our ADaM specifications. This paper will give some background on this topic and describe the process by which we create datasets that are fully compliant with the ADaM standards. We will discuss that determining ADaM compliance cannot only be done with available validation software tools and that a human component is needed. We will also discuss the benefits of beginning this review on the ADaM specifications rather than on the datasets.
DS17 : ADaM Grouping: Groups, Categories, and Criteria. Which Way Should I Go?
Jack Shostak, DCRI
ADaM has variables that allow you to cluster, group, or categorize information for analysis purposes. Sometimes it isn't entirely clear which variable you should be using and when. The goal of this paper is to help to provide some guidance around what ADaM grouping variables are available, what is appropriate and when, and then to discuss when more than one technique will work for a given analysis situation. We will also look at problems where a single solution isn't entirely obvious. The paper focus will be primarily on BDS grouping variables, although other non-BDS variables will be mentioned. The following ADaM BDS variables will be examined in detail: *GRy(*Gy), *CATy, CRITy, MCRITy, AVALCATy, and PARAM. These will be compared and contrasted for several use cases. The paper will conclude with suggested ADaM grouping strategies as well as ideas for where ADaM can be improved in this regard. This paper is targeted at CDISC users with some basic exposure to the CDISC ADaM model.
DS18 : Clarifications About ADaM Implementation Provided in ADaMIG Version 1.1
John Troxell, Accenture
Since the publication of version 1.0 of the CDISC (Clinical Data Interchange Standards Consortium) ADaM (Analysis Data Model) Implementation Guide (ADaMIG) in December 2009, experience has shown that there are some misunderstandings among practitioners about ADaM implementation. The ADaM Team realized that the best way to address these misunderstandings was to clarify the ADaMIG. In fact, although additional variables were added in version 1.1, published in 2016, the primary motivation was to clarify version 1.0. This paper describes important clarifications contained in version 1.1, and what they mean for implementers of both ADaMIG versions.
DS19 : Standardized, Customized or Both? Defining and Implementing (MedDRA) Queries in ADaM Data Sets
Richann Watson, Experis
Karl Miller, inVentiv Health
Investigation of drug safety issues for clinical development will consistently revolve around the experience and impact of important medical occurrences throughout the conduct of a clinical trial. As a first step in the data analysis process, Standardized MedDRA Queries (SMQs), a unique feature of MedDRA, provide a consistent and efficient structure to support safety analysis, reporting, and also address important topics for regulatory and industry users. A variance in working with SMQs is the ability to limit the scope for the analysis need (e.g., "Broad" or "Narrow") but there is also the ability outside of the specific SMQs in allowing the ability to develop Customized Queries (CQs). With the introduction of the ADaM Occurrence Data Structure (OCCDS) standard structure, the incorporation of these SMQs, along with potential CQs, solidified the need for consistent implementation, not only across studies, but across drug compounds and even within a company itself. Working with SMQs one may have numerous questions: What differentiates the SMQ from a CQ and which one should be used? Are there any other considerations in implementation of the OCCDS standards? Where does one begin? Right here&
DS20 : LBTEST/LBSTRESU and ADaM lab parameters: The dilemma of mapping one-to-many or many-to-one
Michelle Barrick, Eli Lilly and Company
Lab results for clinical trials need to be uniquely identified to allow them to be combined for analysis. Multiple source laboratories may have varying names or identifiers for the same test, so standardization is needed to identify lab tests that are the same. Industry wide, there are efforts to standardize lab codes via LOINC but that is still a work in progress and not a mandate. Many pharma companies have their own internal code system to uniquely identify lab tests. These internal codes are mapped to SDTM CDISC terms for LBTESTCD/LBTEST and associated units. Mapping those tests to ADaM parameters becomes a bit more complex because PARAMCD/PARAM must uniquely and unequivocally describe what is in AVAL or AVALC. It is not enough to just look at the SDTM LBTEST (and associated unit) to determine what your parameter should be. This paper will explain some of the levels of detail that needs to be assessed to correctly define an ADaM lab PARAMCD/PARAM. Examples will be provided showing labs that appear to be the same but are not, as well as labs whose names are different and yet can be combined for analysis.
DS21 : Programming Efficiency in the Creation of ADaM BDS Datasets
Ellen Lin, Amgen Inc
The ADaM Basic Data Structure (BDS) has become one of the most prominent and widely implemented dataset structures in industry since the CDISC ADaM Implementation Guide V1.0 was published in 2009. The strictly vertical data design of the BDS brings two common challenges to statistical programming: (1) BDS datasets often get very large quickly, especially in bigger clinical trials. (2) Metadata for BDS datasets are more difficult to develop and understand due to the use of value-level metadata (VLM), for describing variables by PARAM/CD, and the use of multiple BASETYPEs within the same dataset. In this paper, we will describe these programming challenges and illustrate with examples on how to achieve high programming efficiency and quality specifically for BDS datasets. The approaches include reading metadata (e.g., VLM and Controlled Terminology) directly into dataset creation to ensure consistency and avoid error-prone hardcoding; designing streamlined programming steps before and after the construction of BASETYPE to maximize data processing efficiency; and using modular macros to standardize common data computations and imputations.
DS22 : Deriving Rows in CDISC ADaM BDS Datasets
Sandra Minjoe, Accenture
John Troxell, Accenture
The ADaM Basic Data Structure (BDS) can be used for many analysis needs. We all know the SAS Data Step is a very flexible and powerful tool for data processing. In fact, the Data Step is very useful in the creation of a non-trivial BDS dataset. This paper walks through a series of examples showing use of the SAS DATA STEP when deriving rows in BDS. These examples include creating new parameters, new timepoints, and changes from multiple baselines.
DS23 : The Benefits of Traceability Beyond Just From SDTM to ADaM in CDISC Standards
Maggie Ci Jiang, Teva Pharmaceuticals
Since FDA released the Analysis Data Model Implementation Guide (ADaMIG) for public review in 2014, traceability in CDISC ADaM data is well known for its characteristics between SDTM and ADaM. However, in CDISC ADaM development, the ability of traceability is more than just the connection between SDTM and ADaM. When it comes to a more complex analysis, an ADaM data value is not necessarily immediately related to a SDTM source data variable, some or more derivation procedures may be required to achieve the purpose, such as the derivation of an analysis-ready variable, parameter or record, to do it in the ADaM data or to program it in the analytical report instead? It appears that programmers can have an option. The decision often ends up with the one that the programmer favors over the other. Are programmers really free to make this choice? This paper will try to tackle this topic by comprehensive discussion of the ADaM Methodology through some practical examples, and to provide insight on the extensive benefits of traceability beyond just from SDTM data to ADaM data in CDISC ADaM development.
DS24 : ADQRS: Basic Principles for Building Questionnaire, Rating and Scale Analysis Datasets
Nancy Brucken, inVentiv Health Clinical
Karin Lapann, Shire
Questionnaires, ratings and scales (QRS) are frequently used as primary and secondary analysis endpoints in clinical trials. The Submission Data Standards (SDS) QRS sub-team has compiled a considerable library of SDTM supplements defining standards for the collection and storage of QRS data. The ADaM ADQRS sub-team has been formed to develop addenda to these supplements, which will define standards for corresponding analysis datasets. This paper represents the current thinking of the ADQRS sub-team regarding basic principles for building QRS analysis datasets.
DS25 : A Critique of Implementing the Submission Data Tabulation Model (SDTM) for Drugs and Medical Devices
Carey Smoak, DataCeutics, Inc.
The Clinical Data Interchange Standards Consortium (CDISC) encompasses a variety of standards for medical research. Amongst the several standards developed by the CDISC organization are standards for data collection (Clinical Data Acquisition Standard Harmonization - CDASH), data submission (Study Data Tabulation Model - SDTM) and data analysis (Analysis Data Model - ADaM). These standards were originally developed with drug development in mind. Therapeutic Area User Guides (TAUGs) have been a recent focus to provide advice, example and explanations for collecting and submitting data for a specific disease. Non-subjects even have a way to collect data using the Associated Persons Implementation Guide (APIG). SDTM domains for medical device were published 2012. Interestingly, the use of device domains in the TAUGs occurs in fourteen out of eighteen of TAUGs providing examples of the use of various device domains. Drug-device studies also provide a contrast on adoption of CDISC standard for drug submissions versus device submissions. Adoption of SDTM in general and the seven device SDTM domains by the medical device industry has been slow. Reasons for the slow adoption will be discussed in this paper.
Data Visualizations & GraphicsDV01 : Effective Ways to Perfect the Visualization of Clinical Trial Results
Amos Shu, AstraZeneca
A picture is worth a thousand words, which is why graphs are widely used for communicating clinical trial results. Well-designed graphs not only bring clarity to statistical results, but also add elegance to the report. How to make a perfect graph, however, is both technically and aesthetically challenging. This paper illustrates three different ways to perfect graphs in the oncology area. The first example shows that color adjustment helps bring the audience's attention to subjects you want them to focus on. The second one illustrates how to clearly present multiple categories in a graph by using legends and text boxes. The third example demonstrates adding information into a figure as much as you can without compromising its clarity and beauty.
DV02 : Layered Clinical Graphs using SAS
Sanjay Matange, SAS
The SGPLOT procedure supports layering of plot types to create more complex and unique graphs needed for clinical research. Over 30 different plot types can be combined in creative ways to support current or future needs. Often, such graphs can be created with just a few lines of code. Most clinical graphs can be broken down into layers that together build the graph. Then we can combine appropriate plot types to build the final graph. This paper will present this technique to build some new graph requested more recently such as Spider Plot, Volcano Plot, A1c Plot and more.
DV03 : Multipage Adverse Events Reports Using PROC SGPLOT
Warren Kuhfeld, SAS
Mary Beth Herring, Rho, Inc.
Researchers and safety monitoring committees need to track adverse events (AEs) among clinical trial participants. Adverse events are commonly reported as a tabular summary by system organ class and preferred term for each treatment group. But it can be difficult to identify trends or potential safety signals simply by reviewing individual numbers in a summary table. Trends are much easier to identify when the frequency of each AE is displayed in a graph. The SGPLOT procedure can use axis tables to display tabular information next to scatter plots, but multipage printed reports cannot be produced using a single graph. This paper shows how to use PROC SGPLOT to better create multipage adverse events reports. Most of the work involves preparing the data so that page breaks occur in reasonable places and groups of adverse events and continuations onto a new page are both properly labeled. PROC SGPLOT along with a BY statement produces the final report. Groups of adverse events can be separated by blank lines or by reference lines. This paper explains all the steps needed to prepare and display multipage adverse events reports that are easier to interpret than summary tables.
DV04 : Mean and Individual subject graphs of time vs. concentration data using PROC SGPLOT
Pooja Trivedi, Zydus Cadila Healthcare Limited
In clinical research Industry, Clinical Study Report (CSR) is very important document because it contains all the trial related activities, results and conclusion of trial. Graphs are also one of the important part of the CSR as graphs are visual representation of the data and help us to understand the data in better way. The main purpose of this paper is to create highly presentable graphs which can directly put into CSR or as appendices. The mean and individual subject's graphs of concentration vs. time data are most common graphs which show change in drug concentration in the body with respect to time.
DV05 : SAS vs Tableau: Creating Adverse Event/ Concomitant Medication Time line plot
Liling Wei, Pharmacyclics
Kathy Chen, Pharmacyclics, Inc. A Abbvie Company
The Adverse Event (AE) Concomitant Medication (CM) time line plot displays the events and medications for subject by severity and time. In this paper, we use data visualization software Tableau Desktop to create the graph using drag-and-drop steps to analyze the time-based data. The SGPLOT procedure HIGHLOW plot from SAS9.4 also makes it easy to generate the combined AE and CM graph. This paper compares the two different approaches.
DV06 : Visualizing Enrollment Over Time
Laura Gruetzner, BioStat Solutions, Inc.
Taking a closer look at patient enrollment over time can provide a valuable insight and support strategic decisions or further analysis for just about any type of clinical trial. Early-phase studies with small sample sizes may often be visualized on the patient level, resulting in a compact and comprehensive summary of enrollment and events like progression, crossover or AEs. For larger trials with multiple treatment arms, area charts or simple line plots of patients per cohort over time keep track of the developing trial population, periods of slow enrollment and balances between arms. This paper explores how to graphically inspect enrollment patterns over time for early- and late-phase clinical trials in an example-based approach. Focus will be placed on different ways to visualize enrollment for dose escalation, basket (or umbrella) and randomized controlled two-arm trials using SAS® 9.4 ODS Graphics.
DV07 : Using Animated Graphics to Show PKPD Relationships in SAS 9.4
Andrew Mccarthy, Eli Lilly
The relationship between pharmacodynamics and pharmacokinetics is often complex and it can be difficult to conceptualize how drug induced changes relate to drug levels. Animated, multivariate visualizations can provide insight into these relationships. Using the features of SAS 9.4, spectacular animated plots can be produced from SAS Graph Template Language (GTL) procedures simply with the use of a by statement. This paper will demonstrate how to produce a graphic that shows multivariate changes in electroencephalographic (EEG) spectral activity and the relationship these have with drug plasma exposure.
DV08 : Bird's eye view of the data, a graphical exploration!
Hrideep Antony, Inventiv Health USA
Have you ever wondered if there is a better way to know who those outlier subjects are on your summary graphs without going through individual listings or without going back to your database filters to recognize them? Have you wished for a more efficacious means of identifying the outliers in your database and evaluate an individual subject's contribution to the overall database without the repetitious task of comparing and evaluating numerous pages of patient listing outputs vs. summary tables? This Paper may provide you with an optimal solution that you are looking for! Number of listings and Tables that are created for the purpose of understanding a subject's contribution or significance to an overall database could be reduced if we can "recognize" a subject or a group of subjects of interest on a broader summary picture. This paper will introduce some practical techniques using the SAS SG procedures to have a broader understanding of the study databases and ways by which we can "Spot" and "evaluate" the subjects of interest by having a "Birds eye view"(recognizing and analyzing from an elevated angle) of the subjects on high quality summary graphics. The primary focus here will be to generate summary graphs which recognize individual subjects of interest along with overall comparison statistics that are commonly used to analyze data in a clinical trial setting.
DV09 : Mapping Participants to the Closest Medical Center
David Franklin, Quintiles Real World Late Phase Research
"How far are patients from Clinics?" That was the question which was asked on a wintery afternoon. Recorded in the database was the ZIP codes of each participant, but how could we use that data to find the closest distance from the nearest clinic? "And would it not be nice if we could map it?" This paper looks at calculating the distance from a participant ZIP code to a Medical Center using the GEODIST function, finding the smallest distance for each participant to attend, and finally color-coding the plots depending on the distance calculated using the GMAP procedure. Also helpful is producing reports, one of which will be the those participants who are farthest away and the distance.
DV10 : Automated DSMB Presentation in SAS: Yeah, I'd Submit That! How to auto populate PowerPoint presentations so you don't have to
Kaitie Lawson, Rho, Inc.
Data & Safety Monitoring Board (DSMB) members are increasingly requesting high-level overview presentations to assess the current status of a clinical trial in lieu of lengthy physical or digital binders. When multiple clinical trials report to the same DSMB, a certain level of standardization across presentations is desirable to facilitate ease in understanding and familiarity with the presentation format for the DSMB members. Manually entering text and data into a presentation is tedious and is subject to transcription errors. By utilizing ODS PowerPoint, ODS Text, ODS List, and a series of other familiar SAS procedures, we have created a suite of SAS macros that automatically populate presentations, therefore eliminating the need to manually enter data. These SAS macros allow for presentations conforming to a general template to be created in a standardized manner which saves time, ensures data accuracy and integrity, and provides continuity in data presentation to DSMB members.
DV11 : Translating Statistics into Knowledge by Examples Using SAS Graphic Procedures
Tao Shu, Eli Lilly and Company
Jianfei Jiang, Eli Lilly
In a clinical trial, sponsors are required to communicate safety and efficacy data clearly and efficiently to regulatory agencies and to the public. Application of report and exploratory graphic visualization is crucial in summarizing clinical trial data in ways that can facilitate the precise conveyance of the statistical analysis with compelling visual stories. Here, we employ three different examples from Lilly Rheumatoid Arthritis clinical trials to demonstrate how SAS 9.4 graphical procedures can translate statistical analysis into vivid graphical forms. In the first example, using the SGPLOT procedure, we efficiently present the p-value in line graph and bar chart graphical reports. The clinically important effect of the Lilly compound (Baricitinib), as revealed by changes in American College of Rheumatology 20% response and modified Total Sharp Score, is clearly demonstrated at multiple time points when compared to placebo. In examples 2 and 3, using the SGPANEL procedure, we generate exploratory multivariable graphics on two-dimensional surfaces, which help to identify significant factors associated with serious adverse events (Poisson regression model) or radiographic progression of structural joint damage (longitudinal linear model) in Rheumatoid Arthritis. In addition, using exploratory graphics, we are capable of looking into clinical data at both the subject and population level. In summary, use of SAS 9.4 graphical procedures results in efficient and effective data interpretation in clinical trials.
DV12 : Make it personal: Upgrade your figures by adding individual patient data to the common figure types
Yuliia Bahatska, inVentiv Health Clinical
Christiane Ahlers, Bayer AG
Figures are believed to be the most flexible way of data presentation in the world of clinical trials. Indeed, with the abilities of the GTL and graphic procedures in SAS today it is only the power of the statistician's imagination that defines the complexity of the figures in the analysis. In the clinical study reports it is usually referred to summary statistics tables for the general overview and more detailed information about a certain patient can be found in the listings. Figures can present both of these. In this paper we would like to share with you our experience of creation of the figures that contain both the combination of summary statistics and individual patient data. The main idea is to enable the reviewer to trace how the values for a single patient changed throughout the study and at the same time put them into context with the derived statistics or other patients' values and also to identify outliers. This kind of data presentation provides all clinical functions and in the end all readers of the document with a deeper and a more detailed overview of the clinical trial results.
DV13 : The basics of Graphics Template Language and output by PROC DOCUMENT
Fangping Chen, UBC
ODS GRAPH TEMPLATE LANGUAGE (GTL) is a powerful language in SAS 9.2. PROC TEMPLATE can be used to create customized graphs independently from DATA step. ODS DOCUMENT destination give you a way to store and rearrange the individual components of a report and change styles and options for the output with the help of DOCUMENT procedure. This paper will present how to customize a figure with PROC TEMPLATE and rearrange a series of figure with ODS DOCUMENT and PROC DOCUMENT.
DV14 : When one is not enough or multi-celled plots: comparison of different approaches
Vladlen Ivanushkin, inVentiv Health Germany GmbH
Data visualization is a very powerful way to present data and therefore has been used for a long time in clinical trials. However, sometimes it happens that the number of plots required for a project is huge, or maybe you just don't want to scroll pages in a document, or open many files to compare a couple of plots. In such situations, it's a common practice to put several plots on one page to spare some space, to have better comparison possibilities, or just for a better overview. Some years ago a programmer would need to write tons of code to create some non-standard multi-celled graphics. However, the progress does not stand still and many tasks, which used to require much time and effort in the past, can now be replaced with just few lines of code. Having so many opportunities to create as sophisticated plots as one can only imagine which way should the programmer choose, and which one is the most efficient? In this paper I would like to describe different approaches of creating multi-celled plots and specify pros and cons for each of them.
DV15 : The %NEWSURV Family of Macros: An Update on the Survival Plotting Macro %NEWSURV and an Introduction to Expansion Macros
Jeffrey Meyers, Mayo Clinic
Time-to-event endpoints such as overall survival are commonly used as outcomes in oncology clinical trials, and one of the best graphical displays of these outcomes is the Kaplan-Meier curve. The macro NEWSURV, which has been presented at SAS conferences in 2014 and 2015, has become popular as a powerful tool for creating highly customizable publication quality Kaplan-Meier plots. The positive feedback for the NEWSURV macro has led to the program being updated with additional features and improvements, and has led to additional macros being created for the situations that NEWSURV does not cover. A pair of macros, NEWSURV_ADJ_INVWTS and NEWSURV_ADJ_DIRECT, create adjusted Kaplan-Meier curves using two specific methods: inverse weights and direct adjustment. The macro NEWSURV_DATA allows the user to provide their own dataset with pre-calculated survival function estimates that the program will then use to create a high quality plot in the NEWSURV style. This opens up the flexibility for any adjusted or weighted curve to be plotted at publication quality. The following is a paper describing the updated functionality of NEWSURV and an overview on the methods used with the new expansion macros.
Hands-on TrainingHT01 : Five Ways to Create Macro Variables: A Short Introduction to the Macro Language
Art Carpenter, CA Occidental Consultants
The macro language is both powerful and flexible. With this power, however comes complexity, and this complexity often makes the language more difficult to learn and use. Fortunately one of the key elements of the macro language is its use of macro variables, and these are easy to learn and easy to use. Macro variables can be created using a number of different techniques and statements. However the five most commonly methods are not only the most useful, but also among the easiest to master. Since macro variables are used in so many ways within the macro language, learning how they are created can also serve as an excellent introduction to the language itself. These methods include: " %LET statement " macro parameters (named and positional) " iterative %DO statement " using the INTO in PROC SQL " using the CALL SYMPUTX routine
HT02 : Point-and-Click Programming Using SAS® Enterprise Guide®
Kirk Paul Lafler, Software Intelligence Corporation
Mira Shapiro, Senior SAS® Consultant, Capacity Planner and SAS Programmer
Ryan Paul Lafler, High School Student and Software Enthusiast
SAS® Enterprise Guide® empowers organizations, programmers, business analysts, statisticians and end-users with all the capabilities that SAS has to offer. This presentation describes the built-in wizards to perform reporting and analytical tasks, access to multi-platform enterprise data sources, the delivery of data and results to a variety of mediums and outlets, data manipulation without the need to learn complex coding constructs, and support for data management and documentation requirements. Attendees see the graphical user interface (GUI) to access tab-delimited and Excel input files; subset and summarize data; join (or merge) two tables together; flexibly export results to HTML, PDF and Excel; and visually manage projects using flowcharts and diagrams.
HT03 : Survival 101 - Just Learning to Survive
Leanne Goldstein, City of Hope
Rebecca Ottesen, City of Hope
Analysis of time to event data is common in biostatistics and epidemiology but can be extended to a variety of settings such as engineering, economics and even sociology. While the statistical methodology behind time to event analysis can be quite complex and difficult to understand, the basic survival analysis is fairly easy to conduct and interpret. This workshop is designed to provide an introduction to time to event analyses, survival analysis and assumptions, appropriate graphics, building multivariable models, and dealing with time dependent covariates. The emphasis will be on applied survival analysis for beginners in the health sciences setting.
HT04 : New for SAS® 9.4: Including Text and Graphics in Your Microsoft Excel Workbooks, Part 2
Vince Delgobbo, SAS
A new ODS destination for creating Microsoft Excel workbooks is available starting in the third maintenance release of SAS® 9.4. This destination creates native Microsoft Excel XLSX files, supports graphic images, and offers other advantages over the older ExcelXP tagset. In this presentation you learn step-by-step techniques for quickly and easily creating attractive multi-sheet Excel workbooks that contain your SAS® output. The techniques can be used regardless of the platform on which SAS software is installed. You can even use them on a mainframe! Creating and delivering your workbooks on-demand and in real time using SAS server technology is discussed. Although the title is similar to previous presentations by this author, this presentation contains new and revised material not previously presented. Using earlier versions of SAS to create multi-sheet workbooks is also discussed.
HT05 : SAS Studio - the next evolution of SAS programming environments
Jim Box, SAS Institute
SAS Studio is the newest SAS programming environment, and provides many tools to help you with your programming tasks. In this Hands on Training session, we'll take a tour of this enhanced programming environment, highlighting the following features: the dataset browser, where we will build filters and change what columns are displayed, the snippet manager, where we will explore existing code snippets and learn how to create and manage our own code, the task manager, where we will see how to generate code using a GUI, and then see how to build our own tasks, and the visual query builder, where we will see how to combine datasets quickly and efficiently. SAS Studio is a web-based tool, so you will be able to code and interact with SAS from just a browser. Come see how this tool can help you be a more efficient programmer.
HT06 : Usage of Pinnacle 21 Community Toolset 2.x.x for Clinical Programmers
Sergiy Sirichenko, Pinnacle 21
Michael Digiantomasso, Pinnacle 21
All programmers have their own toolsets like a collection of macros, helpful applications, favorite books or websites. Pinnacle 21 Community (P21C) is a free and easy to use toolset which is useful for clinical programmers who work with CDISC standards. In this Hands-On Workshop (HOW) we'll provide an overview of installation, tuning, usage and automation of P21C applications including: Validator - ensure your data is CDISC compliant and FDA submission ready Define.xml Generator - create metadata in standardized define.xml v2.0 format Data Converter - generate Excel, CSV or Dataset-XML format from SAS XPT ClinicalTrials.gov Miner - find information across all existing clinical trials
HT07 : Single File Deliverables: Next Steps
Bill Coar, Axio Research
Creating Tables, Listings, and Figures (TLFs) continues to be part of daily tasks in a statistical programming group in the Pharmaceutical Industry. Trends continue to move toward electronic review and distribution, and the end users of the TLFs are often requesting a single file rather than sending each output individually. In the recent years, the use of item stores has been introduced as a viable option to create a single file deliverable. In 2016, a hands on training was offered to demonstrate their usefulness in our industry setting. During this workshop (Combining TLFs into a Single File Deliverable, PharmaSUG 2016) attendees gained experience with creating, replaying, and restructuring item stores to obtain a single file containing a set of TLFs. Single File Deliverables: Next Steps will build upon what was presented in 2016 working towards more complex implementation and automation. After a brief refresher, Next Steps will focus on combining and restructuring item stores to obtain a single and well-structured document with hyperlinks/bookmarks for navigation. The use of by-group processing will also be introduced. Throughout the workshop it will become apparent that automation is extremely desirable. Basic automation steps will be introduced to demonstrate an application utilizing macro programming. This Hands-on-Training will build upon what was introduced in 2016. It will assume that users have a (very) basic understanding of Proc Report, the SG Procedures, item stores, and macros. The use of ODS is required in this application using SAS 9.4 in a Windows environment.
Healthcare AnalyticsHA01 : Developing Your Data Strategy
Greg Nelson, ThotWave
The ever growing volume of data challenges us to keep pace in ensuring that we use it to its full advantage. Unfortunately, often our response to new data sources, data types and applications is somewhat reactionary. There exists a misperception that organizations have precious little time to consider a purposeful strategy without disrupting business continuity. Strategy is a phrase that is often misused and ill-defined. However, it is nothing more than a set of integrated choices that help position an initiative for future success. With that in mind, this presentation will cover the key elements defining data strategy. Key topics include: " What data should we keep or toss? " How should we structure data? (warehouse vs. data lake vs. real-time/event stream) " How do we store data? (cloud, virtualization, federation, cloud, Hadoop), the approach we use to integrate and cleanse (ETL vs. cognitive/ automated profiling " How do we protect and share data? All of these topics ensure that the organization gets the most value from our data and how we prioritize and adapt our strategy to meet unanticipated needs in the future. As with any strategy, we need to make sure that we have a roadmap or plan for execution, so we will talk specifically about the tools, technologies, methods and processes that are useful as we design a data strategy that is both relevant and actionable to your organization.
HA02 : Multinomial Logistic Regression Models With Sas® Proc Surveylogistic
Marina Komaroff, Noven Pharmaceuticals
Proportional odds logistic regressions are popular models to analyze data from the complex population survey design that includes strata, clusters, and weights. However, when the proportional odds assumption is violated (p-value < .05 for chi-square statistic), the use of multinomial logistic regression models for survey designs becomes challenging. This paper provides guidance in using multinomial logistic regression models to estimate and correctly interpret the relationships between predictor and multiple levels of nominal outcome with and without interaction term. The author developed a SAS MACRO utilizing PROC SYRVEYLOGISTIC that will help researchers to conduct statistical analyses. The U.S. National Health and Nutrition Examination Survey (NHANES) is a probability sample of the US population. These data sets are used in the examples that require application of multinomial logistic regression modeling techniques. Statistical analyses are conducted using the SAS® System for Windows (release 9.3; SAS® Institute Inc., Cary, N.C.) The author is convinced that this paper will be useful to SAS-friendly researchers who analyze the complex population survey data with multinomial logistic regression models.
HA03 : Topology-based Clinical Data Mining for Discovery of Hidden Patterns in Multidimensional Data
Sergey Glushakov, Intego Group
Iryna Kotenko, Intego Group / Experis Clinical, Site Lead
Andrey Rekalo, Intego Group, Senior Data Scientist
Clinical trial data is notoriously heterogeneous, incomplete, noisy, and multidimensional. These data may contain valuable information, and novel insights may be encapsulated in various patterns hidden deep within it. While the majority of approaches to mining clinical trial data focus on univariate relationships between a handful of variables, there is a lack of data integration and visualization tools that can improve our understanding of the entire data set. The aim of this paper is to describe the application of a holistic, topology-based clinical data mining (TCDM) methodology to discover multivariate patterns in clinical trial outcomes. This geometric, data-driven approach allows researchers to identify meaningful relationships in data that would otherwise be left unidentified by traditional biostatistical approaches. The TCDM methodology was adopted to develop a prototype of software platform, which facilitates the extraction and analysis of low-dimensional representations (data maps) of the full set of interdependent clinical outcomes. The prototype was developed using Python, R, and SAS, and combines state-of-the-art machine learning algorithms, statistical tools, and data visualization libraries. Computational experiments were performed on sample studies and included the analyses of both publicly available and proprietary data sets. We discuss the key steps involved in the TCDM workflow: data integration, generation of topological data maps, visual inspection of interesting data maps, statistical analysis, and interpretation of discovered relationships. The paper concludes that TCDM can be used in all phases of clinical trials for the integrated assessment of drug safety and efficacy as well as for exploratory research.
HA04 : Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching
Aran Canes, Cigna
Coarsened Exact Matching (CEM) is a relatively new causal inference technique which allows the researcher to non-parametrically create a matched dataset to evaluate the effect of a treatment. Propensity Score Matching (PSM) is the older, more established technique in the literature. Both methods have been turned into SAS macros which are used by many SAS data scientists. In one particular instance, I was tasked to deal with a dataset generated from pharmacy data where it seemed that, because PSM would not work due to the ratio of cases to controls, the only viable alternative would be the use of regression. However, this alternative was unappealing as it imposes a linear model on the data. Remarkably, I found N:N Coarsened Exact Matching was able to match more than 90% of the case population even though the ratio of cases to controls was close to 1:1. In this paper, I show why PSM was not feasible in this particular instance and why CEM provided a non-parametric alternative to regression. The paper contributes to the growing literature on the subject of non-parametric observational studies, both inside and outside the SAS community, by providing a real-life example of a seemingly intractable problem from the perspective of PSM which in fact could be solved through the use of CEM. Readers of the paper will be introduced to the techniques, and, instead of given general advice about their applicability, provided a real-world example where one method (once implemented in SAS) was clearly preferable to the other.
HA05 : I See de Codes: Using SAS® to Process and Analyze ICD-9 and ICD-10 Diagnosis Codes Found in Administrative Healthcare Data
Kathy Fraeman, Evidera
Administrative healthcare data - including insurance claims data, electronic medical records (EMR) data, and hospitalization data - contain standardized diagnosis codes used to identify diseases and other medical conditions. These codes go by their short-form name of "International Classification of Diseases," also known as ICD. Much of the currently available healthcare data contain the 9th version of the clinical modification of these codes, referred to as ICD-9-CM, while the more recent 10th version ICD-10-CM are becoming more common in healthcare data. These diagnosis codes are typically saved as character variables, often stored in arrays of multiple codes representing primary and secondary diagnoses, and can be associated with either outpatient medical visits or inpatient hospitalizations. The character components of ICD-9 and ICD-10 diagnosis codes are different, and the resulting SAS programming required to process and analyze these different versions of ICD diagnosis codes should reflect these differences. SAS text processing functions, array processing, and use of the SAS colon modifier can be used to analyze the text of both types of ICD diagnosis codes and identify similar codes, or ranges of ICD diagnosis codes as found in administrative healthcare data.
HA06 : An extrapolation algorithm for estimating national inpatient discharges based on a patient sample
Hannah Fan, Boston Strategic Partners, Inc
Victor Khangulov, Boston Strategic Partners, Inc.
David Hayashida, Boston Strategic Partners, Inc.
Despite the large increase in comprehensiveness of International Classification of Diseases (ICD) coding with time, clinically differentiated diseases such as those that are refractory, rare, or poorly described are lost to a lack of coding specificity. These complex etiologies and pathophysiologies are often embedded within too-broadly defined ICD codes or are classified as "unspecified" or "other". Furthermore, there may be a lack of medical consensus over the clinical criteria for patients destined for a particular code. Be it a lack of classification, consensus, or combination thereof, the end result complicates the estimation of disease burdens, because these uncertainties have a cascading effect on the accuracy of downstream economic assessments. A new extrapolation algorithm which produces a national discharge estimate based on any inpatient sample defined by clinical or coding criteria has been developed by Boston Strategic Partners, Inc. In this algorithm, we stratify inpatient discharges by hospital characteristics and apply an empirically calculated discharge weight to each strata to derive a national estimate. We evaluate the precision of an estimate by comparing the results of our algorithm to data previously published in peer-reviewed publications. This algorithm enables researchers and policymakers to leverage a small, clinically relevant patient sample that may not be well-defined using standard coding to identify, track, and analyze national trends in health care utilization and outcomes.
HA07 : Analyzing data from multiple clinical trials using SAS® Real World Evidence
Jay Paulson, SAS Institute
David Olaleye, SAS Institute
Real-world evidence (RWE) is the Health & Life Sciences equivalent of big data. This paper explains how SAS® Real World Evidence solution makes use of very large data sets obtained from multiple clinical trial sources to derive intelligence beyond the scope of a typical clinical trial. By using Prostate Cancer trial data from Project Data Sphere as example, this paper explains the concept of Workspaces, Index Cohorts, Population Cohorts and Expressions. Then the paper explains how users can create Workspaces, Index Cohorts, Population Cohorts and Expressions within SAS® Real World Evidence framework to perform analysis. This paper also explains the challenges of integrating data from multiple clinical trials and the tools and techniques applied to address those challenges. The paper also explains how Index and Population cohorts will be used to perform analyses and derive insights including cohort characteristics, mortality analysis, and cohort comparison etc.
HA08 : Put the "K" in Your KAB: Know How to Efficiently Program with Knowledge, Attitudes, and Behavior Survey Data
Cara Lacson, United BioSource Corporation
Jasmeen Hirachan, United BioSource Corporation
Data from Knowledge, Attitudes, and Behavior surveys can present challenges, but these can be alleviated by a solid, efficient programming strategy. Producing KAB tables, listings, and figures can be very tedious to program, from the number of SAS data set variables that the survey data requires, to skip logic within the survey, and the volume and detail of question and response text that must be reported. With lots of typing involved, this is certainly something you want to do only once within your project, and in one location, to reduce the chance for error. This paper presents the basic concepts of a KAB survey from a programming perspective: what to expect in your data, recommendations for your analysis data set structure, and how to handle permitted skip questions. This paper also explores a solution to storing long question text, responses, and correct answers in one central location for use throughout all of your SAS programs in a KAB study. Software Products Used: SAS, Microsoft Excel Skill Level/Background: SAS Programmers of any experience level
HA10 : Removing the Mask of Average Treatment Effects in Chronic Lyme Disease Research Using Big Data and Sub-Group Analysis
Mira Shapiro, Analytic Designers LLC
Lorraine Johnson, LymeDisease.org
There is much controversy in the medical community surrounding Chronic Lyme Disease (CLD). CLD sufferers have persistent Lyme symptoms as a result of being untreated, under-treated or, having a lack of response to their antibiotic treatment protocol. Many of the past Lyme disease studies that reported on average treatment effects were unable to identify treatment successes. Using patient-reported outcome data collected by LymeDisease.org via their online registry MyLymeData, we will show that using sub-group analysis techniques can unmask valuable information about treatment efficacy.
HA11 : Making Sense of Complex Exposure Patterns - a SAS macro for creating counting process data structures.
Jeremy Smith, Dartmouth College
Longitudinal healthcare data allows us to capture the elaborate interplay of exposures to prescription drugs, hospital settings, medical procedures, health outcomes and the like, but structuring data in a way that best reveals these patterns can be a major challenge. There are a number of common methods, but the so-called 'counting process' structure may be both the most powerful and also the most difficult to create. In simple terms, counting process data has one row for each block of person-time during which the binary state of all time-varying covariates remains unchanged. Each row contains, at least, an ID variable, start-time and end-time variables, and an arbitrary number of (generally binary) time-varying variables. This structure affords a number of advantages over the more common method of dividing longitudinal data into equal bins of pre-determined length. In this paper, I describe the use of a macro implemented in Base SAS to convert event-level data into a counting process structure as well as built in options to account for index and censoring events, carry-forward of fixed values such as those from lab tests, calculation of cumulative dose and varying methods for handling drug supply carry-over. Further, I describe various options for summarization and analysis of the resulting dataset including basic time-varying Cox regression. Implementation is simple enough even for novices to the field of longitudinal data analysis and the macro has been extensively tested on datasets of over 100M records.
HA12 : A Business Solution for Improved Drug Supply Chain Visibility
David Butler, Teradata Corporation
Joy King, Teradata Corporation
Ronald Chomiuk, Teradata Corporation
Richard Neafus, Teradata Corporation
Global increases in supply chain regulations are creating additional burden on manufacturers, distributers and pharmacies. Drug Supply Chain Security Act (DSCSA) is the U.S.A version. It is vital for regulatory purposes to be able to see every product in the supply chain at a unit-of-sale level, but the data can be used for business gains as well. Predictive analytics provide this capability. Examples of opportunities include: what each customer is going to order next; which point in the supply chain is going to not meet SLA; where stock deficiencies or surpluses are going to occur and what can be done in advance to avoid these variations; Identify when a customer's needs are not going to be met and implement steps to attenuate concerns and losses before they happen; Know how to move any and all available products and groups of products to specific anticipated needs and groups of anticipated needs at lowest price. Predictive analytics are stronger when they are integrated analytics, using both data sources from across the industry and analytical models of various types and capabilities. On top of analytics, the best ROI is gained from making analytics results available in a usable, actionable format for each person, process and point in the system, plus receiving feedback from each in order to further refine the business, with the end result becoming a sentient enterprise.
Industry BasicsIB01 : A Practical Guide to Healthcare Data: Tips, traps and techniques
Greg Nelson, ThotWave
Healthcare is weird. Healthcare data is even more so. The digitization of healthcare data that describes the patient experience is a modern phenomenon with most healthcare organizations still in their infancy. While the business of healthcare is already a century old, most organizations have focused their efforts on the financial aspects of healthcare and not on stakeholder experience or clinical outcomes. Think of the workflow that you may have experienced such as scheduling an appointment through doctor visits, obtaining lab tests, or prescriptions for interventions such as surgery or physical therapy. As you traverse the modern healthcare system, we are left with a digital footprint of administrative, process, quality, epidemiological, financial, clinical, and outcome measures that range in size, cleanliness and usefulness. Whether you are new to healthcare or are looking to advance your knowledge of healthcare data and the techniques used to analyze it, this paper will serve as a practical guide to understanding and utilizing healthcare. We will explore common methods for how we structure and access data, discuss common challenges such as aggregating data into episodes of care, reverse engineering real world events, and dealing with the myriad of unstructured data found in nursing notes. Finally, we will discuss the ethical uses of healthcare data and the limits of informed consent that is critically important for those of us in analytics.
IB02 : Good Programming Practices at Every Level
Maria Dalton, GlaxoSmithKline
Programming in the pharmaceutical industry focuses on transforming and analyzing clinical data. It is critical for pharmaceutical programs to be accurate since decisions about the safety and efficacy of drugs are made based on their results. In addition, programs must be well-documented, efficient, and reusable - this is necessary to meet tight timelines and resource constraints and to ensure that programs can be understood by other programmers and regulators. Thus, it is critical that pharmaceutical programmers follow good programming practices (GPP). This paper explores the topic of GPP in the pharmaceutical industry at different levels - at the level of individual programs, at the study level, and at the compound or therapeutic area level. At each level, there are different yet complementary practices that a programmer should follow to produce accurate and robust programs. The SAS programming language is commonly used in the pharmaceutical industry, so some recommendations in this paper are SAS-specific, but the general principles can be applied to any programming language. This paper is most useful for beginner programmers, but it will be helpful to experienced programmers as well, especially for lead programmers and managers who are trying to embed good programming practices within their teams.
IB03 : How to define Treatment Emergent Adverse Event (TEAE) in crossover clinical trials
Mengya Yin, Ultragenyx Pharmaceutical
Wen Tan, Ultragenyx Pharmaceutical
The definition of TEAE is an event that emerges during treatment, having been absent at pretreatment, or worsens relative to the pretreatment state. For crossover studies, things get a little bit complicated: we need to specify which period the TEAE occurs, so as to figure out which treatment really triggers that TEAE. Meanwhile, what if the crossover trials have adverse events occur during the washout/rest period? Does this AE count as TEAE? If so, then which treatment should we consider that cause these adverse events? This paper will have further discussion and give some suggested options on TEAE definition in such scenarios.
IB04 : How to find the best MDR solution for your organization
Kevin Lee, Clindata Insight
Are you satisfied with the current Metadata Repository (MDR) solution strategy at your organization? Are you looking for a more efficient, beneficial MDR platform or to be more educated on the crucial factors of MDR solution? Metadata Repository (MDR) has become essential for Life Sciences industries to store and manage standards (e.g., CDISC and company specific standards) and terminologies. However, many of organizations are struggling with current Metadata Repository (MDR) solutions because of version controls of standards, instabilities within systems, resistance of study teams and other issues. The paper will elucidate common challenges, concerns and typical user complications encountered during the MDR selection and implementation process. The paper will demonstrate various solutions to these frequent issues. The paper will also introduce some variables to consider in MDR selection process - defining the business objectives of MDR solution, Proof of Concept (POC) and evaluation of MDR functionalities of different solution. The paper will introduce good practices to MDR implementation processes, including System Development Life Cycle (SDLC) and integration of MDR with in-house systems (e.g., SAS and EDC) and change management.
IB05 : SDTM Cartography - Learn To Create SDTM Mapping Specifications
Donna Sattler, Eli Lilly
Mike Lozano, Eli Lilly and Company
This paper will explore the Data Integration mapping process. A best practice discussion will ensure that the intent of the data collection arrives at its intended destination and finally, most commonly asked questions when creating and utilizing a mapping specification.
IB06 : Data Monitoring Committee Report Programming: Considering a Risk-Based Approach to Quality Control
Amber Randall, Axio Research
Bill Coar, Axio Research
Quality control is fundamental to ensuring both correct results and sound interpretation of clinical trial data. Most QC procedures are a function of regulatory requirements, industry standards, and corporate philosophies. However, no one should underestimate the importance of independent, thoughtful consideration of relevance and impact at each step in the process from data collection through analysis. Good QC goes far beyond just reviewing individual results and should also consider monitoring data throughout the course of a study. In particular, QC is essential when supporting a Data Monitoring Committee (DMC). Given the nature of interim and incomplete data, inherent challenges exist when it comes to generation of DMC reports. Many of the usual practices associated with quality control need to be adapted to accommodate the repetitive nature of DMC review on accumulating data that may have outstanding queries. This presentation will explore adaptations to a typically rigid QC process that are necessary when reviewing interim/incomplete data. Such adaptations focus on a risk-based approach to QC to ensure that a DMC can make informed decisions with more confidence in the data and programming.
IB07 : Building a Fast Track for CDISC: Practical Ways to Support Consistent, Fast and Efficient SDTM Delivery
Steve Kirby, Chiltern International Ltd
Mario Widel, Eli Lilly
Richard Addy, Chiltern International Ltd
Standardized data are so useful that sponsors are now required to provide study data to the FDA (CBER and CDER) using the standards, formats, and terminologies specified in the Data Standards Catalog. In practice that means following CDISC SDTM for tabulations content. Good planning is required to make sure that SDTM data are ready as needed to support regulatory submission; better planning is needed to have SDTM data available as needed to support all internal and external data consumers. We will share some effective strategies that we have used to provide the study data as collected to data consumers in SDTM format from first patient first visit through database lock. Our presentation will focus on three key areas: Planning to collect what we will submit with CDASH and SDTM-friendly protocols; preparing to consistently implement SDTM with metadata standards; and designing robust, reusable mapping code that is validated early and used often.
Management & SupportMS01 : A Review of "Free" Massive Open Online Content (MOOC) for SAS® Learners
Kirk Paul Lafler, Software Intelligence Corporation
Leading online providers are now offering SAS® users with "free" access to content for learning how to use and program in SAS. This content is available to anyone in the form of massive open online content (or courses) (MOOC). Not only is all the content offered for "free", but it is designed with the distance learner in mind, empowering users to learn using a flexible and self-directed approach. As noted on Wikipedia.org, "A MOOC is an online course or content aimed at unlimited participation and made available in an open access forum using the web." This presentation illustrates how anyone who wants to learn SAS programming techniques can access a wealth of learning technologies including SAS University Edition software, SAS technical support, published "white" papers, code examples, PowerPoint slides, comprehensive student notes, instructor lesson plans, hands-on exercises, webinars, audio files, and videos.
MS02 : The Cross Border Program - Strengthening the Sponsor-Partner Offshore Experience
Kenneth Bauer, Merck
Girish Havildar, Merck
Pharmaceutical statistical programming outsourcing is a comparatively late entrant to the India offshore market. Merck Statistical Programming initiated the outsourcing of limited statistical programming deliverables in 2007, partnering with small companies in India and China. Lessons learned during this early phase allowed us to development a more effective model as we expanded scope in subsequent years with a new partner. One of the programs we established in 2015 was the Cross Border Program (CBP) where partner staff visit Merck US sites for a period of time. Objectives of this program include keeping intact the strong India team we have invested in, providing team members with a sense of a global team concept, assisting the partner in managing attrition, and providing growth opportunities for partner staff. This is a joint commitment program between Merck and our Partner, including budget sharing and onsite logistics support. This paper will cover details of the program and present lessons learned from both the Merck and Partner perspective.
MS03 : Stakeholder Management: How to be an effective lead SAS® programmer
Aakar Shah, Pfizer
As a new lead programmer, the traditional focus is to work with given programming team, study statistician, study data manager to deliver outputs per given timelines. Eventually the experienced lead would engage in timeline discussion, resource discussion, and CRO management. This traditional approach has its limitation in terms of perception of success, burned bridges with other functional groups, lack of communication, and lack of satisfaction in some areas etc. This paper takes holistic view of stakeholder management and how to use available processes, tools and techniques to ensure whole success - tangible and intangible. We will learn about creating Stakeholder Management Plan which will include following four processes, their input, tools and techniques and output: Identify Stakeholders, Plan Stakeholder Management, Manage Stakeholder Engagement, Control Stakeholder Engagement. This paper is intended for a lead SAS programmer. However, contents are useful to SAS programmers at any level.
MS04 : Managing conflicts across Cross Functional and Global Virtual Teams
Arun Raj Vidhyadharan, inVentiv Health
Sunil Jairath, Inventiv Health
Almost all of us work with people situated around the globe, and some of those people are home based and some are office based. For various reasons like improve efficiency, process flow, decision making, etc. organizations are now moving to teams that are made up of representatives from various functions and thsee teams are called as Cross Functional teams. The representatives from various functions on the team work with people from other functions and they are expected to blend in their collective expertise to come up with well thought out solutions, ideas and decisions. Quite often such teams run into problems and are unable to work to their full potential because of inherent relationship issues. The lack of co-operation soon becomes apparent when the team members work at cross purposes. In reality, functions or departments gradually become so full of themselves that they become a mini-organization within an organization. They compartmentalize themselves and start operating as distinct groups. This division brings about a feeling of 'us versus them'. Cross functional conflict and the failure to work well together arises when departmental functions operate in a manner that isolates them from the problems and concerns of their fellow workers. The authors of this paper have experience in working in Cross functional teams and global virtual teams and would like to share some learnings on how to manage conflicts and some rules that may help cross functional and also some ideas on conflict management across global virtual teams.
MS05 : Basic tracking skills and tools for managing day to day activities.
Kiran Kundarapu, Inventiv Health
Statistical Analyst/Programmer role evolved over years from just focusing on technical responsibilities to including basic tracking skills for better managing project and individual tasks. Study level and individual tracking/managing tasks is not just within the scope of supervisor/lead/manager but also part of all roles and levels nowadays. Industry demands and expectations are pushing programmers to require some level of project/individual management skills for tracking, managing, documenting, tracing, quality, efficiency and productivity. Basic skills could be maintaining the programming/Issue log or tracking individual daily actions to managing projects and resourcing. The focus on this paper is to provide awareness, templates and applications that can assist in managing day to day activities and project tasks for aspirants seeking basic tracking skills. The most common managing and tracking activities that encountered in the field at all levels: - Project/study tasks - Assignments - Timelines/Milestones - Issues/Risks - Actions/Decisions - Meeting agenda and minutes - Follow-up items - Emails The software applications in market: MS Project, MS Excel, MS Word and Outlook. Some of conventional tracking tools/logs that are widely used for tracking: Programming log, Risk/Issue log, Validation log, Meeting/Agenda template, Action/Decision log, Study planning chart, resourcing chart, management log, xxx_tracker & The tracking tools and applications that used for developing the tracking tools is completely individual preference. In this paper, some of the common tools that leveraged for tracking will be covered with sample templates and the available applications that used for developing the tools/templates.
MS06 : Insight into Offsites: creating a productive workflow with remote employees.
Maddy Wilks, Agility Clinical, Inc
As the proportion of on to off-site employees shifts within a workplace, it becomes increasingly important to have tools in place to make communication between the two parties efficient. This paper prompts discussion about what to consider when working with remotes, and answers questions about how to create a productive and healthy workspace from a "blended" employee population. Careful planning of communication tools, working hours, time off, and availability of each employee is important. Expectations for both on-site and remote programmers should be clearly defined. Time zone differences should be addressed. Timely recording and dissemination of project information is very important. Verbal and non-verbal communication that is second nature on-site needs to be intentionally addressed to off-site employees. Some remote employees may not immediately absorb the culture or feel like part of the team, so socialization and interaction should be addressed in order to heighten job satisfaction. When all these issues are thought through and taught to both sets of employees, the team can work well together. A company equipped to get the best of both worlds will be well positioned for success in a global marketplace.
MS07 : Woops, I Didn't Know! An Elegant Solution to Let Your Entire Department Benefit from Individual Lessons Learned
Michael Hagendoorn, Amgen, Inc.
Annia Way, Amgen, Inc.
Tim Yerington, Amgen, Inc.
Rachel Bowman, Amgen, Inc.
After completing a clinical trial analysis, programming and biostatistics teams often produce lessons learned (LLs) based on the project they just finished. Such LLs are typically captured in slide decks or Word files, shared once as a presentation within the small team, and then stored in the individual study directory to soon be forgotten. As many LLs may actually apply to the entire department, the result is a treasure trove of wisdom lost: " After a while, the audience won't remember these lessons or where to find them " New hires into the team may not know these lessons were ever presented " People in other product teams may not know an issue was already seen before " People in other functions may not know they can improve processes that impact stats and programming If such LLs don't translate to concrete, written changes to the departmental SOPs and manuals routinely referenced by all staff, then not only will the same mistakes be made all over again, but also what worked well will not be generalized. We will show how we successfully fixed this broken cycle via a small lean departmental Continuous Process Improvement group which proactively seeks LLs from individual teams after completing each study report. The group then works hand in hand with each team to translate those into updates to departmental guidance documents such as SOPs, manuals, checklists, and templates. This way, staff in the entire department immediately benefits from the LLs encountered in any one study team!
MS08 : Mentoring and Oversight of Programmers across Cultures and Time Zones
Chad Melson, Experis
Positions that provide SAS® programmers the opportunity to do both programming and managing are not always available in the world of consulting. Fortunately, I was offered the role of liaison between clinical SAS® programmers located in Ukraine and a client located on the US West Coast, while also maintaining my own project responsibilities as a programmer. I also had the opportunity to mentor students from a University training program for Clinical SAS® programmers in Ukraine. These positions offered me a chance to mentor Ukrainian colleagues and apply and develop my technical and managerial skills. The liaison role was a new model for the client and there was some apprehension about the time difference and the unknowns of working with a team from a global region they had no prior experience with. Through specific examples, this paper will identify the skills needed to manage any type of client relationship, regardless of the geographical distance between stakeholders, and the value provided by mentoring to both the mentors and mentees. Also provided will be support for the ideas that in our virtual world, time zone differences can be much less of an issue than anticipated and that there are more similarities than differences between Ukraine- and US-based teams. I will also address how the oversight of the Ukraine programmers has evolved and encourage other programmers to pursue opportunities to mentor other programmers.
MS09 : Beyond "Just fix it!" Application of Root Cause Analysis Methodology in SAS® Programming.
Nagadip Rao, Eliassen Group
"Just fix it, we have too much to do". A common phrase heard in organizations and especially in SAS® programming departments. It is relatively easy to fix programming issues every time it occurs rather than look into why it occurred in the first place. The problem with this approach is that, there is a good chance that a similar issue may occur and not be caught next time resulting in compromising validity of deliverables. This is analogous to treating the symptom, every time it occurs and not looking deeper into figuring out the actual cause. In order to prevent systemic programming issues occurring, it is essential to identify what, how and why an undesirable event happened before taking steps to prevent it in future. A systematic processes of investigating and identifying root cause that can be controlled by programming/management team with an intention of preventing it in future, can be done through root cause analysis (RCA). This paper explains how we utilized principles of RCA in statistical programming and reporting to identify root cause(s) and came up with corrective and preventive action plan (CAPA) after an undesirable event in the form of an adverse event report with incorrect numbers was produced, passed validation and multiple review by study team was submitted to regulatory authority before the issue was found.
MS11 : Lead Programmer Needs Help: Dedicated Programming Project Manager to the Rescue!
Gloria Boye, Vita Data Sciences (a division of Softworld, Inc)
Aparna Poona, Softworld, Inc. (Life Sciences)
Bhavin Busa, Vita Data Sciences (a division of Softworld, Inc.)
The Statistical Programming team functions as the engine that drives major components of statistical deliverables for a study such as CDISC compliant datasets (SDTM/ADaM), TLFs, and associated submission documents. A typical late stage clinical study could require a sizeable programming team, i.e. anywhere between 6 to 10 full-time resources. For such a study, there is an assigned Lead Programmer who is responsible for monitoring progress of the programming tasks and to ensure quality of the deliverables. In addition, he/she has to direct and distribute programming activities to the study team who are either full-time or contract hires. Essentially, the Lead Programmer becomes responsible for both the technical as well as the resource management aspects of the project. However, coupling these responsibilities with managing a team which is largely virtual can be overwhelming for a single resource. This could result in mismanagement of the team resources and communication gaps, which affect the timeliness and quality of the programming deliverables. Having a distinct programming Project Manager to support the Lead Programmer will help offset the work load and can increase the oversight and the productivity of the study team programmers. In this paper, we will detail the distinct roles of a Lead Programmer and a programming Project Manager, and the importance of having both for successful management of the study deliverables and the resources.
MS12 : Sponsor Oversight of CROs Data Management and Biostatistical Abilities
Lois Lynn, Noven Pharmaceuticals, Inc.
Sponsors of phase I, II and III Clinical Trials who partner with full-service CROs for the management of a clinical trial are responsible for data quality and integrity; choosing a CRO is a critical decision on the way to FDA submission. In my experience, full service CROs tend to be chosen based on their competence in their clinical operations department that recruits study sites and cares for patients. The mission critical oversight and connection with the biostatistics and data management departments tends to be an afterthought, yet these teams provide the study documents and data submission package required for pharmaceutical compound submission to the FDA. This paper highlights key considerations for data management and biostatistical services from a full service CRO starting with the bid defense meeting through study conduct and final deliverables.
PostersPO01 : CDISC Compliant NCA PK Parameter Analysis When Using Phoenix® WinNonlin®
Renfang Hwang, Celgene Coporation
ABSTRACT Per FDA, clinical and nonclinical studies that start after December 17, 2016 must use the data standards developed by Clinical Data Interchange Standards Consortium (CDISC). In meeting the compliance with CDISC data standards and obtaining PK parameters submission ready domains, PP and ADPP*, Pharmacokinetics group at Celgene Clinical Pharmacology has developed a process to ensure that NCA (Non-Compartment Analysis) PK parameter analysis using WinNonlin®, the NCA defaulted PK parameter names are accurately mapped and compliant to the PK parameter codelist and controlled terminology defined by CDISC. PK scientists use their Pharmacokinetics expertise along with the knowledge of CDISC data standards, the PK scientists have carefully mapped the entire mapping worksheet between NCA defaulted PK parameter names and CDISC compliant PK parameter codes. This PK parameter mapped EXCEL worksheet is finally put into production for when working on the CDISC compliant NCA PK parameter analysis using WinNonlin®. *PP PK Parameter file in CDISC SDTM (Study Data Tabulation Model) format - source dataset. ADPP Analysis Data of PK Parameter file in CDISC ADaM (Analysis Data Model) format. It is analysis ready dataset and contains treatment and the demographic data in addition to PK parameters information.
PO02 : Takeaways from Integrating Studies Conducted by Bristol-Myers Squibb (BMS) and ONO
Yan Wang, Bristol-Myers Squibb
In addition to merging and acquisition happening in the pharmaceutical industry, more international companies have been connected by working together to seek NDA or sNDA approvals. Currently, both BMS and Japan ONO have been independently working collaboratively on an oncology drug for cancer treatments, and the two companies have formed a strategic partnership that includes co-development, co-commercialization and co-promotion of multiple immunotherapies for patients with cancer. As a programmer, I have been heavily involved preparing the submissions of the integrated study results to FDA and Europe agencies. Throughout this data integration submission process, I have developed a deep understanding of the differences on presenting SDTM/AdAM datasets between the two organizations, and I am very pleased to share our approach on harmonizing the differences, and our lessons learned and suggestions on preparing data integration from different sponsors.
PO04 : Implementing Patient Report Outcome data in clinical trials analysis
Qi Wang, Amgen
Over recent decades, Health-Related Quality of Life (HRQoL) end points have been increasingly adopted in oncology and hematology clinical trials. This paper will focus on analyzing HRQoL data accumulated from the European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ-C30) and the Acute Lymphoblastic Leukemia Symptom Scale (ALLSS). This paper will introduce various statistical methods to analyze the HRQoL data. The paper will also present some challenges and considerations in developing ADaM data specifications and performing statistical analysis.
PO05 : Data Transparency, de-identification strategies, and platforms available for sharing data
Arun Raj Vidhyadharan, inVentiv Health
Sunil Jairath, Inventiv Health
Many Regulatory authorities now require clinical trial sponsors to make clinical trial data available. Data sharing from clinical trials can lead to better understand and faster development and approval of medicines for rare diseases. One of such initiatives led to launch of platform to share clinical trial data related to cancer studies and the platform is called as PDS (Project Data Sphere). Other platforms available are ClinicalStudyDataRequest.com ( CDSR) , Yale Open Data Access (YODA), ClinicalTrials.gov Data transparency in Pharmaceutical industry requires data de-identification to protect patient confidentiality and comply with legal governing laws. In order to put data for research and other purpose we need to first de-identify the clinical data. At a basic level, de-identification is a process by which dataset is modified in such a way that someone reviewing data should not be able to match an individual patient's data to a real-life individual. But the rules to de-identify dataset can be confusing, as we have to make sure that although we are removing some identifiers, dates, site id etc. but still our data makes sense when an analysis is run against the data. This paper will talk about concept and techniques followed for de-identifying data and discuss data transparency and few platforms for uploading clinical trial data and future strategies of de-identification.
PO06 : Statistician's secret weapon: 20 ways of detecting raw data issues
Lixiang Liu, Eli Lilly
Unclean clinical raw data is always statistician and statistical programmer's nightmare for all the downstream SDTM, ADaM and TFLs development work. Raw data issues could mess up the programming logic, create OpenCDISC reject, error, warning messages, and worst of all, if incorrect data is analyzed, study team could draw wrong or inaccurate conclusions regarding drug's safety and efficacy, which could put patients' safety in jeopardy and have significant impact on company's financial status. This paper will review 20 effective ways of detecting raw data issues. Since they are applied to the drug dispense, labs, and safety related CRF data (Including adverse event, medical history, concomitant therapy, drug exposure etc.), which are common for all clinical trials, these methods and their associated SAS programs could be easily used for clinical trial studies across different therapeutic areas.
PO07 : ClinicalTrials.gov Results: an End of Study Deliverable That Should Be Considered at Study Startup
Maya Barton, Rho
Elizabeth Paynter, Rho
Results submission to ClinicalTrials.gov is a deliverable required by law for most clinical trials. The deadline occurs one year from reaching the primary endpoint Last Patient Last Visit (LPLV) or the study LPLV. If the primary endpoint LPLV occurs prior to the study LPLV, this would lead to two submissions to Clinical Trials.gov. Since this milestone will be different for different studies, the optimum time point to start working on this deliverable needs to be considered. Additionally, the submission typically pulls in data from multiple resources which are populated at different time points during the study lifecycle. We propose using early planning and standardization at study startup so that with the press of a button results will be generated for ClinicalTrials.gov for review and upload to the website. With this in mind we created a template program using SAS Enterprise Guide 7.1 that will enable the study team to map the data from multiple sources into one. Using the published standard template from ClinicalTrials.gov, we developed generic and standard datasets and tables to incorporate the fields and formats specified in the ClinicalTrials.gov template. This will eliminate manual entry of study protocol information that may not be used as titles, headers, or footnotes in everyday data cleaning and bio-statistical analysis datasets and displays. When working on multiple studies, setup of email alert programs for upcoming ClinicalTrials.gov milestones is also essential. This monitors the clinical data status and helps in planning the generation and review for correct and timely submission.
PO08 : Programming Role in In-licensing process
Asif Karbhari, AstraZeneca
Over the past few years, the Pharmaceutical industry has undergone several acquisitions/divestments and/or partnering of products for companies to focus on strategic niches. For "In-licensing", there are two sides to the coin. It helps the company who is selling to strategize their portfolio and the company who is acquiring to expand their portfolio. When companies acquire another compound, it's because they see potential for long term growth to build or enhance the foundation of the product. To make that a success, it's key to understand what clinical data is already available and what can be done with it and who better to know about these important aspects than Programming. I plan to elaborate on how Programming plays a valuable role in making sure that inherited data is workable, reusable and how close it aligns with CDISC standards to help plan for future deliverables. I am a Lead Programmer that oversees two in-licensed compounds acquired from different companies which are at different stages of transition. I am planning to share some of my experiences, challenges faced and lessons learned in the "In-licensing" process from a programming perspective.
PO09 : AIR Binder: An Automatic Reporting and Data Analysis SAS Application for Cytochrome P450 Inhibition Assay to Investigate DDI
Hao Sun, Covance, Inc.
Kristen Cardinal, Covance, Inc.
Carole Kirby, Covance, Inc.
Richard Voorman, Covance, Inc.
We developed a SAS-based application, called AIR Binder (Automatic Inhibition Reporting Binder), which integrates a series of SAS macros to perform data analysis and visualization, and generate report-ready tables and figures for the cytochrome P450 (CYP) inhibition assay, a widely-used in vitro ADME assay that provides key supporting data for drug discovery and development. The dataset from this assay was designed to have only 3 columns (assay/sample identifier, concentration, and activity) to simplify data management/storage and QC process. AIR Binder has been successfully used to analyze this type of dataset, generate required statistics, make report-ready comprehensive tables, and plot high-quality grouped figures. The program calculates IC50 values using non-linear fitting algorithms based on several variations of a 4-parameter logistic (4PL) model. AIR Binder is a standardized and customized program for efficiently reporting CYP inhibition data, significantly reducing turnaround time, eliminating unnecessary errors due to data transfer, minimizing labor cost, and enhancing productivity.
PO10 : Charting Your Path to Using the "New" SAS® ODS and SG Graphics Successfully - Interactively Generate the Code
Roger Muller, Data To Events, Inc
SAS® Output Delivery System (ODS) Graphics started appearing in SAS® 9.2. Collectively these new tools were referred to as "ODS Graphics," "SG Graphics" and "Statistical Graphics". When first starting to use these tools, the traditional SAS/GRAPH® software user might come upon some very significant challenges in learning the new way to do things. This is further complicated by the lack of simple demonstrations of capabilities. Most graphs in training materials and publications are rather complicated graphs that, while useful, are not good teaching examples for starting purposes. This paper contains many examples of very simple ways to get very simple things accomplished. Many different graphs are developed using only a few lines of code each, using data from the SASHELP data sets. The use of the SGPLOT, SGPANEL, and SGSCATTER procedures are shown. In addition, the paper addresses those situations in which the user must alternatively use a combination of the TEMPLATE and SGRENDER procedures to accomplish the task at hand. Most importantly, the use of the "ODS Graphics Designer" as a teaching tool and a generator of sample graphs and code are covered. This tool makes use of the TEMPLATE and SGRENDER Procedures, generating Graphics Template Language (GTL) code. Users get extremely productive fast. The emphasis in this paper is the simplicity of the learning process. Users will be able to take the generated code and run it immediately on their personal machines.
PO11 : Facebook Data Analysis with SAS® Visual Analytics
Prasoon Sangwan, TATA Consultancy Services Ltd
Vikrant Bisht, TATA Consultancy Services Ltd
Piyush Singh, TCS
Ghiyasuddin Mohammed Faraz Khan, Sapphire Software Solutions Inc.
Now a day's social media data is playing key role to understand market trend and user's sentiments. Organizations are more interested to know user choice, their needs, feedback and social media like Facebook is very common platform to read such type of data. This paper demonstrates how SAS® Visual Analytics can be used to analyze Facebook data and create new reports based on these data which helps management to understand the market and user's preference. It also demonstrates the word cloud analysis on Facebook data using SAS® Visual Analytics. This paper contains some techniques which can be very helpful for SAS coders, where they can write their own SAS code for analytics. User can bring Facebook data to their client like SAS® Enterprise Guide and can analyze as they want in their SAS code.
PO12 : It's time to Time-to-Event Analysis!
Arun Raj Vidhyadharan, inVentiv Health
Sunil Jairath, Inventiv Health
Using time-to-event analysis methodology requires careful definition of the event, censored observation, provision of adequate follow-up, number of events, and independence or "noninformativeness" of the censoring mechanisms relative to the event. Design and Analysis of Clinical Trials with Time-to-Event Endpoints provides a thorough presentation of the design, monitoring, analysis, and interpretation of clinical trials in which time-to-event is of critical interest. This paper talks about various different efficacy endpoints in clinical trials that contribute to time-to-event analysis and how they are organized in the AdaM dataset ADTTE.
PO13 : Developing ADaM Dataset for Cardiovascular Outcome Studies
Rakesh Kumar, Mr.
David Wade, Mr.
David Chen, Mr.
CDISC released Therapeutic Area Data Standards User Guide for Cardiovascular Studies (TAUG-CV)Version 1.0 (Provisional) at July 2014. The TAUG-CV describes the most common data needed for ACS or reporting cardiovascular endpoints, and how to store these data in the SDTM datasets. This paper presents a method to create the ADaM dataset in the Basic Data Structure for the clinical events reporting following the TAUG-CV.
PO14 : Mindfulness at Work: Handling Stress and Changes Gracefully and Be the Leader You Were Born To Be
Helena Ho, Astrazeneca
In the pharmaceutical industry, change has become the norm and the pace is getting faster. Based on a PwC National survey, chances are you are part of the 1/3 of Americans who feel burnt out at work. But considering we spend about half of our waking life at work, why not change the attitude? In this paper, we discuss some of the latest research surrounding Mindfulness and how we can use this method to help us break unhelpful stress patterns, deal with changes gracefully and to let the leader within you shine forth in even the most stressful times.
PO15 : Pilot Meta-Analysis of HPA Axis Suppression Studies on Topical Corticosteroids using ADaM Datasets derived from Legacy Data
Lillian Qiu, FDA
Hon-Sum Ko, FDA
Topical corticosteroids are widely used in dermatologic therapy because of their anti-inflammatory and anti-pruritic effects. They are categorized into "potency" classes according to their effect on dermal erythema, which is based on vasoconstrictive activity. There is suggestion that such "potency" may be related to systemic effects such as the suppression of hypothalamic-pituitary-adrenal (HPA) axis. Six HPA axis suppression studies have been included for a pilot safety meta-analysis for topical corticosteroid products. These studies involved the use of topical corticosteroids across a spectrum of "potencies" in patients with psoriasis or atopic dermatitis. ACTH stimulation and plasma cortisol measurements were conducted before and after a period of use of the topical corticosteroid. To perform the pilot safety meta-analysis, ADaM ADLB datasets, derived via SAS programming for the conversion of legacy data of these studies, were combined, also using SAS programming. HPA axis suppression was defined either by plasma cortisol level after ACTH stimulation alone (one criterion) or by a combination of basal, post-stimulation, and rise in plasma cortisol levels (3 criteria). The 90% and 95% confidence intervals for the proportion of subjects showing HPA axis suppression were evaluated. This pilot study illustrates the power of using standardized datasets across studies for the comparison of corticosteroid activity, and the results present a consistency of relationship between dermal vasomotor "potency" and systemic effect (HPA axis suppression).
PO16 : A Precision-strike Approach to Submission Readiness: How to Prepare Your Filing Teams for Consistent Excellence
Michael Hagendoorn, Amgen, Inc.
Tony Chang, Amgen, Inc.
Getting an electronic submission through the door at FDA used to be a Fantastic Journey of Discovery for our programming and biostatistics teams. We faced several challenges: " Lack of a consistent process around electronic filing deliverables across products " Varying levels of filing experience between teams, causing painful re-learning for each submission " Beneficial best practices and lessons are kept within each filing team and not proactively shared with other teams " Variable understanding and engagement from other functions in planning programming-related filing deliverables To address these issues, we established a small Submission Consultancy Group (SCG) consisting of 2-3 experienced filing lead programmers. The remit of SCG is to " Define straightforward standards around submission deliverable preparation to be referenced by all filing teams " Proactively advise filing teams on planning and assembling all statistical programming filing deliverables, and review status and content of these deliverables to help optimize package reviewability at FDA " Work with data standards groups to prospectively plan standards adoption for each filing " Continually take best practices and lessons from each filing team and proactively share those with all other teams " Communicate in a higher-level capacity with other functions such as Clinical and Regulatory to clarify stat-programming roles and contributions in filings to promote cross-functional understanding and engagement Importantly, SCG only advises; decision-making itself remains squarely within each filing team. Our approach has proven to be highly successful over the years. Here, we will share what worked well and what did not!
PO17 : The Power of Interleaving Technique in Data Manipulation
Zongming Pan, ConnectiveRx
Two or more sorted datasets could be combined into a new dataset by means of a SET statement, instead of MERGE statement, and an accompanying BY statement. This process is called interleaving. It has been suggested that interleaving a dataset with itself is a very useful technique to finish some kinds of tasks (Schreier, H.). The most popular one is the processing based on group summarization or analysis. This 'looking forward' character can be applied to efficiently cleaning and preparing data. Like many other datasets, medicine prescription data from pharmacy usually is 'dirty', such as containing missing or extreme values. It must be cleaned at the patient (higher) level rather than script level before any statistical analyses are carried on in medical marketing studies. Besides, it is quite often that because of specific requirements of a project, many other kinds of cleaning/manipulation are needed. Here it is shown that interleaving technique together with RETAIN statement has advantages over conventional 'Sort and Merge' process in the medicine prescription data cleaning and preparation: (1) much more neat-just one Data Step, (2) more efficient in terms of CPU time and usage. If the dataset is larger and more items need to be cleaned, the difference in terms of efficiency between interleaving technique and the conventional 'Sort and Merge' process is even more significant.
PO18 : A custom ADaM domain for time to event analysis in adverse events
John Saida Shaik, Seattle Genetics Inc.
One of the primary objectives of first in human oncology clinical trials is to evaluate the safety of a drug and assess its maximum tolerated dose (MTD). To evaluate safety and MTD, it is important to study all the adverse events that are occurring during the trial. In particular, adverse events with severity grade 3 and above and adverse events that are recurring need to be analyzed in detail. In addition, when an AE first occurred in a patient and how long it took to improve to a better grade and/or complete resolution are endpoints of interest. In this paper, I am proposing a custom ADaM domain to capture onset, improvement and resolution of adverse events information following the basic data structure (BDS) as described in CDISC ADaM Implementation guide (v1.0). This approach facilitates in preparing Kaplan Meier and time to event analysis TLFs in one proc away method and reduces time and effort on validation.
PO19 : Analyzing Singly Imputed Utility Based Weighted Scores Using EQ-5D for Determining Patients' Quality of Life in Oncology Studies
Vamsi Krishna Medarametla, Seattle Genetics
Liz Thomas, Seattle Genetics
Gokul Vasist, Seattle Genetics
Quality of Life (QoL) analysis provides complementary information to standard safety and efficacy analysis in clinical trials. This analysis allows regulators to take patient-reported outcomes (PROs) on quality of life into account when making decisions about drug approvals and labeling. A standard technique in QoL analysis is to combine patient-reported outcome data collected on a questionnaire into a single utility score. Missingness in PRO data is common, and must be addressed and accounted for in analysis using an imputation method. The EQ-5D is a standardized instrument for use as a measure of health outcome. Analysis of the EQ-5D requires computation of a weighted utility score. Country-specific time trade-off (TTO) weights are commonly used as well. Calculating the index weighted scores of the EQ-5D profiles under several different single imputation schemes will help in understanding the state of a patient's health. This paper demonstrates computation of the US and UK weighted TTO EQ-5D scores using LOCF, WOCF, Time point average and death only single imputation methods for a variety of time point indexed analyses.
PO20 : Dealing with alignment of statistics: Beyond the decimal point.
Eric Pedraza, INC Research
Dealing with alignment of statistics: Beyond the decimal point. The issue: When summarizing continuous variables, we need to take several factors into account, such as, dynamically assessing the required formats to obtain the desired alignment, values comprised of 2 statistics, a text to display in case the statistic cannot be computed, what is going to be aligned and in relation to what (i.e. a comma in a value comprised to 2 statistics aligned to the decimal point of single-value statistics), etc. The solution: These are time consuming tasks that can be simplified by using a MACRO to take care of the issues mentioned above, and taking very little time from the programmer to parametrize. This allows customizing the look and alignment of each single statistic independently, thus it can be used for different studies with different display requirements. The objective of this poster: I intend to explain the methods used to resolve the issues mentioned above and the way the MACRO was created, pros and cons, things to consider prior to implementing the use of this MACRO, as well as how this method can be helpful to decrease the time spent in generating tables of analysis of continuous variables. The code: Below I present the code I used to create the MACRO and also have 2 examples of how this MACRO can be implemented. (Please note I still need to polish things like the parameter names, to be more self-descriptive, and add comments for easier reading of the MACRO.)
PO21 : Metadata of Titles, Footnotes, and Proc Report Details
Julius Kirui, SCRI
Biostatisticians devote numerous amount of research time in development and reviewing of tables, figures and listings (TFL) mock shells. The Information provided by the mock shells include titles, footnotes, and TFL column width and header labels. Detailing these mock shells to a desired alignment, gramma and spell-check free document is quite involving and tedious. Several SAS users have written and shared their thoughts on how titles and footnotes can be adopted and used for TFL production with very minimal manual involvement. Repetitive manual transfer of mock shells information is labor-intensive and error prone. This paper will discuss and provide a SAS 9.4 macro program to generate ready to use metadata of titles, footnote, and proc report details from a standard set of TFL mock shells housed in an excel workbook. The metadata will be stored as a SAS data set or a macro parameter can be enabled to produce an editable version stored in excel workbook. The ability to duplicate these information in the final data summaries will eliminate majority of programming source errors like typos, missing or irrelevant titles and footnotes, shifted proc report details and etc. Key words: TFL Tables Figures and Listing
PO22 : SAS and R Playing Nice Together
David Edwards, Amgen, Inc.
Bella Feng, Amgen, Inc
Brian Schultheiss, Amgen, Inc.
Like many other statistical programming departments in the pharma/biotech industry, Amgen primarily uses SAS for data manipulation, data analysis and reporting. In recent years, other statistical software, such as R, has matured significantly and in some situations, provides a compelling alternative to SAS. To provide statistical programmers with a seamless integration between SAS and R Amgen has considered several different approaches including SAS/IML, MineQuest's software A Bridge to R for SAS Users and even simply using a SAS X command to shell out and execute R from the command-line. Because at Amgen our SAS Grid and Qualified R environments are hosted on different physical servers, however, none of these techniques offered an ideal solution. To address this problem Amgen decided to leverage the Microsoft DeployR package to embed the results of R routines within the SAS programs we use to analyze Clinical Trial Data. This paper will provide an overview of how Amgen implemented their integration between SAS and R and touch briefly on other use cases for DeployR. References: Edwards D., Nelson G., and Wang, S.: Modern SAS Programming: Using SAS Grid Manager and Enterprise Guide in a Global Pharmaceutical Environment. Paper presented at PharmaSUG, 2013. Muenchen R. A.: R for SAS and SPSS Users, Springer, 2008
PO23 : Utilizing a standard program template to aide in easy program maintenance.
Kurtis Cowman, PRA Health Sciences
It's been said that only 10% of your job is done once 90% of a program has been written. With studies ranging in duration from months to decades this couldn't be more apparent than programs written for the Pharmaceutical Industry. During these long periods programming leads will often see support programmers come and go all with their own programming style and leaving their own marks on programs. Because of this, maintaining programs can become quite burdensome and confusing especially in the absence of good notes and comments. By utilizing simple programming techniques and good programming practices this presentation aims to show that with a standard project program template you can maximize your efficiency by giving a familiar feeling, layout, and style to all the programs on your project. All this without jeopardizing the industry standard independent double programming.
PO24 : A Guide To Programming Patient Narratives
Renuka Tammisetti, PRA Health Sciences
Karthika Bhavadas, PRA Health Sciences
The ICH-E3 guidelines require patient narratives, which are targeted patient profiles of clinical importance. Patient narratives describe death, other serious adverse events, and certain other significant adverse events judged to be of special interest collected for a subject over the course of a clinical trial. SAS programmer is expected to provide key data information to the medical writer. Medical writer will review patient profiles to coincide with an event of interest and address the safety concerns of interest at the patient level. The poster will provide helpful insight on the traditional process of narrative generation and how to gather required information to program narratives.
PO25 : A unique method of deriving time variables in PK/PD studies
Yingqiu Yvette Liu, PA
In PK/PD studies, deriving time variables such as Nominal Time Since First Dose (NTSFD), Nominal Time Since Last Dose (NTSLD), actual Time Since First Dose (TSFD), actual Time Since Last Dose (TSLD), Time Deviation to TSFD (TSFDDEV), Time Deviation to TSLD (TSLDDEV), is a very common task in creation of analysis datasets which facilitate PK/PD modeling and clinical study report. The author will introduce a unique and efficient method of deriving time variables in the paper. The SAS code demonstrated can be used in SAS software after version 9.1. The paper is intended to all skill levels of intended audience.
PO26 : REDCAP Ins and Outs
Leanne Goldstein, City of Hope
REDCap standing for Research Electronic Data Capture is making waves around the world as the latest way to capture data at academic institutions. Its biggest benefit is that it can provide multi-user, web based created databases, with audit trails and reporting. It is an improvement to using MS Access databases and MS Excel which investigators have historically used to capture data in academic settings. Application programming interfaces (APIs) have been developed for programmers to get data in and out of REDCAPs but the SAS community has been somewhat left behind. This paper discusses an introduction to REDCAP for the SAS programmer and discusses methods of getting data into and out of REDCAP databases.
PO27 : Roadmap for Managing Multiple CRO Vendors
Veena Nataraj, Shire
Karin Lapann, Shire
When working with CROs, the relationship established between the two organizations is critical for successful outcomes. Increasingly, sponsors and CROs exist in the same ecosystem, looking to share standards, processes, and people. Many times, a sponsor works with multiple CROs, partnering with some while having a study-to-study relationship with others. Based on the type of the relationship, a CRO can provide certain functions to the sponsor. As with any relationship, investing time to understand and establish ground rules will help in the building of a healthy relationship. This paperr discusses the relationship of a sponsor to a CRO as a partner with high-level information on how the information flows back and forth. Also discussed are implementations of CDISC standards using tools available such as published standards, sponsor interpretation guides and document templates. It also notes advantages and pain points, and how these can be managed to ensure a successful partnership.
Quick TipsQT01 : Remove the Error: Variable length is too long for actual data
Eric Larson, inVentiv Health
Have you ever seen the error from Pinnacle 21 output 'Variable length is too long for actual data'? I have created a macro that will go through the character variables and identify the minimum length to hold the longest value of the data, and apply that length to the dataset just before creating a SAS transport file from it.
QT02 : Importing CSV Data to All Character Variables
Art Carpenter, CA Occidental Consultants
Have you ever needed to import data from a CSV file and found that some of the variables have been incorrectly assigned to be numeric? When this happens to us we may lose information and our data may be incomplete. When using PROC IMPORT on an EXCEL file we can avoid this problem by specifying the MIXED=YES option to force all the variables to be character. This option is not available when using IMPORT to read a CSV file. Increasing GUESSINGROWS can help, but what if it is not sufficient? It is possible to force IMPORT to only create character variables. Although there is no option to do this, you can create a process that only creates character variables, and the process is easily automated so that no intervention is required on the user's part.
QT03 : What Are Occurrence Flags Good For Anyway?
Nancy Brucken, inVentiv Health Clinical
The ADaM Structure for Occurrence Data (OCCDS) includes a series of permissible variables known as occurrence flags. These are optional Y/null flags indicating the first occurrence of a particular type of record within a subject. This paper shows how occurrence flags can be used with PROC SQL to easily produce tables summarizing adverse events (AEs) by System Order Class (SOC) and dictionary preferred term.
QT04 : Is There a Date In That Comment? Use SAS To Find Out.
Keith Hibbetts, Eli Lilly and Company
When anonymizing data from clinical trials, it's critical to identify any dates contained in the data. This can be a challenge when dealing with data that came from free form text entry, such as comments, where a date could be one small piece of a larger text string. This paper will show how utilizing Perl Regular Expressions in SAS code can identify records that likely contain a date.
QT05 : My bag of SAS lifehacks
Dmytro Hasan, Experis
Have you ever come across anything special in another person's code that made your brain freeze? If your answer is yes, then I hope this article will be of interest to you. Sometimes it can be very difficult to grab the message the programmer wanted to convey in their code lines. In this article, I would like to share my remarks about the issues that I find the most interesting. I have caught sight of some of them in other people's SAS programs or read up in different books and used them in my programs. We will consider a missing RUN statement at the end of the data step and calling procedure(s) without specifying the dataset, which decreases the length of the code. Some features of PROC SQL will be analysed which may seem non-trivial for the programmers who are less experienced with this procedure. We will go back to the beginning and try to find out what COMMENT does and in what ways it can be helpful. Also I would like to draw the reader's attention to two functions, namely ifn() and input() and how they handle missing values, how byte() allows to make the output more refined. Moreover, I would like to outline the utility of -rcl options of put(). I will demonstrate how to make a musical notification for data issues, compilation errors and the end of compiling with sound().
QT06 : Generating Colors from the Viridis Color Scale with a SAS® Macro
Shane Rosanbalm, Rho, Inc
In this paper we present a SAS® macro capable of producing a list of RGB color values from the Viridis color scale. The Viridis color scale was originally designed for Matlab as an open-source alternative to Matlab's proprietary Parula color scale, which itself was a replacement for the oft-criticized Jet color scale. Viridis is designed to be: " Colorful, spanning as wide a palette as possible so as to make differences easy to see, " Perceptually uniform, meaning that values close to each other have similar-appearing colors and values far away from each other have more different-appearing colors, consistently across the range of values, " Robust to colorblindness, so that the above properties hold true for people with common forms of colorblindness, as well as in grey scale printing, and " Pretty, oh so pretty.
QT07 : PROC DOC III: Self-generating Codebooks Using SAS®
Louise Hadden, Abt Associates Inc.
This paper will demonstrate how to use good documentation practices and SAS® to easily produce attractive, camera-ready data codebooks (and accompanying materials such as label statements, format assignment statements, etc.) Four primary steps in the codebook production process will be explored: use of SAS metadata to produce a master documentation spreadsheet for a file; review and modification of the master documentation spreadsheet; import and manipulation of the metadata in the master documentation spreadsheet to self-generate code to be included to generate a codebook; and use of the documentation metadata to self-generate other helpful code such as label statements. Full code for the example shown (using the SASHELP.HEART data base) will be provided.
QT09 : Basic SDTM House-Keeping
Emmy Pahmer, inVentiv Health
When creating SDTM datasets, there are a few simple checks that can be performed to help spot potential problems. These checks will be presented here. They are generally for common programming errors, not for checking the validity of source values or cross-checking values against others. In other words, basic house-keeping.
QT10 : Everyone can use a little Currency - when dependent data set updates silently make your analysis data set out of date.
Scott Worrell, PAREXEL
Nearly every analysis data set has dependencies on multiple other data sets. The dependent data sets may be raw data sets or other analysis data sets. At times a dependent data set is updated, but the person in charge of updating the analysis data set is not notified of the modification to the dependent data set. This makes the analysis data set with the data dependency invalid. In turn, this condition produces data integrity issues in every Table, Listing or Figure (TLFs) that uses the analysis data set. Before running TLFs, it is a good idea to verify that all dependent data sets are older or the same age as the analysis data set(s) being updated. The Data Currency Utility detects data set currency issues and reports them so that they can be corrected through an update run of the analysis data set creation program(s). This presentation is appropriate for persons with all levels of SAS experience, but assumes basic understanding of SAS's Dictionary Tables. The utility was written and tested using SAS 9.4 on the Linux operating system.
QT11 : Multi-Dimensional Arrays: Add Derived Parameters
Siddharth Kumar Mogra, GCE Solutions Inc
Multidimensional arrays are useful when Data needs to be grouped into a 'table like' arrangement for processing. Often we are required to derive and add records in analysis datasets. When there are large numbers of parameters to be added, the use of repeated if-then statements are prone to error and time consuming. Effective use of Multi-Dimensional Arrays or Nested Arrays can increase efficiency of program. It can minimize the code and makes program efficient. This paper demonstrates situation where use of Multi-Dimensional Arrays is practical and appropriate in Clinical Trials.
QT12 : Same statistical method different results? Don't panic the reason might be obvious.
Iryna Kotenko, Experis Clinical A Manpower Group Company
Modern statisticians and statistical programmers are able to use a variety of analytics tools to perform a statistical analysis. Performing quality control checks or verification analysis using different statistical packages for the same statistical method one may receive unexpectedly unequal results. So the question comes: what is wrong with the calculations? The answer might disconcert: the calculations are valid and the root of discrepancies is the difference in computational methods and default settings that are implemented in each statistical package. The aim of this article is to bring an awareness to the auditory about known inconsistency in computational methods in commonly used analytics tools: SAS, Python, R, and SPSS.
QT13 : Insignias Inside your Reports
Salmanali Momin, Mr
In clinical trial data analysis, presentation of data plays a vital role. This is achieved by creating Tables, Listings and Figures. CRO (Contract Research Organization) industry deals with various sponsors and their variety of requirements, which many a time include representation of company logos or any other study specific watermarks on the reports. Furthermore, for studies planned to have more than one stage submission, it may be advised to augment the intermediate reports with badges symbolizing the respective study stage i.e. draft, interim analysis or a DSMB (Data Safety Monitoring Board) submission. This paper will discuss the methods to imprint the company logos and watermarks on the study reports and explore the useful procedures.
QT14 : Visual Basic for Applications: A Solution for Handling Non-ASCII Removal and Replacement
Eric Crockett, Chiltern International
CDISC submission data and documentation should not include non-ASCII or non-printable characters. However the source information used to support the data and documentation often (and unavoidably) includes such characters. It is common for companies to leverage SAS macros to address this issue; but when the source information is present in the mapping specification, as, for example, it often would be for Trial Design domains such as TI, the best approach may be to programmatically address the issue within the mapping specification. An example of how special characters can be removed and/or replaced at the specification level using a Microsoft Office Visual Basic for Applications (VBA) macro will be shared and the practical advantages of using that approach will be discussed.
QT15 : How to Create a Journal-Quality Forest Plot with SAS® 9.4
John O'Leary, Department of Veterans Affairs
Warren Kuhfeld, SAS
This paper offers a solution for creating a high quality forest plot that is befitting even the most prestigious scientific journals. Building on code developed by Sanjay Matange and Prashant Hebbar, this paper provides intermediate SAS® programmers further versatility through a few simple enhancements. In particular, SG annotation is used to display text and handle special characters through its Unicode support (see Heath 2011). Additionally, error bar markers are sized based on subgroup population sizes to show treatment effectiveness, and hazard ratios (HR) are properly displayed on a log scale. These enhancements became the basis for a forest plot figure that was published in the New England Journal of Medicine (NEJ, see Kernan et al., 2016) for the Insulin Resistance Intervention after Stroke (IRIS) study, a double-blind randomized control trial (RCT) in 3,876 participants designed to prevent secondary stroke or myocardial infarction in patients with insulin resistance. Although our plot sought to visually demonstrate the effectiveness of the treatment (prescription drug Pioglitazone) within subgroup populations compared to placebo, forest plots have many applications including displaying meta-analyses of related research studies. The simplicity of the Statistical Graphics (SG) procedures in SAS® 9.4 augmented by the power of both the Graph Template Language (GTL) and SG annotation will provide you with a powerful tool for forest plot construction.
QT16 : Common mistakes by programmers & Remedies
Sairam Veeramalla, GCE SOLUTIONS
We all know that programmers are very busy with lot activities like programming, weekly and monthly meetings, one to ones, filling the time sheets, meeting the tight timelines etc., in this process we do few simple/common mistakes. I would like to share few common mistakes which were observed my experience (as a programmer and validator) and remedies to avoid those mistakes, if we avoid these mistakes we can reduce 60-70% of mistakes and ultimately we can deliver good quality deliverables, also can increase the First time right (FTR) quantity which helps to improve individual and company performance as well.
Statistics & PharmacokineticsSP01 : Multiple Imputation: A Statistical Programming Story
Chris Smith, Cytel Inc.
Scott Kosten, DataCeutics Inc.
Multiple imputation (MI) is a technique for handling missing data. MI is becoming an increasingly popular method for sensitivity analyses in order to assess the impact of missing data. The statistical theory behind MI is a very intense and evolving field of research for statisticians. It is important, as statistical programmers, to understand the technique in order to collaborate with statisticians on the recommended MI method. In SAS/STAT® software, MI is done using the MI and MIANALYZE procedures in conjunction with other standard analysis procedures (e.g. FREQ, GENMOD or MIXED procedures). We will describe the 3-step process in order to perform MI analyses. Our goal is to remove some of the mystery behind these procedures and to address typical misunderstandings of the MI process. We will also illustrate how multiply imputed data can be represented using the ADaM standards and principals through an example-driven discussion. Lastly, we will do a run-time simulation in order to determine how the number of imputations influences the MI process. SAS® 9.4 M2 and SAS/STAT 13.2 software were used in the examples presented, but we will call out any version dependencies throughout the text. This paper is written to all levels of SAS users. While we present a statistical programmer's perspective, an introductory level understanding about statistics including p-values, hypothesis testing, confidence intervals, mixed models, and regression is beneficial.
SP02 : Multiplicity Controlled Analyses Using SAS/IML
Xingxing Wu, Eli Lilly and Company
Hangtao Xu, Eli Lilly and Company
Multiplicity controlled analyses are very important for the clinical trials with more than one objectives (endpoints). Graphical approach proposed recently is becoming popular because of many advantages. Related tools implemented in R programming language have been developed to apply this approach to clinical trials. Even though the code developed in SAS/IML is available to apply this approach in some straightforward cases, there is still lack of a comprehensive tool to fully apply this approach into multiplicity controlled analyses in clinical trials. In this paper, the macro implemented in SAS/IML is presented because of the popularity of SAS. This macro can not only be applied in the case when the p-value of each endpoint is available, it can also estimate the endpoint p-values and predict the success rate (power) of each endpoint through simulation to help the early decision-making. Furthermore, the implementation of family-wise gatekeeping approach in SAS/IML is also integrated into this macro for the purpose of comparison of different approaches in multiplicity controlled analyses. This paper first introduces the graphical approach briefly. Then the implementation of graphical approach in SAS/IML is introduced for the case when the p-value of each endpoint is available. In addition, the approach to estimate the p-value and predict the success rate of each endpoint is proposed. Finally, this paper presents the implementation of family-wise gatekeeping approach in SAS/IML.
SP03 : Using Prentice-Williams-Peterson Gap-Time Model and PROC PHREG to analyze recurrent events data in Clinical Trials
Ronald Smith, Clinical Health Biostatistics
In some clinical trials that study deleterious events, then event of interest can occur more than once for each participant in the trial. Examples of these recurrent events include admissions to hospitals, falls in elderly patients, migraines, cancer recurrences, bacterial infections and epileptic seizures. A goal of many clinical trials is to find a means to eliminate these recurrent events and effect relief. In survival time-to-event analyses, the focus usually is on time to the first event, ignoring any subsequent events. Commonly these recurrent events have an intrinsic correlation (dependency) for those events occurring in the same subject. A standard approach used to analyze survival data is the Cox Proportional Hazards model. However, due to the independence assumption, the original Cox model is only appropriate for modelling the time to the first event. In this paper, we examine a simulated data set the Prentice, Williams and Peterson Total Time and Gap Time models (PWP-TT and PWP-GT). The PWP-TT and PWP -GT models are Markovian extensions of the Cox model and require only slight modifications to PROC PHREG to obtain results. Results from the PWP-TT and PWP-TT tests are informative about treatment-level covariates and allow comparison of the efficacy of these covariates. Additionally, we extract additional data from the analysis of recurrent event data to apply an ANOVA comparing treatment methods and the location of the longest gap time. The assumption being that increasing gap times occurring as the clinical trial progresses might indicate utility of a treatment.
SP04 : Population PK/PD Analysis - SAS® with R and NONMEM® Make Customization Easy
Sharmeen Reza, Cytel Inc.
Population pharmacokinetics/pharmacodynamics (pop-PK/PD) modeling and analysis are typically exploratory in nature. Different levels of customization are needed for various pieces in any analysis, where the core is driven by NONMEM software requiring that a structured file be passed onto it. Development and validation methods of such a NONMEM-ready data file rely heavily on the firmness of PK specs. Multiple tools including SAS are considered in this paper for creating a data set. An interface-based solution reduces programming dependencies, but has limitations as its backend logic is constructed on a single version of PK specs. Other parts of the analysis, pre-processing of data file and post-processing of NONMEM software output, are performed in SAS and/or R; the choice of tools is determined by availability and user competency. Since SAS is well-established in the drug industry for regulatory submissions (Rickert, 2013), it is prudent to utilize SAS for executing all the pieces - making data investigation easy especially for large data, allowing flexibility with programming, connecting with R and NONMEM software, and delivering reports after fine-tuning. SAS can be utilized for a fully customizable solution, or to take advantage of a compatible interface that is highly adaptable to frequently changing requirements of pop-PK/PD world.
SP05 : Combining Survival Analysis Results after Multiple Imputation of Censored Event Times
Jonathan L Moscovici, QuintilesIMS
Bohdana Ratitch, QuintilesIMS
Multiple Imputation (MI) is an effective and increasingly popular solution in the handling of missing covariate data as well as missing continuous and categorical outcomes in clinical studies. However in many therapeutic areas, interest has also risen in multiple imputation of censored time-to-event data, since in many cases the Missing at Random (MAR) assumption is not clinically plausible for all subjects. MI is possible through SAS PROC MI, procedures implementing Bayesian analysis (e.g., MCMC, PHREG) or user-implemented approximate Bayesian bootstrap. In MI, the missing values are replaced and several imputed datasets are created with differing values swapped for the missing ones. Each of those imputed datasets are analyzed separately using the methods that compute the statistics of interest (Kaplan-Meier survival estimates, hazard ratios, etc). Once these estimates are calculated for each imputed dataset, they are combined, or pooled, using Rubin's rules. These rules assume the estimates to be combined are asymptotically normally distributed. In many cases, such as survival probabilities, this assumption does not hold and normalizing transformations must be applied beforehand. In this paper, we cover these combining methods and present SAS code that implements the necessary data transformations and manipulations for combining various survival analysis estimates such as Kaplan-Meier survival curves (including percentiles), logrank, Wilcoxon, and Tarone test statistics, and Cox regression estimates of the hazard ratios after multiple imputation of censored time-to-event data. Demonstration is provided using an example imputed data set.
SP06 : Adverse Event Data over Time
Kriss Harris, SAS Specialists Ltd
Commonly in clinical trials Adverse Events (AEs) are captured over time and the incidence and time to first occurrence of an event are presented descriptively. The duration of the event is also often calculated in the subset of patients who have experienced the event. However, better methods are needed to combine and present AE incidence and duration information over time. In oncology, methods for displaying tumor response and duration of response were developed by Temkin (Temkin, 1978)). This paper will investigate the applicability, with modification as necessary, of these methods, in particular the Probability of Being in Response method, to the display and meaningful interpretation of AE data over time in clinical trials. Recommendations for the presentation of AE data in future trials will be made.
SP07 : Modelling and analysis of recurrent event data
Meda Sammanna, GCE solutions
The analysis of time to event data has become more common in clinical trials. In general, time to event data is modeled using COX regression where each sampling unit experiences at most one event. In many cases it may be more useful to analyze recurrent events data. This presentation covers analysis of recurrent events data using robust sandwich variance estimator and other recurrent event models. Results from recurrent events models will be presented in addition to time to first event models, to explore the treatment effect on the number of occurrences of events over time. As this type of data is being analyzed more frequently there is a great need to study these types of structure and events to analyze the data efficiently and provide robust results.
Submission StandardsSS01 : How will FDA Reject non-CDISC submission?
Kevin Lee, Clindata Insight
Beginning Dec 18, 2016, all clinical trial and nonclinical trial studies must use standards (e.g., CDISC) for submission data and beginning May 5, 2017, NDA, ANDA, and BLA submissions must follow eCTD format for submission documents. In order to enforce these standards mandates, the FDA also released "Technical Rejection Criteria for Study Data" and also implemented a rejection process for submissions that do not conform to the required study data standards. The paper is intended for programmers who prepare CDISC compliant submission and should respond to FDA questions and rejections. The paper will begin with how these new FDA mandates impact the electronic submission to FDA. The paper will show what sponsors should prepare for CDISC and eCTD complaint submission package such as SDTM, ADaM, Define.xml, SDTM annotated eCRF, SDRG, ADRG and SAS programs. The paper will introduce the current FDA submission process, especially the current FDA rejection processes; "Technical Rejection" and "Refuse-to-File". The paper will discuss how FDA will use "Technical Rejection" and "Refuse-to-File" to reject CDISC non-compliant data. The paper will show how FDA rejection of CDISC non-compliant data will impact sponsor's submission process, and it will specifically show how the sponsors should respond to FDA rejections. The paper will also show how the sponsors respond or answer FDA questions during the whole submission process. The paper will introduce use cases to show how FDA new technical rejection criteria will impact the submission process, what the sponsors should prepare and how the sponsors should respond to FDA rejection.
SS02 : Preparing Analysis Data Model (ADaM) Data Sets and Related Files for FDA Submission
Sandra Minjoe, Accenture
John Troxell, Accenture
This paper compiles information from documents produced by the U.S. Food and Drug Administration (FDA), the Clinical Data Interchange Standards Consortium (CDISC), and Computational Sciences Symposium (CSS) workgroups to identify what analysis data and other documentation is to be included in submissions and where it all needs to go. It not only describes requirements, but also includes recommendations for things that aren't so cut-and-dried. It focuses on the New Drug Application (NDA) submissions and the subset of Biologic License Application (BLA) submissions that are covered by the FDA binding guidance documents. Where applicable, SAS® tools are described and examples given.
SS03 : Use of Traceability Chains in Study Data and Metadata for Regulatory Electronic Submission
Tianshu Li, Celldex
Traceability is one of the fundamental requirements for electronic submission. It helps the FDA or other regulatory agencies to understand the data's lineage and the relationships among the process of the data collection, SDTM and ADaM data generation, Metadata, and analyses results. The establishment of a clear and unambiguous traceability chain will show the transparency of the electronic submission (e-sub) package and build confidence in the quality of the analyses results and statistical conclusions. Ultimately, it will help expedite the review and approval process. Based on oncology data as an illustration, this paper describes the type, elements and relationships of good traceability chains, and some important considerations in the process.
SS04 : Good versus better SDTM: Including Screen Failure Data in the Study SDTM?
Henry Winsor, Relypsa Inc.
Mario Widel, Eli Lilly
In the FDA Study Data Technical Conformance Guide, first released in December, 2014, there are a couple of paragraphs on the topic of the two subject identifier variables defined in SDTM and some details of how FDA would like to see them implemented. Implicit in this section is the request that data from screen failure subjects be included in the study SDTM data. Although not a formal requirement (yet), FDA is very interested in receiving information about a study's screen failures/re-screen attempts and has been quite persistent in continuing to ask to be provided that data by sponsors. This paper discusses steps taken at different companies to meet that request. It may provide some guidance of its own if you are thinking about but have not yet taken any steps to capture and report information about your screen failure subjects.
SS05 : Good Data Validation Practice
Sergiy Sirichenko, Pinnacle 21
Max Kanevsky, Pinnacle 21
According to FDA and PMDA guidance for regulatory submissions, sponsors are expected to perform study data validation and explain any issues that were not fixed. This is a new requirement introduced only a few years ago, which is why the industry is still struggling to achieve full implementation. In many cases, data validation is not performed correctly with sponsors having the wrong interpretation of validation results, or having invalid, confusing, or non-relevant issue explanations. The major reason for these problems is a lack of explicit and detailed documentation about the data validation process. The purpose of this presentation is to address this gap. We will summarize the regulatory requirements for data validation, discuss basic concepts and methodology, provide instructions on correct installation and usage of Pinnacle 21 Community open source tool, and review best practices for how to explain issues in study data.
SS06 : Awareness from Electronic Data Submission to FDA and PMDA
Yuichi Nakajima, Novartis
Takashi Kitahara, Novartis
PMDA, Pharmaceuticals and Medical Devices Agency, has started acceptance for an electronic data submission from October 2016 with 3.5 years transitional period. The requirement of PMDA is similar to that of FDA, U.S. Food and Drug Administration, electronic data submission. However, there are some differences in the requirement between both health authorities. Because recent drug development has become to be done in a more globally integrated way and same clinical data package was used for submission to both PMDA and FDA, sponsors need to understand the difference precisely and consider efficient processes for a new drug application to meet every health authority requirements. In November 2016, Novartis submitted one set of electronic study data of global study including Japanese subjects in CDISC formats to FDA and PMDA. Because of different health authority requirements, there were several challenges in terms of electronic study data preparation, and communication with global colleagues and health authorities. This presentation explains experiences and challenges based on actual electronic data submission, and proposes internal comprehensive process for the electronic data submission, focusing on the differences between PMDA and FDA requirements.
SS07 : Overview and Application of the HCV Vertical Resistance Analysis Template
Yan Xie, Abbvie
FDA proposed a new hepatitis C (HCV) vertical resistance draft template on March 2016, which is quite different from the existing guidance for submitting HCV resistance data. The new vertical template is not published yet and is still under review, but FDA encourages and expects the sponsors to submit the resistance data using the new template. Compared to the previous horizontal template, the new vertical template is more advanced because: 1. it is compatible with current SDTM and ADaM standards, 2. it reduces numerous variables by applying the streamlined and simplified vertical format, 3. it can hold the Next Generation Sequencing (NGS) data, and 4. it can hold multiple targets and HCV subtypes in one dataset. The variables in the new vertical template include three categories of variables, subject level characteristics, pharmacogenomics results, and phenotypic results. The subject level characteristic variables are derived from the ADaM datasets (e.g. ADSL), and the pharmacogenomics and phenotypic information can be found in the PF and MS domains in SDTM, respectively. Only variants which are different from the prototypic reference are included in the resistance dataset, and the data structure in the new vertical template is one record per subject, per visit sequenced, per genetic region of interest, per location, and per variant. The HCV resistance data derived from the new vertical template for Abbvie's pilot study were reviewed by FDA in March 2016. The new vertical template was implemented in Abbvie's recent HCV drug submission.
SS08 : Creating Define-XML version 2 including Analysis Results Metadata with the SAS® Clinical Standards Toolkit
Lex Jansen, SAS Institute Inc.
In 2015 CDISC published the Analysis Results Metadata extension to the Define-XML 2.0.0 model for the purpose of submissions to regulatory agencies such as the FDA as well as for the exchange of analysis datasets and key results between other parties. Analysis Results Metadata provide traceability for a given analysis result to the specific ADaM data that were used as input to generating the analysis result; they also provide information about the analysis method used and the reason the analysis was performed. Analysis Results Metadata will assist the reviewer by identifying the critical analyses, providing links between results, documentation, and datasets, and documenting the analyses performed. This presentation will show how Define-XML v2 including Analysis Results Metadata can be created with the SAS Clinical Standards Toolkit.
SS09 : Leveraging Study Data Reviewer's Guide (SDRG) in Building FDA's Confidence in Sponsor's Submitted Datasets
Xiangchen (Bob) Cui, Alkermes, Inc
Min Chen, Alkermes
Letan (Cleo) Lin, Alkermes Inc.
FDA issued Study Data Technical Conformance Guide  in October 2016, which stipulates "The SDRG should describe any special considerations or directions that may facilitate an FDA reviewer's use of the submitted data and may help the reviewer understand the relationships between the study report and the data." Hence SDRG not only supports regulatory review and analysis, but also establishes the traceability of tabulation datasets (SDTM), and source data (raw data). FDA reviewers consider traceability as an important component of a regulatory review . Confidence in submitted datasets (SDTM and ADaM) can be established through traceability from the datasets, their define files and reviewer's guides (SDRG and ADRG). PhUSE released SDRG Package v1.2  on January 26, 2015, which provides a step-by-step template that helps sponsors to prepare SDRG. This paper presents the readers how to build the traceability in SDRG, further build the FDA's confidence in sponsor's submitted datasets. The examples in this paper are from working experiences from FDA request, and NDA submission preparations of more than twenty recently-developed SDRG's from Phase I-III clinical study data.
SS10 : The Do's/Don'ts, An SDTM Validation Perspective. The should/shouldn't when explaining issues in the Study Data Reviewers Guide
Thomas Guinter, Independent
Janet Reich, Sr. Manager Amgen
This session will provide an overview of some of the obvious and not-so-obvious things sponsors do in the SDTM and/or the Study Data Reviewers Guide when explaining validation/conformance issues that they probably should not, and vice-versa things that should be done that are often overlooked. This will be presented from a reviewability and compliance perspective, with a goal to enhance the sponsors compliance with agency validation expectations and therefore improve the agency reviewers experience.
SS11 : Documenting Traceability for the FDA: Clarifying the Legacy Data Conversion Plan & Introducing the Study Data Traceability Guide
David Izard, Chiltern
Kristin Kelly, Merck
Jane Lozano, Eli Lilly
Traceability from data collection through to the presentation of analysis results has always been a concern of the FDA. The introduction of electronic data as part of submission added additional steps to confirm provenance of information. Now the requirement to provide clinical and non-clinical data based on a set of FDA endorsed data standards adds exponentially to the challenge, especially if legacy format data structures were utilized when the study was originally executed and reported but data meeting FDA requirements must be present in your submission. The PhUSE organization, made up of volunteers across industry, have worked closely with the FDA to develop tools to support the organization, presentation and interpretation of clinical and non-clinical data to regulatory bodies. Examples include the Study & Analysis Data Reviewer's Guides and the Study Data Standardization Plan. These documents describe routine situations where FDA endorsed data standards are deployed at the time a study is initiated; additional support is needed when the provenance of the data is not a straightforward. The FDA's Study Data Technical Conformance Guide calls out for the need to provide a Legacy Data Conversion Plan & Report when legacy data is the source of deliverables based on FDA endorsed data standards, but it is not very clear as to when you must provide one. This paper will leverage recent PhUSE efforts to develop a template and completion guidelines for this document to clarify when it must be provided and introduce the concept of the Study Data Traceability Guide.
SS12 : Quality Check your CDISC Data Submission Folder Before It Is Too Late!
Bhavin Busa, Vita Data Sciences (a division of Softworld, Inc.)
The standardized clinical study datasets will be required in submissions for clinical and non-clinical studies that start on or after December 17, 2016. FDA has added a technical rejection criteria to the existing eCTD validation criteria to enforce the deadlines. The FDA may refuse to file for NDAs, an electronic submission that does not have study data in conformance to the required standards specified in the FDA Data Standards Catalog. This means that all studies going forward must utilize CDISC SDTM/ADaM standards and should consist of associated submission documents (aCRF.pdf, define.xml/define.pdf, cSDRG.pdf, ADRG.pdf) per the Study Data Technical Conformance Guide (SDTCG). The submission of these study datasets and documents should be organized into a specific file directory structure per the eCTD requirements. As Sponsor is preparing for their NDA submission, it is critical for them to verify the content and validity of the dataset folder per the FDA submission requirements, i.e. that the datasets meet the technical specifications per the SDTCG and eCTD validation criteria. In addition, it is in their best interest to check whether the datasets that are provided for regulatory publishing is truly the 'final' version. In this paper, we will provide an overview of a SAS-based tool to perform a final quality check on your CDISC data submission package ('m5' folder) that incorporate checks per the SDTCG and eCTD validation criteria's which are not typically covered by either an existing CDISC datasets compliance tools (e.g. Pinnacle 21) or by commercially available eCTD publishing software.
Techniques & TutorialsTT01 : SAS® and ISO8601: A practical approach
Derek Morgan, Mallinckrodt Pharmaceuticals
The ISO 8601 standard for dates and times has long been adopted by regulatory agencies around the world for clinical data. While there are many homemade solutions for working in this standard, SAS has many built-in solutions, from formats and informats that even take care of time zone specification, to the IS8601_CONVERT routine, which painlessly handles durations and intervals.
TT02 : Building Intelligent Macros: Using Metadata Functions with the SAS® Macro Language
Art Carpenter, CA Occidental Consultants
The SAS macro language gives us the power to create tools that to a large extent can think for themselves. How often have you used a macro that required your input and you thought to yourself "Why do I need to provide this information when SAS already knows it?" SAS may well already know what you are being asked to provide, but how do we direct our macro programs to self-discern the information that they need? Fortunately there are a number of functions and other tools within SAS that can intelligently provide our programs with the ability to find and utilize the information that they require. If you provide a variable name, SAS should know its type and length; given a data set name, the list of variables should be known; given a library or libref, the full list of data sets that it contains should be known. In each of these situations there are functions that can be utilized by the macro language to determine and return these types of information. Given a libref these functions can determine the library's physical location and the list of all the data sets it contains. Given a data set they can return the names and attributes of any of the variables that it contains. These functions can read and write data, create directories, build lists of files in a folder, and even build lists of folders. Maximize your macro's intelligence; learn and use these functions.
TT03 : SAS® Debugging 101
Kirk Paul Lafler, Software Intelligence Corporation
SAS® users are always surprised to discover their programs contain bugs (or errors). In fact, when asked, users will emphatically stand by their programs and logic by saying they are error free. But, the vast number of experiences along with the realities of writing code says otherwise. Errors in program code can appear anywhere; whether accidentally introduced by users when writing code. No matter where an error occurs, the overriding sentiment among most users is that debugging SAS programs can be a daunting, and humbling, task. Attendees learn about the various error types, identification techniques, their symptoms, and how to repair program code to work as intended.
TT04 : SAS® Studio: We Program
Jim Box, SAS Institute
Have you investigated SAS® Studio? From the 1980s into the 2010s I used SAS Display Manager (PC SAS front end) for all of my clinical table, listing, figure and database program development. I became accustomed to the program editor, log window, output window and being able to view my working and saved datasets via this programming IDE. I resisted new coding editors through the years UNTIL SAS® Studio came to fruition. SAS Studio is a web-based application that accesses your SAS environment - cloud, local server(s), grid or PC. With the environment, you can access your data libraries, files and existing programs & and write new programs! Additionally, SAS Studio contains pre-defined tasks that generate code for you. Have a specific set of clinical programming code you always use? Snippets! Want to define a personal or global task for AE table summarization? Define it within SAS Studio!
TT05 : Special Symbols in Graphs: Multiple Solutions
Abhinav Srivastva, Gilead Sciences
It is not uncommon in Graphs to include special symbols at various places like axes, legends, titles and footnotes, and practically anywhere in the plot area. This paper discusses multiple ways how special symbols can be inserted as applicable in SAS/GRAPH®, Graph Annotations, ODS Graphics® - SG Procedures, SG Annotations and Graph Template Language (GTL). There will be some examples presented which leverage the power of Formats to put these into action. The techniques will vary depending on the version of SAS® and the type of procedure used. The ODS Graphics text statements such as INSET and DRAWTEXT (in GTL) allows additional ways to embed special symbols. With higher release of SAS 9.4®, there are SYMBOLCHAR and SYMBOLIMAGE statements which provide even advanced ways to include symbols in the graphs.
TT06 : Check Please: An Automated Approach to Log Checking
Richann Watson, Experis
In the Pharmaceutical industry, we find ourselves having to re-run our programs repeatedly for each deliverable. These programs can be run individually in an interactive SAS® session which will allow us to review the logs as we execute the programs. We could run the individual program in batch and open each individual log to review for unwanted log messages, such as ERROR, WARNING, uninitialized, have been converted to, etc. Both of these approaches are fine if there are only a handful of programs to execute. But what do you do if you have hundreds of programs that need to be re-run? Do you want to open every single one of the programs and search for unwanted messages? This manual approach could take hours and is prone to accidental oversight. This paper will discuss a macro that will search through a specified directory and check either all the logs in the directory or check only logs with a specific naming convention or check only the files listed. The macro then produces a report that lists all the files checked and indicates whether or not issues were found.
TT07 : Converting Non-Imputed Dates for SDTM Data Sets With PROC FCMP
Noory Kim, CROS NT
SDTM (Study Data Tabulation Model) data sets are required to store date values with ISO8601 formats, which accommodate both complete dates (e.g. YYYY-MM-DD) and partial dates (e.g. YYYY-MM). On the other hand, raw data sets may come with non-ISO8601 date formats (e.g. DDMMMYYYY). Converting complete date values to an ISO8601 format can be as simple as applying a SAS® date format to a numeric version of the date value. Conversion of partial date values is trickier. How may we convert, say, "UNOCT2016" and "UNUNK2016" into "2016-10" and "2016" respectively? This paper provides examples of how this can be done using PROC FCMP, the SAS Function Compiler procedure. This paper also gives examples of how to avoid the output of nonexistent dates such as "2016-01-99".
TT08 : Clinical Trials Data: It's a Scary World Out There! or Code that Helps You Sleep at Night
Scott Horton, United BioSource
Clinical trials data can be tough to manipulate and summarize. Sometimes the problems that data can cause are not obvious when SAS code runs clean and perusal of the large data sets does not reveal any anomalies. Text truncation, Merging data, Coded data, Unexpected dates, Duplicate data, Data set naming, Macro parameters, Select statement, Array statement Most of these tricks are a few lines of code-very little effort that continues to pay dividends without too much up front cost. All of the tricks are straightforward as well-even beginning programmers can implement them.
TT09 : Hashtag #Efficiency! An Exploration of Hash Tables and Other Techniques
Lakshmi Nirmala Bavirisetty, Independent SAS User
Kaushal Chaudhary, Independent SAS User
Deanna Schreiber-Gregory, Henry M Jackson Foundation for the Advancement of Military Medicine
Have you ever had to walk away from your computer during an analysis? Have you wondered if there is a way to increase your efficiency, save time, and be able to answer more questions? Hash tables to the rescue! This paper covers a brief introduction to the use of hash tables, their definition, benefits, concept, and theory. It also includes a review of some more applied approaches to hash table usage through code examples and applications that illustrate how using of hash tables can help improve performance time and coding efficiency. This paper will wrap up by providing a comparison of performance times between hash tables and traditional lookup and join/merge methods. An additional discussion of instances in which hash tables may not be the most effective method will also be explored. This paper is intended for any level of SAS® user who would like to learn about how hash tables can help process efficiency!
TT10 : Data Quality Control: Using High Performance Binning to Prevent Information Loss
Deanna Schreiber-Gregory, Henry M Jackson Foundation for the Advancement of Military Medicine
Lakshmi Nirmala Bavirisetty, Independent SAS User
Kaushal Chaudhary, Independent SAS User
It is a well-known fact that the structure of real-world data is rarely complete and straight-forward. Keeping this in mind, we must also note that the quality, assumptions, and base state of the data we are working with has a very strong influence on the selection and structure of the statistical model chosen for analysis and/or data maintenance. If the structure and assumptions of the raw data are altered too much, then the integrity of the results as a whole are grossly compromised. The purpose of this paper is to provide programmers with a simple technique which will allow the aggregation of data without losing information. This technique will also check for the quality of binned categories in order to improve the performance of statistical modeling techniques. The SAS high performance analytics procedure, HPBIN, gives us a basic idea of syntax as well as various methods, tips, and details on how to bin variables into comprehensible categories. We will also learn how to check whether these categories are reliable and realistic by reviewing the WOE (Weight of Evidence), and IV (Information Value) for the binned variables. This paper is intended for any level of SAS User interested in quality control and/or SAS high performance analytics procedures.
TT11 : Tips and Best Practices using SAS® Analytics
Tho Nguyen, Teradata
Paul Segal, colleague
Come learn some tips and best practices with SAS/ACCESS, SAS formats, data quality, DS2, model development, model scoring, Hadoop and Visual Analytics - all integrated with the data warehouse.
TT12 : Application of Deming Regression in Molecular Diagnostics using a SAS® Macro
Merlin Njoya, Roche
Pari Hemyari, Roche
In Molecular Diagnostics, method comparison studies are conducted to estimate possible systematic difference between measures from a new investigational assay and an approved assay. The U.S. Food and Drug Administration (FDA) recommends in such case the use of Deming regression analysis to establish that the new assay measures the target value as accurately as the gold standard method. While extensive literature is available on the estimation of the Deming Regression parameters using SAS, there is limited information on how to construct confidence interval of these parameters when there is correlation among data due to repeated measurements for instance. This paper will present a SAS macro that efficiently calculates the slope and intercept of the Deming regression and also uses the bootstrap method to estimate the confidence interval of these parameters.
TT13 : The Proc Transpose Cookbook
Doug Zirbel, Wells Fargo and Co.
Proc Transpose rearranges columns and rows of SAS datasets, but its documentation and behavior can be difficult to comprehend. For common input files, this paper will show a variety of desired output files, plus code and explanations.
TT14 : Defensive Programming -- Tips and Techniques for Producing that Dataset, Table, Listing or Figure First Time
David Franklin, Quintiles Real World Late Phase Research
If you have ever bought one of those 'put together furniture kits', it is not usually an easy process. There is the setting up of the work area, taking out of the box, reading the instructions, getting the tools needed that were not included, construction, and finally look it over and check what you have done and that it is going to do as intended. A similar process is used for writing programs - we should first do some planning, construct the program, and check it over to see that it producing what was asked. This paper takes a brief and lighthearted look at each of these stages, providing a few tips for avoiding some of the many pitfalls and gives a few pieces of SAS® code that are useful in the development of your program, hopefully avoiding having to say that dreaded phrase, "Oh Clanger".
TT15 : Merge With Caution: How to Avoid Common Problems when Combining SAS Datasets
Josh Horstman, Nested Loop Consulting
Although merging is one of the most frequently performed operations when manipulating SAS datasets, there are many problems which can occur, some of which can be rather subtle. This paper examines several common issues, provides examples to illustrate what can go wrong and why, and discusses best practices to avoid unintended consequences when merging.
TT16 : Dear Dave, Please See the .LST File for Our Validation Differences. Thanks, Bad Validation Programmer
Tracy Sherman, Ephicacy
David Carr, Ephicacy
Brian Fairfield-Carter, InVentiv Health
Good validation programmers are hard to come by these days. This paper will gives you examples of the good, the bad and the ugly validation practices in statistical programming. It will also ensure that the next instance you are validating a dataset or TFL (table, figure or listing), you will understand the importance and the reasons behind these good practices. The sole responsibility lies with the validation programmer to investigate the differences between the production and validation side. We will demonstrate the how, what, where, when and why for investigating differences and then provide details on how to effectively communicate your findings to the production programmer.