Forecasting Archives - Tiger Analytics

A Pharma Leader’s Guide to Driving Effective Drug Launches with Commercial Analytics

TA@2023 — Wed, 10 Jan 2024 10:16:59 +0000

For a Pharmaceutical company, launching a drug represents the culmination of extensive research and development efforts. Across the typical stages of drug launch – planning the launch, the launch itself, and the post-launch drug lifecycle management, Data Analytics can guide pharmaceutical companies to leverage the power of data-driven insights and strategic analysis. How does this help? According to research, for 85% of pharmaceutical launches, the product trajectory is set in the first six months.

Real-time analytics enables informed decision-making, enhanced patient outcomes, and creates a competitive edge for the drug in the ever-evolving Healthcare industry. A data-driven approach across the drug lifecycle ensures that the drug launch is not just a milestone, but a stepping stone towards improved healthcare and a brighter future.

5 Benefits of a Data-Driven Drug Launch

How can Pharma leaders benefit from a data-driven launch? We’ve put together a few of our observations here:

1. Precise Patient Targeting
Begin by identifying the most promising patient segments through comprehensive data analysis. By integrating electronic health records, prescription data, and demographic information, you can pinpoint the specific patient populations that will benefit most from your drug. Tailor your messaging and outreach to address their unique needs and preferences.

2. Segmented Marketing Strategies
Develop personalized marketing strategies for each identified patient segment. Utilize commercial analytics to understand the distinct characteristics of these segments and create tailored campaigns that resonate with their concerns. This approach enhances engagement and encourages a deeper connection between patients and your product.

3. Tactical Pricing Optimization
Determine the optimal pricing strategy for your drug by analyzing market dynamics, competitor pricing, and patient affordability. Commercial analytics helps strike the right balance between maximizing revenue and ensuring accessibility. Data-driven pricing decisions also enhance negotiations with payers and reimbursement discussions.

4. Multi-channel Engagement
Leverage commercial analytics to identify the most effective communication channels for reaching healthcare professionals and patients. Analyze historical prescription patterns and physician preferences to allocate resources to the channels that yield the highest impact. This approach ensures that your message reaches the right stakeholders at the right time.

5. Continuous Performance Monitoring
The launch doesn’t end on the day of the launch — it’s a continuous process. Utilize real-time data analytics to monitor your drug’s performance in the market. Track metrics such as prescription volume, market share, and patient feedback. This information helps you adapt your strategies as needed and capitalize on emerging opportunities.

Enabling a 360-Degree View of Pharma Drug Launch with Commercial Analytics

At Tiger Analytics, we developed a Data Analytics solution, tailored to meet the specific requirements of our clients in the Pharmaceutical industry. Our Commercial Analytics engine powers a suite of data-driven analytical interventions throughout the lifecycle of a drug. It serves as a bridge between goals and actionable insights, effectively transforming raw data into strategic decisions. The solution supports pre-launch patient segmentation and provider engagement. It also aids in launch-stage payer analytics and pharmacy optimization. Lastly, it enables post-launch patient journey analysis and outcomes assessment – giving Pharma leaders a 360-degree view of the entire launch cycle.

Here’s how it works:

Pre-Launch: Setting the Stage for Success

In this stage, the goal is to lay a strong foundation for success by developing the value proposition of the drug. Clinical teams, data strategists, and market researchers collaborate to assess the drug’s commercial potential and create a strategy to realize it. To begin, comprehensive surveys and market research are conducted to gain insights into healthcare personnel (HCP) behavior, competitor analysis, patient profiles, packaging analysis, price comparison, and sales benchmarks. These analyses shape the roadmap for the drug’s performance and enable the exploration of various scenarios through forecasting exercises. Patient profiling and segmentation strategies are also implemented to devise effective marketing and engagement strategies.

From Action to Impact

To drive tangible results, at Tiger Analytics we orchestrated a series of targeted initiatives with specific outcomes:

What did we do?

Conducted a comprehensive analysis of analog drugs in the market and performed market scoping along with other forecasting exercises to understand the potential impact of the new drug once launched.
Analyzed survey results and developed a tool to assess the possible effectiveness of the drug in real-world scenarios.
Formulated multiple scenario analyses to account for unprecedented events and their potential impact on the demand for the drug.

How did the solutions help?

Provided a clear view of the expected market landscape through market sizing.
Prepared the pharma company for unknown events through scenario analysis.
Facilitated target adjustment and improved planning by forecasting numbers.

Launch: Strategic Payer Engagement in a Complex Landscape

During the drug launch, the focus shifts to accelerating drug adoption and reducing the time it takes to reach peak sales. At this juncture, analytics plays a crucial role in optimizing market access and stakeholder engagement (payers, prescribers, and patients). By analyzing payer data, claims information, and reimbursement policies, pharmaceutical companies gain insights for strategic decision-making, including formulary inclusion, pricing strategies, and reimbursement trends. These insights enable effective negotiations with payers, ensuring optimal coverage and patient access to the medication.

Monitoring sales and identifying early and late adopters among HCPs and patients enables targeted marketing activities and tailored promotional initiatives. This approach effectively propelled successful market penetration.

From Action to Impact

To drive tangible results, we, at Tiger Analytics, orchestrated a series of targeted initiatives with specific outcomes:

What did we do?

Implemented a robust email marketing campaign, targeting the identified early adopter HCPs.
Monitored HCP engagement and response to emails using advanced analytics and tracking tools.
Leveraged predictive models to conduct real-time analysis of promotional activities, optimizing their effectiveness and making data-driven adjustments.

How did the solutions help?

Achieved a 15% increase in HCP engagement and response rates.
Real-time analysis led to a 10% improvement in effectiveness.

Post-Launch: Empowering Patient-Centric Care

Post-launch analytics focuses on monitoring the market and adapting to market dynamics (competition, regulations, reimbursements, etc.) to extend the drug’s lifecycle. Advanced analytics also enables understanding a patient’s journey and optimizing the person’s medication adherence. By leveraging real-world data, electronic health records, and patient-reported outcomes, pharmaceutical companies gain invaluable insights into patient behavior, adherence rates, and treatment patterns. These insights facilitate the development of personalized interventions, patient support programs, and targeted educational campaigns to enhance patient adherence and improve treatment outcomes. Additionally, continuous tracking of the medication’s performance, market share, and patient-reported outcomes enables pharmaceutical companies to make data-driven decisions, generate evidence for stakeholders, and drive ongoing improvements in patient care.

From Action to Impact

To drive tangible results, we, at Tiger Analytics, orchestrated a series of targeted initiatives with specific outcomes:

What did we do?

Utilized real-world data and electronic health records to track patient behavior and medication adherence.
Conducted in-depth analysis of patient-reported outcomes to gain insights into treatment patterns and efficacy.
Developed personalized interventions and patient support programs, based on the identified patterns and behaviors.

How did the solutions help?

Improved medication adherence by 25%.
Achieved a 30% increase in patient satisfaction and treatment compliance.

For Pharmaceutical companies, the goal of a successful drug launch is not only about accelerating the medicine’s time to market, but it is also about ensuring patient awareness and access to life-saving drugs. By leveraging the power of data to fuel AI-enabled drug launches, we’ll continue to see better medication adherance, satisfied patients, compliance to treatments – which will ultimately lead to better health outcomes.

The post A Pharma Leader’s Guide to Driving Effective Drug Launches with Commercial Analytics appeared first on Tiger Analytics.

CECL in Loss Forecasting – Practical Approaches for Credit Cards

TA@2023 — Thu, 14 Jan 2021 18:25:59 +0000

Credit card companies face the challenge of accurately forecasting expected credit losses over the lifetime of their loans to comply with the Current Expected Credit Loss (CECL) standards. Developing an effective loss forecasting model is crucial for these companies to maintain financial stability, make informed business decisions, and meet regulatory requirements. We will discuss the approach that offers a practical solution to simplify the CECL modeling process for unsecured consumer bankcard portfolios, helping credit card companies reduce the cost of regulatory compliance while improving the accuracy of their loss predictions. By leveraging a combination of account-level forecasting, segmentation analysis, and rigorous model validation techniques, this methodology enables credit card issuers to address the unique challenges posed by CECL and enhance their overall risk management strategies.

Developing a CECL-Compliant Loss Forecasting Model for a Midsize US Bank

A midsize US bank wants to create a statistical loss forecasting model for the unsecured consumer bankcard portfolios and small businesses bankcard portfolios to calculate current expected credit losses (CECL) over the life of the loan for their internal business planning and CECL reporting requirements.

Under CECL, the expected lifetime losses of loans are recognized at the time a loan is recorded. The model suite and its components forecast the current expected credit loss as an aggregation of the account-level forecasts for the unsecured lending (bankcard) portfolios.

Model Size

Design Objective: The primary goal for this custom model development is to forecast losses under CECL for an unsecured credit card portfolio. The model provides a credit loss forecast for the life of a loan at a loan level that can be aggregated ‘bottom-up’ to create a portfolio loss forecast. A segment-level ‘top-down’ aggregated model is created for certain segments with a short time frame and very predictable performance because of prepayments or charge-off. The custom forecasting model is intended to have the following features:

Predict current expected credit losses on existing, active credit card members with the balance outstanding through the life of the loan
Output a monthly loss forecast that can be used for internal business requirements and allowance calculation
Leverage FICO Score as the risk score in the model
Provide clear guidelines for model performance monitoring and validation to allow model users to explain the root cause of forecast error

Success Criteria: The custom model is gauged by both in-time and out-of-time validations based on the following guidelines:

Model performance – Back-testing on both in-time and out-of-time validation data, which typically can have up to a maximum of 10% variation, incrementally over time

The mean absolute deviation can be much larger since such a long period can compromise the forecast horizon
For the segments and sub-populations where volumes are very small, the variation expressed in percentage could be higher due to a smaller denominator. In such cases, the forecast errors are evaluated in dollar difference

Aggregate risk estimation – back-testing performed on the entire portfolio to assess the fit of the model in its entirety. The aggregate risk assessment would be the total portfolio expected credit loss (ECL) for all accounts and significant portfolios and segments for the duration of the loss forecast horizon
Model sensitivity – the aggregated forecast and the component models should be sensitive to changes in internal portfolio characteristics and external macroeconomic factors
Model implementability – The model suite and the scoring equations generated as an output can be implemented in the production system

Data Sources: The portfolio data consists of origination and portfolio characteristics, expressed as monthly snapshots at month-end and cycle-end at an account level. Apart from this, various demographic as well macroeconomic factors were used for model development. The representative list of account-level characteristics are below:

Origination characteristics like sourcing channel, FICO score at the time of origination, etc.
Underwriting actions such as initial credit line, interest rate (APR), balance transfer, and their changes over time
Demographic information like state of residence
Credit usage like balance, payment, fee, purchase, utilization
Derogatory behavior such as days past due, max delinquency
Macroeconomic indicators like local unemployment, income, GDP, etc

Data Processing and Exploration

Monthly data spanning several years is considered for model development. The data is chosen to cover portfolio performance during the recessionary, recovery, and growth cycles. Additionally, around three years of data were used for an out of time validation. Following data processing steps were taken to make the data appropriate for modeling:

Different type of duplicate records where identified and treated
Defaults related to fraud and stolen accounts were removed since it was not the true indicator of credit default
Records were examined and aligned for account number transfers, thus avoiding misclassification of transferred account as “paid down” or otherwise closed

Candidate Variable Selection:

The models were built based on the portfolio and macroeconomic attributes. Except for the FICO score, no other bureau attributes were considered for model development. To gain confidence in the data, descriptive statistics on the macroeconomic variables were performed and compared with the publicly available sources before using it in the model development.

Modeling Approach / Definition:

The default event can be described as competing risk events or terminal events, namely prepayment and charge off occurring over a period of time. The discrete-time hazard modeling approach determines the probability of such an event to occur within a specific timeframe. For each of the competing risks, an account is considered ‘survived’ at time t (between 1 to 39 months) when the risk event does not happen. The dependent variable for the PD models is defined as =1 when the account is in one of the terminal events. In other words, the account is charged-off (CO) or the account is paid down (Pre-pay). Otherwise, the dependent variable is set as =0.

For this, we define the survivor function at time t_j any given time as the probability that the month of survival T is at least t_j.

Therefore, the hazard at time t_j is defined as a conditional probability of a terminal event at that time, given that the account survived up to that point.

Hence, the conditional odds of the terminal event at each discrete time can be expressed as conditional odds of the conditional event at each time t_j, given survival S_j up to that point. Specifically,

Here,

Taking logs, we obtain a model on the logit of the hazard or conditional probability of a terminal event at tj, given survival up to that time. Expressed in the equation, the model is:

Here, the model essentially treats time as a discrete factor by introducing one parameter α_j for each possible time terminal event. Interpretation of the parameters β associated with the other covariates follows along the same lines as in logistic regression. Thus, one can fit the discrete-time proportional-hazards model by running a logistic regression on a set of pseudo observations generated by creating a terminal event indicator for an account as event_J = 1 on the month j, 0 otherwise. It has been observed that there is no significant difference between Complementary Log-Log (c-log-log) link function and logit transformation. Hence, the PD is estimated via logistic regression. Lastly, the binary probabilities are converted to multinomial probabilities using Begg-Gray (1984) transformation method^[1], which is defined as below:

where j=1,2 representing events such as prepay and charge-off, respectively.

The LGD can be determined by the percentage of loss by facility or collateral type. LGD estimates could also be driven or influenced by product type, industry, or geography. For the exercise, the LGD was considered to be 100% due to the unsecured nature of the credit card loans.

The EAD is calculated based on the following formulas,

where P1 and P2 represent conditional probabilities of charge-off and prepayment respectively calculated by using the above Begg-Gray transformation, and the ratio r_t is calculated as a monthly ratio between the balance survival curve and the number of accounts probability survival curve.

Segmentation Analysis

Segmentation analysis was performed separately for each of the dependent events namely, for charge-off and pre-pay. This is done to determine whether there are sub-populations within the development dataset that would benefit from separate scorecards. The purpose of this analysis is to determine how many scorecards and which specific segmentation schemes would be optimal. The segmentation analysis is done using non-parametric survival analysis with censored effects in SAS, combined with business intuition and keeping implementability of the model in mind.

These initial segments reflected the delinquency status, transition states, payment activity (payment ratio) and tenure at observation points. The segment distribution is then examined at different snapshots to ensure that the segments are stable across a period of time. A loan-level ‘bottom-up’ methodology was identified for major segments. Whereas the ‘top-down’ approach was selected for minor segments where loan-level data may not have provided extra discrimination, for example, transactor segments.

Model Development

Based on the insights derived from the segmentation process, the models were developed at the portfolio level.
Model development includes the below parameters:

Probability of Default (PD), which gives the average percentage of accounts, or borrowers, that experience a default event;
Loss Given Default (LGD), which gives the percentage of exposure the bank might lose if the borrower defaults; and
Exposure At Default (EAD), which gives an estimate of the outstanding amount (drawn amounts) in case the borrower defaults.

The loss projections are derived from the PD models applied to the monthly LGD and EAD estimates that yield expected losses for each month. Summing across all months gives each account’s total expected loss and summing across all accounts gives the total portfolio expected loss.

Finally, the three components are combined to give an expected loss (ECL) for an account. This framework is described in the following equation:

Standard logistic and linear regression is used for the PD model developments in estimation. Each model is put through a series of stepwise logistic regression and development tests to build and refine the initial models, evaluating the variable significance levels (p-values), variance inflation factors (VIF) to thwart multicollinearity and improve model parsimony, and the signs of the parameter estimates in each model iteration. To prevent over-fitting, linear regression is run alongside the primary logistic regression to assess the cross-variable correlation as indicated by VIF. Bivariate charts and weight-of-evidence patterns are also examined to help ensure that each variable utilized exhibited both a clearly discernible trend and a solid business rationale. Each dataset is divided into an estimation sample, upon which the model is built, and validation, or hold-out sample, to ensure model stability. The hold-out sample is kept at 30%.

Model Validation

Each model is evaluated on several qualitative and quantitative performance measures such as model parsimony, model lift (Kolmogorov-Smirnov or KS), measures of statistical dispersion (Gini), event capture rate, and measures of accuracy. The KS and Gini statistics are the primary targets for the individual model optimizations. All variables are tested and considered during model building not only for their statistical significance but also for the theoretical or intuitive explanation, relevance and materiality, and redundancy. Where candidate variables failed on these criteria, they are dropped from the model, even at the expense of KS and Gini.

Model components were also tested for their performance for short-term forecast windows of 6-month and 12-month, apart from the long term forecast window of 24-month and 39-month. The errors are calculated as absolute values of errors, expressed in percentage. Examination of the absolute value, as well as the direction of percentage errors across the twentiles, provide an indication of the model fit accuracy across the score range.

Population Stability Index:

PSI report is used to identify population change over time, as compared with the model development sample. The PSI reports offer useful insights to check data quality and to evaluate credit policy effects on the portfolio. A significant population shift often is an early indicator that the model assumptions may no longer hold, and the model may require fine-tuning.

Sensitivity Analysis:

Sensitivity analysis is needed during model development to check the impact of small changes in inputs on model outputs and ensure they fall within an expected range. One more usage of sensitivity analysis in the context of loss estimation is demonstrating that a model is indeed conservative in its forecast.

Model Back-testing and Out of Time Validation

Model back-testing is one form of outcome analysis that involves comparing actual outcomes with model forecasts in a historical period that matches the model’s forecast horizon or performance window at different snapshots of time.

Back-testing entails analyzing a large number of forecasts over different conditions at a point in time or over multiple periods. This process may reveal significant errors or inaccuracies in model development, wherein such cases, model adjustment, recalibration, or redevelopment is warranted. The purpose is to test the overall loss prediction rather than individual forecast values.

Model Documentation

The CECL standard emphasizes the institutions to be more involved in the entire allowance process, especially on the management/executive level. Therefore, explanations, justifications, and rationales must be discussed, understood, and documented. The CECL loss forecast model intended for regulatory submission should also be used by the banks for their internal loan loss reserve calculations process.

Tiger Analytics applies a “document-as-you-go” principle so that the assumptions and discussions around the modeling process and decisions are captured as they happen. The model documentation relies on several contributors supplying information in a pre-determined format and template. However, it is necessarily authored by a risk and modeling specialist to ensure accountability and completeness. For this exercise, the Tiger Analytics team created around 300 pages of main documentation for the bank’s regulatory submission. Several addendums and appendices running to hundreds of pages are also created in a standard format to aid model validation and subsequent internal and external audits.

The FASB’s CECL standards require timely, forward-looking measurement of risk using “reasonable and supportable” forecasts over the lifetime of the loan. This presents a unique challenge for credit cards since issuers will have to estimate loss from current outstanding balance and ignore future draws. An important part of the modeling approach is calculating paydown balance curves at an account level and then rolling it up to the segment and then to the portfolio. A practical approach for credit cards should be selected to balance the need for output accuracy, model sensitivity, and ease of implementation. Model documentation is an important part of the exercise and should be undertaken by risk and modeling specialists to ensure completeness and assign accountability.

References:

^[1] Begg C, Gray R. “Calculation of polychotomous logistic regression parameters using individualized regressions.” Biometrika 1984; 71 :11-18

The post CECL in Loss Forecasting – Practical Approaches for Credit Cards appeared first on Tiger Analytics.

Maximizing Efficiency: Redefining Predictive Maintenance in Manufacturing with Digital Twins

onemg — Thu, 24 Dec 2020 18:19:09 +0000

Historically, manufacturing equipment maintenance has been done during scheduled service downtime. This involves periodically stopping production for carrying out routine inspections, maintenance, and repairs. Unexpected equipment breakdowns disrupt the production schedule; require expensive part replacements, and delay the resumption of operations due to long procurement lead times.

Sensors that measure and record operational parameters (temperature, pressure, vibration, RPM, etc.) have been affixed on machinery at manufacturing plants for several years. Traditionally, the data generated by these sensors was compiled, cleaned, and analyzed manually to determine failure rates and create maintenance schedules. But every equipment downtime for maintenance, whether planned or unplanned, is a source of lost revenue and increased cost. The manual process was time-consuming, tedious, and hard to handle as the volume of data rose.

The ability to predict the likelihood of a breakdown can help manufacturers take pre-emptive action to minimize downtime, keep production on track, and control maintenance spending. Recognizing this, companies are increasingly building both reactive and predicted computer-based models based on sensor data. The challenge these models face is the lack of a standard framework for creating and selecting the right one. Model effectiveness largely depends on the skill of the data scientist. Each model must be built separately; model selection is constrained by time and resources, and models must be updated regularly with fresh data to sustain their predictive value.

As more equipment types come under the analytical ambit, this approach becomes prohibitively expensive. Further, the sensor data is not always leveraged to its full potential to detect anomalies or provide early warnings about impending breakdowns.

In the last decade, the Industrial Internet of Things (IIoT) has revolutionized predictive maintenance. Sensors record operational data in real-time and transmit it to a cloud database. This dataset feeds a digital twin, a computer-generated model that mirrors the physical operation of each machine. The concept of the digital twin has enabled manufacturing companies not only to plan maintenance but to get early warnings of the likelihood of a breakdown, pinpoint the cause, and run scenario analyses in which operational parameters can be varied at will to understand their impact on equipment performance.

Several eminent ‘brand’ products exist to create these digital twins, but the software is often challenging to customize, cannot always accommodate the specific needs of each and every manufacturing environment, and significantly increases the total cost of ownership.

ML-powered digital twins can address these issues when they are purpose-built to suit each company’s specific situation. They are affordable, scalable, self-sustaining, and, with the right user interface, are extremely useful in telling machine operators the exact condition of the equipment under their care. Before embarking on the journey of leveraging ML-powered digital twins, certain critical steps must be taken:

1. Creation of an inventory of the available equipment, associated sensors and data.

2. Analysis of the inventory in consultation with plant operations teams to identify the gaps. Typical issues may include missing or insufficient data from the sensors; machinery that lacks sensors; and sensors that do not correctly or regularly send data to the database.

3. Coordination between the manufacturing operations and analytics/technology teams to address some gaps: installing sensors if lacking (‘sensorization’); ensuring that sensor readings can be and are being sent to the cloud database; and developing contingency approaches for situations in which no data is generated (e.g., equipment idle time).

4. A second readiness assessment, followed by a data quality assessment, must be performed to ensure that a strong foundation of data exists for solution development.

This creates the basis for a cloud-based, ML-powered digital twin solution for predictive maintenance. To deliver the most value, such a solution should:

Use sensor data in combination with other data as necessary
Perform root cause analyses of past breakdowns to inform predictions and risk assessments
Alert operators of operational anomalies
Provide early warnings of impending failures
Generate forecasts of the likely operational situation
Be demonstrably effective to encourage its adoption and extensive utilization
Be simple for operators to use, navigate and understand
Be flexible to fit the specific needs of the machines being managed

When model-building begins, the first step is to account for the input data frequency. As sensors take readings at short intervals, timestamps must be regularized and resamples taken for all connected parameters where required. At this time, data with very low variance or too few observations may be excised. Model data sets containing sensor readings (the predictors) and event data such as failures and stoppages (the outcomes) are then created for each machine using both dependent and independent variable formats.

To select the right model for anomaly detection, multiple models are tested and scored on the full data set and validated against history. To generate a short-term forecast, gaps related to machine testing or idle time must be accounted for, and a range of models evaluated to determine which one performs best.

Tiger Analytics used a similar approach when building these predictive maintenance systems for an Indian multinational steel manufacturer. Here, we found that regression was the best approach to flag anomalies. For forecasting, the accuracy of Random Forest models was higher compared to ARIMA, ARIMAX, and exponential smoothing.

Using a modular paradigm to build ML-powered digital twin makes it straightforward to implement and deploy. It does not require frequent manual recalibration to be self-sustaining, and it is scalable so it can be implemented across a wide range of equipment with minimal additional effort and time.

Careful execution of the preparatory actions is as important as strong model-building to the success of this approach and its long-term viability. To address the challenge of low-cost, high-efficiency predictive maintenance in the manufacturing sector, employ this sustainable solution: a combination of technology, business intelligence, data science, user-centric design, and the operational expertise of the manufacturing employees.

This article was first published in Analytics India Magazine.

The post Maximizing Efficiency: Redefining Predictive Maintenance in Manufacturing with Digital Twins appeared first on Tiger Analytics.