4.1 A significant amount of data is generated during pharmaceutical development and manufacturing activities. The interpretation of such data is becoming increasingly difficult. Individual examination of the univariate process variables is relevant but can be significantly complemented by multivariate data analysis (MVDA). MVDA may be particularly appropriate for exploring and handling large sets of heterogenous data, mapping data of high dimensionality onto lower dimensional representations, exposing significant correlations among multivariate variables within a single data set or significant correlations among multivariate variables across data sets. MVDA may extract statistically significant information which may enhance process understanding, decision making in process development, process monitoring and control (including product release), product life-cycle management, and continuous improvement.
4.2 MVDA is widely used in various industries including the pharmaceutical industry. To achieve a valid outcome, an MVDA model/application should incorporate the following:
4.2.1 A predefined risk-based objective incorporating one or more relevant scientific hypotheses specific to the application;
4.2.2 Sufficient relevant data of requisite quality covering the variance space encountered during intended use, that is, pharmaceutical development, or pharmaceutical manufacturing, or both;
4.2.3 Appropriate data analysis and model utilization practices including considerations on testing, validation, and qualification of all new data prior to using a model to analyze it;
4.2.4 Appropriately trained staff;
4.2.5 Appropriate standard operating procedures; and
4.2.6 Life-cycle management.
4.3 This guide can be used to support data analysis activities associated with pharmaceutical development and manufacturing, process performance and product quality monitoring in manufacturing, as well as for troubleshooting and investigation events. Technical details in data analysis can be found in the scientific literature and standard practices in data analysis are already available (such as Practices E1655 and E1790 for spectroscopic applications, Practice E2617 for model validation, and Practice E2474 for utilizing process analytical technology).