Statistics & Data Mining
Third Millennium Analytics, Inc. can provide prompt and accurate analysis and can assist with clear and meaningful interpretation of the results. We take a methodical approach to analysis, have diverse programming capabilities, and are experienced applying a wide variety of statistical and data mining techniques. It is important to us that you fully understand why specific techniques are applied, and that the results are informative and enable you to strategically plan or make adjustments to improve your organization’s effectiveness, productivity, and/or efficiency. We strive to apply the most appropriate techniques that render results that readily enable effective, evidence-based decision-making.
Statistical Techniques. We have experience applying a wide variety of statistical techniques, ranging from non-parametric tests to multivariate techniques, including the general linear model (i.e., analysis of variance, ordinary least squares & multiple regression), models with non-normally distributed outcomes (logistic regression, ordinal & multinomial logit modeling, Poisson & negative binomial regression, Cox regression, probit regression), causal modeling techniques (partial & canonical correlation analysis, path analysis, structural equations modeling), and techniques designed to establish dimensionality (exploratory & confirmatory factor analysis, latent trait analysis, item response theory). Other techniques include, but are not limited to, multilevel or hierarchical linear modeling, event history analysis, hazards modeling, longitudinal data analysis (generalized estimating equations & mixed/random effects modeling), survival analysis, life table analysis, missing data analysis (including multiple imputation), data mining, and classification and regression trees analysis.
We begin by learning the details of your research project i.e., its design and implementation, and any issues that occurred during data collection, coding, or entry. We then carefully consider the level of measurement and univariate distributions of each variable, and the bivariate associations among them. The appropriateness of a multivariate analysis is then ascertained, which allows for an examination of more complex associations. Before conducting any analysis, we carefully examine whether any required assumptions are violated. If a violation(s) is present, we either address it or opt for a more appropriate technique. Though meticulous, this process is necessary to avoid the misapplication of statistical techniques, which can lead to misinterpreted or misleading results.
While complex multivariate statistical techniques often yield informative and useful results, it is important to note that relatively less sophisticated though powerful techniques may be more appropriate for your project, as they may yield more meaningful and readily interpretable results. Given possible data constraints, less sophisticated techniques may also be necessary for statistically valid analyses.
Data Mining Techniques. Data mining can be applied to unlock value in enterprise data, improve processes, help make more informed decisions, and improve operations. Drawing from statistics, machine learning, pattern recognition, database technology, information retrieval, and artificial intelligence, these techniques represent means for knowledge discovery and are used for tapping into large amounts of data to derive useful information. Data mining broadly includes techniques or algorithms for predictive modeling (e.g., classification and regression), descriptive modeling (e.g., cluster analysis/market segmentation, density estimations, dependency modeling), association analysis (e.g., similarity matching, co-occurrence grouping), and profiling (e.g., behavior description, outlier analysis/anomaly detection).
Programming Capabilities. Our programming skills include a variety of statistical and analytical software packages, including IBM® SPSS Statistics® and SPSS Modeler®, Stata®, SAS®, Mplus®, HLM®, LISREL®, AMOS®, MICE®, CART®, etc.
Statistical Power Analysis. Statistical considerations ideally begin at the initial stages of a research project, so that the design will allow for adequate statistical power while balancing considerations regarding precision and resource constraints. When these issues are insufficiently addressed at early stages, clients are often left with disappointing, misleading, or otherwise uninterpretable results.
Data Security. Data security, and confidentiality and privacy of research participants are an extremely high priority for us. We adhere to very rigorous data security protocols, only store data on encrypted hard drives, require secure data transfers, and strictly adhere to human subjects protocols. We respect the fact that some clients do not wish to have it publicly known that an outside organization has assisted them with their research. Without our clients’ written, expressed consent, we do not disclose their identity, their research projects, or findings with anyone outside of our firm.