Appearance
Stats Models
Overview
The Stats Models service provides statistical modeling and analysis capabilities for time series data and general statistical computations. It offers classical statistical methods including smoothing algorithms, forecasting models (ARIMA, SARIMA), regression analysis, and data transformations.
This model is designed for data scientists, analysts, and engineers who need proven statistical methods for data analysis, hypothesis testing, and forecasting.
Key Capabilities
Time Series Forecasting
- ARIMA Models — AutoRegressive Integrated Moving Average for trend-based forecasting
- SARIMA Models — Seasonal ARIMA for data with seasonal patterns
- Polynomial Regression — Fit polynomial trends to time series
- Forecast Intervals — Confidence bounds on predictions
Data Smoothing
- Moving Averages — Simple, weighted, and exponential moving averages
- Savitzky-Golay Filter — Polynomial smoothing that preserves peaks
- Lowess Smoothing — Locally weighted regression for non-parametric smoothing
- Seasonal Decomposition — Separate trend, seasonal, and residual components
Statistical Analysis
- Descriptive Statistics — Mean, median, variance, quantiles
- Distribution Fitting — Fit probability distributions to data
- Hypothesis Testing — T-tests, chi-square, ANOVA
- Correlation Analysis — Measure relationships between variables
Data Transformations
- Differencing — Remove trends and seasonality
- Log Transformations — Stabilize variance
- Normalization — Scale data to standard ranges
- Detrending — Remove systematic trends
Use Cases
Time Series Forecasting
Scenario: A supply chain team needs to forecast monthly demand using historical sales data with seasonal patterns.
Workflow:
- Load historical monthly sales data
- Detect seasonality and trend
- Fit SARIMA model
- Generate 12-month forecast with confidence intervals
- Use forecasts for procurement planning
Value: Make data-driven inventory and procurement decisions based on statistical predictions.
Data Smoothing for Trend Analysis
Scenario: A finance team needs to identify underlying trends in noisy revenue data.
Workflow:
- Import daily revenue data (high variability)
- Apply exponential smoothing
- Visualize smoothed trend
- Identify inflection points and changes
- Present to leadership for strategic planning
Value: See the signal through the noise for better strategic insights.
Quality Control Monitoring
Scenario: A manufacturing plant monitors product quality metrics and needs to detect when the process is out of control.
Workflow:
- Collect quality measurements over time
- Calculate rolling mean and standard deviation
- Identify when measurements exceed control limits
- Flag batches for investigation
- Trigger corrective actions
Value: Detect quality issues early before they affect customers.
Seasonality Decomposition
Scenario: A retail analyst wants to understand how much of sales variation is due to trends vs seasonal effects vs random noise.
Workflow:
- Load multi-year sales time series
- Decompose into trend, seasonal, and residual components
- Analyze each component separately
- Quantify contribution of each to total variation
- Use insights to improve forecasting
Value: Understand drivers of variation for better planning and forecasting.
Model Inputs
The Stats Models service accepts:
- Time Series Data — Observations with timestamps or sequence indices
- Model Parameters — Configuration for statistical methods (ARIMA orders, smoothing windows)
- Analysis Options — Which statistics or transformations to compute
- Confidence Levels — For forecast intervals (e.g., 80%, 95%)
Model Outputs
The model produces:
- Forecasts — Predicted future values with confidence intervals
- Smoothed Data — Noise-reduced time series
- Statistical Metrics — Descriptive statistics and model fit measures
- Decomposition Results — Separated trend, seasonal, and residual components
- Transformed Data — Differenced, logged, or normalized series
Available Methods
Forecasting Methods
ARIMA (AutoRegressive Integrated Moving Average)
- Best for: Trend-based forecasting without seasonality
- Parameters: AR order (p), differencing (d), MA order (q)
- Use when: Data has trends but no strong seasonal patterns
SARIMA (Seasonal ARIMA)
- Best for: Forecasting with both trend and seasonality
- Parameters: ARIMA orders + seasonal period and orders
- Use when: Data has recurring seasonal patterns (daily, weekly, yearly)
Polynomial Regression
- Best for: Fitting smooth trends
- Parameters: Polynomial degree
- Use when: Trend is smooth and well-approximated by polynomials
Smoothing Methods
Simple Moving Average (SMA)
- Average of last N observations
- Best for: Quick smoothing, trend identification
Exponential Moving Average (EMA)
- Weighted average favoring recent observations
- Best for: Responsive smoothing, recent trend tracking
Weighted Moving Average (WMA)
- Custom weights for different observations
- Best for: Emphasizing specific periods
Savitzky-Golay Filter
- Polynomial smoothing in a moving window
- Best for: Smoothing while preserving peaks and features
Lowess (Locally Weighted Scatterplot Smoothing)
- Non-parametric smoothing
- Best for: Complex non-linear trends
Statistical Methods
Descriptive Statistics
- Mean, median, mode, variance, standard deviation
- Quartiles, percentiles, range
- Skewness, kurtosis
Correlation Analysis
- Pearson correlation (linear relationships)
- Spearman correlation (monotonic relationships)
- Cross-correlation (time-lagged relationships)
Configuration Options
Key parameters you can configure:
- Model Type — Which statistical method to use
- Model Orders — AR, MA, differencing parameters for ARIMA
- Seasonal Period — Length of seasonal cycle (e.g., 7 for weekly, 12 for monthly)
- Smoothing Window — Number of points for moving averages
- Confidence Level — For forecast intervals (typically 80% or 95%)
Integration with Other Models
The Stats Models service works well with:
- Tiny Time Mixers — Compare ML forecasts with statistical methods
- Data Loader — Prepare time series data for analysis
- Linear Systems — Combine statistical and physics-based modeling
- AI Agent Python — Automate statistical analysis and interpretation
Statistical vs Machine Learning Forecasting
Use Statistical Models When:
- You have smaller datasets (dozens to hundreds of observations)
- You need interpretable models with explainable parameters
- You understand the data generation process (trend, seasonality)
- You need confidence intervals and statistical rigor
- Compliance requires traditional statistical methods
Use Machine Learning (Tiny Time Mixers) When:
- You have larger datasets (hundreds to thousands of observations)
- You're forecasting many series simultaneously
- Patterns are complex and hard to specify manually
- You need zero-shot forecasting without parameter tuning
- Speed matters more than interpretability
Best Approach: Combine Both
- Use statistical methods for interpretation and understanding
- Use ML for accurate multi-series forecasting
- Compare both for robustness
- Ensemble predictions for better reliability
Performance Notes
- Sample Size — Larger datasets take longer but produce more reliable estimates
- Model Complexity — Higher-order ARIMA models are slower to fit
- Smoothing Windows — Larger windows slow computation slightly
- Use Caching — Cache results for repeated analysis
Getting Started
Basic Workflow
- Prepare Data — Format time series with timestamps
- Select Method — Choose statistical technique for your goal
- Configure Parameters — Set model orders, windows, or other options
- Add to Workflow — Drag Stats Models into workflow canvas
- Run Analysis — Execute and interpret results
Example: SARIMA Forecasting
[Load Time Series] → [Stats Models: SARIMA] → [Visualize Forecast]This workflow loads seasonal data, fits a SARIMA model, and produces a forecast.
Example: Data Smoothing Pipeline
[Load Noisy Data] → [Stats Models: Exponential Smoothing] → [Trend Analysis]This workflow smooths noisy data to reveal underlying trends for analysis.
Best Practices
Model Selection
- Plot Your Data First — Visualize to understand patterns
- Check for Seasonality — Use SARIMA if seasonal patterns exist
- Start Simple — Begin with low-order models, add complexity if needed
- Validate — Test on held-out data before trusting forecasts
Parameter Tuning
- Use Information Criteria — AIC/BIC for model selection
- Check Residuals — Residuals should look like white noise
- Grid Search — Try multiple parameter combinations
- Domain Knowledge — Use known seasonal periods (7 for weekly, 12 for monthly)
Forecast Validation
- Backtesting — Test on historical data
- Rolling Forecasts — Validate with realistic update frequency
- Comparison — Benchmark against naive methods
- Monitor — Track actual vs predicted performance
Troubleshooting
Model Won't Converge
- Reduce model complexity (lower AR/MA orders)
- Increase data sample size
- Check for stationarity (use differencing if needed)
- Remove outliers from data
Poor Forecast Accuracy
- Try different model orders
- Check if seasonality is properly specified
- Ensure sufficient historical data
- Consider if data is predictable (some series are just noisy)
Residuals Show Patterns
- Model is missing something (trend, seasonality, or autocorrelation)
- Try higher-order model or seasonal terms
- Check for regime changes or structural breaks
- Consider if linear model is appropriate
Next Steps
- Build a workflow: Building and Configuring Workflows
- Understand orchestration: Workflow Execution Manager
- Explore other models: Modelling Library
