Examples & Tutorials

HPCSeries Core includes 12 comprehensive Jupyter notebooks demonstrating real-world use cases.

📓 All notebooks are located in: notebooks/ directory in the repository.

Getting Started

To run these notebooks locally:

# Clone the repository
git clone https://github.com/hpcseries/HPCSeriesCore.git
cd HPCSeriesCore

# Install HPCSeries with examples dependencies
pip install -e ".[examples]"

# Launch Jupyter
jupyter notebook notebooks/

Alternatively, install just the examples dependencies:

pip install jupyter matplotlib pandas seaborn scikit-learn

Notebook Catalog

All notebooks showcase real or realistic datasets and demonstrate practical use cases including: - Time-series analytics (rolling operations, reductions, transformations) - Robust statistics (median, MAD, outlier-resistant methods) - Anomaly detection (z-score and robust methods) - Multi-series processing (batched operations) - Missing data handling (masked operations)

00. Getting Started

Notebook: 00_getting_started.ipynb

Quick introduction to HPCSeries Core

Learn:
  • Basic statistical operations (sum, mean, std, etc.)

  • Rolling window operations

  • Auto-tuning calibration

  • CLI commands (hpcs cpuinfo)

  • Performance comparison with NumPy

Key Functions: hpcs.calibrate(), hpcs.rolling_mean(), hpcs.zscore()

Runtime: ~5 minutes

01. Rolling Mean vs Rolling Median

Notebook: 01_rolling_mean_vs_median.ipynb

Dataset: Daily temperature readings from a weather station (1 year)

Learn:
  • How rolling mean and rolling median differ

  • Why rolling median is more robust to sensor spikes/outliers

  • Performance comparison (50-100x faster than Pandas)

Key Functions: hpcs.rolling_mean(), hpcs.rolling_median(), hpcs.rolling_std()

Performance: 50-100x faster than pandas.rolling()

02. Robust Anomaly Detection (Climate Data)

Notebook: 02_robust_anomaly_climate.ipynb

Dataset: Historical climate data with extreme weather events

Learn:
  • Detect anomalies using z-scores

  • Robust detection using MAD (Median Absolute Deviation)

  • Why robust methods are better for real-world data

Key Functions: hpcs.detect_anomalies(), hpcs.detect_anomalies_robust(), hpcs.mad()

Use Case: Climate monitoring, weather stations, environmental sensors

03. Batched IoT Rolling Operations

Notebook: 03_batched_iot_rolling.ipynb

Dataset: Multiple IoT sensor streams with data gaps

Learn:
  • Process multiple time-series simultaneously

  • Handle sensor offline periods

  • Batched rolling operations for efficiency

Key Functions: hpcs.rolling_mean_batched(), hpcs.rolling_mean_masked()

Use Case: IoT sensor networks, industrial monitoring, smart buildings

04. Axis Reductions (Column Statistics)

Notebook: 04_axis_reductions_column_stats.ipynb

Dataset: Multi-feature sensor array (1000 samples × 10 sensors)

Learn:
  • Compute per-column statistics efficiently

  • Process 2D arrays (samples × features)

  • Alternative to pandas.DataFrame.agg()

Key Functions: hpcs.axis_mean(), hpcs.axis_median(), hpcs.axis_mad()

Use Case: Feature engineering, multi-sensor analysis, data preprocessing

05. Masked Operations (Missing Data)

Notebook: 05_masked_missing_data.ipynb

Dataset: Sensor data with missing values (NaN gaps)

Learn:
  • Handle missing data without imputation

  • Compute statistics ignoring invalid values

  • Masked rolling operations

Key Functions: hpcs.mean_masked(), hpcs.median_masked(), hpcs.rolling_mean_masked()

Use Case: Real-world sensor data, incomplete datasets, telemetry with dropouts

06. Performance Calibration

Notebook: 06_performance_calibration.ipynb

Auto-tuning HPCSeries for your hardware

Learn:
  • How calibration works

  • Benchmarking before/after calibration

  • Understanding SIMD and OpenMP thresholds

  • Configuration persistence

Key Functions: hpcs.calibrate(), hpcs.save_calibration_config(), hpcs.simd_info()

Use Case: First-time setup, performance optimization

07. C-Optimized Operations

Notebook: 07_c_optimized_operations.ipynb

Deep dive into SIMD acceleration

Learn:
  • How SIMD vectorization works

  • AVX2 vs AVX-512 vs SSE2 performance

  • When SIMD provides benefits

  • Cache-friendly algorithms

Key Functions: hpcs.get_cpu_info(), hpcs.simd_info()

Use Case: Understanding performance characteristics, optimization

08. NumPy/Pandas Migration Guide

Notebook: 08_numpy_pandas_migration_guide.ipynb

Step-by-step migration examples

Learn:
  • Drop-in replacements for NumPy functions

  • Converting Pandas rolling operations

  • Performance comparisons

  • When to use NumPy/Pandas vs HPCSeries

Use Case: Migrating existing code, evaluating HPCSeries

09. Real-World Applications

Notebook: 09_real_world_applications.ipynb

Complete examples across multiple domains

Learn:
  • Financial data analysis (stock prices, Sharpe ratios)

  • Sensor stream processing (real-time monitoring)

  • Scientific computing (large-scale simulations)

Use Case: End-to-end workflows, production deployments

Case Studies

Kaggle Competition: Store Sales Forecasting

Notebooks:

Competition: Kaggle Store Sales - Time Series Forecasting

Learn:
  • Feature engineering with rolling statistics

  • Lag features and time-based aggregations

  • Performance optimization for Kaggle kernels

  • HPCSeries vs Pandas for competition work

Results: ~10x faster feature engineering

Running in the Cloud

Google Colab

Upload notebooks to Colab and install HPCSeries:

!pip install hpcs

import hpcs
print(hpcs.__version__)

Binder

Launch an interactive environment (if configured):

# Add binder configuration to your repo
# Create environment.yml with dependencies

Contributing Notebooks

We welcome notebook contributions! See Contributing for guidelines.

Good notebook topics:
  • Domain-specific applications (finance, biology, physics)

  • Integration with other libraries (Dask, PyTorch, etc.)

  • Performance comparisons and benchmarks

  • Advanced use cases and optimizations

Support

  • Questions? Open an issue on GitHub

  • Want to share your notebook? Submit a pull request!

  • Found a bug in a notebook? Please report it

Download All Notebooks

All notebooks are included in the repository:

git clone https://github.com/hpcseries/HPCSeriesCore.git
cd HPCSeriesCore/notebooks

Or download as ZIP from GitHub.