Automated daily electricity data collection and analysis for Nanjing University dormitories.
This project automates the collection, processing, and aggregation of electricity consumption data from the NJU epay system. It provides:
✅ Automated daily data collection at 2 AM UTC
✅ Auto-login with captcha recognition - no manual cookie updates needed
✅ Atomic batch operations with rollback on failure
✅ Cookie-based authentication with validation
✅ Monthly data archiving (tar.gz)
✅ Pre-computed statistics for frontend
✅ File-based JSON storage
✅ GitHub Actions automation
✅ Data persistence via Git repository - survives between workflow runs
How does data persist between GitHub Actions runs?
GitHub Actions每次运行都是全新环境,数据通过以下方式持久化:
database/summaries/数据流:
运行开始 → 检出仓库(包含完整历史 summary)
↓
查询新数据 → 写入原始 database/(临时)
↓
合并数据 → 加载旧 summary + 新数据 → 生成新 summary
↓
提交推送 → 只提交 summaries/(原始数据丢弃)
空间估算(500个房间):
关键配置:
database/{校区}/ 被 .gitignore 忽略(原始数据不提交)database/summaries/ 不被忽略(聚合数据提交)快速验证前端数据显示功能:
# 启动本地服务器
python serve_frontend.py
# 浏览器访问
# http://localhost:8000/frontend/
前端功能演示:
GitHub Actions自动登录(推荐):
每次查询前自动登录获取cookie,无需手动更新。
| Secret Name | Description | Example |
|---|---|---|
NJU_USERNAME |
你的学号 | 201250000 |
NJU_PASSWORD |
统一身份认证密码 | your_password |
YUNMA_TOKEN |
云码API Token | TA6djdhm0NC... |
获取云码Token: 注册 zhuce.jfbym.com → 用户中心 → Token
配置房间列表 (可选):
# 编辑 config/room_ids.txt
echo "53463" > config/room_ids.txt
echo "53464" >> config/room_ids.txt
成本: 云码验证码识别 ~0.01-0.03元/次,月成本 < 1元
详见:docs/github-actions-setup.md
本地手动登录(备用):
如果需要手动获取cookie:
# 安装依赖
pip install -r requirements.txt
# 配置登录信息
echo "your_username" > /tmp/username
echo "your_password" > /tmp/password
echo "your_yunma_token" > /tmp/token
# 自动登录
python scripts/nju_auto_login.py
# Cookie将保存到 /tmp/cookie.json
.
├── .github/workflows/ # GitHub Actions automation
│ ├── daily-query.yml # Scheduled daily collection
│ ├── manual-query.yml # Manual trigger workflow
│ └── data-cleanup.yml # Monthly cleanup/archival
│
├── scripts/ # Processing scripts
│ ├── validate_cookie.py # Cookie validation
│ ├── rollback_failed_run.py # Rollback on failure
│ ├── cleanup_archives.py # Archive management
│ └── aggregate_data.py # Summary generation
│
├── config/
│ └── room_ids.txt # List of room IDs to query
│
├── database/ # Data storage (git-ignored)
│ ├── [campus]/[building]/[room-id]/[date].json # Daily data
│ ├── archives/ # Monthly archives
│ └── summaries/ # Hierarchical aggregated summaries
│ ├── overview.json # All campuses overview
│ └── campuses/ # Campus → Building → Room hierarchy
│
├── logs/
│ └── query_runs/ # Workflow execution logs
│
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests
│
├── nju_electric_query.py # Existing query script (unchanged)
└── list_room_ids.py # Existing room ID script (unchanged)
# Find today's file
find database -name "$(date +%Y%m%d).json"
# View data
cat database/仙林校区/19幢/19栋第16层1613-53463/$(date +%Y%m%d).json | jq
# View overview (all campuses)
cat database/summaries/overview.json | jq
# View specific campus
cat database/summaries/campuses/仙林校区/summary.json | jq
# View specific building
cat database/summaries/campuses/仙林校区/buildings/19幢/summary.json | jq
# View specific room
cat database/summaries/campuses/仙林校区/buildings/19幢/rooms/53463.json | jq
# Extract specific month
cd database/archives
tar -xzf 2026-05.tar.gz
See docs/troubleshooting.md for common issues and solutions.
# Install dev dependencies
pip install -r requirements.txt
# Run all tests
pytest tests/
# Run specific test file
pytest tests/unit/test_validate_cookie.py -v
# Format code
black scripts/
# Lint code
ruff check scripts/
This project follows the Data-Business Separation principle:
nju_electric_query.py (unchanged)scripts/aggregate_data.py, scripts/cleanup_archives.pyData Flow:
Daily Query → Raw JSON Files → Hierarchical Aggregation
↓
database/summaries/
├── overview.json (all campuses)
└── campuses/
└── {campus}/
├── summary.json
└── buildings/
└── {building}/
├── summary.json
└── rooms/{id}.json
See docs/hierarchical-aggregation.md for detailed usage.
logs/query_runs/query_success_rate metricEPAY_COOKIE secret in GitHubdry_run: trueMIT
Built with: