Added first draft of OLS model with sample data and helper tools by ArneTR · Pull Request #3 · green-kernel/procpower

ArneTR · 2025-09-09T18:02:19Z

This PR re-works the current model implementation of the OLS model to estimate the weights.

The implementation is done in Python with the statsmodels library to make the formulas used better readable than with sklearn's OLS implementation.

OLS was used as model of choice as it has the highest interpretability allowing direct conclusions about component energy factors.

Also this PR adds sample data from my machine (Framebook) to support the project development with some sample data.

Suggested TODOs for further exploration

Perform validations of the OLS assumptions
Incorporate ridge L2 model to test
Consider alternative models, e.g., logarithmic models, which correlate more closely with the energy consumption of a CPU
Generally, record a power curve of the system and see if it is linear, logarithmic, or whatever ... draw conclusions from this
Generally validate how good the mixed model is for repetitions. Currently only done once

ArneTR · 2025-09-09T18:08:37Z

Also I want to provide some experience while playing around with some models:

Workload design experiences

Computer systems behave very differently in power draw when different workloads happen. Thus it was explored if separate and disparate workloads shall be used or one large mixed workload shall be used.

Advantages of the separate workloads could be:

Better fit and better prediction
Might work especially well in edge cases
Might needs less sample data to fit

Disadvantages though respectively:

When conditions change the fit might be unusable. This was tried with a fit on a compute workload. Here the intercept was already 16 W (although system was 4 Watts in idle). Thus it was impossible to know for the model what would happen if CPU was idle.
The later prediction must know which model to choose when. This is hard when no domain knowledge is present as for instance a rule like "switch to compute when instructions are > 10,000,000" might be misleading in a system with a higher base frequency or sytems that only have one core and cannot fully sleep the core.

Model design experiences

Fitting on all variables leads to high colinearity. This means that repeated runs produce highly different weights. This can be either combated by higher sample sizes, changing the sample interval or by simply dropping variables or chaining them together.
- Needs to be explored further
Idle workloads are dominated by wakeups. If a simple model with only wakeups is fitted the idle is properly estimated at around 1 W for the intercept and about 2-3 W contribution by the wakeups.

ArneTR · 2025-09-09T18:08:58Z

@ribalba ping for you

…y; Making mixed workload more diverse

ArneTR · 2025-09-17T18:32:54Z

I just upgrade the PR and added some more transformations.

Try at you box and tell me the values.

Here is what I do:

sudo pkill -f energy-logger.sh
sudo ./energy-logger.sh & # will output the file it will write to
python3 run-workload.py mixed
sudo pkill -f energy-logger.sh

# now you will have output to stdout of a file name like /tmp/energy-0yb45hR0.log

# now you run the exact same stuff again

sudo pkill -f energy-logger.sh
sudo ./energy-logger.sh & # will output the file it will write to
python3 run-workload.py mixed
sudo pkill -f energy-logger.sh

# now you will have a different output to stdout of a file name like /tmp/energy-890342da.log

# Then you can run:

python3 model.py /tmp/energy-0yb45hR0.log --no-validate --fit OLS --predict /tmp/energy-890342da.log

This will effectively fit a model on the first benchmark run and then make an out of sample prediction with the second run.

Also please run the last command again with --log appended

Eager for the results :)

ArneTR · 2025-09-17T18:33:02Z

@ribalba

* main: New sys file in debugfs

…ata to sample data

…redict stage

ArneTR · 2025-10-27T06:49:59Z

@ribalba I added a prediction stage that now also back-transforms the data from the logarithm space.

It can now be iterated quite quickly iterated on. I added examples for using the model.py and the predict stage of it with included energy sample data from the newly added endpoint in /sys/kernel/debug/energy/sys.

This can be merged now if you feel the functionality is contributing to the tool.

Happy for your feedback.

TODOs (TBD in a different PR though):

Automatic validation of model assumptions (condition number, F-Statistic, normal distribution of errors etc.)
- Statsmodels calculates all these numbers already and outputs them in a statistic. They currently must be interpreted by the reader but conditions could be introduced to make this automatic
Implementing more and better models
- The model chokes on numerical errors. Log transformation helps quite a bit, but sacrifices explainability and needs a log transformation in the kernel.
- Thus other models and other transformations should be explored to make the model more viable
- Alternatively models should be situational. A compute model for when compute is happening and a idle model when the system is idle.
Reducing overhead
- On a 99ms sampling the kernel module requires 10% of one core. That is a negligable load on the 10 core system I am testing on, but when added with a sampling from userspace of 10 Hz the 5% margin of system load (which this PR assumes as not idle anymore) is quickly reached

…of statsmodels

ArneTR · 2026-01-10T06:53:28Z

I just pushed another big chunk of implementations to the branch:

Models

Implemented XGBoost Model
Implemented Ridge Model
Implemented Huber Regressor

Data Preparation

Implemeneted Standard Scaler

Results

python3 model.py ../sample_data/energy_logs/mixed_workload.log --dump-raw --dump-diff --predict ../sample_data/energy_logs/mixed_workload_2.log --add-intercept --features extra --model xgboost --dump-predictions --no-validate
=> 82% Accuracy

python3 model.py ../sample_data/energy_logs/mixed_workload.log --dump-raw --dump-diff --predict ../sample_data/energy_logs/mixed_workload_2.log --add-intercept --features normal --model ols --dump-predictions --no-validate
=> 66% Accuracy

Summary

Huber and Ridge provide no benefit over a scaled OLS
XGBoost far outperforms all models on mixed data. However data becomes uninterpretable.
- If we want to use that in the model output we need to put a user space conversion step after the kernel output

Possible next steps

Train separate OLS models for idle, compute etc. and see if we can get OLS to > 80% accuracy
Massage the dataset even more, removing outliers, apply different scaling strategies etc.

Call me :)

… place of argparse

…t transformations like IPS

…n minimal energy budgets have occured

…to most energy consuming components

ArneTR · 2026-04-03T15:39:40Z

@ribalba

This model is now in a very good status.

Brought over benchmark.sh script with more versatile network etc. workload
Made memory workload more versatile to also have --stream
Better ouput of accuracy including WMAPE and sMAPE (normal MAPE can collapse due to singularity and give false negative accuracy)
Added MCP integration (See Mcp granularity green-coding-solutions/green-metrics-tool#1630 for high resolution energy output)

Results

The results are on-par with the model from #4 ... thus I would favor closing #4 and #5 for increased complexity and no gain

To reproduce:

$ python3 model.py ../sample_data/energy_logs/M2-fixed-frequency-mixed-workload-new.log --no-validate --target rapl_core_sum_uj --add-intercept --features all --predict ../sample_data/energy_logs/M2-fixed-frequency-mixed-workload-new-2.log --model ols --scale
                            OLS Regression Results
==============================================================================
Dep. Variable:       rapl_core_sum_uj   R-squared:                       0.920
Model:                            OLS   Adj. R-squared:                  0.920
Method:                 Least Squares   F-statistic:                     3216.
Date:                Fri, 03 Apr 2026   Prob (F-statistic):               0.00
Time:                        17:29:00   Log-Likelihood:                -30414.
No. Observations:                2244   AIC:                         6.085e+04
Df Residuals:                    2235   BIC:                         6.090e+04
Df Model:                           8
Covariance Type:            nonrobust
================================================================================
                   coef    std err          t      P>|t|      [0.025      0.975]
--------------------------------------------------------------------------------
const         1.058e+06   3938.888    268.579      0.000    1.05e+06    1.07e+06
cpu_ns        4.688e+05   6511.658     71.990      0.000    4.56e+05    4.82e+05
mem           1.216e+04   3959.779      3.072      0.002    4399.143    1.99e+04
instructions  1.976e+05   6016.728     32.842      0.000    1.86e+05    2.09e+05
wakeups       4.503e+04   4489.838     10.028      0.000    3.62e+04    5.38e+04
diski        -1087.3669   4157.814     -0.262      0.794   -9240.948    7066.214
disko         1.692e+04   4193.800      4.033      0.000    8691.217    2.51e+04
rx            5.319e+04   4.68e+04      1.137      0.256   -3.86e+04    1.45e+05
tx           -2.092e+04   4.68e+04     -0.447      0.655   -1.13e+05    7.08e+04
==============================================================================
Omnibus:                      238.860   Durbin-Watson:                   0.813
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1114.578
Skew:                          -0.406   Prob(JB):                    9.39e-243
Kurtosis:                       6.356   Cond. No.                         26.2
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Rescaled params:
 const           274158.747531
cpu_ns               0.002019
mem                  0.000135
instructions         0.000322
wakeups              4.524398
diski               -0.044344
disko                0.081460
rx                 211.375596
tx                -189.914460
dtype: float64
MAE: 127893.85497056974
MAPE (%): 13.415981823943687
WAPE (%): 12.35187010382711
sMAPE (%): 12.744775805994854
R²: 0.9156642472220207

As seen the average error is around 10% sometimes more sometimes less depending on some fluctuation.

With XGBoost, which is the gold standard, we get down to 5%.

Next steps

To Advance further now I need more variables in the procpower output:

Unhalted Clock Cycles
- Is this available per core?
Halted Clock Cycles (Might not be available)
- Is this available per core?
Frequency (Only if available directly. Otherwise I will produce it from ClockCycles / WallTime )
- Is this available per core?
APERF
- Is this available per core?
MPERF
- Is this available per core?
DRAM-serviced LLC-miss loads
TLB Miss
DMA Access
Interrupts (is that distinctively available from wakeups?)
Other Instruction value. What are we using at the moment? Unhalted Instructions? Retired Instructions?
- Is this available per core?
uOPs (sub-Instructions)
- Is this available per core?
Temperature (Should be in IA32_PACKAGE_THERM_STATUS and can be Intel only for now)

Added first draft of OLS model with sample data and helper tools

53d1ccf

ArneTR added 2 commits September 17, 2025 20:28

Moving energy logger again to 0.1. We need many samples

a501ac8

Itegrates prediction; DataFrame transformation for numerical stabilit…

a3561a6

…y; Making mixed workload more diverse

ArneTR added 7 commits October 25, 2025 11:12

Merge branch 'main' into OLS-model

8f16c4a

* main: New sys file in debugfs

Added Workload energy samples of new system only metrics and energy d…

48ae8cf

…ata to sample data

Changed readme to recommend 99 ms anti lockstep sampling by default

b030705

(fix): Predictions where not re-transformed from logspace

dccaffd

Added missing statsmodels requirement

ea3daa8

More robust np.subtype check

380614b

Added examples in the readme for usage with sample data and for the p…

5e141e5

…redict stage

ArneTR marked this pull request as ready for review October 27, 2025 06:50

ArneTR added 3 commits January 10, 2026 12:43

Added better prediction and scaling by incorporating SKLearn instead …

66e6344

…of statsmodels

Added XGBoost model, Ridge model and HuberRegressor

4ff4403

Added Sublime Workspace got gitignore

6922943

ArneTR added 10 commits March 21, 2026 13:31

Removing any version requirements. Just pull latest available

9176c5d

Added missing requirement XGBoost

126b257

Made code allow to use different target: rapl_core_sum_uj; Refactored…

e82e43d

… place of argparse

Removed run-workload and merged with Didis benchmark.sh

51961c8

Added powercurve display functionality

0af2b0d

Benchmark option added to end after CPU workload

d990add

Made memory benchmarks more versatile

5021e83

Added all mode to model which allows to include all parameters withou…

f38a8b5

…t transformations like IPS

Added sMAPE and WAPE to evaluation aprams which are more reliable whe…

62d2e25

…n minimal energy budgets have occured

Brought plotting functionality to model.py

5db9685

ArneTR added 3 commits March 26, 2026 10:30

Updated Readme to reflect new benchmark file

d7b836f

(fix): Plot shold happen on diffed DF

0c68012

timestamp and sample_ns should not be diffed

bb27667

ArneTR force-pushed the OLS-model branch from 0eee29d to bb27667 Compare March 26, 2026 09:49

ArneTR added 4 commits March 26, 2026 10:51

Added sample_ns to extractor

331e434

Integrated MCP functionality into energy logger

7f4d85e

Updated Sample data

2a0c668

Added psu_energy_ac_mcp_machine and slightly altered normal features …

6d21e4c

…to most energy consuming components

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added first draft of OLS model with sample data and helper tools#3

Added first draft of OLS model with sample data and helper tools#3
ArneTR wants to merge 30 commits into
mainfrom
OLS-model

ArneTR commented Sep 9, 2025 •

edited

Loading

Uh oh!

ArneTR commented Sep 9, 2025

Uh oh!

ArneTR commented Sep 9, 2025

Uh oh!

ArneTR commented Sep 17, 2025

Uh oh!

ArneTR commented Sep 17, 2025

Uh oh!

ArneTR commented Oct 27, 2025 •

edited

Loading

Uh oh!

ArneTR commented Jan 10, 2026

Uh oh!

ArneTR commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ArneTR commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Suggested TODOs for further exploration

Uh oh!

ArneTR commented Sep 9, 2025

Workload design experiences

Model design experiences

Uh oh!

ArneTR commented Sep 9, 2025

Uh oh!

ArneTR commented Sep 17, 2025

Uh oh!

ArneTR commented Sep 17, 2025

Uh oh!

ArneTR commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArneTR commented Jan 10, 2026

Models

Data Preparation

Results

Summary

Possible next steps

Uh oh!

ArneTR commented Apr 3, 2026

Results

Next steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ArneTR commented Sep 9, 2025 •

edited

Loading

ArneTR commented Oct 27, 2025 •

edited

Loading