enumeration of idle states
how long it takes to enter+exit a particular idle state
[ perhaps information about how destructive to CPU caches that particular idle state is. ]
new driver entry point that allows the scheduler to enter any of the enumerated idle states. Platform code will not change this state, all policy decisions and the idle state is decided at the power saving policy level.
All of this combines into a 'cost to enter and exit an idle state' estimation plus a way to enter idle states. It should be presented to the scheduler in a platform independent fashion, but without policy embedded: a low level platform driver interface in essence.
you're missing an aspect. Deeper idle states on one core, allow (on Intel and AMD at least) the other cores to go faster. So it's not so simple as "if I want more performance, go less deep". By going less deep you also reduce overall performance of the system... as well as increase the power usage.
This aspect really really cannot be ignored, it's quite significant today, and going forward is only going to get more and more significant.