Rapl node capping daemon #21

Merged
ridv merged 18 commits from raplDaemon into master 2020-01-19 19:52:32 +00:00
ridv commented 2020-01-04 21:29:49 +00:00 (Migrated from github.com)
  • Initial work on RAPL daemon.
  • Python code has been ported over.
  • Tests for daemon have been added.

Left to do in separate PRs:

  • Systemd unit file needs to be added.
  • Scheduler code when calling capper should be rewired to send an HTTP payload instead of starting an expensive SSH connection.
* Initial work on RAPL daemon. * Python code has been ported over. * Tests for daemon have been added. Left to do in separate PRs: * Systemd unit file needs to be added. * Scheduler code when calling capper should be rewired to send an HTTP payload instead of starting an expensive SSH connection.
ridv commented 2020-01-05 17:18:36 +00:00 (Migrated from github.com)

Maybe we can put this in a separate repo since there's no dependencies on the main project.

Maybe we can put this in a separate repo since there's no dependencies on the main project.
pradykaushik (Migrated from github.com) requested changes 2020-01-13 19:55:42 +00:00
pradykaushik (Migrated from github.com) left a comment

I agree that we can place the rapl-daemon code in a separate repo.
I have commented on a few things. Apart from that, everything looks okay to me.

I agree that we can place the rapl-daemon code in a separate repo. I have commented on a few things. Apart from that, everything looks okay to me.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:09:54 +00:00

capable of changing

capable **of** changing
pradykaushik (Migrated from github.com) commented 2020-01-13 19:11:14 +00:00

missing period at the end of the sentence.

missing period at the end of the sentence.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:13:15 +00:00

missing period at end of sentence.

missing period at end of sentence.
@ -0,0 +24,4 @@
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Unsupported endpoint %s", html.EscapeString(r.URL.Path))
pradykaushik (Migrated from github.com) commented 2020-01-13 19:12:52 +00:00

error messages should be lowercase.

error messages should be lowercase.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:22:44 +00:00

constraint_1 is the shorter window.

constraint_1 is the shorter window.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:23:24 +00:00

maybe just say cap percentage must be in the range (0, 100].

maybe just say **cap percentage must be in the range (0, 100]**.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:29:52 +00:00

constraint_0 is the longer window.
Going by this we are intending to use the longer window for running average calculations.
So, change the variable name to maxPowerFileLongWindow?

constraint_0 is the longer window. Going by [this](https://github.com/spdfg/elektron/blob/master/scripts/RAPL_PKG_Throttle.py#L48) we are intending to use the longer window for running average calculations. So, change the variable name to `maxPowerFileLongWindow`?
pradykaushik (Migrated from github.com) commented 2020-01-13 19:32:00 +00:00

ditto.

ditto.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:32:28 +00:00

overflows

overflow~s~
pradykaushik (Migrated from github.com) commented 2020-01-13 19:38:12 +00:00

fits on a single line.

fits on a single line.
@ -0,0 +44,4 @@
maxPower, err := maxPower(filepath.Join(base, file.Name(), maxPowerFileLongWindow))
if err != nil {
failed = append(failed, file.Name())
fmt.Println("unable to retreive max power for zone ", err)
pradykaushik (Migrated from github.com) commented 2020-01-13 19:40:01 +00:00

any reason for just printing the error and not returning it?
Basically, the question is whether we need to continue to attempt to powercap if an error occurs anywhere in the process.

any reason for just printing the error and not returning it? Basically, the question is whether we need to continue to attempt to powercap if an error occurs anywhere in the process.
@ -0,0 +53,4 @@
if err := capZone(filepath.Join(base, file.Name(), powerLimitFileLongWindow), powercap); err != nil {
failed = append(failed, file.Name())
fmt.Println("unable to write powercap value: ", err)
pradykaushik (Migrated from github.com) commented 2020-01-13 19:41:13 +00:00

same here regarding whether the error to continue if error occurs or should we just return it.

same here regarding whether the error to continue if error occurs or should we just return it.
pradykaushik (Migrated from github.com) commented 2020-01-13 19:45:13 +00:00

I think it's safe to use spaces ("bad percentage") as the words are hyphenated by default.
Although, I do not mind the camelcasing.

I think it's safe to use spaces ("bad percentage") as the words are hyphenated by default. Although, I do not mind the camelcasing.
@ -0,0 +83,4 @@
}
func TestCapZone(t *testing.T) {
const percentage float64 = .50
pradykaushik (Migrated from github.com) commented 2020-01-13 19:53:17 +00:00

The payload declares percentage as an integer. So, we would never hit this scenario.
Should the percentage in the payload be an integer?

The payload declares `percentage` as an integer. So, we would never hit this scenario. Should the percentage in the payload be an integer?
ridv (Migrated from github.com) reviewed 2020-01-17 17:25:50 +00:00
@ -0,0 +24,4 @@
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Unsupported endpoint %s", html.EscapeString(r.URL.Path))
ridv (Migrated from github.com) commented 2020-01-17 17:25:49 +00:00

In this case this is some http content being served back to to the client that made the request. I did make it lower case though.

In this case this is some http content being served back to to the client that made the request. I did make it lower case though.
ridv (Migrated from github.com) reviewed 2020-01-17 17:34:54 +00:00
ridv (Migrated from github.com) commented 2020-01-17 17:34:53 +00:00

Can you expand on this comment?

Can you expand on this comment?
ridv (Migrated from github.com) reviewed 2020-01-17 17:39:01 +00:00
@ -0,0 +44,4 @@
maxPower, err := maxPower(filepath.Join(base, file.Name(), maxPowerFileLongWindow))
if err != nil {
failed = append(failed, file.Name())
fmt.Println("unable to retreive max power for zone ", err)
ridv (Migrated from github.com) commented 2020-01-17 17:39:01 +00:00

Agreed, I don't know what the right path to take is here. If the file write fails, we could be left in an inconsistent state. The only options here are to bail and return an error right away, or to continue trying to cap the node and then returning an array of zones where were unable to cap the node in.

Agreed, I don't know what the right path to take is here. If the file write fails, we could be left in an inconsistent state. The only options here are to bail and return an error right away, or to continue trying to cap the node and then returning an array of zones where were unable to cap the node in.
ridv (Migrated from github.com) reviewed 2020-01-17 17:42:53 +00:00
@ -0,0 +83,4 @@
}
func TestCapZone(t *testing.T) {
const percentage float64 = .50
ridv (Migrated from github.com) commented 2020-01-17 17:42:53 +00:00

The payload does indeed take an integer, but when we calculate powercap, I'd rather multiply by a fraction than to multiply by a (0,100] integer and then divide by 100. This is done in the next line and it's just for testing.

The payload does indeed take an integer, but when we calculate powercap, I'd rather multiply by a fraction than to multiply by a (0,100] integer and then divide by 100. This is done in the next line and it's just for testing.
pradykaushik (Migrated from github.com) reviewed 2020-01-17 22:15:38 +00:00
@ -0,0 +44,4 @@
maxPower, err := maxPower(filepath.Join(base, file.Name(), maxPowerFileLongWindow))
if err != nil {
failed = append(failed, file.Name())
fmt.Println("unable to retreive max power for zone ", err)
pradykaushik (Migrated from github.com) commented 2020-01-17 22:15:38 +00:00

Any chance we can roll back the power limits set to the previous values if and when any of the file writes fail?

Any chance we can roll back the power limits set to the previous values if and when any of the file writes fail?
pradykaushik (Migrated from github.com) reviewed 2020-01-17 22:19:21 +00:00
pradykaushik (Migrated from github.com) commented 2020-01-17 22:19:20 +00:00

Actually, your way of representing the ranges is more explicit and clear. One is not well-versed with range expressions would still be able to understand your explanation. So, ignore my previous statement and leave it as is.

Actually, your way of representing the ranges is more explicit and clear. One is not well-versed with range expressions would still be able to understand your explanation. So, ignore my previous statement and leave it as is.
pradykaushik (Migrated from github.com) reviewed 2020-01-17 22:20:19 +00:00
pradykaushik (Migrated from github.com) commented 2020-01-17 22:20:18 +00:00

Either change overflows to overflow, or get rid of the "an".

Either change **overflows** to **overflow**, or get rid of the "an".
ridv (Migrated from github.com) reviewed 2020-01-18 01:21:05 +00:00
@ -0,0 +44,4 @@
maxPower, err := maxPower(filepath.Join(base, file.Name(), maxPowerFileLongWindow))
if err != nil {
failed = append(failed, file.Name())
fmt.Println("unable to retreive max power for zone ", err)
ridv (Migrated from github.com) commented 2020-01-18 01:21:04 +00:00

If the write fails, we're sort of left in limbo. If the write to modify it failed, then most likely the write to restore the value will fail too. We can try reading it and seeing if that's the same value as before.

If the write fails, we're sort of left in limbo. If the write to modify it failed, then most likely the write to restore the value will fail too. We can try reading it and seeing if that's the same value as before.
ridv (Migrated from github.com) reviewed 2020-01-18 01:21:47 +00:00
ridv (Migrated from github.com) commented 2020-01-18 01:21:46 +00:00

Good catch, fixed.

Good catch, fixed.
ridv (Migrated from github.com) reviewed 2020-01-18 01:22:14 +00:00
ridv (Migrated from github.com) commented 2020-01-18 01:22:14 +00:00

Fixed this, definitely a mistake on my part. Thanks for catching it!

Fixed this, definitely a mistake on my part. Thanks for catching it!
ridv (Migrated from github.com) reviewed 2020-01-18 01:22:28 +00:00
ridv (Migrated from github.com) commented 2020-01-18 01:22:28 +00:00

I'll revert it

I'll revert it
ridv (Migrated from github.com) reviewed 2020-01-18 01:22:53 +00:00
ridv (Migrated from github.com) commented 2020-01-18 01:22:53 +00:00

Good point, fixed.

Good point, fixed.
pradykaushik (Migrated from github.com) approved these changes 2020-01-19 19:36:54 +00:00
This repository is archived. You cannot comment on pull requests.
No description provided.