Describing PCP data metrics and PCP logs.

This commit is contained in:
Pradyumna Kaushik 2018-10-05 17:27:53 -04:00
parent bcc2b4e4cc
commit 1902d1a3e8

View file

@ -0,0 +1,58 @@
# Performance Co-Pilot (PCP) Data
_Elektron_ makes use of PCP to collect performance metrics using the [pcp config](../../config) file.
[pmdumptext](https://pcp.io/man/man1/pmdumptext.1.html) is used to retrieve all the data. The command used to retrieve the performance metrics is shown below (can also be found [here](https://gitlab.com/spdf/elektron/blob/master/pcp/pcp.go#L15)).
```commandline
pmdumptext -m -l -f '' -t 1.0 -d , -c <config file>
```
The logs are written to a file named _\<logFilePrefix\>\_\<timestamp\>.pcplog_, where
* _logFilePrefix_ is the prefix provided using the `-logPrefix` option.
* _timestamp_ corresponds to the time when _Elektron_ was run.
Use `-pminfo` to obtain information about different performance metrics that can be monitored through Performance Co-Pilot. Please see [pminfo doc](https://pcp.io/man/man1/pminfo.1.html) for usage and options.
#### Example PCP log
Assume we want to retrieve the following performance metrics collected from one host, _myhost_.<br>
* System CPU time
* User CPU time
Then the PCP config file would be as shown below.
```
myhost:kernel.all.cpu.user
myhost:kernel.all.cpu.sys
```
When we run the `pmdumptext` command mentioned above for 5 seconds, the PCP log from _Elektron_ would be as shown below.
```
<yyyy/mm/dd> <hh:mm:ss> myhost:kernel.all.cpu.user,myhost:kernel.all.cpu.sys
<yyyy/mm/dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
<yyyy/mm/dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
<yyyy/mm/dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
<yyyy/mm/dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
<yyyy/mm/dd> <hh:mm:ss> <myhost user cpu time>,<myhost system cpu time>
```
## Power Measurements
It is also possible to measure the power consumption of CPU, DRAM etc., through the use of RAPL hardware counters.
When running the power capping strategies, [Extrema](../PowerCappingStrategies.md) and [Progressive Extrema](../PowerCappingStrategies.md), the following performance metrics MUST be included in the PCP config file.
```
#RAPL CPU PKG
<hostname1>:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value
<hostname2>:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value
...
#RAPL DRAM
<hostname1>:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value
<hostname2>:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value
...
```
Note that the power readings are retrieved for each processor on each worker node. For example, if you have two processors on a machine (hostname = _myhost_), then the PCP log for CPU and DRAM power readings would contain the following headers.
`myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value["cpux"]`
`myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_PKG.value["cpuy"]`
`myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value["cpux"]`
`myhost:perfevent.hwcounters.rapl__RAPL_ENERGY_DRAM.value["cpuy"]`
Use `-pminfo` and search for RAPL to get the list of RAPL packages from which data can be read from.