File oprofile-fixes-for-skylake-event-lists.patch of Package oprofile
From: Andi Kleen <ak@linux.intel.com>
Date: Mon Jul 6 16:48:25 2015 -0700
Subject: Fixes for Skylake event lists
Git-commit: ccc38adf33e3ae845e0b7c4f8fe77beceaa7b930
References: FATE#318979
Signed-off-by: Tony Jones <tonyj@suse.de>
oprofile: Fixes for Skylake event lists
This fixes the review feedback for the Skylake event list.
- Fix event codes for INST_RETIRED, CPU_CLK_UNHALTED.
- Fix OFFCORE_REQUESTS_OUTSTANDING events
- Add br_inst_retired.all_branches_pebs
- Fill in correct default event
diff --git a/events/i386/skylake/events b/events/i386/skylake/events
index 28d6654..9a04a86 100644
--- a/events/i386/skylake/events
+++ b/events/i386/skylake/events
@@ -6,8 +6,6 @@
# Note the minimum counts are not discovered experimentally and could be likely
# lowered in many cases without ill effect.
#
-event:0x00 counters:1 um:inst_retired minimum:2000003 name:inst_retired :
-event:0x00 counters:cpuid um:cpu_clk_unhalted minimum:2000003 name:cpu_clk_unhalted :
event:0x03 counters:cpuid um:ld_blocks minimum:100003 name:ld_blocks :
event:0x07 counters:cpuid um:ld_blocks_partial minimum:100003 name:ld_blocks_partial_address_alias :
event:0x08 counters:cpuid um:dtlb_load_misses minimum:2000003 name:dtlb_load_misses :
@@ -16,6 +14,7 @@ event:0x0e counters:cpuid um:uops_issued minimum:2000003 name:uops_issued :
event:0x14 counters:cpuid um:arith minimum:2000003 name:arith_divider_active :
event:0x24 counters:cpuid um:l2_rqsts minimum:200003 name:l2_rqsts :
event:0x2e counters:cpuid um:longest_lat_cache minimum:100003 name:longest_lat_cache :
+event:0x3c counters:cpuid um:cpu_clk_unhalted minimum:2000003 name:cpu_clk_unhalted :
event:0x3c counters:cpuid um:cpu_clk_thread_unhalted minimum:2000003 name:cpu_clk_thread_unhalted :
event:0x48 counters:cpuid um:l1d_pend_miss minimum:2000003 name:l1d_pend_miss :
event:0x49 counters:cpuid um:dtlb_store_misses minimum:2000003 name:dtlb_store_misses :
@@ -44,6 +43,7 @@ event:0xb0 counters:cpuid um:offcore_requests minimum:100003 name:offcore_reques
event:0xb1 counters:cpuid um:uops_executed minimum:2000003 name:uops_executed :
event:0xb2 counters:cpuid um:offcore_requests_buffer minimum:2000003 name:offcore_requests_buffer_sq_full :
event:0xbd counters:cpuid um:tlb_flush minimum:100007 name:tlb_flush :
+event:0xc0 counters:1 um:inst_retired minimum:2000003 name:inst_retired :
event:0xc1 counters:cpuid um:other_assists minimum:100003 name:other_assists_any :
event:0xc2 counters:cpuid um:uops_retired minimum:2000003 name:uops_retired :
event:0xc3 counters:cpuid um:machine_clears minimum:100003 name:machine_clears :
diff --git a/events/i386/skylake/unit_masks b/events/i386/skylake/unit_masks
index 98ed65c..b505769 100644
--- a/events/i386/skylake/unit_masks
+++ b/events/i386/skylake/unit_masks
@@ -37,16 +37,6 @@ name:offcore_requests_buffer type:mandatory default:0x1
0x1 extra: sq_full Offcore requests buffer cannot take more entries for this thread core.
name:other_assists type:mandatory default:0x3f
0x3f extra: any Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists.
-name:inst_retired type:exclusive default:any
- 0x1 extra: any Instructions retired from execution.mem
- 0x0 extra: any_p Number of instructions retired. General Counter - architectural event
- 0x1 extra:pebs prec_dist Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution
-name:cpu_clk_unhalted type:exclusive default:thread
- 0x2 extra: thread Core cycles when the thread is not in halt state
- 0x3 extra: ref_tsc Reference cycles when the core is not in halt state.
- 0x0 extra: thread_p Thread cycles when thread is not in halt state
- 0x2 extra:any thread_any Core cycles when at least one thread on the physical core is not in halt state
- 0x0 extra:any thread_p_any Core cycles when at least one thread on the physical core is not in halt state
name:ld_blocks type:exclusive default:0x2
0x2 extra: store_forward loads blocked by overlapping with store buffer that cannot be forwarded .
0x8 extra: no_sr The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use
@@ -85,6 +75,12 @@ name:l2_rqsts type:exclusive default:0x21
name:longest_lat_cache type:exclusive default:0x41
0x41 extra: miss Core-originated cacheable demand requests missed L3
0x4f extra: reference Core-originated cacheable demand requests that refer to L3
+name:cpu_clk_unhalted type:exclusive default:thread
+ 0x2 extra: thread Core cycles when the thread is not in halt state
+ 0x3 extra: ref_tsc Reference cycles when the core is not in halt state.
+ 0x0 extra: thread_p Thread cycles when thread is not in halt state
+ 0x2 extra:any thread_any Core cycles when at least one thread on the physical core is not in halt state
+ 0x0 extra:any thread_p_any Core cycles when at least one thread on the physical core is not in halt state
name:cpu_clk_thread_unhalted type:exclusive default:ref_xclk
0x1 extra: ref_xclk Reference cycles when the thread is unhalted (counts at 100 MHz rate)
0x2 extra: one_thread_active Count XClk pulses when this thread is unhalted and the other thread is halted.
@@ -119,8 +115,8 @@ name:rs_events type:exclusive default:empty_cycles
0x1 extra:cmask=1,inv,edge empty_end Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.
name:offcore_requests_outstanding type:exclusive default:demand_data_rd
0x1 extra: demand_data_rd Offcore outstanding Demand Data Read transactions in uncore queue.
- 0x2 extra:cmask=1 demand_code_rd Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
- 0x4 extra:cmask=1 demand_rfo Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle
+ 0x2 extra: demand_code_rd Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.
+ 0x4 extra: demand_rfo Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle
0x8 extra: all_data_rd Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore
0x10 extra: l3_miss_demand_data_rd Counts number of Offcore outstanding Demand Data Read requests who miss L3 cache in the superQ every cycle.
0x1 extra:cmask=1 cycles_with_demand_data_rd Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore
@@ -217,6 +213,10 @@ name:uops_executed type:exclusive default:thread
name:tlb_flush type:exclusive default:0x1
0x1 extra: dtlb_thread DTLB flush attempts of the thread-specific entries
0x20 extra: stlb_any STLB flush attempts
+name:inst_retired type:exclusive default:any
+ 0x1 extra: any Instructions retired from execution.mem
+ 0x0 extra: any_p Number of instructions retired. General Counter - architectural event
+ 0x1 extra:pebs prec_dist Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution
name:uops_retired type:exclusive default:retire_slots
0x2 extra: retire_slots Retirement slots used.
0x1 extra:cmask=1,inv stall_cycles Cycles without actually retired uops.
@@ -231,6 +231,7 @@ name:br_inst_retired type:exclusive default:all_branches
0x1 extra:pebs conditional_pebs Conditional branch instructions retired.
0x2 extra: near_call Direct and indirect near call instructions retired.
0x2 extra:pebs near_call_pebs Direct and indirect near call instructions retired.
+ 0x0 extra:pebs all_branches_pebs All (macro) branch instructions retired.
0x8 extra: near_return Return instructions retired.
0x8 extra:pebs near_return_pebs Return instructions retired.
0x10 extra: not_taken Not taken branch instructions retired.
diff --git a/libop/op_events.c b/libop/op_events.c
index f58d243..25f010e 100644
--- a/libop/op_events.c
+++ b/libop/op_events.c
@@ -1200,7 +1200,6 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr)
case CPU_NEHALEM:
case CPU_HASWELL:
case CPU_BROADWELL:
- case CPU_SKYLAKE:
case CPU_SILVERMONT:
case CPU_WESTMERE:
case CPU_SANDYBRIDGE:
@@ -1213,6 +1212,10 @@ void op_default_event(op_cpu cpu_type, struct op_default_event_descr * descr)
descr->count = 1024;
break;
+ case CPU_SKYLAKE:
+ descr->name = "cpu_clk_unhalted";
+ break;
+
case CPU_P4:
case CPU_P4_HT2:
descr->name = "GLOBAL_POWER_EVENTS";