PCI Express (PCIe) has emerged as a cornerstone technology, enabling efficient and reliable data transfer between devices. Among the aspects of PCIe endpoint design are the optional implementation of Completion Timeout Range Select and Completion Timeout Value. These parameters play a pivotal role in ensuring system stability, performance, and compliance with the PCIe specification. However, they are optional per the PCIe standard. Although Design engineers may favor against implementation, when it comes to design for test in post silicon it is necessary.
Completion Timeout refers to the maximum time an endpoint device waits for a response to a transaction request before considering it as failed. The PCIe specification defines a range of timeout values to accommodate various system requirements and use cases. These values are categorized into predefined ranges as shown as Completion Timeout Range:
Figure 1 Completion Timeout Ranges
Figure 2 Completion Timeout Ranges defined for a DEVCAP2 register
These values are pat of bit fields in the DEVCAP or “Capabilities” register. That is if it is enabled by designed. Otherwise it is wired to 0x0.
The completion timeout range select is set in Device Control 2 register (DEVCTL2). The DEVCTL2.CTRS value should be set to one of the ranges in the Completion Timeout Range. If design chooses not to implement CTRS/CTV then it will be set to a fixed range. This is where issues for validation engineers in post silicon validation at scale can be introduced.
For reference, the following is sample Verilog for implementation of CTRS and CTV.
For CTRS Range and CTV in DEVCTL2 register:
module devctl2_config (
input wire clk,
input wire reset,
input wire [3:0] timeout_range, // 4 bits for timeout range (e.g., Range A to D)
input wire [3:0] timeout_value, // 4 bits for timeout value within the range
output reg [15:0] devctl2 // DevCtl2 is a 16-bit register
);
// Fields in DevCtl2 based on PCIe specification:
// Bits [7:4]: Completion Timeout Range
// Bits [3:0]: Completion Timeout Value
always @(posedge clk or posedge reset) begin
if (reset) begin
devctl2 <= 16'b0; // Reset the DevCtl2 register to default
end else begin
// Set Completion Timeout Range and Completion Timeout Value
devctl2[7:4] <= timeout_range; // Map timeout range to bits [7:4]
devctl2[3:0] <= timeout_value; // Map timeout value to bits [3:0]
end
end
Endmodule
The case for and against CTRS and CTV being implemented.
Completion Timeout Range Select (CTRS) and Completion Timeout Value (CTV) are optional features in the PCIe standard because not all systems or devices require the same level of configurability for timeout mechanisms. Design has some reasons why they may decide against implementing CTRS and CTV
Application-Specific Needs: Different PCIe devices and systems have varying requirements for timeout behavior. Some devices may operate in environments where default timeout settings are sufficient, while others may need more precise control over timeout ranges and values.
Simplified Design for Certain Devices: For simpler devices or systems with less stringent performance or reliability requirements, implementing CTRS and CTV might add unnecessary complexity. Making these features optional allows designers to focus on essential functionalities.
Cost and Resource Optimization: Implementing CTRS and CTV requires additional logic and resources in the hardware design. By making these features optional, the PCIe standard allows manufacturers to optimize costs and resources for devices that do not need advanced timeout configurations.
However, for DFT, Design For Test, it is necessary to implement CTRS and CTV for PCIe endpoints in a SoC environment. Let’s examine the following I/O stack:
Figure 3 Non-posted reads from slow MMIO space on PCIe Test Card
In this example, PCIe Endpoint 1 and PCIe Endpoint 2 are under test at SoC level validation at scale, where a PCIe test card with a MMIO card with slower response to a non-posted read request from the PCIe endpoints. If in this case EP1 is set to a range that is too restrictive of a completion timeout range, EP1 could experience undesired completion timeouts while EP2 which is not so restrictive, perhaps implementing CTRS and CTV, would not. This would cause false failures in validation.
A more important need would be a test for Unexpected Completions at the PCIe endpoint. Consider the following flow of non-posted reads from our EP1 in the above diagram to the MMIO space of the PCIe test card
Figure 4 Flow of non-posted read to slow MMIO space resulting in UC AER error
In this example the host driver makes calls to EP1 to do a series of non-transparent reads resulting in a UC (Unexpected Completion) as well as CTO(s)
A) Host driver makes call for PCIe endpoint to have a series of non-posted reads of MMIO space of a slow PCIe test card.
B) First non-posted read request reaches MMIO space of test card. The latency exceeds the completion timeout range as set in DEVCAP2.CTV in the endpoint.
C) Many more non-posted reads reach the MMIO space of the test card. Each TLP has an Transaction ID but the PCIe endpoint has only a finite number of them.
D) PCIe endpoint experiences a completion timeout of the non-posted read in (B) because of the slow MMIO space. Later the completion with data reaches the PCIe endpoint with an Transaction ID from (B) but the Transaction ID (in the form of requester identifier and tag has been reused for later non-posted reads.
E) PCIe endpoint generates an AER response for both the UC and the CTO. The UC could cause unstable behavior.
For the TLP shown below, the circled bytes define the requester ID. Reuse of bytes 8 to 10 in another non-posted read TLP causes the UC.
Figure 5 TLP for non-posted read highlight requester identifier
Validations need for DEVCTL2.CTRS and DEVCTL2.CTV knobs
There are two aspects of validation in focus during this discussion. One is validation at scale where multiple different PCIe endpoints maybe under test each one maybe having different expectations for latency in PCIe non-posted read requests. The validation and automation teams need to adjust the CTRS and CTV knobs as necessary in such an environment.
The second aspect is allowing testing for UCs and CTOs in validation. This would allow for robust negative testing of the PCIe endpoint where multiple UCs and CTOs could be generated with the use of multiple MMIO targets in non-posted reads.
Design should implement CTRS and CTV given design for test needs.
Conclusion
This has just been an example of innovative verification. Here At Approaching Zero Escapes Validation and Development; LLC, we strive to continue to increase verification and validation coverage. And bring that new coverage to our clients. You want zero bug escapes, to both internal customers and external customers. At Approaching Zero Escapes, we want to partner with you to achieve that.
References
3.9.1. PCI Express and PCI Capabilities ; Arria® 10 Avalon® Streaming with SR-IOV IP for PCIe* User Guide
Repo for code referenced in this article:
https://github.com/bobwalker1-ship-it/UC_CTO_article.git