sysadmin/ipxe - ipxe - codex.r10x.net

mirror of https://github.com/ipxe/ipxe synced 2026-05-18 10:00:30 +03:00

Author	SHA1	Message	Date
Michael Brown	0d15d7f0a5	[ena] Record supported device features Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-16 16:36:29 +01:00
Michael Brown	e5e371f485	[ena] Cancel uncompleted transmit buffers on close Avoid spurious assertion failures by ensuring that references to uncompleted transmit buffers are not retained after the device has been closed. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-16 16:36:29 +01:00
Michael Brown	dcc5d36ce5	[ena] Map the on-device memory, if present Newer generations of the ENA hardware require the use of low latency transmit queues, where the submission queues and the initial portion of the transmitted packet are written to on-device memory via BAR2 instead of being read from host memory. Prepare for this by mapping the on-device memory BAR. As with the register BAR, we may need to steal a base address from the upstream PCI bridge since the BIOS on some instance types (observed with an m8i.metal-48xl instance in eu-south-2) will fail to assign an address to the device. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-15 15:55:57 +01:00
Michael Brown	510f3e5e17	[ena] Add descriptive messages for any admin queue command failures Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-15 12:00:42 +01:00
Michael Brown	3538e9c39a	[pci] Record prefetchable memory window for PCI bridges Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-14 18:38:08 +01:00
Michael Brown	04a61c413d	[ena] Use pci_bar_set() to place device within bridge memory window Use pci_bar_set() when we need to set a device base address (on instance types such as c6i.metal where the BIOS fails to do so), so that 64-bit BARs will be handled automatically. This particular issue has so far been observed only on 6th generation instances. These use 32-bit BARs, and so the lack of support for handling 64-bit BARs has not caused any observable issue. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-14 15:57:02 +01:00
Michael Brown	94902ae187	[pci] Handle sizing of 64-bit BARs Provide pci_bar_set() to handle setting the base address for a potentially 64-bit BAR, and rewrite pci_bar_size() to correctly handle sizing of 64-bit BARs. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-14 14:43:50 +01:00
Michael Brown	e80818e4f6	[tls] Disable renegotiation unless extended master secret is used RFC 7627 states that renegotiation becomes no longer secure under various circumstances when the non-extended master secret is used. The description of the precise set of circumstances is spread across various points within the document and is not entirely clear. Avoid a superset of the circumstances in which renegotiation apparently becomes insecure by refusing renegotiation completely unless the extended master secret is used. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-12 23:25:09 +01:00
Michael Brown	57504353fe	[tls] Refuse to resume sessions with mismatched master secret methods RFC 7627 section 5.3 states that the client must abort the handshake if the server attempts to resume a session where the master secret calculation method stored in the session does not match the method used for the connection being resumed. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-12 23:25:09 +01:00
Michael Brown	ab64bc5b8d	[tls] Add support for the Extended Master Secret RFC 7627 defines the Extended Master Secret (EMS) as an alternative calculation that uses the digest of all handshake messages rather than just the client and server random bytes. Add support for negotiating the Extended Master Secret extension and performing the relevant calculation of the master secret. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-12 23:25:04 +01:00
Michael Brown	d6656106e9	[tls] Generate master secret only after sending Client Key Exchange The calculation for the extended master secret as defined in RFC 7627 relies upon the digest of all handshake messages up to and including the Client Key Exchange. Facilitate this calculation by generating the master secret only after sending the Client Key Exchange message. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-12 22:20:13 +01:00
Michael Brown	4f44f62402	[gve] Rearm interrupts unconditionally on every poll Experimentation suggests that rearming the interrupt once per observed completion is not sufficient: we still see occasional delays during which the hardware fails to write out completions. As described in commit `d2e1e59` ("[gve] Use dummy interrupt to trigger completion writeback in DQO mode"), there is no documentation around the precise semantics of the interrupt rearming mechanism, and so experimentation is the only available guide. Switch to rearming both TX and RX interrupts unconditionally on every poll, since this produces better experimental results. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-10 13:12:19 +01:00
Michael Brown	f5ca1de738	[gve] Use raw DMA addresses in descriptors in DQO-QPL mode The DQO-QPL operating mode uses registered queue page lists but still requires the raw DMA address (rather than the linear offset within the QPL) to be provided in transmit and receive descriptors. Set the queue page list base device address appropriately. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-10 12:49:26 +01:00
Michael Brown	1cc1f1cd4f	[gve] Report only packet completions for the transmit ring The hardware reports descriptor and packet completions separately for the transmit ring. We currently ignore descriptor completions (since we cannot free up the transmit buffers in the queue page list and advance the consumer counter until the packet has also completed). Now that transmit completions are written out immediately (instead of being delayed until 128 bytes of completions are available), there is no value in retaining the descriptor completions. Omit descriptor completions entirely, and reduce the transmit fill level back down to its original value. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-09 17:29:20 +01:00
Michael Brown	d2e1e591ab	[gve] Use dummy interrupt to trigger completion writeback in DQO mode When operating in the DQO operating mode, the device will defer writing transmit and receive completions until an entire internal cacheline (128 bytes) is full, or until an associated interrupt is asserted. Since each receive descriptor is 32 bytes, this will cause received packets to be effectively delayed until up to three further packets have arrived. When network traffic volumes are very low (such as during DHCP, DNS lookups, or TCP handshakes), this typically induces delays of up to 30 seconds and results in a very poor user experience. Work around this hardware problem in the same way as for the Intel 40GbE and 100GbE NICs: by enabling dummy MSI-X interrupts to trick the hardware into believing that it needs to write out completions to host memory. There is no documentation around the interrupt rearming mechanism. The value written to the interrupt doorbell does not include a consumer counter value, and so must be relying on some undocumented ordering constraints. Comments in the Linux driver source suggest that the authors believe that the device will automatically and atomically mask an MSI-X interrupt at the point of asserting it, that any further interrupts arriving before the doorbell is written will be recorded in the pending bit array, and that writing the doorbell will therefore immediately assert a new interrupt if needed. In the absence of any documentation, choose to rearm the interrupt once per observed completion. This is overkill, but is less impactful than the alternative of rearming the interrupt unconditionally on every poll. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-09 17:12:20 +01:00
Michael Brown	c2d7ddd0c2	[gve] Add missing memory barriers Ensure that remainder of completion records are read only after verifying the generation bit (or sequence number). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-09 16:42:20 +01:00
Michael Brown	5438299649	[intelxl] Use default dummy MSI-X target address Use the default dummy MSI-X target address that is now allocated and configured automatically by pci_msix_enable(). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-09 16:37:14 +01:00
Michael Brown	4224f574da	[pci] Map all MSI-X interrupts to a dummy target address by default Interrupts as such are not used in iPXE, which operates in polling mode. However, some network cards (such as the Intel 40GbE and 100GbE NICs) will defer writing out completions until the point of asserting an MSI-X interrupt. From the point of view of the PCI device, asserting an MSI-X interrupt is just a 32-bit DMA write of an opaque value to an opaque target address. The PCI device has no know to know whether or not the target address corresponds to a real APIC. We can therefore trick the PCI device into believing that it is asserting an MSI-X interrupt, by configuring it to write an opaque 32-bit value to a dummy target address in host memory. This is sufficient to trigger the associated write of the completions to host memory. Allocate a dummy target address when enabling MSI-X on a PCI device, and map all interrupts to this target address by default. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-09 16:29:29 +01:00
Michael Brown	ce30ba14fc	[gve] Select preferred operating mode Select a preferred operating mode from those advertised as supported by the device, falling back to the oldest known mode (GQI-QPL) if no modes are advertised. Since there are devices in existence that support only QPL addressing, and since we want to minimise code size, we choose to always use a single fixed ring buffer even when using raw DMA addressing. Having paid this penalty, we therefore choose to prefer QPL over RDA since this allows the (virtual) hardware to minimise the number of page table manipulations required. We similarly prefer GQI over DQO since this minimises the amount of work we have to do: in particular, the RX descriptor ring contents can remain untouched for the lifetime of the device and refills require only a doorbell write. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-06 14:04:18 +01:00
Michael Brown	74c9fd72cf	[gve] Add support for out-of-order queues Add support for the "DQO" out-of-order transmit and receive queue formats. These are almost entirely different in format and usage (and even endianness) from the original "GQI" in-order transmit and receive queues, and arguably should belong to a completely different device with a different PCI ID. However, Google chose to essentially crowbar two unrelated device models into the same virtual hardware, and so we must handle both of these device models within the same driver. Most of the new code exists solely to handle the differences in descriptor sizes and formats. Out-of-order completions are handled via a buffer ID ring (as with other devices supporting out-of-order completions, such as the Xen, Hyper-V, and Amazon virtual NICs). A slight twist is that on the transmit datapath (but not the receive datapath) the Google NIC provides only one completion per packet instead of one completion per descriptor, and so we must record the list of chained buffer IDs in a separate array at the time of transmission. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-06 14:04:12 +01:00
Michael Brown	0d1ddfe42c	[gve] Cancel pending transmissions when closing device We cancel any pending transmissions when (re)starting the device since any transmissions that were initiated before the admin queue reset will not complete. The network device core will also cancel any pending transmissions after the device is closed. If the device is closed with some transmissions still pending and is then reopened, this will therefore result in a stale I/O buffer being passed to netdev_tx_complete_err() when the device is restarted. This error has not been observed in practice since transmissions generally complete almost immediately and it is therefore unlikely that the device will ever be closed with transmissions still pending. With out-of-order queues, the device seems to delay transmit completions (with no upper time limit) until a complete batch is available to be written out as a block of 128 bytes. It is therefore very likely that the device will be closed with transmissions still pending. Fix by ensuring that we have dropped all references to transmit I/O buffers before returning from gve_close(). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-10-06 13:16:22 +01:00
Joseph Wong	cf53497541	[bnxt] Handle link related async events Handle async events related to link speed change, link speed config change, and port phy config changes. Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>	2025-10-01 16:20:23 +01:00
Michael Brown	4508e10233	[gve] Allow for descriptor and completion lengths to vary by mode The descriptors and completions in the DQO operating mode are not the same sizes as the equivalent structures in the GQI operating mode. Allow the queue stride size to vary by operating mode (and therefore to be known only after reading the device descriptor and selecting the operating mode). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-30 12:17:22 +01:00
Michael Brown	20a489253c	[gve] Rename GQI-specific data structures and constants Rename data structures and constants that are specific to the GQI operating mode, to allow for a cleaner separation from other operating modes. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-30 11:10:20 +01:00
Michael Brown	86b322d999	[gve] Allow for out-of-order buffer consumption We currently assume that the buffer index is equal to the descriptor ring index, which is correct only for in-order queues. Out-of-order queues will include a buffer tag value that is copied from the descriptor to the completion. Redefine the data buffers as being indexed by this tag value (rather than by the descriptor ring index), and add a circular ring buffer to allow for tags to be reused in whatever order they are released by the hardware. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-30 11:09:45 +01:00
Michael Brown	b8dd3c384b	[gve] Add support for raw DMA addressing Raw DMA addressing allows the transmit and receive descriptors to provide the DMA address of the data buffer directly, without requiring the use of a pre-registered queue page list. It is modelled in the device as a magic "raw DMA" queue page list (with QPL ID 0xffffffff) covering the whole of the DMA address space. When using raw DMA addressing, the transmit and receive datapaths could use the normal pattern of mapping I/O buffers directly, and avoid copying packet data into and out of the fixed queue page list ring buffer. However, since we must retain support for queue page list addressing (which requires this additional copying), we choose to minimise code size by continuing to use the fixed ring buffer even when using raw DMA addressing. Add support for using raw DMA addressing by setting the queue page list base device address appropriately, omitting the commands to register and unregister the queue page lists, and specifying the raw DMA QPL ID when creating the TX and RX queues. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-29 15:13:55 +01:00
Michael Brown	9f554ec9d0	[gve] Add concept of a queue page list base device address Allow for the existence of a queue page list where the base device address is non-zero, as will be the case for the raw DMA addressing (RDA) operating mode. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-29 15:13:55 +01:00
Michael Brown	91db5b68ff	[gve] Set descriptor and completion ring sizes when creating queues The "create TX queue" and "create RX queue" commands have fields for the descriptor and completion ring sizes, which are currently left unpopulated since they are not required for the original GQI-QPL operating mode. Populate these fields, and allow for the possibility that a transmit completion ring exists (which will be the case when using the DQO operating mode). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-29 15:13:55 +01:00
Michael Brown	048a346705	[gve] Add concept of operating mode The GVE family supports two incompatible descriptor queue formats: * GQI: in-order descriptor queues * DQO: out-of-order descriptor queues and two addressing modes: * QPL: pre-registered queue page list addressing * RDA: raw DMA addressing All four combinations (GQI-QPL, GQI-RDA, DQO-QPL, and DQO-RDA) are theoretically supported by the Linux driver, which is essentially the only public reference provided by Google. The original versions of the GVE NIC supported only GQI-QPL mode, and so the iPXE driver is written to target this mode, on the assumption that it would continue to be supported by all models of the GVE NIC. This assumption turns out to be incorrect: Google does not deem it necessary to retain backwards compatibility. Some newer machine types (such as a4-highgpu-8g) support only the DQO-RDA operating mode. Add a definition of operating mode, and pass this as an explicit parameter to the "configure device resources" admin queue command. We choose a representation that subtracts one from the value passed in this command, since this happens to allow us to decompose the mode into two independent bits (one representing the use of DQO descriptor format, one representing the use of QPL addressing). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-29 15:13:55 +01:00
Michael Brown	610089b98e	[gve] Remove separate concept of "packet descriptor" The Linux driver occasionally uses the terminology "packet descriptor" to refer to the portion of the descriptor excluding the buffer address. This is not a helpful separation, and merely adds complexity. Simplify the code by removing this artifical separation. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-29 15:12:54 +01:00
Michael Brown	ee9aea7893	[gve] Parse option list returned in device descriptor Provide space for the device to return its list of supported options. Parse the option list and record the existence of each option in a support bitmask. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-09-26 12:02:03 +01:00
Joseph Wong	6464f2edb8	[bnxt] Add error recovery support Add support to advertise adapter error recovery support to the firmware. Implement error recovery operations if adapter fault is detected. Refactor memory allocation to better align with probe and open functions. Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>	2025-09-18 13:25:07 +01:00
Michael Brown	969ce2c559	[efi] Use current boot option as a fallback for obtaining the boot URI Some systems (observed with a Lenovo X1) fail to populate the loaded image device path with a Uri() component when performing a UEFI HTTP boot, instead creating a broken loaded image device path that represents a DHCP+TFTP boot that has not actually taken place. If no URI is found within the loaded image device path, then fall back to looking for a URI within the current boot option. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-29 12:34:17 +01:00
Michael Brown	c10da8b53c	[efi] Add ability to extract device path from an EFI load option An EFI boot option (stored in a BootXXXX variable) comprises an EFI_LOAD_OPTION structure, which includes some undefined number of EFI device paths. (The structure is extremely messy and awkward to parse in C, but that's par for the course with EFI.) Add a function to extract the first device path from an EFI load option, along with wrapper functions to read and extract the first device path from an EFI boot variable. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-29 12:34:17 +01:00
Michael Brown	5bec2604a3	[libc] Add wcsnlen() Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-28 15:12:41 +01:00
Michael Brown	61b4585e2a	[efi] Drag in MNP driver whenever SNP driver is present The chainloaded-device-only "snponly" driver already drags in support for driving SNP, NII, and MNP devices, on the basis that the user generally doesn't care which UEFI API is used and just wants to boot from the same network device that was used to load iPXE. The multi-device "snp" driver already drags in support for driving SNP and NII devices, but does not drag in support for MNP devices. There is essentially zero code size overhead to dragging in support for MNP devices, since this support is always present in any iPXE application build anyway (as part of the code to download "autoexec.ipxe" prior to installing our own drivers). Minimise surprise by dragging in support for MNP devices whenever using the "snp" driver, following the same reasoning used for the "snponly" driver. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-27 13:12:11 +01:00
Joseph Wong	a53ec44932	[bnxt] Update CQ doorbell type Update completion queue doorbell to a non-arming type, since polling is used. Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>	2025-08-13 12:36:20 +01:00
Michael Brown	8460dc4e8f	[dwgpio] Use fdt_reg() to get GPIO port numbers DesignWare GPIO port numbers are represented as unsized single-entry regions. Use fdt_reg() to obtain the GPIO port number, rather than requiring access to a region cell size specification stored in the port group structure. This allows the field name "regs" in the port group structure to be repurposed to hold the I/O register base address, which then matches the common usage in other drivers. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-07 15:49:12 +01:00
Michael Brown	88ba011764	[fdt] Provide fdt_reg() for unsized single-entry regions Many region types (e.g. I2C bus addresses) can only ever contain a single region with no size cells specified. Provide fdt_reg() to reduce boilerplate in this common use case. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-07 15:49:09 +01:00
Michael Brown	9d4a2ee353	[cmdline] Show commands in alphabetical order Commands were originally ordered by functional group (e.g. keeping the image management commands together), with arrays used to impose a functionally meaningful order within the group. As the number of commands and functional groups has expanded over the years, this has become essentially useless as an organising principle. Switch to sorting commands alphabetically (using the linker table mechanism). Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-06 16:34:45 +01:00
Michael Brown	332241238e	[digest] Treat inability to acquire an image as a fatal error The "md5sum" and "sha1sum" commands were originally intended solely as debugging utilities, and would return success (with a warning message) even if the specified images did not exist. To minimise surprise and to be consistent with other commands, treat the inability to acquire an image as a fatal error. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-06 15:21:14 +01:00
Michael Brown	6fa901530a	[digest] Add "--set" option to store digest value in a setting Allow the result of a digest calculation to be stored in a named setting. This allows for digest verification in scripts using e.g.: set expected:hexraw cb05def203386f2b33685d177d9f04e3e3d70dd4 sha1sum --set actual 1mb iseq ${expected} ${actual} \|\| goto checksum_bad Note that digest verification alone cannot be used to set the trusted execution status of an image. The only way to mark an image as trusted is to use the "imgverify" command. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-06 14:07:00 +01:00
Michael Brown	f5467d69db	[github] Extend sponsorship link Add Christian Nilsson <nikize@gmail.com> as a project sponsorship recipient, to reflect the enormous amount of time invested in responding to issues and pull requests. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-06 13:31:00 +01:00
Michael Brown	f45782f9f3	[digest] Add commands for all enabled digest algorithms Add "sha256sum", "sha512sum", and similar commands. Include these new commands only when DIGEST_CMD is enabled in config/general.h and the corresponding algorithm is enabled in config/crypto.h. Leave "mdsum" and "sha1sum" included whenever only DIGEST_CMD is enabled, to avoid potentially breaking backwards compatibility with builds that disabled MD5 or SHA-1 as a TLS or X.509 digest algorithm, but would still have expected those commands to be present. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-06 13:17:25 +01:00
Michael Brown	2e4e1f7e9e	[dwgpio] Add driver for the DesignWare GPIO controller Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-05 14:39:56 +01:00
Michael Brown	90fe3a2924	[gpio] Add a framework for GPIO controllers Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-05 13:54:27 +01:00
Michael Brown	5f10b74555	[fdt] Use phandle as device location Consumption of phandles will be in the form of locating a functional device (e.g. a GPIO device, or an I2C device, or a reset controller) by phandle, rather than locating the device tree node to which the phandle refers. Repurpose fdt_phandle() to obtain the phandle value (instead of searching by phandle), and record this value as the bus location within the generic device structure. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-08-04 14:52:00 +01:00
Michael Brown	f7a1e9ef8e	[dwmac] Show core version in debug messages Read and display the core version immediately after mapping the MMIO registers, to provide a basic sanity check that the registers have been correctly mapped and the core is not held in reset. Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-07-30 15:59:38 +01:00
Michael Brown	01b1028d4e	[bnxt] Remove unnecessary test_if macro Signed-off-by: Michael Brown <mcb30@ipxe.org>	2025-07-30 14:08:25 +01:00
Joseph Wong	6ca7a560a4	[bnxt] Remove unnecessary I/O macros Remove unnecessary driver specific macros. Use standard pci_read_config_xxxx, pci_write_config_xxx, writel/q calls. Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>	2025-07-30 14:03:51 +01:00

1 2 3 4 5 ...

7199 Commits