Commit Graph

1569 Commits

Author SHA1 Message Date
Michael Brown ec38e98d40 [disklog] Generalise CONSOLE_INT13 to CONSOLE_DISKLOG
The name "int13" is intrinsically specific to a BIOS environment.
Generalise the build configuration option CONSOLE_INT13 to
CONSOLE_DISKLOG, in preparation for adding EFI disk log console
support.

Existing configurations using CONSOLE_INT13 will continue to work.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-04-07 13:21:59 +01:00
Michael Brown 6d82975059 [undi] Pad transmit buffer length to work around vendor driver bugs
The workaround used for UEFI in commit 926816c ("[efi] Pad transmit
buffer length to work around vendor driver bugs") is also applicable
to the BIOS UNDI driver.

Apply the same workaround of padding the transmit I/O buffers to the
minimum Ethernet frame length before passing them to the underlying
UNDI driver's transmit function.

Reported-by: Alexander Patrakov <patrakov@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-04-07 00:40:38 +01:00
Michael Brown 646d4b0a09 [build] Allow binaries to request a log partition from genfsimg
For UEFI, the USB disk image is constructed from the built EFI binary
(e.g. bin-x86_64-efi/ipxe.efi) by genfsimg, which does not itself have
any way to access the build configuration.  We therefore need a way to
annotate the binary such that genfsimg can determine whether or not to
include a log partition within the USB disk image.

The "OEM ID" and "OEM information" fields within the PE header can be
used for this, since they are easily accessed and serve no other
purpose.  We define bit 0 of "OEM information" as a flag indicating
that a log partition should be included.  If this bit is set, genfsimg
will create a log partition with a layout matching that of the BIOS
build (i.e. using partition 3 and at an offset of 16kB from the start
of the disk).

The PE header is constructed by elf2efi.c, which takes as an input the
linked ELF form of the binary.  We use an ELF .note section to allow
any linked-in object to communicate the log partition request through
to elf2efi.c, which then populates the OEM information field
accordingly.

We choose to use the same field locations within the BIOS bzImage
header, since this allows genfsimg to use the same logic for both BIOS
and UEFI binaries.  In a BIOS build, there is no external processing
equivalent to elf2efi.c, and so we construct the field value directly
using absolute symbols and explicit relocation records.

(Note that the bzImage header is relevant only when using genfsimg to
construct a combined BIOS/UEFI image.  In the common case of building
a BIOS-only image such as bin/ipxe.usb, the partition table is
manually constructed by usbdisk.S and genfsimg is not involved.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-03-26 15:12:52 +00:00
Michael Brown 7b3eb3c86a [undi] Drag in PCI-specific configuration
The undionly.kpxe binary does not need the full PCI bus support.
However, the overwhelming majority of UNDI devices are PCI-based and
we already end up dragging in PCI configuration space support in order
to be able to test for devices with broken interrupts.

Dragging in the PCI configuration allows the PCI settings mechanism to
also be present, which is often useful for end users.  The total cost
is around 200 bytes in the final binary, which is acceptable for a
generally very useful feature.

Users wanting to minimise the binary size can choose to explicitly
disable PCI_SETTINGS via config/settings.h.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-03-18 15:12:11 +00:00
Michael Brown 2012ab71de [pxeprefix] Add a minimal iPXE NBP metadata header
There is no fixed structure for a PXE NBP: the format is just an
opaque block of executable code that is loaded into memory verbatim
and executed by jumping to the first byte.  It is consequently
impossible for external code to unambiguously identify a PXE NBP, or
to inspect any metadata about the NBP's functionality.

The first five bytes of an iPXE NBP are already fixed as being an ljmp
instruction that resets the code segment to 0x7c0 and continues
execution from the following byte.  We can extend this to include a
minimal header as follows:

    Offset    Content
    ------    -------
    0         ljmp instruction (0xea)
    1-2       ljmp offset (and therefore length of header)
    3-4       ljmp segment (0x07c0)
    5+        Metadata fields
    \_ 5      CPU architecture (0x32=i386, 0x64=x86_64)
    \_ 6-7    Magic value (0x18ae)

This is backwards-compatible to existing binaries (which effectively
have zero bytes of metadata following the ljmp instruction), and
allows for future expansion by appending metadata fields (with the
ljmp offset used to determine the overall header length and therefore
the presence of further fields).

In this initial version of the header, define a magic value (used to
differentiate an iPXE NBP from other binaries that happen to start
with an ljmp instruction), and a single-byte value that encodes
whether this binary is built for 32-bit or 64-bit CPUs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-22 23:23:37 +00:00
Michael Brown 2161e976cd [build] Include USB drivers in the all-drivers build by default
Including USB drivers has some unavoidable side effects.  With a BIOS
firmware, attaching the host controller drivers will necessarily
disable the SMM-based USB legacy support which emulates a PS/2
keyboard.  With a UEFI firmware, loading the host controller drivers
may disconnect some of the less compliant vendor USB device drivers.

We have historically erred on the side of caution and avoided
including any USB drivers in the all-drivers build.  Time has moved
on, USB NICs have become more common (especially for laptops, which
now rarely include physical Ethernet ports), and the UEFI Secure Boot
model makes it prohibitively difficult for users to compile their own
binaries to add support for non-default drivers.

Switch to including USB drivers by default in the all-drivers build.
Provide a fallback build target that matches the existing driver set
(i.e. excluding any USB drivers) and can be built using e.g.:

   make bin/ipxe-legacy.iso

   make bin-x86_64-efi/ipxe-legacy.efi

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-13 18:36:14 +00:00
Michael Brown ae8e23a452 [build] Handle all driver list construction via parserom.pl
Handle construction of the EFI, Linux, Xen, and VMBus driver build
rules via parserom.pl to ensure consistency.  In particular, this
allows those drivers to appear in the DRIVERS_SECBOOT list used to
filter out non-permitted drivers in a Secure Boot build.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-13 14:16:44 +00:00
Michael Brown c9158cb32c [build] Mark Xen HVM files as permitted for UEFI Secure Boot
The Xen netfront driver and the core architecture-independent files
such as xenstore.c and xenbus.c are already marked as permitted for
UEFI Secure Boot, but the x86-specific HVM driver (which attaches to
the PCI device and instantiates the Xen devices) is not.

Review the HVM-specific files and mark them as permitted for UEFI
Secure Boot.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-13 14:06:00 +00:00
Michael Brown 25429d952d [build] Include PCI drivers only in BIOS and UEFI builds
We currently have no PCI bus abstractions for Linux userspace or for
RISC-V SBI.  Limit PCI drivers to being included in the all-drivers
build only for BIOS and UEFI platforms.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-12 13:29:06 +00:00
Michael Brown 3f12b8b1cf [build] Include devicetree drivers in the SBI all-drivers build
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-12 13:29:06 +00:00
Michael Brown 5669c4d52e [build] Include Xen and Hyper-V drivers only in x86 BIOS and UEFI builds
The Xen and Hyper-V drivers cannot be included in the Linux userspace
build since they require MMIO accesses.  Limit these drivers to being
included in the all-drivers build only for BIOS and UEFI platforms.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-12 13:08:06 +00:00
Michael Brown 8a1dd58502 [build] Include ISA drivers only in 32-bit BIOS builds
ISA hardware is vanishingly unlikely to be encountered in anything
other than pre-64-bit x86 hardware with a BIOS firmware.  Exclude the
ISA drivers from all other builds.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-12 11:44:37 +00:00
Michael Brown 7a2817bbd7 [build] Drag in Xen and Hyper-V support via network device drivers
Include Xen and Hyper-V support in the all-drivers build by dragging
in the netfront and netvsc drivers, since these are the functional
drivers that provide network interfaces.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-11 22:02:23 +00:00
Michael Brown 481e043116 [librm] Work around two errata in the 386's "popal" instruction
Detailed experiments show that at least one model of 386 CPU has a
previously undocumented errata in the "popal" instruction.
Specifically: when the stack-address size is 16 bits and the operand
size is 32 bits, the "popal" instruction will erroneously load the
high 16 bits of %esp from the value stored on the stack.

The "movl -20(%esp), %esp" instruction near the end of virt_call()
currently relies on the assumption that the high 16 bits of %esp will
already be zero, since they were set to zero by the "movzwl %bp, %esp"
instruction at the end of prot_to_real() and will not have been
subsequently modified by the "popal".  This 386 CPU errata invalidates
that assumption, with the result that we end up loading the stack
pointer from an essentially undefined memory location.

Fix by inserting a "movzwl %sp, %esp" after the "popal" to explicitly
zero the high 16 bits of %esp.

Inserting this instruction also happens to work around another (known
and documented) errata in the 386, in which the CPU may malfunction if
"popal" is followed immediately by an instruction that uses a base
address register to form an effective address.

Debugged-by: Jaromir Capik <jaromir.capik@email.cz>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-10 10:29:54 +00:00
Michael Brown 18fab8dd84 [loong64] Fix error identifier generation for LoongArch64
The initial code contribution from Loongson defined ASM_NO_PREFIX as
being "a" for this architecture.  This seems to result in small values
such as error line numbers being rendered as "$r0, <value>" rather
than just "<value>".

This seems to hit an undocumented behaviour path in the GNU assembler.
For some reason ".long $r0" is not treated as a syntax error but will
instead be treated as a zero value.  The net effect is therefore that
an extra zero value is emitted before the line number in the einfo
structure, which in turn causes the error information parser to see
all source code line numbers as zero.  (The overall structure remains
valid since the length and all string offsets are encoded within the
structure itself, so nothing breaks when a spurious extra integer
field is appended.)

Fix by setting ASM_NO_PREFIX to the empty string (as for RISC-V),
since there are no literal value prefixes anyway in LoongArch64
assembly.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-02-05 13:45:19 +00:00
Michael Brown 6b17d320db [build] Mark core arm64 files as permitted for UEFI Secure Boot
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-28 15:44:58 +00:00
Michael Brown 40c2db9d67 [build] Mark direct kernel loading as forbidden for UEFI Secure Boot
Our long-standing policy for EFI platforms is that we support invoking
binary executables only via the LoadImage() and StartImage() boot
services calls, so that all security policy decisions are delegated to
the platform firmware.

Most binary executable formats that we support are BIOS-only and
cannot in any case be linked in to an EFI executable.  The only
cross-platform format is the generic Linux kernel image format as used
for RISC-V (and potentially also for AArch64).

Mark all files associated with direct loading of a kernel binary as
explicitly forbidden for UEFI Secure Boot.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-28 13:38:20 +00:00
Michael Brown 4db03054d5 [build] Mark GDB stub as forbidden for UEFI Secure Boot
Enabling the GDB debugger functionality would provide an immediate and
trivial Secure Boot exploit.  Mark all GDB-related files as explicitly
forbidden for UEFI Secure Boot.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-28 13:20:38 +00:00
Jaromir Capik 641ea020f1 [prefix] Make unlzma.S compatible with 386 class CPUs
Replace the bswap instruction with xchgb and roll and change the
module architecture from i486 to i386 to be consistent with the rest
of the project.

Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-25 16:15:32 +00:00
Michael Brown adcaaf9b93 [build] Mark known reviewed files as permitted for UEFI Secure Boot
Some past security reviews carried out for UEFI Secure Boot signing
submissions have covered specific drivers or functional areas of iPXE.
Mark all of the files comprising these areas as permitted for UEFI
Secure Boot.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-14 16:10:29 +00:00
Michael Brown 6cccb3bdc0 [build] Mark core files as permitted for UEFI Secure Boot
Mark all files used in a standard build of bin-x86_64-efi/snponly.efi
as permitted for UEFI Secure Boot.  These files represent the core
functionality of iPXE that is guaranteed to have been included in
every binary that was previously subject to a security review and
signed by Microsoft.  It is therefore legitimate to assume that at
least these files have already been reviewed to the required standard
multiple times.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-01-14 13:25:34 +00:00
Michael Brown 7c39c04a53 [crypto] Allow for zero-length big integer literals
Ensure that zero-length big integer literals are treated as containing
a zero value.  Avoid tests on every big integer arithmetic operation
by ensuring that bigint_required_size() always returns a non-zero
value: the zero-length tests can therefore be restricted to only
bigint_init() and bigint_done().

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-12-29 14:01:46 +00:00
Michael Brown 9c1ac48bcf [pci] Allow probing permission to vary by range
Make pci_can_probe() part of the runtime selectable PCI I/O API, and
defer this check to the per-range API.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-24 23:16:32 +00:00
Michael Brown ff1a17dc7e [pci] Use linker tables for runtime selectable PCI APIs
Use the linker table mechanism to enumerate the underlying PCI I/O
APIs, to allow PCIAPI_CLOUD to become architecture-independent code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-24 20:54:01 +00:00
Michael Brown 0cf2f8028c [pci] Allow PCI configuration space access mechanism to vary by range
On some systems (observed on an AWS EC2 c7a.medium instance in
eu-west-2), there is no single PCI configuration space access
mechanism that covers all PCI devices.  For example: the ECAM may
provide access only to bus 01 and above, with type 1 accesses required
to access bus 00.

Allow the choice of PCI configuration space access mechanism to be
made on a per-range basis, rather than being made just once in an
initialisation function.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-24 19:10:49 +00:00
Michael Brown 81496315f2 [arm] Avoid unaligned accesses for memcpy() and memset()
iPXE runs only in environments that support unaligned accesses to RAM.
However, memcpy() and memset() are also used to write to graphical
framebuffer memory, which may support only aligned accesses on some
CPU architectures such as ARM.

Restructure the 64-bit ARM memcpy() and memset() routines along the
lines of the RISC-V implementations, which split the region into
pre-aligned, aligned, and post-aligned sections.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-19 22:20:38 +00:00
Michael Brown bd3982b630 [ioapi] Allow iounmap() to be called for port I/O addresses
Allow code using the combined MMIO and port I/O accessors to safely
call iounmap() to unmap the MMIO or port I/O region.

In the virtual offset I/O mapping API as used for UEFI, 32-bit BIOS,
and 32-bit RISC-V SBI, iounmap() is a no-op anyway.  In 64-bit RISC-V
SBI, we have no concept of port I/O and so the issue is moot.

This leaves only 64-bit BIOS, where it suffices to simply do nothing
for any pages outside of the chosen MMIO virtual address range.

For symmetry, we implement the equivalent change in the very closely
related RISC-V page management code.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-05 19:33:53 +00:00
Michael Brown a786c8d231 [uart] Support 16550 UARTs accessed via either MMIO or port I/O
Use the combined accessors ioread8() and iowrite8() to read and write
16550 UART registers, to allow the decision between using MMIO and
port I/O to be made at runtime.

Minimise the increase in code size for x86 by ignoring the register
shift, since this is essentially used only for non-x86 SoCs.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-04 21:14:41 +00:00
Michael Brown f7de1b53dc [ioapi] Provide combined MMIO and port I/O accessors
Some devices (such as a 16550 UART) may be accessed via either MMIO or
port I/O.  This is currently forced to be a compile-time decision.
For example: we currently access a 16550 UART via port I/O on x86 and
via MMIO on any other platform.

PCI UARTs with MMIO BARs do exist but are not currently supported in
an x86 build of iPXE.  Some AWS EC2 systems (observed on a c6i.metal
instance in eu-west-2) provide only a PCI MMIO UART, and it is
therefore currently impossible to get serial output from iPXE on these
instance types.

Add ioread8(), ioread16(), etc accessors that will select between MMIO
and port I/O at the point of use.  For non-x86 platforms where we
currently have no port I/O support, these simply become wrappers
around the corresponding readb(), readw(), etc MMIO accessors.  On
x86, we use the fairly well-known trick of treating any 16-bit address
(below 64kB) as a port I/O address.

This trick works even in the i386 BIOS build of iPXE (where virtual
addresses are offset from physical addresses by a runtime constant),
since the first 64kB of the virtual address space will correspond to
the iPXE binary itself (along with its uninitialised-data space), and
so must be RAM rather than a valid MMIO address range.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-11-04 21:14:41 +00:00
Michael Brown 0ddd830693 [riscv] Correct page table stride calculation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-27 14:22:16 +00:00
Michael Brown 426c721e32 [librm] Correct page table stride calculation
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-10-27 14:22:16 +00:00
Michael Brown 9d4a2ee353 [cmdline] Show commands in alphabetical order
Commands were originally ordered by functional group (e.g. keeping the
image management commands together), with arrays used to impose a
functionally meaningful order within the group.

As the number of commands and functional groups has expanded over the
years, this has become essentially useless as an organising principle.
Switch to sorting commands alphabetically (using the linker table
mechanism).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-08-06 16:34:45 +01:00
Michael Brown a814c46059 [riscv] Place explicitly zero-initialised variables in the .data section
Variables in the .bss section cannot be relied upon to have zero
values during early initialisation, before we have relocated ourselves
to somewhere suitable in RAM and zeroed the .bss section.

Place any explicitly zero-initialised variables in the .data section
rather than in .bss, so that we can rely on their values even during
this early initialisation stage.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 13:15:11 +01:00
Michael Brown 5bda1727b4 [riscv] Allow for poisoning .bss section before early initialisation
On startup, we may be running from read-only memory, and therefore
cannot zero the .bss section (or write to the .data section) until we
have parsed the system memory map and relocated ourselves to somewhere
suitable in RAM.  The code that runs during this early initialisation
stage must be carefully written to avoid writing to the .data section
and to avoid reading from or writing to the .bss section.

Detecting code that erroneously writes to the .data or .bss sections
is relatively easy since running from read-only memory (e.g. via
QEMU's -pflash option) will immediately reveal the bug.  Detecting
code that erroneously reads from the .bss section is harder, since in
a freshly powered-on machine (or in a virtual machine) there is a high
probability that the contents of the memory will be zero even before
we explicitly zero out the section.

Add the ability to fill the .bss section with an invalid non-zero
value to expose bugs in early initialisation code that erroneously
relies upon variables in .bss before the section has been zeroed.  We
use the value 0xeb55eb55eb55eb55 ("EBSS") since this is immediately
recognisable as a value in a crash dump, and will trigger a page fault
if dereferenced since the address is in a non-canonical form.

Poisoning the .bss can be done only when the image is known to already
reside in writable memory.  It will overwrite the relocation records,
and so can be done only on a system where relocation is known to be
unnecessary (e.g. because paging is supported).  We therefore do not
enable this behaviour by default, but leave it as a configurable
option via the config/fault.h header.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-30 12:31:15 +01:00
Michael Brown e3a6e9230c [undi] Assume that legacy interrupts are broken for any PCIe device
PCI Express devices do not have physical INTx output signals, and on
modern motherboards there is unlikely to be any interrupt controller
with physical interrupt input signals.  There are multiple levels of
abstraction involved in emulating the legacy INTx interrupt mechanism:
the PCIe device sends Assert_INTx and Deassert_INTx messages, PCIe
bridges and switches must collate these virtual wires, and the root
complex must map the virtual wires into messages that can be
understood by the host's emulated 8259 PIC.

This complex chain of emulations is rarely tested on modern hardware,
since operating systems will invariably use MSI-X for PCI devices and
the I/O APIC for non-PCI devices such as the real-time clock.  Since
the legacy interrupt emulation mechanism is rarely tested, it is
frequently unreliable.  We have encountered many issues over the years
in which legacy interrupts are simply not raised as expected, even
when inspection shows that the device believes it is asserting an
interrupt and the controller believes that the interrupt is enabled.

We already maintain a list of devices that are known to fail to
generate legacy interrupts correctly.  This list is based on the PCI
vendor and device IDs, which is not necessarily a fair test since the
root cause may be a board-level misconfiguration rather than a
device-level fault.

Assume that any PCI Express device has a high chance of not being able
to raise legacy interrupts reliably.  This is a relatively intrusive
change since it will affect essentially all modern network devices,
but should hopefully fix all future issues with non-functional legacy
interrupts, without needing to constantly grow the list of known
broken devices.

If some PCI Express devices are found to fail when operated in polling
mode, then this change will need to be revisited.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-24 14:14:41 +01:00
Michael Brown 65b8a6e459 [pxeprefix] Display PCI vendor and device ID in PXE startup banner
In the case of a misbehaving PXE stack, it is often useful to know the
PCI vendor and device IDs (e.g. for adding the device to the list of
devices with known broken support for generating interrupts).

The PCI vendor and device ID is already available to the prefix code,
and so can trivially be printed out.  Add this information to the PXE
prefix startup banner.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-23 16:11:09 +01:00
Michael Brown 1e3fb1b37e [init] Show initialisation function names in debug messages
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-15 14:10:33 +01:00
Michael Brown 434462a93e [riscv] Ensure coherent DMA allocations do not cross cacheline boundaries
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 13:50:41 +01:00
Michael Brown d539a420df [riscv] Support the standard Svpbmt extension for page-based memory types
Set the appropriate Svpbmt type bits within page table entries if the
extension is supported.  Tested only in QEMU so far, due to the lack
of availability of real hardware supporting Svpbmt.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:24:02 +01:00
Michael Brown 2aacb346ca [riscv] Create coherent DMA mapping of 32-bit address space on demand
Reuse the code that creates I/O device page mappings to create the
coherent DMA mapping of the 32-bit address space on demand, instead of
constructing this mapping as part of the initial page table.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:23:51 +01:00
Michael Brown 0611ddbd12 [riscv] Use 1GB pages for I/O device mappings
All 64-bit paging schemes support at least 1GB "gigapages".  Use these
to map I/O devices instead of 2MB "megapages".  This reduces the
number of consumed page table entries, increases the visual similarity
of I/O remapped addresses to the underlying physical addresses, and
opens up the possibility of reusing the code to create the coherent
DMA map of the 32-bit address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-11 12:05:52 +01:00
Michael Brown bbabde8ff8 [riscv] Invalidate data cache on completed RX DMA buffers
The data cache must be invalidated twice for RX DMA buffers: once
before passing ownership to the DMA device (in case the cache happens
to contain dirty data that will be written back at an undefined future
point), and once after receiving ownership from the DMA device (in
case the CPU happens to have speculatively accessed data in the buffer
while it was owned by the hardware).

Only the used portion of the buffer needs to be invalidated after
completion, since we do not care about data within the unused portion.

Update the DMA API to include the used length as an additional
parameter to dma_unmap(), and add the necessary second cache
invalidation pass to the RISC-V DMA API implementation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-10 14:39:07 +01:00
Michael Brown 634d9abefb [riscv] Add optimised TCP/IP checksumming
Add a RISC-V assembly language implementation of TCP/IP checksumming,
which is around 50x faster than the generic algorithm.  The main loop
checksums aligned xlen-bit words, using almost entirely compressible
instructions and accumulating carries in a separate register to allow
folding to be deferred until after all loops have completed.

Experimentation on a C910 CPU suggests that this achieves around four
bytes per clock cycle, which is comparable to the x86 implementation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-10 13:32:45 +01:00
Michael Brown 101ef74a6e [riscv] Provide a DMA API implementation for RISC-V bare-metal systems
Provide an implementation of dma_map() that performs cache clean or
invalidation as required, and an implementation of dma_alloc() that
returns virtual addresses within the coherent mapping of the 32-bit
physical address space.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-09 11:07:37 +01:00
Michael Brown e223b32511 [riscv] Support explicit cache management operations on I/O buffers
On platforms where DMA devices are not in the same coherency domain as
the CPU cache, it is necessary to be able to explicitly clean the
cache (i.e. force data to be written back to main memory) and
invalidate the cache (i.e. discard any cached data and force a
subsequent read from main memory).

Add support for cache management via the standard Zicbom extension or
the T-Head cache management operations extension, with the supported
extension detected on first use.

Support cache management operations only on I/O buffers, since these
are guaranteed to not share cachelines with other data.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-07 16:38:23 +01:00
Michael Brown 6a75115a74 [riscv] Add support for detecting T-Head vendor extensions
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-07 16:38:23 +01:00
Michael Brown d75d10df16 [riscv] Create coherent DMA mapping for low 4GB of address space
Use PTEs 256-259 to create a mapping of the 32-bit physical address
space with attributes suitable for coherent DMA mappings.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-04 16:10:51 +01:00
Michael Brown 3fd54e4f3a [riscv] Construct invariant portions of page table outside the loop
The page table entries for the identity map vary according to the
paging level in use, and so must be constructed within the loop used
to detect the maximum supported paging level.  Other page table
entries are invariant between paging levels, and so may be constructed
just once before entering the loop.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-07-04 16:10:51 +01:00
Michael Brown 8df3b96402 [build] Allow for the existence of small-data sections
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 14:40:57 +01:00
Michael Brown 97f40c5fcc [pxe] Use a weak symbol for isapnp_read_port
Use a weak symbol for isapnp_read_port used in pxe_preboot.c, rather
than relying on a common symbol.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-06-24 13:34:41 +01:00