Commit Graph

7070 Commits

Author SHA1 Message Date
Michael Brown
56f5845b36 [riscv] Include carriage returns in libprefix.S debug messages
Support debug consoles that do not automatically convert LF to CRLF by
including the CR character within the debug message strings.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-26 00:10:30 +01:00
Michael Brown
09140ab2c1 [memmap] Allow explicit colour selection for memory map debug messages
Provide DBGC_MEMMAP() as a replacement for memmap_dump(), allowing the
colour used to match other messages within the same message group.

Retain a dedicated colour for output from memmap_dump_all(), on the
basis that it is generally most useful to visually compare full memory
dumps against previous full memory dumps.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-25 12:06:53 +01:00
Michael Brown
8d88870da5 [riscv] Support older SBI implementations
Fall back to attempting the legacy SBI console and shutdown calls if
the standard calls fail.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-25 10:43:39 +01:00
Michael Brown
036e43334a [memmap] Rename addr/last fields to min/max for clarity
Use the terminology "min" and "max" for addresses covered by a memory
region descriptor, since this is sufficiently intuitive to generally
not require further explanation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 16:55:42 +01:00
Michael Brown
cd38ed4fab [lkrn] Support initrd construction for RISC-V bare-metal kernels
Use the shared initrd reshuffling and CPIO header construction code
for RISC-V bare-metal kernels.  This allows for files to be injected
into the constructed ("magic") initrd image in exactly the same way as
is done for bzImage and UEFI kernels.

We append a dummy image encompassing the FDT to the end of the
reshuffle list, so that it ends up directly following the constructed
initrd in memory (but excluded from the initrd length, which was
recorded before constructing the FDT).

We also temporarily prepend the kernel binary itself to the reshuffle
list.  This is guaranteed to be safe (since reshuffling is designed to
be unable to fail), and avoids the requirement for the kernel segment
to be available before reshuffling.  This is useful since current
RISC-V bare-metal kernels tend to be distributed as EFI zboot images,
which require large temporary allocations from the external heap for
the intermediate images created during archive extraction.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 16:14:45 +01:00
Michael Brown
c713ce5c7b [initrd] Squash and shuffle only initrds within the external heap
Any initrd images that are not within the external heap (e.g. embedded
images) do not need to be copied to the external heap for reshuffling,
and can just be left in their original locations.

Ignore any images that are not already within the external heap (or,
more precisely, that are wholly outside of the reshuffle region within
the external heap) when squashing and swapping images.

This reduces the maximum additional storage required by squashing and
swapping to zero, and so ensures that the reshuffling step is
guaranteed to succeed under all circumstances.  (This is unrelated to
the post-reshuffle load region check, which is still required.)

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 14:39:17 +01:00
Michael Brown
4a39b877dd [initrd] Split out initrd construction from bzimage.c
Provide a reusable function initrd_load_all() to load all initrds
(including any constructed CPIO headers) into a contiguous memory
region, and support functions to find the constructed total length and
permissible post-reshuffling load address range.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-23 12:31:46 +01:00
Michael Brown
11929389e4 [initrd] Allow for images straddling the top of the reshuffle region
It is hypothetically possible for external heap memory allocated
during driver startup to have been freed before an image was
downloaded, which could therefore leave an image straddling the
address recorded as the top of the reshuffle region.

Allow for this possibility by skipping squashing for any images
already straddling (or touching) the top of the reshuffle region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
Michael Brown
029c7c4178 [initrd] Rename bzimage_align() to initrd_align()
Alignment of initrd lengths is applicable to all Linux kernels, not
just those in the x86 bzImage format.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
Michael Brown
9231d8c952 [initrd] Swap initrds entirely in-place via triple reversal
Eliminate the requirement for free space when reshuffling initrds by
swapping adjacent initrds using an in-place triple reversal.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
Michael Brown
11e01f0652 [uheap] Expose external heap region directly
We currently rely on implicit detection of the external heap region.
The INT 15 memory map mangler relies on examining the corresponding
in-use memory region, and the initrd reshuffler relies on performing a
separate detection of the largest free memory block after startup has
completed.

Replace these with explicit public symbols to describe the external
heap region.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:28:15 +01:00
Michael Brown
e056041074 [uheap] Prevent allocation of blocks with zero physical addresses
If the external heap ends up at the top of the system memory map then
leave a gap after the heap to ensure that no block ends up being
allocated with either a start or end address of zero, since this is
frequently confusing to both code and humans.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:16:14 +01:00
Michael Brown
b9095a045a [fdtmem] Allow iPXE to be relocated to the top of the address space
Allow for relocation to a region at the very end of the physical
address space (where the next address wraps to zero).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-22 16:16:14 +01:00
Michael Brown
a534563345 [riscv] Speed up memmove() when copying in forwards direction
Use the word-at-a-time variable-length memcpy() implementation when
performing an overlapping copy in the forwards direction, since this
is guaranteed to be safe and likely to be substantially faster than
the existing bytewise copy.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 16:12:56 +01:00
Michael Brown
20d2c0f787 [lkrn] Shut down devices before jumping to kernel entry point
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 15:07:55 +01:00
Michael Brown
969e8b5462 [lkrn] Allow a single initrd to be passed to the booted kernel
Allow a single initrd image to be passed verbatim to the booted RISC-V
kernel, as a proof of concept.

We do not yet support reshuffling to make optimal use of available
memory, or dynamic construction of CPIO headers, but this is
sufficient to allow iPXE to start up the Fedora 42 kernel with its
matching initrd image.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 14:56:10 +01:00
Michael Brown
9bc559850c [fdt] Allow an initrd to be specified when creating a device tree
Allow an initrd location to be specified in our constructed device
tree via the "linux,initrd-start" and "linux,initrd-end" properties.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 14:31:18 +01:00
Michael Brown
c1cd54ad74 [initrd] Move initrd reshuffling to be architecture-independent code
There is nothing x86-specific in initrd.c, and a variant of the
reshuffling logic will be required for executing bare-metal kernels on
RISC-V and AArch64.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-21 12:12:16 +01:00
Michael Brown
d15a11f3a4 [image] Use image replacement when executing extracted images
Use image_replace() to transfer execution to the extracted image,
rather than calling image_exec() directly.  This allows the original
archive image to be freed immediately if it was marked as an
automatically freeable image (e.g. via "chain --autofree").

In particular, this ensures that in the case of an archive image
containing another archive image (such as an EFI zboot kernel wrapper
image containing a gzip-compressed kernel image), the intermediate
extracted image will be freed as early as possible, since extracted
images are always marked as automatically freeable.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 15:34:49 +01:00
Michael Brown
e2f4dba2b7 [lkrn] Add support for EFI zboot compressed kernel images
Current RISC-V and AArch64 kernels found in the wild tend not to be in
the documented kernel format, but are instead "EFI zboot" kernels
comprising a small EFI executable that decompresses and executes the
inner payload (which is a kernel in the expected format).

The EFI zboot header includes a recognisable magic value "zimg" along
with two fields describing the offset and length of the compressed
payload.  We can therefore treat this as an archive image format,
extracting the payload as-is and then relying on our existing ability
to execute compressed images.

This is sufficient to allow iPXE to execute the Fedora 42 RISC-V
kernel binary as currently published.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 14:29:57 +01:00
Michael Brown
ecac4a34c7 [lkrn] Add basic support for the RISC-V Linux kernel image format
The RISC-V and AArch64 bare-metal kernel images share a common header
format, and require essentially the same execution environment: loaded
close to the start of RAM, entered with paging disabled, and passed a
pointer to a flattened device tree that describes the hardware and any
boot arguments.

Implement basic support for executing bare-metal RISC-V and AArch64
kernel images.  The (trivial) AArch64-specific code path is untested
since we do not yet have the ability to build for any bare-metal
AArch64 platforms.  Constructing and passing an initramfs image is not
yet supported.

Rename the IMAGE_BZIMAGE build configuration option to IMAGE_LKRN,
since "bzImage" is specific to x86.  To retain backwards compatibility
with existing local build configurations, we leave IMAGE_BZIMAGE as
the enabled option in config/default/pcbios.h and treat IMAGE_LKRN as
a synonym for IMAGE_BZIMAGE when building for x86 BIOS.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-20 13:08:38 +01:00
Michael Brown
d0c35b6823 [bios] Use generic external heap based on the system memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 20:47:21 +01:00
Michael Brown
140ceeeb08 [riscv] Use generic external heap based on the system memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:25 +01:00
Michael Brown
4d560af2b0 [uheap] Add a generic external heap based on the system memory map
Add an implementation of umalloc() using the generalised model of a
heap, placing the external heap in the largest usable region obtained
from the system memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:25 +01:00
Michael Brown
490f1ecad8 [malloc] Allow heap to specify block and pointer alignments
Size-tracked pointers allocated via umalloc() have historically been
aligned to a page boundary, as have the edges of the hidden memory
region covering the external heap.

Allow the block and size-tracked pointer alignments to be specified as
heap configuration parameters.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:36:23 +01:00
Michael Brown
c6ca3d3af8 [malloc] Allow for the existence of multiple heaps
Create a generic model of a heap as a list of free blocks with
optional methods for growing and shrinking the heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-19 19:35:56 +01:00
Michael Brown
83449702e0 [memmap] Remove now-obsolete get_memmap()
All memory map users have been updated to use the new system memory
map API.  Remove get_memmap() and its associated definitions.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 18:16:41 +01:00
Michael Brown
624d76e26d [bios] Use memmap_describe() to find an external heap location
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 18:04:27 +01:00
Michael Brown
79c30b92a3 [settings] Use memmap_describe() to construct memory map settings
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:39:36 +01:00
Michael Brown
c8d64ecd87 [bios] Use memmap_describe() to find a relocation address
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:39:29 +01:00
Michael Brown
dbc86458e5 [comboot] Use memmap_describe() to obtain available memory
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:02:55 +01:00
Michael Brown
d0adf3b4cc [multiboot] Use memmap_describe() to construct Multiboot memory map
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 17:02:55 +01:00
Michael Brown
25ab8f4629 [image] Use memmap_describe() to check loadable image segments
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:55:35 +01:00
Michael Brown
a353e70800 [memmap] Use memmap_dump_all() to dump debug memory maps
There are several places where get_memmap() is called solely to
produce debug output.  Replace these with calls to memmap_dump_all()
(which will be a no-op unless debugging is enabled).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
Michael Brown
3812860e39 [bios] Describe umalloc() heap as an in-use memory area
Use the concept of an in-use memory region defined as part of the
system memory map API to describe the umalloc() heap.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
Michael Brown
4c4c94ca09 [bios] Update to use the generic system memory map API
Provide an implementation of the system memory map API based on the
assorted BIOS INT 15 calls, and a temporary implementation of the
legacy get_memmap() function using the new API.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
Michael Brown
3f6ee95737 [fdtmem] Update to use the generic system memory map API
Provide an implementation of the system memory map API based on the
system device tree, excluding any memory outside the size of the
accessible physical address space and defining an in-use region to
cover the relocated copy of iPXE and the system device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:18:36 +01:00
Michael Brown
bab3d76717 [memmap] Define an API for managing the system memory map
Define a generic system memory map API, based on the abstraction
created for parsing the FDT memory map and adding a concept of hidden
in-use memory regions as required to support patching the BIOS INT 15
memory map.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-16 16:12:15 +01:00
Michael Brown
f6f11c101c [tests] Remove prehistoric umalloc() test code
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-15 15:47:08 +01:00
Michael Brown
e0c4cfa81e [fdtmem] Record size of accessible physical address space
The size of accessible physical address space will be required for the
runtime memory map, not just at relocation time.  Make this size an
additional parameter to fdt_register() (matching the prototype for
fdt_relocate()), and record the value for future reference.

Note that we cannot simply store the limit in fdt_relocate() since it
is called before .data is writable and before .bss is zeroed.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 22:09:51 +01:00
Michael Brown
64ad1d03c3 [bios] Rename memmap.c to int15.c
Create namespace for an architecture-independent memmap.c by renaming
the BIOS-specific memmap.c to int15.c.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 22:02:46 +01:00
Joseph Wong
1dd9ac13fd [bnxt] Use updated DMA APIs
Replace malloc_phys with dma_alloc, free_phys with dma_free, alloc_iob
with alloc_rx_iob, free_iob with free_rx_iob, virt_to_bus with dma or
iob_dma.  Replace dma_addr_t with physaddr_t.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-05-14 14:21:02 +01:00
Joseph Wong
08edad7ca3 [bnxt] Return proper error codes in probe
Return the proper error codes in bnxt_init_one, to indicate the
correct return status upon completion.  Failure paths could
incorrectly indicate a success.  Correct assertion condition to check
for non-NULL pointer.

Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
2025-05-14 14:08:27 +01:00
Michael Brown
4d39b2dcc6 [crypto] Remove redundant null pointer check
Coverity reports a spurious potential null pointer dereference in
cms_decrypt(), since the null pointer check takes place after the
pointer has already been dereferenced.  The pointer can never be null,
since it is initialised to point to cipher_null at the point that the
containing structure is allocated.

Remove the redundant null pointer check, and for symmetry ensure that
the digest and public-key algorithm pointers are similarly initialised
at the point of allocation.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-14 12:46:23 +01:00
Michael Brown
d1c1e578af [riscv] Add a .pf32 build target for padded parallel flash images
QEMU's -pflash option requires an image that has been padded to the
exact expected size (32MB for all of the supported RISC-V virtual
machines).

Add a .pf32 build target which is simply the equivalent .sbi target
padded to 32MB in size, to simplify testing.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 18:25:24 +01:00
Michael Brown
6fd927f929 [riscv] Perform a writability test before applying relocations
If paging is not supported, then we will attempt to apply dynamic
relocations to fix up the runtime addresses.  If the image is
currently executing directly from flash memory, this can result in
effectively sending an undefined sequence of commands to the flash
device, which can cause unwanted side effects.

Perform an explicit writability test before applying relocations,
using a write value chosen to be safe for at least any devices
conforming to the JEDEC Common Flash Interface (CFI01).

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 17:42:53 +01:00
Michael Brown
4566f59757 [riscv] Avoid potentially overwriting the scratch area during relocation
We do not currently describe the temporary page table or the temporary
stack as areas to be avoided during relocation of the iPXE image to a
new physical address.

Perform the copy of the iPXE image and zeroing of the .bss within
libprefix.S, after we have no futher use for the temporary page table
or the temporary initial stack.  Perform the copy and registration of
the system device tree in C code after relocation is complete and the
new stack (within .bss) has been set up.

This provides a clean separation of responsibilities between the
RISC-V libprefix.S and the architecture-independent fdtmem.c.  The
prefix is responsible only for relocating iPXE to the new physical
address returned from fdtmem_relocate(), and doesn't need to know or
care where fdtmem.c is planning to place the copy of the device tree.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 14:00:34 +01:00
Michael Brown
8e38af800b [riscv] Add a .lkrn build target resembling a Linux kernel binary
On x86 BIOS, it has been useful to be able to build iPXE to resemble a
Linux kernel, so that it can be loaded by programs such as syslinux
which already know how to handle Linux kernel binaries.

Add an equivalent .lkrn build target for RISC-V SBI, allowing for
build targets such as:

  make bin-riscv64/ipxe.lkrn

  make bin-riscv64/cgem.lkrn

The Linux kernel header format allows us to specify a required length
(including uninitialised-data portions) and defines that the image
will be loaded at a fixed offset from the start of RAM.  We can
therefore use known-safe areas of memory (within our own .bss) for the
initial temporary page table and stack.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-13 13:03:08 +01:00
Michael Brown
17fd67ce03 [riscv] Relocate to a safe physical address on startup
On startup, we may be running from read-only memory.  We need to parse
the devicetree to obtain the system memory map, and identify a safe
location to which we can copy our own binary image along with a
stashed copy of the devicetree, and then transfer execution to this
new location.

Parsing the system memory map realistically requires running C code.
This in turn requires a small temporary stack, and some way to ensure
that symbol references are valid.

We first attempt to enable paging, to make the runtime virtual
addresses equal to the link-time virtual addresses.  If this fails,
then we attempt to apply the compressed relocation records.

Assuming that one of these has worked (i.e. that either the CPU
supports paging or that our image started execution in writable
memory), then we call fdtmem_relocate() to parse the system memory map
to find a suitable relocation target address.

After the copy we disable paging, jump to the relocated copy,
re-enable paging, and reapply relocation records (if needed).  At this
point, we have a full runtime environment, and can transfer control to
normal C code.

Provide this functionality as part of libprefix.S, since it is likely
to be shared by multiple prefixes.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 13:59:42 +01:00
Michael Brown
3dfc88158c [riscv] Construct page tables based on link-time virtual addresses
Always construct the page tables based on the link-time address values
even if relocations have already been applied, on the assumption that
relocations will be reapplied after paging has been enabled.

Signed-off-by: Michael Brown <mcb30@ipxe.org>
2025-05-12 13:59:42 +01:00