Use the tighter provable constraint
carry.2^n + x <= (2^n - 1) + (2^n - 1)
<= 2^n + (2^n - 2)
and so
x + carry <= (2^n - 2) + 1
<= (2^n - 1)
to eliminate some unnecessary folding steps, and hold the folded value
in the most significant bits of the register rather than the least
significant bits so that the final one's complement negation can be
accomplished naturally without requiring an explicit 0xffff constant.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The current implementation of the optimised string operations appears
to have been ported from the (old) arm64 implementation, and does not
cleanly match the LoongArch64 instruction set.
Replace with code derived from the riscv64 implementation, modified to
use indexed load and store instructions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Discarding neighbour cache entries for active connections is known to
be extremely disruptive, and is therefore done only as a last resort
when attempting to free up memory for a new allocation attempt.
There is currently no way to discard the deferred packet queue
separately from discarding the complete neighbour cache entry. Under
some conditions (such as a sustained ICMP echo request packet flood
from an IP address that will never complete neighbour resolution),
this can lead to the deferred packet queue growing without limit,
which will eventually lead to complete neighbour cache entries being
discarded.
Split out the logic in neighbour_destroy() for dropping deferred
packets to a separate neighbour_drop() function, and add a separate
cache discarder that will use this to free up memory without requiring
the complete neighbour cache entry to be discarded.
Reported-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The existing virtio network driver has been somewhat hacked together
over the past two decades by multiple contributors, and includes a
substantial amount of logic that is almost but not quite duplicated
between the "legacy" and "modern" code paths.
Rip out the existing driver and replace with a completely new driver
written based on the Virtual I/O Device specification document, not
derived from the Linux kernel driver.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 3d43789 ("[lacp] Detect and ignore erroneously looped back LACP
packets") added protection against LACP packet storms that arise when
our own transmitted packets are somehow looped back to the same port,
but does not protect against a situation in which we have two
different ports that are externally bridged to each other.
This situation is unlikely to arise in practice since a properly
configured link partner should not be both sending and forwarding LACP
packets. Triggering this situation essentially requires our two ports
to be connected to a non-LACP-capable switch, while another port on
the same switch is connected to a separate device that is sending out
LACP packets.
Guard against this situation by using the MAC address of the first
network device as the LACP system identifier, thereby allowing the
loopback detection to reject any packets that were sent from any of
our ports.
Since the system identifier is no longer unique between ports, use the
guaranteed-unique network device scope ID as the group key to indicate
that we do not support aggregation.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RSA-PSS signature scheme is crowbarred somewhat awkwardly into TLS
version 1.2. Certificates with the standard rsaEncryption OID in the
public key may be used with either PKCS#1 or RSA-PSS, which breaks the
straightforward mapping between the OID and the signature algorithm.
Extend the definition of a TLS signature hash algorithm to include a
required OID-identified algorithm in the certificate's public key.
This allows us to define signature schemes such as rsa_pss_rsae_sha256
where the signature scheme uses an algorithm that differs from the
algorithm identified in the certificate's public key.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for the RSA-PSS signature scheme as defined in RFC 8017
and required for TLS version 1.3.
Signature verification is deliberately implemented by first deriving
the salt value and then reconstructing the entire expected signature.
This is arguably inefficient since it involves two invocations of the
mask generation function when only one is required. However, this
implementation approach keeps the code size minimal (since there is no
need to implement separate verification logic), and makes it provably
impossible to accidentally omit a verification step (such as checking
the leading zero bits or the fixed 0x01 or 0xbc bytes). Since
signature verification is not a fast-path operation, the guaranteed
correctness is more valuable than a marginally faster execution.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The RSA-PSS signature scheme has the same basic structure as the
existing PKCS#1 signature scheme, with a difference only in how the
digest value is encoded before being enciphered.
Abstract out the digest encoding from the signature and verification
methods, and add an explicit "pkcs1" to the relevant method names.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Make the public key self-tests fully deterministic by temporarily
overriding the function used to obtain random data for RSA encryption.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Modify bnxt_hwrm_run() to accept a flag indicating whether to abort
immediately upon a command failure. During initialization path,
driver will continue to abort on first error. During teardown,
sequence will continue executing subsequent cleanup commands even if
one fails. This ensures a best-effort cleanup.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
Enhance code readability in the completion queue servicing logic to
use explicit function calls per case statement, rather than falling
through to the next statement. Add debug print in ring allocation
path. Fix typo in PCI ROM entry.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
The regparm function attribute is meaningful only for i386, not for
x86_64, and is reported as a build error by GCC 16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The regparm function attribute is meaningful only for i386, not for
x86_64, and is reported as a build error by GCC 16.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The EFI device path settings are currently registered as the
"netX.dhcp" settings block, in order that they will be automatically
overridden if a real DHCP configuration takes place. This does not
work as expected in an IPv6-only network, since the IPv6 configurator
will register "netX.ndp" rather than "netX.dhcp".
Fix by registering the EFI device path settings as either "netX.dhcp"
or "netX.ndp" based on the first address family encountered within the
device path.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
RFC 5246 defines the signature_algorithm extension values for TLS
version 1.2 as being tuples of {HashAlgorithm, SignatureAlgorithm}
pairs. RFC 8446 redefines the signature_algorithm extension values
for TLS version 1.3 in a backwards-compatible way as opaque 16-bit
SignatureScheme values, and RFC 8447 updates RFC 5246 to allow these
values to be used with TLS version 1.2.
Redefine our concept of a signature algorithm identifier to remove the
internal structure that no longer exists.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The null crypto algorithms are intended to do nothing: the null digest
algorithm accepts all input and generates a zero-length digest, and
the null cipher algorithm simply copies the input unmodifed to the
output.
The null public-key algorithm currently does nothing successfully.
Unlike the null digest and cipher algorithms, the null public-key
algorithm's methods are never called.
Change the null public-key algorithm to fail all operations, thereby
allowing its methods to be used as stubs by algorithms such as ECDSA
that do not implement all of the possible public-key operations.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The != operator has higher precedence than = in C, so the expressions:
rc = imgacquire ( ..., image ) != 0
are parsed as:
rc = ( imgacquire ( ..., image ) != 0 )
This assigns the boolean result (0 or 1) to rc instead of the actual
return code from imgacquire(). As a result, strerror(rc) reports an
incorrect error message when debugging is enabled.
Add parentheses around each assignment to ensure rc captures the
actual return value, matching the pattern already used in
efi_autoexec_filesystem() within the same file.
Modified-by: Michael Brown <mcb30@ipxe.org>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for the HMAC-based Extract-and-Expand Key Derivation
Function (HKDF) as used in TLS version 1.3 and defined in RFC 5869.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 988243c ("[virtio] Add virtio-net 1.0 support") erroneously
placed the code to unmap the device regions before the code to
unregister the network device. In the common case that the network
device is still open at the time that we shut down to boot the OS,
this results in the regions being accessed after having been unmapped.
For 32-bit BIOS or for UEFI with no IOMMU enabled, the iounmap()
operation is a no-op and so the driver still happens to work despite
the ordering bug. For 64-bit BIOS or for UEFI with an IOMMU enabled,
the iounmap() operation is not a no-op, and the driver will trigger a
page fault.
Fix by moving the call to unregister_netdev() to before the code that
unmaps the device regions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The unused RX I/O buffers are currently freed without being deleted
from the list, with the list head being reinitialised only after all
buffers have been deleted. This triggers assertion failures due to
the list integrity checks when debugging is enabled.
Fix by deleting each buffer individually, so that the list structure
remains valid at all times.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit b9d68b9 ("[ethernet] Use standard 1500 byte MTU unless
explicitly overridden") added code to explicitly set the MTU for
virtio-net devices, but only on the legacy probe path.
Make the behaviour consistent by setting the MTU on both legacy and
modern probe paths.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a workflow to build and import the official iPXE images for
Alibaba Cloud. As with the AWS and Google Cloud imports, treat this
as a workflow that must be triggered manually.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for a disk log partition console, using the same on-disk
structures as for the BIOS INT13 console.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the generic portions of the INT13 disk log console support
to a separate file that can be shared between BIOS and UEFI platforms.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The name "int13" is intrinsically specific to a BIOS environment.
Generalise the build configuration option CONSOLE_INT13 to
CONSOLE_DISKLOG, in preparation for adding EFI disk log console
support.
Existing configurations using CONSOLE_INT13 will continue to work.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The workaround used for UEFI in commit 926816c ("[efi] Pad transmit
buffer length to work around vendor driver bugs") is also applicable
to the BIOS UNDI driver.
Apply the same workaround of padding the transmit I/O buffers to the
minimum Ethernet frame length before passing them to the underlying
UNDI driver's transmit function.
Reported-by: Alexander Patrakov <patrakov@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit cb95b5b ("[efi] Veto the Dhcp6Dxe driver on all platforms")
vetoed the Dhcp6Dxe driver to work around the bug described at
https://github.com/tianocore/edk2/issues/10506 that results in
EfiDhcp6Stop() getting stuck in a tight loop waiting for an event that
will never occur.
Since we now call UnloadImage() at TPL_APPLICATION, we no longer
trigger the bug in Dhcp6Dxe, and so the veto may be removed.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
As of commit c3376f8 ("[efi] Drop to external TPL for calls to
ConnectController()"), the veto mechanism will drop to TPL_APPLICATION
for calls to DisconnectController().
Match this behaviour for calls to UnloadImage(), since that is likely
to result in calls to DisconnectController(). For example, any EDK2
driver using NetLibDefaultUnload() as its unload handler will call
DisconnectController() to disconnect itself from all handles.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
On Ubuntu/Debian, syslinux-common installs mbr.bin to
/usr/lib/syslinux/mbr/mbr.bin. This path is not currently searched by
find_syslinux_file(), causing USB disk image generation to fail with
"could not find mbr.bin".
Add /usr/lib/syslinux/mbr, /usr/share/syslinux/mbr, and
/usr/local/share/syslinux/mbr to the search paths.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For UEFI, the USB disk image is constructed from the built EFI binary
(e.g. bin-x86_64-efi/ipxe.efi) by genfsimg, which does not itself have
any way to access the build configuration. We therefore need a way to
annotate the binary such that genfsimg can determine whether or not to
include a log partition within the USB disk image.
The "OEM ID" and "OEM information" fields within the PE header can be
used for this, since they are easily accessed and serve no other
purpose. We define bit 0 of "OEM information" as a flag indicating
that a log partition should be included. If this bit is set, genfsimg
will create a log partition with a layout matching that of the BIOS
build (i.e. using partition 3 and at an offset of 16kB from the start
of the disk).
The PE header is constructed by elf2efi.c, which takes as an input the
linked ELF form of the binary. We use an ELF .note section to allow
any linked-in object to communicate the log partition request through
to elf2efi.c, which then populates the OEM information field
accordingly.
We choose to use the same field locations within the BIOS bzImage
header, since this allows genfsimg to use the same logic for both BIOS
and UEFI binaries. In a BIOS build, there is no external processing
equivalent to elf2efi.c, and so we construct the field value directly
using absolute symbols and explicit relocation records.
(Note that the bzImage header is relevant only when using genfsimg to
construct a combined BIOS/UEFI image. In the common case of building
a BIOS-only image such as bin/ipxe.usb, the partition table is
manually constructed by usbdisk.S and genfsimg is not involved.)
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The syslinux function check_fat_bootsect() performs some sanity checks
to ensure that the filesystem type string (e.g. "FAT12") is correct
for the total number of clusters in the FAT. There is unfortunately a
bug in its calculation of the number of sectors occupied by the root
directory, which causes it to underestimate the number of sectors by a
factor of 32.
When the total number of clusters is close to the FAT12 limit of 4096,
this bug can cause syslinux to erroneously report that the filesystem
has "more than 4084 clusters but claims FAT12".
Work around this bug by selecting an explicit cluster size in order to
avoid potentially problematic cluster counts. We default to using 4kB
clusters, doubling to 8kB if using 4kB would result in a total cluster
count near 4096 (the FAT12 limit) or near 65536 (the FAT16 limit).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The calculations around the FAT filesystem layout currently use a
mixture of kilobytes and sector counts. Switch to using sector counts
throughout the calculation, to make the code easier to read.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The USB disk image constructed by util/genfsimg is currently a raw FAT
filesystem, with no containing partition. This makes it incompatible
with the use of CONSOLE_INT13, since there is no way to add a
dedicated log partition without a partition table.
Add a partition table when building a non-ISO image, using the mbr.bin
provided by syslinux (since we are already using syslinux to invoke
the ipxe.lkrn within the FAT filesystem).
The BIOS .usb targets are built using a manually constructed partition
table with C/H/S geometry x/64/32. Match this geometry to minimise
the differences between genfsimg and non-genfsimg USB disk images.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We use mformat to ensure that the FAT filesystem starts as empty.
However, formatting the filesystem can still leave old data blocks
present (though unreferenced) within the disk image.
Truncate the image to a zero length before extending, to ensure that
no stale content is retained.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Include additional condition to invoke short command logic when
firmware indicates it is required. Replace 100ms delay with wmb() to
ensure DMA buffer is ready when short command is invoked.
Signed-off-by: Joseph Wong <joseph.wong@broadcom.com>
The undionly.kpxe binary does not need the full PCI bus support.
However, the overwhelming majority of UNDI devices are PCI-based and
we already end up dragging in PCI configuration space support in order
to be able to test for devices with broken interrupts.
Dragging in the PCI configuration allows the PCI settings mechanism to
also be present, which is often useful for end users. The total cost
is around 200 bytes in the final binary, which is acceptable for a
generally very useful feature.
Users wanting to minimise the binary size can choose to explicitly
disable PCI_SETTINGS via config/settings.h.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Since October 2025, the Microsoft UEFI Signing Requirements have
included a clause stating that "submissions must contain a valid
signed SPDX SBOM in a custom '.sbom' PE section". A list of required
fields is provided, and a link is given to "the Microsoft SBOM tool to
aid SBOM generation". So far, so promising.
The Microsoft SBOM tool has no support for handling a .sbom PE
section. There is no published document that specifies what is
supposed to appear within this PE section. An educated guess is that
it should probably contain the raw JSON data in the same format that
the Microsoft SBOM tool produces.
The list of required fields does not map to identifiable fields within
the JSON. In particular:
- "file name / software"
This might be the top-level "name" field. It's hard to tell. The
SPDX SBOM specification is not particularly informative either: the
only definition it appears to give for "name" is "This field
identifies the name of an Element as designated by the creator",
which is a spectacularly useless definition.
- "software version / component generation (shim)"
This may refer to the "packages[].versionInfo" field. There is no
obvious relevance for the words "component", "generation", or
"shim". The proximity of "generation" and "shim" suggests that this
might be related in some way to the SBAT security generation, which
is absolutely not the same thing as the software version.
- "vendor / company name (this must exactly match the verified company
name in the submitter's EV certificate on the Microsoft HDC partner
center account)"
This is clearly written as though it has some significance for the
UEFI signing submission process. Unfortunately there is no obvious
map to any defined SBOM field. An educated guess is that this might
be referring to "packages[].supplier", since experiments show that
the Microsoft SBOM tool will fail validation unless this field is
present.
- "product-name"
This might also be the top-level "name" field. There is no
indication given as to how this might differ from "file name /
software".
- "OEM Name" and "OEM ID"
These seem to be terms made up on the spur of the moment. The
three-letter sequence "OEM" does not appear anywhere within the
codebase of the Microsoft SBOM tool.
In the absence of any meaningful specification, we choose not to
engage in good faith with this requirement. Instead, we construct a
best guess at the contents of a .sbom section that has some chance of
being accepted by the UEFI signing submission process. We assume that
anything that passes "sbom-tool validate" will probably be accepted,
with the only actual check being that the supplier name must match the
registered EV code signing certificate.
To anyone who actually cares about the arguably valuable benefits of
having a software bill of materials: please stop creating junk
requirements. If you want people to actually make the effort to
produce useful SBOM data, then make it clear what data you want.
Provide unambiguous specifications. Provide example files. Provide
tools that actually do the job they are claimed to do. Don't just
throw out another piece of "MUST HAS THING BECAUSE IS MORE SECURITY"
garbage and call it a day.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 19dffdc ("[efi] Allow for creating devices with no EFI parent
device") relaxed the restriction on attempting to create SNP devices
when no EFI parent device is available, with the result that the test
network devices created when running the IPv4 tests are now registered
as SNP devices.
Since the dummy EFI parent device path is fixed and the test network
device MAC addresses are empty, the SNP devices end up with identical
constructed device paths and registration of the second and subsequent
devices will fail since device paths must be unique.
Fix by assigning MAC addresses to the test network devices.
Reported-by: Miao Wang <shankerwangmiao@gmail.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Most TPL manipulation is handled by efi_raise_tpl()/efi_restore_tpl()
pairs. The exceptions are the places where we need to temporarily
drop to a lower TPL in order to allow a timer interrupt to occur.
These currently assume that they are called only from code that is
already running at the internal TPL (generally TPL_CALLBACK). This
assumption is not always correct. In particular, the call from
_efi_start() to efi_driver_reconnect_all() takes place after the SNP
devices have been released and so will be running at the external TPL.
Create an efi_drop_tpl()/efi_undrop_tpl() pair to abstract away the
temporary lowering of the TPL, and ensure that the TPL is always
raised back to its original level rather than being unconditionally
raised to the internal TPL.
Signed-off-by: Michael Brown <mcb30@ipxe.org>