Almost all image consumers do not need to modify the content of the
image. Now that the image data is a pointer type (rather than the
opaque userptr_t type), we can rely on the compiler to enforce this at
build time.
Change the .data field to be a const pointer, so that the compiler can
verify that image consumers do not modify the image content. Provide
a transparent .rwdata field for consumers who have a legitimate (and
now explicit) reason to modify the image content.
We do not attempt to impose any runtime restriction on checking
whether or not an image is writable. The only existing instances of
genuinely read-only images are the various unit test images, and it is
acceptable for defective test cases to result in a segfault rather
than a runtime error.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Not all images are allocated via alloc_image(). For example: embedded
images, the static images created to hold a runtime command line, and
the images used by unit tests are all static structures.
Using image_set_cmdline() (via e.g. the "imgargs" command) to set the
command-line arguments of a static image will succeed but will leak
memory, since nothing will ever free the allocated command line.
There are no code paths that can lead to calling image_set_len() on a
static image, but there is no safety check against future code paths
attempting this.
Define a flag IMAGE_STATIC to mark an image as statically allocated,
generalise free_image() to also handle freeing dynamically allocated
portions of static images (such as the command line), and expose
free_image() for use by static images.
Define a related flag IMAGE_STATIC_NAME to mark the name as statically
allocated. Allow a statically allocated name to be replaced with a
dynamically allocated name since this is a potentially valid use case
(e.g. if "imgdecrypt --name <name>" is used on an embedded image).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The BOFM tests are not part of the standard unit test suite, since
they are designed to allow for exercising real BOFM driver code
outside of the context of a real IBM blade server.
Allow for the BOFM tests to be run without a real BOFM driver, by
providing a dummy driver for the specified PCI test device.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Use standard void pointers for umalloc(), urealloc(), and ufree(),
with the "u" prefix retained to indicate that these allocations are
made from external ("user") memory rather than from the internal heap.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the ACPI table parsing code by assuming that all table
content is fully accessible via pointer dereferences.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the deflate, zlib, and gzip decompression code by assuming
that all content is fully accessible via pointer dereferences.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The memcpy_user(), memmove_user(), memcmp_user(), memset_user(), and
strlen_user() functions are now just straightforward wrappers around
the corresponding standard library functions.
Remove these redundant wrappers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add fdt_cells() to read scalar values encoded within a cell array,
reimplement fdt_u64() as a wrapper around this, and add fdt_u32() for
completeness.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Refactor device tree traversal to operate on the basis of describing
the token at a given offset, with no separate notion of a device tree
cursor.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow for the possibility of creating empty directories (without
having to include a dummy file inside the directory) using a
zero-length image and a CPIO filename with a trailing slash, such as:
initrd emptyfile /usr/share/oem/
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Commit 12ea8c4 ("[cpio] Allow for construction of parent directories
as needed") introduced a regression in constructing CPIO archive
headers for relative paths (e.g. simple filenames with no leading
slash).
Fix by counting the number of path components rather than the number
of path separators, and add some test cases to cover CPIO header
construction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for the EFI signature list image format (as produced by
tools such as efisecdb).
The parsing code does not require any EFI boot services functions and
so may be enabled even in non-EFI builds. We default to enabling it
only for EFI builds.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The only remaining use case for direct reduction (outside of the unit
tests) is in calculating the constant R^2 mod N used during Montgomery
multiplication.
The current implementation of direct reduction requires a writable
copy of the modulus (to allow for shifting), and both the modulus and
the result buffer must be padded to be large enough to hold (R^2 - N),
which is twice the size of the actual values involved.
For the special case of reducing R^2 mod N (or any power of two mod
N), we can run the same algorithm without needing either a writable
copy of the modulus or a padded result buffer. The working state
required is only two bits larger than the result buffer, and these
additional bits may be held in local variables instead.
Rewrite bigint_reduce() to handle only this use case, and remove the
no longer necessary uses of double-sized big integers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Classic Montgomery reduction involves a single conditional subtraction
to ensure that the result is strictly less than the modulus.
When performing chains of Montgomery multiplications (potentially
interspersed with additions and subtractions), it can be useful to
work with values that are stored modulo some small multiple of the
modulus, thereby allowing some reductions to be elided. Each addition
and subtraction stage will increase this running multiple, and the
following multiplication stages can be used to reduce the running
multiple since the reduction carried out for multiplication products
is generally strong enough to absorb some additional bits in the
inputs. This approach is already used in the x25519 code, where
multiplication takes two 258-bit inputs and produces a 257-bit output.
Split out the conditional subtraction from bigint_montgomery() and
provide a separate bigint_montgomery_relaxed() for callers who do not
require immediate reduction to within the range of the modulus.
Modular exponentiation could potentially make use of relaxed
Montgomery multiplication, but this would require R>4N, i.e. that the
two most significant bits of the modulus be zero. For both RSA and
DHE, this would necessitate extending the modulus size by one element,
which would negate any speed increase from omitting the conditional
subtractions. We therefore retain the use of classic Montgomery
reduction for modular exponentiation, apart from the final conversion
out of Montgomery form.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the number of parameters passed to bigint_montgomery() by
calculating the inverse of the modulus modulo the element size on
demand. Cache the result, since Montgomery reduction will be used
repeatedly with the same modulus value.
In all currently supported algorithms, the modulus is a public value
(or a fixed value defined by specification) and so this non-constant
timing does not leak any private information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no further need for a standalone modular multiplication
primitive, since the only consumer is modular exponentiation (which
now uses Montgomery multiplication instead).
Remove the now obsolete bigint_mod_multiply().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Speed up modular exponentiation by using Montgomery reduction rather
than direct modular reduction.
Montgomery reduction in base 2^n requires the modulus to be coprime to
2^n, which would limit us to requiring that the modulus is an odd
number. Extend the implementation to include support for
exponentiation with even moduli via Garner's algorithm as described in
"Montgomery reduction with even modulus" (Koç, 1994).
Since almost all use cases for modular exponentation require a large
prime (and hence odd) modulus, the support for even moduli could
potentially be removed in future.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery reduction is substantially faster than direct reduction,
and is better suited for modular exponentiation operations.
Add bigint_montgomery() to perform the Montgomery reduction operation
(often referred to as "REDC"), along with some test vectors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery reduction requires only the least significant element of an
inverse modulo 2^k, which in turn depends upon only the least
significant element of the invertend.
Use the inverse size (rather than the invertend size) as the effective
size for bigint_mod_invert(). This eliminates around 97% of the loop
iterations for a typical 2048-bit RSA modulus.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
With a slight modification to the algorithm to ignore bits of the
residue that can never contribute to the result, it is possible to
reuse the as-yet uncalculated portions of the inverse to hold the
residue. This removes the requirement for additional temporary
working space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Direct modular reduction is expected to be used in situations where
there is no requirement to retain the original (unreduced) value.
Modify the API for bigint_reduce() to reduce the value in place,
(removing the separate result buffer), impose a constraint that the
modulus and value have the same size, and require the modulus to be
passed in writable memory (to allow for scaling in place). This
removes the requirement for additional temporary working space.
Reverse the order of arguments so that the constant input is first,
to match the usage pattern for bigint_add() et al.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose the effective carry (or borrow) out flag from big integer
addition and subtraction, and use this to elide an explicit bit test
when performing x25519 reduction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery multiplication requires calculating the inverse of the
modulus modulo a larger power of two.
Add bigint_mod_invert() to calculate the inverse of any (odd) big
integer modulo an arbitrary power of two, using a lightly modified
version of the algorithm presented in "A New Algorithm for Inversion
mod p^k (Koç, 2017)".
The power of two is taken to be 2^k, where k is the number of bits
available in the big integer representation of the invertend. The
inverse modulo any smaller power of two may be obtained simply by
masking off the relevant bits in the inverse.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Faster modular multiplication algorithms such as Montgomery
multiplication will still require the ability to perform a single
direct modular reduction.
Neaten up the implementation of direct reduction and split it out into
a separate bigint_reduce() function, complete with its own unit tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The big integer shift operations are misleadingly described as
rotations since the original x86 implementations are essentially
trivial loops around the relevant rotate-through-carry instruction.
The overall operation performed is a shift rather than a rotation.
Update the function names and descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An n-bit multiplication product may be added to up to two n-bit
integers without exceeding the range of a (2n)-bit integer:
(2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1
Exploit this to perform big integer multiplication in constant time
without requiring the caller to provide temporary carry space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integer multiplication currently performs immediate carry
propagation from each step of the long multiplication, relying on the
fact that the overall result has a known maximum value to minimise the
number of carries performed without ever needing to explicitly check
against the result buffer size.
This is not a constant-time algorithm, since the number of carries
performed will be a function of the input values. We could make it
constant-time by always continuing to propagate the carry until
reaching the end of the result buffer, but this would introduce a
large number of redundant zero carries.
Require callers of bigint_multiply() to provide a temporary carry
storage buffer, of the same size as the result buffer. This allows
the carry-out from the accumulation of each double-element product to
be accumulated in the temporary carry space, and then added in via a
single call to bigint_add() after the multiplication is complete.
Since the structure of big integer multiplication is identical across
all current CPU architectures, provide a single shared implementation
of bigint_multiply(). The architecture-specific operation then
becomes the multiplication of two big integer elements and the
accumulation of the double-element product.
Note that any intermediate carry arising from accumulating the lower
half of the double-element product may be added to the upper half of
the double-element product without risk of overflow, since the result
of multiplying two n-bit integers can never have all n bits set in its
upper half. This simplifies the carry calculations for architectures
such as RISC-V and LoongArch64 that do not have a carry flag.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For some 32-bit CPUs, we need to provide implementations of 64-bit
shifts as libgcc helper functions. Add test cases to cover these.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise CMS self-test data structure and macro names to refer to
"messages" rather than "signatures", in preparation for adding image
decryption tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data. This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.
Simplify the public-key algorithm API so that there is no reusable
algorithm context. In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().
This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature. This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Generalise the existing support for performing RSA public-key
encryption, decryption, signature, and verification tests, and update
the code to use okx() for neater reporting of test results.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.
Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is some exploitable similarity between the data structures used
for representing CMS signatures and CMS encryption keys. In both
cases, the CMS message fundamentally encodes a list of participants
(either message signers or message recipients), where each participant
has an associated certificate and an opaque octet string representing
the signature or encrypted cipher key. The ASN.1 structures are not
identical, but are sufficiently similar to be worth exploiting: for
example, the SignerIdentifier and RecipientIdentifier data structures
are defined identically.
Rename data structures and functions, and add the concept of a CMS
message type.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cms_signature() and cms_verify() functions currently accept raw
data pointers. This will not be possible for cms_decrypt(), which
will need the ability to extract fragments of ASN.1 data from a
potentially large image.
Change cms_signature() and cms_verify() to accept an image as an input
parameter, and move the responsibility for setting the image trust
flag within cms_verify() since that now becomes a more natural fit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>