Simplify the deflate, zlib, and gzip decompression code by assuming
that all content is fully accessible via pointer dereferences.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the CMS code by assuming that all content is fully accessible
via pointer dereferences. This avoids the need to use fragment loops
for calculating digests and decrypting (or reencrypting) data.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Simplify the ASN.1 code by assuming that all objects are fully
accessible via pointer dereferences. This allows the concept of
"additional data beyond the end of the cursor" to be removed, and
simplifies parsing of all ASN.1 image formats.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The memcpy_user(), memmove_user(), memcmp_user(), memset_user(), and
strlen_user() functions are now just straightforward wrappers around
the corresponding standard library functions.
Remove these redundant wrappers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
We currently disable all external trust sources (such as the UEFI
TlsCaCertificate variable) if an explicit TRUST=... parameter is
provided on the build command line.
Define an explicit TRUST_EXT build parameter that can be used to
explicitly disable external trust sources even if no TRUST=...
parameter is provided, or to explicitly enable external trust sources
even if an explicit TRUST=... parameter is provided. For example:
# Default trusted root certificate, disable external sources
make TRUST_EXT=0
# Explicit trusted root certificate, enable external sources
make TRUST=custom.crt TRUST_EXT=1
If no TRUST_EXT parameter is specified, then continue to default to
disabling external trust sources if an explicit TRUST=... parameter is
provided, to maintain backwards compatibility with existing build
command lines.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The allocation of memory for the certificate chain link may cause the
certificate itself to be freed by the cache discarder, if the only
current reference to the certificate is held by the certificate store
and the system runs out of memory during the call to malloc().
Ensure that this cannot happen by taking out a temporary additional
reference to the certificate within x509_append(), rather than
requiring the caller to do so.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
UEFI's built-in HTTPS boot mechanism requires the trusted CA
certificates to be provided via the TlsCaCertificates variable.
(There is no equivalent of the iPXE cross-signing mechanism, so it is
not possible for UEFI to automatically use public CA certificates.)
Users who have configured UEFI HTTPS boot to use a custom root of
trust (e.g. a private CA certificate) may find it useful to have iPXE
automatically pick up and use this same root of trust, so that iPXE
can seamlessly fetch files via HTTPS from the same servers that were
trusted by UEFI HTTPS boot, in addition to servers that iPXE can
validate through other means such as cross-signed certificates.
Parse the TlsCaCertificates variable at startup, add any certificates
to the certificate store, and mark these certificates as trusted.
There are no access restrictions on modifying the TlsCaCertificates
variable: anybody with access to write UEFI variables is permitted to
change the root of trust. The UEFI security model assumes that anyone
with access to run code prior to ExitBootServices() or with access to
modify UEFI variables from within a loaded operating system is
supposed to be able to change the system's root of trust for TLS.
Any certificates parsed from TlsCaCertificates will show up in the
output of "certstat", and may be discarded using "certfree" if
unwanted.
Support for parsing TlsCaCertificates is enabled by default in EFI
builds, but may be disabled in config/general.h if needed.
As with the ${trust} setting, the contents of the TlsCaCertificates
variable will be ignored if iPXE has been compiled with an explicit
root of trust by specifying TRUST=... on the build command line.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The ANS X9.82 specification implicitly assumes that the RBG_Startup
function will be called before it is needed, and includes checks to
make sure that Generate_function fails if this has not happened.
However, there is no well-defined point at which the RBG_Startup
function is to be called: it's just assumed that this happens as part
of system startup.
We currently call RBG_Startup to instantiate the DRBG as an iPXE
startup function, with the corresponding shutdown function
uninstantiating the DRBG. This works for most use cases, and avoids
an otherwise unexpected user-visible delay when a caller first
attempts to use the DRBG (e.g. by attempting an HTTPS download).
The download of autoexec.ipxe for UEFI is triggered by the EFI root
bus probe in efi_probe(). Both the root bus probe and the RBG startup
function run at STARTUP_NORMAL, so there is no defined ordering
between them. If the base URI for autoexec.ipxe uses HTTPS, then this
may cause random bits to be requested before the RBG has been started.
Extend the logic in rbg_generate() to automatically start up the RBG
if startup has not already been attempted. If startup fails
(e.g. because the entropy source is broken), then do not automatically
retry since this could result in extremely long delays waiting for
entropy that will never arrive.
Reported-by: Michael Niehaus <niehaus@live.com>
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The only remaining use case for direct reduction (outside of the unit
tests) is in calculating the constant R^2 mod N used during Montgomery
multiplication.
The current implementation of direct reduction requires a writable
copy of the modulus (to allow for shifting), and both the modulus and
the result buffer must be padded to be large enough to hold (R^2 - N),
which is twice the size of the actual values involved.
For the special case of reducing R^2 mod N (or any power of two mod
N), we can run the same algorithm without needing either a writable
copy of the modulus or a padded result buffer. The working state
required is only two bits larger than the result buffer, and these
additional bits may be held in local variables instead.
Rewrite bigint_reduce() to handle only this use case, and remove the
no longer necessary uses of double-sized big integers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The NIST elliptic curves are Weierstrass curves and have the form
y^2 = x^3 + ax + b
with each curve defined by its field prime, the constants "a" and "b",
and a generator base point.
Implement a constant-time algorithm for point addition, based upon
Algorithm 1 from "Complete addition formulas for prime order elliptic
curves" (Joost Renes, Craig Costello, and Lejla Batina), and use this
as a Montgomery ladder commutative operation to perform constant-time
point multiplication.
The code for point addition is implemented using a custom bytecode
interpreter with 16-bit instructions, since this results in
substantially smaller code than compiling the somewhat lengthy
sequence of arithmetic operations directly. Values are calculated
modulo small multiples of the field prime in order to allow for the
use of relaxed Montgomery reduction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The Montgomery ladder may be used to perform any operation that is
isomorphic to exponentiation, i.e. to compute the result
r = g^e = g * g * g * g * .... * g
for an arbitrary commutative operation "*", base or generator "g", and
exponent "e".
Implement a generic Montgomery ladder for use by both modular
exponentiation and elliptic curve point multiplication (both of which
are isomorphic to exponentiation).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The elliptic curve point representation for the x25519 curve includes
only the X value, since the curve is designed such that the Montgomery
ladder does not need to ever know or calculate a Y value. There is no
curve point format byte: the public key data is simply the X value.
The pre-master secret is also simply the X value of the shared secret
curve point.
The point representation for the NIST curves includes both X and Y
values, and a single curve point format byte that must indicate that
the format is uncompressed. The pre-master secret for the NIST curves
does not include both X and Y values: only the X value is used.
Extend the definition of an elliptic curve to allow the point size to
be specified separately from the key size, and extend the definition
of a TLS named curve to include an optional curve point format byte
and a pre-master secret length.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Split out the portion of tls_send_client_key_exchange_ecdhe() that
actually performs the elliptic curve key exchange into a separate
function ecdhe_key().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
In debug messages, big integers are currently printed as hex dumps.
This is quite verbose and cumbersome to check against external
sources.
Add bigint_ntoa() to transcribe big integers into a static buffer
(following the model of inet_ntoa(), eth_ntoa(), uuid_ntoa(), etc).
Abbreviate big integers that will not fit within the static buffer,
showing both the most significant and least significant digits in the
transcription. This is generally the most useful form when visually
comparing against external sources (such as test vectors, or results
produced by high-level languages).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calculating the Montgomery constant (R^2 mod N) is done in our
implementation by zeroing the double-width representation of N,
subtracting N once to give (R^2 - N) in order to obtain a positive
value, then reducing this value modulo N.
Extract this logic from bigint_mod_exp() to a separate function
bigint_reduce_supremum(), to allow for reuse by other code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Classic Montgomery reduction involves a single conditional subtraction
to ensure that the result is strictly less than the modulus.
When performing chains of Montgomery multiplications (potentially
interspersed with additions and subtractions), it can be useful to
work with values that are stored modulo some small multiple of the
modulus, thereby allowing some reductions to be elided. Each addition
and subtraction stage will increase this running multiple, and the
following multiplication stages can be used to reduce the running
multiple since the reduction carried out for multiplication products
is generally strong enough to absorb some additional bits in the
inputs. This approach is already used in the x25519 code, where
multiplication takes two 258-bit inputs and produces a 257-bit output.
Split out the conditional subtraction from bigint_montgomery() and
provide a separate bigint_montgomery_relaxed() for callers who do not
require immediate reduction to within the range of the modulus.
Modular exponentiation could potentially make use of relaxed
Montgomery multiplication, but this would require R>4N, i.e. that the
two most significant bits of the modulus be zero. For both RSA and
DHE, this would necessitate extending the modulus size by one element,
which would negate any speed increase from omitting the conditional
subtractions. We therefore retain the use of classic Montgomery
reduction for modular exponentiation, apart from the final conversion
out of Montgomery form.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the number of parameters passed to bigint_montgomery() by
calculating the inverse of the modulus modulo the element size on
demand. Cache the result, since Montgomery reduction will be used
repeatedly with the same modulus value.
In all currently supported algorithms, the modulus is a public value
(or a fixed value defined by specification) and so this non-constant
timing does not leak any private information.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is no further need for a standalone modular multiplication
primitive, since the only consumer is modular exponentiation (which
now uses Montgomery multiplication instead).
Remove the now obsolete bigint_mod_multiply().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Speed up modular exponentiation by using Montgomery reduction rather
than direct modular reduction.
Montgomery reduction in base 2^n requires the modulus to be coprime to
2^n, which would limit us to requiring that the modulus is an odd
number. Extend the implementation to include support for
exponentiation with even moduli via Garner's algorithm as described in
"Montgomery reduction with even modulus" (Koç, 1994).
Since almost all use cases for modular exponentation require a large
prime (and hence odd) modulus, the support for even moduli could
potentially be removed in future.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery reduction is substantially faster than direct reduction,
and is better suited for modular exponentiation operations.
Add bigint_montgomery() to perform the Montgomery reduction operation
(often referred to as "REDC"), along with some test vectors.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
With a slight modification to the algorithm to ignore bits of the
residue that can never contribute to the result, it is possible to
reuse the as-yet uncalculated portions of the inverse to hold the
residue. This removes the requirement for additional temporary
working space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Direct modular reduction is expected to be used in situations where
there is no requirement to retain the original (unreduced) value.
Modify the API for bigint_reduce() to reduce the value in place,
(removing the separate result buffer), impose a constraint that the
modulus and value have the same size, and require the modulus to be
passed in writable memory (to allow for scaling in place). This
removes the requirement for additional temporary working space.
Reverse the order of arguments so that the constant input is first,
to match the usage pattern for bigint_add() et al.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Expose the effective carry (or borrow) out flag from big integer
addition and subtraction, and use this to elide an explicit bit test
when performing x25519 reduction.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add a dedicated bigint_msb_is_set() to reduce the amount of open
coding required in the common case of testing the sign of a two's
complement big integer.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Montgomery multiplication requires calculating the inverse of the
modulus modulo a larger power of two.
Add bigint_mod_invert() to calculate the inverse of any (odd) big
integer modulo an arbitrary power of two, using a lightly modified
version of the algorithm presented in "A New Algorithm for Inversion
mod p^k (Koç, 2017)".
The power of two is taken to be 2^k, where k is the number of bits
available in the big integer representation of the invertend. The
inverse modulo any smaller power of two may be obtained simply by
masking off the relevant bits in the inverse.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Faster modular multiplication algorithms such as Montgomery
multiplication will still require the ability to perform a single
direct modular reduction.
Neaten up the implementation of direct reduction and split it out into
a separate bigint_reduce() function, complete with its own unit tests.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The big integer shift operations are misleadingly described as
rotations since the original x86 implementations are essentially
trivial loops around the relevant rotate-through-carry instruction.
The overall operation performed is a shift rather than a rotation.
Update the function names and descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
An n-bit multiplication product may be added to up to two n-bit
integers without exceeding the range of a (2n)-bit integer:
(2^n - 1)*(2^n - 1) + (2^n - 1) + (2^n - 1) = 2^(2n) - 1
Exploit this to perform big integer multiplication in constant time
without requiring the caller to provide temporary carry space.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Big integer multiplication currently performs immediate carry
propagation from each step of the long multiplication, relying on the
fact that the overall result has a known maximum value to minimise the
number of carries performed without ever needing to explicitly check
against the result buffer size.
This is not a constant-time algorithm, since the number of carries
performed will be a function of the input values. We could make it
constant-time by always continuing to propagate the carry until
reaching the end of the result buffer, but this would introduce a
large number of redundant zero carries.
Require callers of bigint_multiply() to provide a temporary carry
storage buffer, of the same size as the result buffer. This allows
the carry-out from the accumulation of each double-element product to
be accumulated in the temporary carry space, and then added in via a
single call to bigint_add() after the multiplication is complete.
Since the structure of big integer multiplication is identical across
all current CPU architectures, provide a single shared implementation
of bigint_multiply(). The architecture-specific operation then
becomes the multiplication of two big integer elements and the
accumulation of the double-element product.
Note that any intermediate carry arising from accumulating the lower
half of the double-element product may be added to the upper half of
the double-element product without risk of overflow, since the result
of multiplying two n-bit integers can never have all n bits set in its
upper half. This simplifies the carry calculations for architectures
such as RISC-V and LoongArch64 that do not have a carry flag.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add support for decrypting images containing detached encrypted data
using a cipher key obtained from a separate CMS envelope image (in DER
or PEM format).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some ASN.1 OID-identified algorithms require additional parameters,
such as an initialisation vector for a block cipher. The structure of
the parameters is defined by the individual algorithm.
Extend asn1_algorithm() to allow these additional parameters to be
returned via a separate ASN.1 cursor.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Reduce the number of dynamic allocations required to parse a CMS
message by retaining the ASN.1 cursor returned from image_asn1() for
the lifetime of the CMS message. This allows embedded ASN.1 cursors
to be used for parsed objects within the message, such as embedded
signatures.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Instances of cipher and digest algorithms tend to get called
repeatedly to process substantial amounts of data. This is not true
for public-key algorithms, which tend to get called only once or twice
for a given key.
Simplify the public-key algorithm API so that there is no reusable
algorithm context. In particular, this allows callers to omit the
error handling currently required to handle memory allocation (or key
parsing) errors from pubkey_init(), and to omit the cleanup calls to
pubkey_final().
This change does remove the ability for a caller to distinguish
between a verification failure due to a memory allocation failure and
a verification failure due to a bad signature. This difference is not
material in practice: in both cases, for whatever reason, the caller
was unable to verify the signature and so cannot proceed further, and
the cause of the error will be visible to the user via the return
status code.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Asymmetric keys are invariably encountered within ASN.1 structures
such as X.509 certificates, and the various large integers within an
RSA key are themselves encoded using ASN.1.
Simplify all code handling asymmetric keys by passing keys as a single
ASN.1 cursor, rather than separate data and length pointers.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
There is some exploitable similarity between the data structures used
for representing CMS signatures and CMS encryption keys. In both
cases, the CMS message fundamentally encodes a list of participants
(either message signers or message recipients), where each participant
has an associated certificate and an opaque octet string representing
the signature or encrypted cipher key. The ASN.1 structures are not
identical, but are sufficiently similar to be worth exploiting: for
example, the SignerIdentifier and RecipientIdentifier data structures
are defined identically.
Rename data structures and functions, and add the concept of a CMS
message type.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Extend the definition of an ASN.1 OID-identified algorithm to include
a potential cipher suite, and add identifiers for AES-CBC and AES-GCM.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The cms_signature() and cms_verify() functions currently accept raw
data pointers. This will not be possible for cms_decrypt(), which
will need the ability to extract fragments of ASN.1 data from a
potentially large image.
Change cms_signature() and cms_verify() to accept an image as an input
parameter, and move the responsibility for setting the image trust
flag within cms_verify() since that now becomes a more natural fit.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Allow passing a NULL value for the certificate list to all functions
used for identifying an X.509 certificate from an existing set of
certificates, and rename function parameters to indicate that this
certificate list represents an unordered certificate store (rather
than an ordered certificate chain).
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Centralise all current mechanisms for identifying an X.509 certificate
(by raw content, by subject, by issuer and serial number, and by
matching public key), and remove the certstore-specific and
CMS-specific variants of these functions.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Handling large ASN.1 objects such as encrypted CMS files will require
the ability to use the asn1_enter() and asn1_skip() family of
functions on partial object cursors, where a defined additional length
is known to exist after the end of the data buffer pointed to by the
ASN.1 object cursor.
We already have support for partial object cursors in the underlying
asn1_start() operation used by both asn1_enter() and asn1_skip(), and
this is used by the DER image probe routine to check that the
potential DER file comprises a single ASN.1 SEQUENCE object.
Add asn1_enter_partial() to formalise the process of entering an ASN.1
partial object, and refactor the DER image probe routine to use this
instead of open-coding calls to the underlying asn1_start() operation.
There is no need for an equivalent asn1_skip_partial() function, since
only objects that are wholly contained within the partial cursor may
be successfully skipped.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Calling asn1_skip_if_exists() on a malformed ASN.1 object may
currently leave the cursor in a partially-updated state, where the tag
byte and one of the length bytes have been stripped. The cursor is
left with a valid data pointer and length and so no out-of-bounds
access can arise, but the cursor no longer points to the start of an
ASN.1 object.
Ensure that each ASN.1 cursor manipulation code path leads to the
cursor being either fully updated, left unmodified, or invalidated,
and update the function descriptions to reflect this.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Successfully reaching the end of a well-formed ASN.1 object list is
arguably not an error, but the current code (dating back to the
original ASN.1 commit in 2007) will explicitly check for and report
this as an error condition.
Remove the explicit check for reaching the end of a well-formed ASN.1
object list, and instead return success along with a zero-length (and
hence implicitly invalidated) cursor.
Almost every existing caller of asn1_skip() or asn1_skip_if_exists()
currently ignores the return value anyway. Skipped objects are (by
definition) not of interest to the caller, and the invalidation
behaviour of asn1_skip() ensures that any errors will be safely caught
on a subsequent attempt to actually use the ASN.1 object content.
Since these existing callers ignore the return value, they cannot be
affected by this change.
There is one existing caller of asn1_skip_if_exists() that does check
the return value: in asn1_skip() itself, an error returned from
asn1_skip_if_exists() will cause the cursor to be invalidated. In the
case of an error indicating only that the cursor length is already
zero, invalidation is a no-op, and so this change affects only the
return value propagated from asn1_skip().
This leaves only a single call site within ocsp_request() where the
return value from asn1_skip() is currently checked. The return status
here is moot since there is no way for the code in question to fail
(absent a bug in the ASN.1 construction or parsing code).
There are therefore no callers of asn1_skip() or asn1_skip_if_exists()
that rely on an error being returned for successfully reaching the end
of a well-formed ASN.1 object list. Simplify the code by redefining
this as a successful outcome.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
For unknown reasons, miscellaneous versions of gcc seem to struggle
with the static assertions used to ensure the correct layout of the
GCM structures.
Adjust the assertions to use offsetof() rather than direct pointer
comparison, on the basis that offsetof() must be a compile-time
constant value.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Add an implementation of the authentication portions of the MS-CHAPv2
algorithm as defined in RFC 2759, along with the single test vector
provided therein.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Downloading a cross-signed certificate chain to partially replace
(rather than simply extend) an existing chain will require the ability
to discard all certificates after a specified link in the chain.
Extract the relevant logic from x509_free_chain() and expose it
separately as x509_truncate().
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Some versions of gcc (observed with gcc 4.8.5 in CentOS 7) will report
spurious build_assert() failures for some assertions about structure
layouts. There is no clear pattern as to what causes these spurious
failures, and the build assertion does succeed in that no unresolvable
symbol reference is generated in the compiled code.
Adjust the assertions to work around these apparent compiler issues.
Signed-off-by: Michael Brown <mcb30@ipxe.org>
The DES block cipher dates back to the 1970s. It is no longer
relevant for use in TLS cipher suites, but it is still used by the
MS-CHAPv2 authentication protocol which remains unfortunately common
for 802.1x port authentication.
Add an implementation of the DES block cipher, complete with the
extremely comprehensive test vectors published by NBS (the precursor
to NIST) in the form of an utterly adorable typewritten and hand-drawn
paper document.
Signed-off-by: Michael Brown <mcb30@ipxe.org>