# ML Practitioners Introduction to CKKS.

This is part 1.5 in a series entitled “A Machine Learning Based Introduction to PALISADE and CKKS”

# Tutorial 1.5: CKKS for the curious

This tutorial is meant for ML practitioners who are interested in CKKS and Homomorphic Encryption in general. We attempt to make this section digestible but will also provide links for those who want in-depth reading material.

We use the following acronyms:

• Partially homomorphic encryption (PHE)

• Somewhat homomorphic encryption (SHE)

• Levelled Somewhat homomorphic encryption (LeveledSHE)

• Fully homomorphic encryption (FHE)

• Cheon-Kim-Kim-Song (CKKS) which was the new name after HEAAN.

## Homomorphic Encryption

There are several different levels of homomorphic encryption, a few of which we discuss briefly, but we focus on two primary algorithm styles: the SHE and the LeveledSHE because the PALISADE library supports them.

### Partially Homomorphic Encryption

The PHE encryption style-algorithm allows either addition or multiplication operations on elements within its group. One unique property is that these algorithms are unbounded in that we can carry out any number of either additions or multiplications. Many popular encryption schemes fall into this category: for example, RSA and ElGamal, both of which support multiplications on their ciphertexts. Another PHE scheme is the Paillier encryption scheme which supports addition on its ciphertexts.

### Somewhat Homomorphic Encryption

Somewhat homomorphic encryption supports both addition and multiplication logic gates instead of one-or-the-other as in the case of PHE. However, it still only supports a subset of circuits as it is now bounded in the number of operations it can do. As we carry out these operations, there is a non-zero amount of noise intentionally added for security. Note that homomorphic addition and homomorphic multiplication adds different levels of noise: addition scales the noise linearly while multiplication squares the noise. Eventually, it is no longer possible to decrypt the ciphertext and obtain the expected result. One method would be to “refresh” the ciphertext, but this was not possible until the work of Gentry which allowed for FHE. One example of a somewhat homomorphic encryption scheme is the BGV Scheme which was produced after the paper which allowed for FHE (we discuss why below)

We refer the interested reader to Acar et al. A Survey on Homomorphic Encryption Schemes: Theory and Implementation for more information about SHE and HE in general. P.s click on the “ereader” button, then on the download link at the top of the next page.

### Fully Homomorphic Encryption

Fully homomorphic encryption was not possible until the work of Craig Gentry in his thesis. In the thesis Gentry showed that it was possible to evaluate circuits of arbitrary depth via a process he termed bootstrapping. The fundamental idea of bootstrapping is that it is possible to “refresh” the ciphertext (thus reducing the noise). Gentry showed that by taking the decryption algorithm and decrypting both the ciphertext and the encrypted secret key ( encrypted using the public key ), it is possible to obtain a new ciphertext which has the same underlying plaintext but has the noise removed.

This process is equivalent to the following equation:

The equation describes the ciphertext at time i+1 and how it is derived from the original ciphertext. Nevertheless, the process of bootstrapping was (and still is) an expensive operation which can significantly slow down any application that requires a large number of additions or multiplications (such as in machine learning). Work following Gentry’s thesis focused on reducing the need for bootstrapping. In the “Accompanying Code” section, we show how this can be done in the PALISADE library.

## What is CKKS?

If we were to get pedantic, CKKS is not a homomorphic encryption scheme but is instead an approximate homomorphic encryption scheme.

CKKS is an encryption scheme that allows for operations on real numbers as opposed to just integers as is the case for BGV, and BFV which are other H.E. schemes (that we do not discuss in this ML practitioners guide).

Machine learning with homomorphic encryption was not fully supported until the CKKS scheme. Although various homomorphic encryption schemes emerged before CKKS, and it was possible to do some forms of machine learning, all of them dealt with integers. However, work by Cheon et al. in their paper entitled “Homomorphic Encryption for Arithmetic of Approximate Numbers” (also known as the CKKS scheme after the authors) general forms of machine learning became viable. CKKS operates on complex numbers, but via the use of an embedding, we can work on real numbers.

There are two sources of noise in CKKS. The first source of noise is in the encoding process that we discuss below. The second source is in the “relinearization” process, which we touch on briefly after.

### The Encoding Process

Earlier, we had mentioned that CKKS operates on the complex numbers and that these values can be converted to the real numbers via an encoding. We refer to the left half of the following image, the encoding-decoding step:

*image taken from Simons Institute Talk

The definition of tau has been left out, but tau is essentially

which is a mapping from the rational polynomials of at most degree $X^n$ where n is the dimensionality specified by the user. Considering that our message is an element in the complex numbers, we take the inverse mapping, $\tau^{-1}$. The use of $\tau^{-1}$ maps our message to the rational polynomials which we then scale by $\Delta$ before rounding to the nearest integer. The scaling-by-$\Delta$ is to control for precision loss later in the pipeline.

### The Relinearization Process

To discuss the intuition behind this process, we first need to discuss how numbers are stored in computers. For every floating number, there is a significand and a scaling factor (comprised of a base and exponent)

For example, in a (truncated) representation of $\sqrt{2}$, we have:

### Batch size

The term “Batch size” in the context of PALISADE (and homomorphic encryption libraries in general) is not the same as a batch size in machine learning. A batch size in PALISADE describes the number of messages that are encoded in a single slot of our ciphertext. The message-packing allows the use of SIMD, which enables parallelism as we are no longer bound to calculating data sequentially.

## Code Discussion

There are a few takeaways from the accompanying code:

Be sure to check out the accompanying code for this section: Tutorial1.5 Cpp code before continuing to following the discussion.

1) if we carry out too many multiplications, we exceed our noise threshold, and our results become meaningless. This is illustrated in the code provided and we encourage users to try different parameters.

2) We can avoid the noise issue by “refreshing” the ciphertext, a process called the interactive scheme. However, this comes with certain drawbacks. One drawback is that it is considered to be “less secure” as mentioned in the previous tutorial. The reason that it is “less secure” is because we need to be able to transmit the ciphertext between our private-key holder (a server) and our user (a client), and a user could attempt to structure their weights in such a way that they can determine some underlying information.

3) It is implied in the code, but we explicitly state that PALISADE does not support FHE, and you will need to refresh the ciphertext. Ideally, instead of re-encrypting via the interactive process, we would be able to use bootstrapping to keep the noise in check.