A Machine Learning oriented introduction to PALISADE and CKKS.

Special thanks to Dr David Cousins for his role in this including, but not limited to, reviewing the content, offering feedback on the code, and suggesting changes to make the material more digestible.

Overview

1) We train on a single vector of the Iris dataset. Typically the Iris dataset is phrased as a classification problem, but here it is phrased as a linear regression because it is relatively straightforward in that anyone with a basic understanding of machine learning should be able to follow along.

2) We discuss the PALISADE cryptographic parameters that we need to specify and what they mean for our application.

3) We discuss ciphertext Refreshing aka the “Interactive” approach and why we need to do it.

Note: check the link at the very bottom for the complete source code. Sections have been omitted in this page to reduce clutter.


We first need to import/ include all the libraries we need

#include "palisade.h"
#include <iomanip>
using namespace lbcrypto;

Instructions to install PALISADE can be found here: PALISADE-Dev build instructions. If you are new to PALISADE and C++, I highly recommend bookmarking the PALISADE Doxygen page which contains the documentation for the library.

Training on the Iris dataset:

In this section we show

  • how to create a cryptocontext, which configures PALISADE to perform encrypted computation within a specific encryption scheme
  • code for training on a single vector of the iris dataset.

One important thing to note is that the results you obtain may be slightly different from a plaintext version or a numpy version because CKKS accumulates noise as we execute more and more operations.

We briefly introduce the parameters used below but defer further discussion to later.

CryptoContext<DCRTPoly> cc = 
    CryptoContextFactory<DCRTPoly>::genCryptoContextCKKS(
        multDepth, scalingFactorBits, batchSize
    );

std::cout << "Ring Dimension: " << cc->GetRingDimension() << '\n';

cc->Enable(ENCRYPTION);
cc->Enable(SHE);
cc->Enable(LEVELEDSHE);
auto keys = cc->KeyGen();
cc->EvalMultKeyGen(keys.secretKey);
cc->EvalSumKeyGen(keys.secretKey);

We create a cryptocontext object which takes our chosen parameters:

multDepth - The maximum number of sequential multiplications we can do before our data becomes too noisy and the decryption becomes meaningless. We discuss this later

scalingFactorBits - the scaling factor detailed later

batchSize - how many data points (think vector of data) we pack into a ciphertext. Homomorphic encryption is slow but can be sped up by conducting operations over batches of data (via SIMD)

Training setup

You may have some questions while reading through this section such as:

  • why do we operate on complex numbers?
  • Why do we have to convert from our raw data to a “packedplaintext” before encrypting?

These are all valid questions but given the goals of this post, we highly recommend that interested readers read my next (optional) post which is already up: Tutorial 1.5: CKKS for the Curious

// assume the existence of a label vector, labels, and a feature 
// matrix, features.

/////////////////////////////////////////////////////////////////
//Setup the weights, biases and encrypt all 
/////////////////////////////////////////////////////////////////
using cMatrix = std::vector<std::complex<double>>  

cMatrix singleLabelAsVector = {(float) labels[0]};

// we add a bias to our features which allows us to scale the line 
// of best fit by some offset
cMatrix biasFeatures = features[0];
auto it = biasFeatures.begin();
std::cout << "Vec 0 before adding bias: " << biasFeatures << '\n';
biasFeatures.insert(it, std::complex<double>(1, 0));
std::cout << "Vec 0 after adding bias: " << biasFeatures << '\n';

// the IRIS dataset has 4 features but since we added a bias term
// above, we generate a vector of 5 elements
cMatrix weights = {1, 1, -0.1, -1.5, 10.01};
// Up to now, all the steps we have taken should be familiar to a  
// reader with a machine learning background

// Below we pack our data into a plaintext. To pack the data means 
// that we encode multiple data points into a single ciphertext. The 
// advantage of this is that it allows us to operate on multiple data
// at once as opposed to single values of data; think vector operations
// as opposed to scalar ops in numpy.
auto singleFeature = cc->MakeCKKSPackedPlaintext(
    biasFeatures
);
auto singleLabel = cc->MakeCKKSPackedPlaintext(
    singleLabelAsVector
);
auto packedWeights = cc->MakeCKKSPackedPlaintext(weights);

// Above we packed the values but now we must encrypt them
auto encFeature = cc->Encrypt(keys.publicKey, singleFeature);
auto encLabel = cc->Encrypt(keys.publicKey, singleLabel);
auto encWeights = cc->Encrypt(keys.publicKey, packedWeights);

double alpha = 0.01;
Plaintext ptError;
Plaintext ptGradient;
if (verbose) {
    std::cout << "Label: " << singleLabel << '\n';
}
std::vector<double> costs;

Training Loop

The following section should be familiar to you if you’ve taken an introductory ML class.

/////////////////////////////////////////////////////////////////
//Start training
/////////////////////////////////////////////////////////////////
for (int i = 1; i < epochs + 1; i++) {
    // offset the index by 1 for some modulus checking 
    //below when refreshing our ciphertexts
    if (verbose) {
        std::cout << "Epoch : " << i << '\n';
    }
    // We have two vectors so an inner product involves a 
    // hadamard product and then a sum of the values

    auto innerProd = cc->EvalSum(
        cc->EvalMult(encFeature, encWeights), batchSize
    );

    // to determine how close we are to the correct value,
    // we find the difference
    auto residual = cc->EvalSub(innerProd, encLabel);

    // The residual variable is a ciphetext that contains our 
    // true error value. Thus, we must decrypt it and store 
    // the result to a plaintext
    cc->Decrypt(keys.secretKey, residual, &ptResidual);

    // we set the length below because we are only interested in the
    // first value. The remaining elements are 0s (approximately) 
    // and are a result of the packing.

    if (verbose) {

        ptResidual->SetLength(1);
        auto decResidual = ptResidual->GetCKKSPackedValue()[0].real();
        auto cost = decResidual * decResidual;

        std::cout << "Residual: " << decResidual << '\t' 
            <<"cost: " << cost << '\n';
    }

    // derivative:
    // J(w) = (Wx - y)
    // /partial J(w) = x.T * (y-Wx)  ( Equivalent to dot prod)

    auto err_partial_w = cc->EvalSum(
        cc->EvalMult(encFeature, residual), batchSize
    );
    // Scale by learning rate
    auto update = cc->EvalMult(alpha, err_partial_w);

    // Since this is meant as a tutorial, we print out the gradient
    // after every training step.
    if (verbose) {
        cc->Decrypt(keys.secretKey, update, &ptGradient);
        ptGradient->SetLength(4);
        std::cout << "Update: ";
        for (auto &v: ptGradient->GetCKKSPackedValue()){
            std::cout << v.real() << ',';
        }
        std::cout << std::endl;
    }
    encWeights = cc->EvalSub(encWeights, update);

The following section is within the epoch loop (like most of the code above). You should notice a few things:

  • it seems like we are refreshing something every few epochs.
  • we are decrypting our weights with our cryptocontext object then immediately encrypting the weights again. What gives?
      if ((refreshEvery != 0) && (i % refreshEvery == 0)) {
        if (verbose) {
          std::cout << 
              "Refreshing the weights for epoch: " << i << " \n";
        }
        Plaintext clearWeights; 
        // We don't actually use this. We use it to decrpt
        cc->Decrypt(keys.secretKey, encWeights, &clearWeights);  
        auto oldWeights = clearWeights->GetCKKSPackedValue();
        clearWeights->SetLength(4);
        encWeights = cc->Encrypt(
            keys.publicKey, 
            cc->MakeCKKSPackedPlaintext(oldWeights)
        );
    }
    

The reason for our decrypting and re-encrypting has to do with the multDepth parameter that we specified earlier. As mentioned much earlier, as we do operations on our ciphertexts, we accumulate noise. If this noise gets too large, our decryption will begin to fail. This failing results in random bits that are interpreted as (usually huge) random numbers. By decrypting and encrypting our results again, we can refresh this noise (reduce the noise to 0).

However, there is a caveat here: only the party that has the secret key can do the encrypting. Consider a scenario where we have a data enclave-client setup where the client does all the computations. There is a limit to the maximum multDepth one can set before CKKS becomes too unwieldy. Computations that exceed that multDepth need either server reencryption (like shown here) or Bootstrapping to securely reencrypt the data. Bootstrapping resets the noise and thus the multiplicative depth. However Bootstrapping for CKKS is not yet available for PALISADE as of this writing. This server reencryption process is considered less secure compared to a fully homomorphic setup but we defer further discussion to Tutorial 1.5: CKKS for the Curious

Cryptographic Parameter Discussion

multDepth

uint8_t multDepth = 5;

Depth of multiplication supported:

  • (a * b) + (c * d) has a multiplication depth of 1
  • a * b * c has a multiplication depth of 2

You may come upon the term “towers” later so just know that multDepth is one less than that i.e. multDepth == (numTowers - 1) . Each multiplication consumes one tower in a ciphertext until there is only one left. Dave asks if

Note: there is no 1-size-fits-all setup here so you’ll need to know your underlying algorithm to use this effectively. It is possible to set an extremely high multDepth but this means that your computations will run slower and require much more space.

scalingFactorBits

uint8_t scalingFactorBits = 40;

In the original CKKS paper, the authors discuss a scaling factor which they multiply by to prevent rounding errors from destroying the significant figures during encoding. In short, approximate floating point numbers are represented as integers: roughly as an int with scalingFactor* approxFloat (more on this in the next optional blog post).

Note: this specifies the bit LENGTH of the scaling factor but not the scaling factor itself. Note: I tend to use values between 30 to 50 for most applications I use.

batchSize

The batchSize is a tricky parameter to set correctly. The issue is that the batch size must be smaller than
Unfortunately, you need to set multDepth, then look at ring size before doing it all over again with batchsize set to be less than half the ring size. It’s a little hairy, yes, but this is the price we pay for privacy.

Closing words

Thank you for taking the time to read this! I hope that you leave this post with a rough understanding of how to use PALISADE for your real-number applications. To see the full source code visit my palisade-tutorial on Gitlab. Also, visit PALISADE - PKE for further examples of how to use PALISADE (one of which I contributed to!)

I have a full project that is slated to be released before the end of the year where I use Python bindings and train a linear regression on the Ames dataset comparing a plaintext numpy implementation and the encrypted linear regression. In that project, I provide a more in-depth discussion of the details of CKKS and Linear Regression while keeping the focus on showing the efficacy of homomorphic encryption and machine learning. Follow or add me on LinkedIn - ianquahtc where I will share the project once it is ready.

In the following months, I hope to share more work that I have been doing including but limited to creating a matrix interface (likely unoptimized) that will allow users to work in a format akin to numpy.

Again, If you plan to use the PALISADE library I recommend bookmarking their documentation page which I found to be extremely useful.

Written on December 1, 2020