May 2, 2025

The last digits of primes (C++)

A dramatic deviation from expectation

This is very simple and satisfying code that demonstrates and unlikely result. It uses the first 50 million primes file that you can find on the site here.

We sort the last digit of consecutive primes into categories. So for a given prime ending 1, in theory there should be an equal chance that the next prime ends in 1,3,7 or 9 (once you get higher than 5 they are the only options). These categories should each say 6.25% if there is an equal chance.

‍

In fact, this is what happens:

‍

Significant skew away from paired outcomes, {1,1}, {3,3}, {7,7}, {9,9}

Clear dominance for {9,1} although the code only goes up to 50 million primes. Why? It should not be like this and the explanation is the Riemann Zeta function causing the skew.

‍

#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <iomanip>

using namespace std;

int main() {
    string filename = "/Users/50mprime.txt";
    ifstream file(filename);

    if (!file.is_open()) {
        cout << "Failed to open the file." << endl;
        return 1;
    }

    unordered_map<string, int> sequence_counts;
    int prev_prime;
    file >> prev_prime;

    int total_count = 0;

    while (file >> ws, !file.eof()) {
        int curr_prime;
        file >> curr_prime;

        int prev_digit = prev_prime % 10;
        int curr_digit = curr_prime % 10;

        string sequence = to_string(prev_digit) + "," + to_string(curr_digit);
        sequence_counts[sequence]++;
        total_count++;

        prev_prime = curr_prime;
    }

    file.close();

    string sequences[] = {"1,1", "1,3", "1,7", "1,9",
                          "3,1", "3,3", "3,7", "3,9",
                          "7,1", "7,3", "7,7", "7,9",
                          "9,1", "9,3", "9,7", "9,9"};

    cout << "Sequence\tCount\tPercentage" << endl;

    for (const string& sequence : sequences) {
        int count = sequence_counts[sequence];
        double percentage = (count * 100.0) / total_count;

        cout << "(" << sequence << ")\t" << count << "\t";
        cout << fixed << setprecision(2) << percentage << "%" << endl;
    }

    return 0;
}