Primes to One Trillion The Gutenberg text "The First 100,000 Prime Numbers", EBook #65, lists the primes up to 1,318,699. This somewhat more ambitious version lists the primes up to one trillion (1,000,000,000,000 or 1E12). Introduction I became interested in prime numbers after hearing about Goldbach's Conjecture, "Every even integer greater than 2 can be expressed as the sum of two primes". Verifying this requires a source of primes. Short lists (or programs to generate them) are widely available. Really long lists are scarce, except for primos.mat.br. To make these lists more accessible, I have reformatted them to a size easily manageable by ordinary text editors and viewers--about 55MB. The file names correspond to the range of primes the file contains: 00000000000_to_00100000000.txt Zero to 100 Million 00100000000_to_00200000000.txt 100 Million to 200 Million etc. The leading zero digits cause the file names to collate in order of their content. Longer lists can be composed with the DOS copy command. Move the required prime txt files to a temporary directory and use: copy *.txt longList.txt Prime Text Files This is a collection of 10,000 files, occupying about 486GB of disk space in their unzipped native txt format. Since adjacent primes have about 90% identical leading digits, the compressed (zip) versions total 61GB. Each zip file contains 100 txt files. Do not use Windows Explorer to copy or move large numbers of files at a time. Use DOS copy or xcopy for large copies. I find Beyond Compare (Scooter Software) handy for keeping track of large numbers of files. 0000.zip 000G to 010G (0 to 1E10) 0010.zip 010G to 020G 0020.zip 020G to 030G 0030.zip 030G to 040G 0040.zip 040G to 050G 0050.zip 050G to 060G 0060.zip 060G to 070G 0070.zip 070G to 080G 0080.zip 080G to 090G 0090.zip 090G to 100G 0100.zip 100G to 110G 0110.zip 110G to 120G 0120.zip 120G to 130G 0130.zip 130G to 140G 0140.zip 140G to 150G 0150.zip 150G to 160G 0160.zip 160G to 170G 0170.zip 170G to 180G 0180.zip 180G to 190G 0190.zip 190G to 100G 0200.zip 200G to 210G 0210.zip 210G to 220G 0220.zip 220G to 230G 0230.zip 230G to 240G 0240.zip 240G to 250G 0250.zip 250G to 260G 0260.zip 260G to 270G 0270.zip 270G to 280G 0280.zip 280G to 290G 0290.zip 290G to 200G 0300.zip 300G to 310G 0310.zip 310G to 320G 0320.zip 320G to 330G 0330.zip 330G to 340G 0340.zip 340G to 350G 0350.zip 350G to 360G 0360.zip 360G to 370G 0370.zip 370G to 380G 0380.zip 380G to 390G 0390.zip 390G to 300G 0400.zip 400G to 410G 0410.zip 410G to 420G 0420.zip 420G to 430G 0430.zip 430G to 440G 0440.zip 440G to 450G 0450.zip 450G to 460G 0460.zip 460G to 470G 0470.zip 470G to 480G 0480.zip 480G to 490G 0490.zip 490G to 400G 0500.zip 500G to 510G 0510.zip 510G to 520G 0520.zip 520G to 530G 0530.zip 530G to 540G 0540.zip 540G to 550G 0550.zip 550G to 560G 0560.zip 560G to 570G 0570.zip 570G to 580G 0580.zip 580G to 590G 0590.zip 590G to 500G 0600.zip 600G to 610G 0610.zip 610G to 620G 0620.zip 620G to 630G 0630.zip 630G to 640G 0640.zip 640G to 650G 0650.zip 650G to 660G 0660.zip 660G to 670G 0670.zip 670G to 680G 0680.zip 680G to 690G 0690.zip 690G to 600G 0700.zip 700G to 710G 0710.zip 710G to 720G 0720.zip 720G to 730G 0730.zip 730G to 740G 0740.zip 740G to 750G 0750.zip 750G to 760G 0760.zip 760G to 770G 0770.zip 770G to 780G 0780.zip 780G to 790G 0790.zip 790G to 700G 0800.zip 800G to 810G 0810.zip 810G to 820G 0820.zip 820G to 830G 0830.zip 830G to 840G 0840.zip 840G to 850G 0850.zip 850G to 860G 0860.zip 860G to 870G 0870.zip 870G to 880G 0880.zip 880G to 890G 0890.zip 890G to 800G 0900.zip 900G to 910G 0910.zip 910G to 920G 0920.zip 920G to 930G 0930.zip 930G to 940G 0940.zip 940G to 950G 0950.zip 950G to 960G 0960.zip 960G to 970G 0970.zip 970G to 980G 0980.zip 980G to 990G 0990.zip 990G to 1000G Additional prime files will be posted here. PrimeC File Format and Miscellaneous C++ Programs While working with primes, I developed the primec format, a file or array representation for primes that is roughly the same size of the compressed (zipped) txt representation, and supports fast access, both sequential and direct. The exact location of the primality specification of any number in the file (or memory array) is computed with a few instructions and no search. If you wish to examine and experiment with the C++ programs used to reformat these prime lists and test the Goldbach Conjecture, download the "programs.zip" package. It contains Generating and Analyzing Prime Numbers, a description of the content and use of these files, including the primec file format. Primec Format The primec format exploits the fact that all primes greater than 5 end in the decimal digits 1, 3, 7, or 9. Thus, the primality of 20 successive numbers can be specified in one 8 bit byte. The file begins with the complete binary representation of: The beginning of the sequence The end of the sequence A check sum of all data bytes (All three are 8 bytes for this implementation). The first and last values are a multiple of twenty, thus are never primes. There is no overlap of primes between successive files that use the same number for the upper boundary of the first file, and the lower boundary of the second file. The primality of any number in the range of the file is determined as follows: If the number ends in 0, 2, 4, 5, 6, or 8, it is not prime. Otherwise, the location of the specifying byte is at offset: ( value - start ) / 20 Within that byte, the primality of the value is specified by the bit as shown in the following table. The only tedious programming tasks were: Special case code for values less than 20, which include 2 and 5, and exclude 1 and 9. All larger values follow the same simple pattern. The increment and decrement operators for the corresponding iterators must search forward (or backward) for the next true bit, specifying the next prime number. This table shows the layout and content for a file containing 20 to 60. The first 24 bytes (start value, end value, check sum) are not shown. Byte 0------------------------------| 1---------------------------- Bit 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Value 21 23 27 29 31 33 37 39 41 43 47 49 51 53 57 59 Prime F T F T T F T F T T T F F T F T Hex 5A E5 As primes become larger, the density of primes becomes smaller as 1/ln(n). Thus the density of true bits also falls off. The number of digits (binary or decimal) to represent the primes grows as ln(n). Thus, a sequence of primes represented as primec is always competitive in size with the corresponding sequence in ASCI text or binary, besides providing fast direct access by value: bool isPrime(value). The results for the sequence of the largest 64 bit primes (18446744073707000000 to 18446744073709551558) is: Format Size (KB) 64 Bit Binary 445 Txt 1150 Zip Txt 114 PrimeC 125 Programs Among the programs in the "program.zip" package are: BuildTxtPrime Create file of primes, txt format TxtToPrimeC Convert txt to primec format. Goldbach Verify Goldbach's conjecture for zero to 1E12 Among the more than 15 classes and utilities are: PrimeGenerator Create prime numbers in a given range. Directory A vector of strings containing the names of files in a file directory. Progress A class to manage the periodic reporting of program activity. PrimeCVector Abstract class providing the algorithms to access primec data. PrimeCFileWriter Create a primec file. PrimeCFileReader Read a primec file. I hope you find them useful. If you have any questions, observations or bug reports concerning the C++ programming or the content of the prime files, send an email (after changing "at" to "@". primes1e12 at earthlink.net I embarked on this project as a programming challenge. I am not a mathematician. I have no deep insight into prime number theory. Please confine messages to programming issues. Here are some references: Prime Numbers: http://www.primos.mat.br/indexen.html Wikipedia: List of Prime Numbers (with numerous references): https://en.wikipedia.org/wiki/List_of_prime_numbers The Math Forum: http://mathforum.org/dr.math/faq/faq.prime.num.html The Prime Pages https://primes.utm.edu The program files are also posted on http://home.earthlink.net/~primes1E12. Corrections and additions will be posted there as they occur. Don Kostuch October, 2018.