SlideShare una empresa de Scribd logo
1 de 1
Descargar para leer sin conexión
Empirical Discovery of Formulae Connecting Sequences in the OEIS
YANGCHEN PAN MAX A. ALEKSEYEV
Department of Computer Science Department of Mathematics
EXAMPLES OF DISCOVERED CONNECTIONS
While we have not yet performed a system-
atic analysis of the discovered new matches, be-
low we list some rather random examples of such
matches.
Example 1. The LAH transform (cf. A103194)
of a sequence A is the sequence B, whose expo-
nential generating functions satisfy the relation
EB(x) = 1
1−x · EA( x
1−x ).
The LAH transform of A014500 (Number of
graphs with unlabeled non-isolated nodes and n
labeled edges): 1, 1, 2, 9, 70, 794, 12055, 233238,
5556725, 158931613, 5350854707, ... matches
A020558 (Number of ordered multigraphs on
n labeled edges without loops): 1, 1, 4, 27,
274, 3874, 71995, 1682448, 47840813, 1615315141,
63566760077, ... which can be easily verified using
known e.g.f.’s of these sequences.
Example 2. The inverse characteristic transform
produces the sequences of indices of nonzero el-
ements of an input sequence.
The inverse characteristic transform of
A007253 (McKay-Thompson series of class 5a for
Monster): 1, 0, -6, 20, 15, 36, 0, -84, 195, 100, 240, 0,
-461, 1020, 540, 1144, 0, -1980, ... matches A020558
(Numbers that are congruent to {0, 2, 3, 4} mod 5):
0, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20,
22, 23, 24, 25, 27, ... This match extends to at least
first 25 terms, but it is immediately unclear if it
holds for the whole sequences.
Example 3. The record transform produces the
record values of an input sequence.
The record transform of A003959 (If n =
k pek
k , then a(n) = k(pk + 1)ek
, a(1) = 1): 1,
3, 4, 9, 6, 12, 8, 27, 16, 18, 12, 36, 14, 24, 24, 81, 18,
48, 20, 54, 32, 36, 24, 108, ... matches A211221 (For
any partition of n consider the product of the σ
of each element. Sequence gives the maximum of
such values): 1, 3, 4, 9, 12, 27, 36, 81, 108, 243, 324,
729, 972, 2187, 2916, 6561, 8748, 19683, ... Namely,
the first 10,000 terms of A003959 contain 26 record
values, which all match A211221.
Example 4. The partial sum transform produces
the sequence of partial sums of an input sequence.
The partial sum transform of A231560
(
n
i=2
1
i·log i ): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2 ... matches
with A032521 (Sum of the integer part of 143-th
roots of integers less than n): 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
... which extends to 169 first terms, but then the
sequences diverge.
Example 5. The n-th term of the least inverse
transform of a sequence {ak} equals the smallest
index m such that am = n.
Th least inverse transform of A004445 (Nim-
sum n + 4): 4, 5, 6, 7, 0, 1, 2, 3, 12, 13, 14, 15, 8,
9, 10, 11, 20, 21, 22, 23, 16, 17, 18, 19, 28, 29, ...
matches itself. In fact, it can be easily seen that
A004445 represents a permutation of nonnegative
integers that is involution and thus a fixed point
of the least inverse transform.
Example 6. The labeled cycle transform applies cy-
cle structure to a sequence of labeled objects.
The labeled cycle transform of A046912 (Num-
ber of irreducible quasiorders with n labeled
points): 1, 1, 2, 11, 147, 3412, 121553, 6353629,
476850636, ... matches A001929 (Number of con-
nected topologies on n labeled points): 1, 1,
3, 19, 233, 4851, 158175, 7724333, 550898367,
... and, similarly, the labeled cycle transform
of A046908 (Number of irreducible posets with
n labeled points): 1, 1, 1, 7, 97, 2251, 80821,
4305127, 332273257, ... matches A001927 (Number
of connected partially ordered sets with n labeled
points): 1, 1, 2, 12, 146, 3060, 101642, 5106612,
377403266, ... Correctness of these matches easily
follows from the sequences definition.
INTRODUCTION
The Online Encyclopedia of Integer Sequences
(OEIS) is a rich and constantly growing source of
numerical data from various areas of science. As
such, it often helps to establish unexpected con-
nection between seemingly unrelated results and
get deeper understanding of the underlying prob-
lems. In our project, we mined the numerical data
in the OEIS to discover yet unknown formulae that
possibly connect different sequences.
We assume that connection between se-
quences A and B in the OEIS is known if either the
description of A mentions B or vice versa, the de-
scription of B mentions A; otherwise we assume
that the connection between A and B is unknown.
Many of the discovered new matches rep-
resent a situation where one transformed se-
quence matches a large number of existing se-
quences. However, the most interesting results
arise from the situation when a transformed se-
quence matches only one existing sequence, since
this makes the match somewhat special and thus
deserving further investigation.
MINING SEQUENCE CONNECTIONS
Diagram 1: Project Outline
We stored a local copy of the OEIS database.
For each sequence in the database, we per-
formed “standard” transforms from http://
oeis.org/transforms.html to obtain trans-
formed sequences. Then each transformed se-
quence was searched in the database to discover
matches with existing sequences. The overall
mining process is outlined in Diagram 1.
Step 1. Data preprocessing: First, matrix and
short (with less than 5 terms) sequences and were
deleted. Second, from each sequence database en-
try (excluding Sequence in context: and Adjacent se-
quences: fields), we extract references to other se-
quences, which are treated as known connections.
Step 2. Cope with transforms: We ignored matrix
transforms, which left us with 106 transforms. We
further skipped any transform for a particular se-
quence that resulted in an error.
Step 3. Sequence processing pipeline: The se-
quence processing pipeline is shown in Diagram
2. Apache lucene library was used to build an in-
dex for efficient search of sequences in the local
database.
Step 4. Distributed execution: The project used
195 cores running individual sequence pipelines.
Diagram 2: Sequence Processing Pipeline
MATCH STATISTICS
Match Type Known New Total
Pairwise 56,725 15,286,111 15,342,836
One-to-Many 518 492,910 493,428
One-to-One 13,777 70,677 84,454
One-to-Any 14,295 56,3587 577,882
Self ? ? 202,777
For a sequence A and transformation T such that
T(A) matches k ≥ 1 sequences in the database,
this match is classified as one-to-one if k = 1 and
one-to-many if k > 1. Sequences invariant w.r.t.
transforms are classified as self matches.
ACKNOWLEDGMENTS
We are thankful to Neil Sloane and Charles
Greathouse for providing us with the snapshot of
the OEIS database (dated March 26, 2014). We
are also indebted to Christian G. Bower for the
PARI/GP implementation of transforms.
The computations were performed at Colo-
nialOne high-performance cluster at the George
Washington University.
The project is supported by the National Sci-
ence Foundation under the grant No. IIS-1462107.

Más contenido relacionado

La actualidad más candente

System of linear algebriac equations nsm
System of linear algebriac equations nsmSystem of linear algebriac equations nsm
System of linear algebriac equations nsmRahul Narang
 
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...AIST
 
Bba i-bm-u-2- matrix -
Bba i-bm-u-2- matrix -Bba i-bm-u-2- matrix -
Bba i-bm-u-2- matrix -Rai University
 
Math Geophysics-system of linear algebraic equations
Math Geophysics-system of linear algebraic equationsMath Geophysics-system of linear algebraic equations
Math Geophysics-system of linear algebraic equationsAmin khalil
 
Triangularization method
Triangularization methodTriangularization method
Triangularization methodKamran Ansari
 
Matrices and determinants
Matrices and determinantsMatrices and determinants
Matrices and determinantsKum Visal
 
Matlab ch1 (3)
Matlab ch1 (3)Matlab ch1 (3)
Matlab ch1 (3)mohsinggg
 
matrix algebra
matrix algebramatrix algebra
matrix algebrakganu
 
Row space | Column Space | Null space | Rank | Nullity
Row space | Column Space | Null space | Rank | NullityRow space | Column Space | Null space | Rank | Nullity
Row space | Column Space | Null space | Rank | NullityVishvesh Jasani
 
Ppt presentasi matrix algebra
Ppt presentasi matrix algebraPpt presentasi matrix algebra
Ppt presentasi matrix algebraRahmatulFitri1
 
System of linear equations
System of linear equationsSystem of linear equations
System of linear equationsDiler4
 
Row Space,Column Space and Null Space & Rank and Nullity
Row Space,Column Space and Null Space & Rank and NullityRow Space,Column Space and Null Space & Rank and Nullity
Row Space,Column Space and Null Space & Rank and NullityParthivpal17
 
Matrix Algebra : Mathematics for Business
Matrix Algebra : Mathematics for BusinessMatrix Algebra : Mathematics for Business
Matrix Algebra : Mathematics for BusinessKhan Tanjeel Ahmed
 

La actualidad más candente (20)

Rank of a matrix
Rank of a matrixRank of a matrix
Rank of a matrix
 
System of linear algebriac equations nsm
System of linear algebriac equations nsmSystem of linear algebriac equations nsm
System of linear algebriac equations nsm
 
APM.pdf
APM.pdfAPM.pdf
APM.pdf
 
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...Dmitrii Tihonkih - The Iterative Closest Points Algorithm and  Affine Transfo...
Dmitrii Tihonkih - The Iterative Closest Points Algorithm and Affine Transfo...
 
Bba i-bm-u-2- matrix -
Bba i-bm-u-2- matrix -Bba i-bm-u-2- matrix -
Bba i-bm-u-2- matrix -
 
determinants.ppt
determinants.pptdeterminants.ppt
determinants.ppt
 
Matrices ppt
Matrices pptMatrices ppt
Matrices ppt
 
Math Geophysics-system of linear algebraic equations
Math Geophysics-system of linear algebraic equationsMath Geophysics-system of linear algebraic equations
Math Geophysics-system of linear algebraic equations
 
Triangularization method
Triangularization methodTriangularization method
Triangularization method
 
Matrices and determinants
Matrices and determinantsMatrices and determinants
Matrices and determinants
 
Matlab ch1 (3)
Matlab ch1 (3)Matlab ch1 (3)
Matlab ch1 (3)
 
matrix algebra
matrix algebramatrix algebra
matrix algebra
 
rank of matrix
rank of matrixrank of matrix
rank of matrix
 
MATLAB - Arrays and Matrices
MATLAB - Arrays and MatricesMATLAB - Arrays and Matrices
MATLAB - Arrays and Matrices
 
Row space | Column Space | Null space | Rank | Nullity
Row space | Column Space | Null space | Rank | NullityRow space | Column Space | Null space | Rank | Nullity
Row space | Column Space | Null space | Rank | Nullity
 
Matrices
Matrices Matrices
Matrices
 
Ppt presentasi matrix algebra
Ppt presentasi matrix algebraPpt presentasi matrix algebra
Ppt presentasi matrix algebra
 
System of linear equations
System of linear equationsSystem of linear equations
System of linear equations
 
Row Space,Column Space and Null Space & Rank and Nullity
Row Space,Column Space and Null Space & Rank and NullityRow Space,Column Space and Null Space & Rank and Nullity
Row Space,Column Space and Null Space & Rank and Nullity
 
Matrix Algebra : Mathematics for Business
Matrix Algebra : Mathematics for BusinessMatrix Algebra : Mathematics for Business
Matrix Algebra : Mathematics for Business
 

Similar a OEIS mining - poster

DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsDSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsAmr E. Mohamed
 
Matrix algebra in_r
Matrix algebra in_rMatrix algebra in_r
Matrix algebra in_rRazzaqe
 
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed ArithmeticLow Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed ArithmeticIJERA Editor
 
Linear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraLinear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraMUHAMMADUSMAN93058
 
tw1979 Exercise 1 Report
tw1979 Exercise 1 Reporttw1979 Exercise 1 Report
tw1979 Exercise 1 ReportThomas Wigg
 
from_data_to_differential_equations.ppt
from_data_to_differential_equations.pptfrom_data_to_differential_equations.ppt
from_data_to_differential_equations.pptashutoshvb1
 
Statistical analysis information about PCA or principles component analysis a...
Statistical analysis information about PCA or principles component analysis a...Statistical analysis information about PCA or principles component analysis a...
Statistical analysis information about PCA or principles component analysis a...RezaJoia
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Amairullah Khan Lodhi
 
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...Investigation of auto-oscilational regimes of the system by dynamic nonlinear...
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...IJECEIAES
 
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Chyi-Tsong Chen
 
I04124052057
I04124052057I04124052057
I04124052057IOSR-JEN
 
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTER
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTERSYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTER
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTERTarun Kumar
 
Signals And Systems Lab Manual, R18 Batch
Signals And Systems Lab Manual, R18 BatchSignals And Systems Lab Manual, R18 Batch
Signals And Systems Lab Manual, R18 BatchAmairullah Khan Lodhi
 

Similar a OEIS mining - poster (20)

DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsDSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
 
Matrix algebra in_r
Matrix algebra in_rMatrix algebra in_r
Matrix algebra in_r
 
Q
QQ
Q
 
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed ArithmeticLow Power Adaptive FIR Filter Based on Distributed Arithmetic
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
 
Linear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear AlgebraLinear Algebra Presentation including basic of linear Algebra
Linear Algebra Presentation including basic of linear Algebra
 
Seminar on MATLAB
Seminar on MATLABSeminar on MATLAB
Seminar on MATLAB
 
Ada notes
Ada notesAda notes
Ada notes
 
tw1979 Exercise 1 Report
tw1979 Exercise 1 Reporttw1979 Exercise 1 Report
tw1979 Exercise 1 Report
 
from_data_to_differential_equations.ppt
from_data_to_differential_equations.pptfrom_data_to_differential_equations.ppt
from_data_to_differential_equations.ppt
 
Statistical analysis information about PCA or principles component analysis a...
Statistical analysis information about PCA or principles component analysis a...Statistical analysis information about PCA or principles component analysis a...
Statistical analysis information about PCA or principles component analysis a...
 
13005810.ppt
13005810.ppt13005810.ppt
13005810.ppt
 
Matrix_PPT.pptx
Matrix_PPT.pptxMatrix_PPT.pptx
Matrix_PPT.pptx
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual
 
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...Investigation of auto-oscilational regimes of the system by dynamic nonlinear...
Investigation of auto-oscilational regimes of the system by dynamic nonlinear...
 
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 01 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
 
I04124052057
I04124052057I04124052057
I04124052057
 
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTER
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTERSYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTER
SYSTEM IDENTIFICATION USING CEREBELLAR MODEL ARITHMETIC COMPUTER
 
Signals And Systems Lab Manual, R18 Batch
Signals And Systems Lab Manual, R18 BatchSignals And Systems Lab Manual, R18 Batch
Signals And Systems Lab Manual, R18 Batch
 
Sorting ppt
Sorting pptSorting ppt
Sorting ppt
 
Data Analysis Homework Help
Data Analysis Homework HelpData Analysis Homework Help
Data Analysis Homework Help
 

OEIS mining - poster

  • 1. Empirical Discovery of Formulae Connecting Sequences in the OEIS YANGCHEN PAN MAX A. ALEKSEYEV Department of Computer Science Department of Mathematics EXAMPLES OF DISCOVERED CONNECTIONS While we have not yet performed a system- atic analysis of the discovered new matches, be- low we list some rather random examples of such matches. Example 1. The LAH transform (cf. A103194) of a sequence A is the sequence B, whose expo- nential generating functions satisfy the relation EB(x) = 1 1−x · EA( x 1−x ). The LAH transform of A014500 (Number of graphs with unlabeled non-isolated nodes and n labeled edges): 1, 1, 2, 9, 70, 794, 12055, 233238, 5556725, 158931613, 5350854707, ... matches A020558 (Number of ordered multigraphs on n labeled edges without loops): 1, 1, 4, 27, 274, 3874, 71995, 1682448, 47840813, 1615315141, 63566760077, ... which can be easily verified using known e.g.f.’s of these sequences. Example 2. The inverse characteristic transform produces the sequences of indices of nonzero el- ements of an input sequence. The inverse characteristic transform of A007253 (McKay-Thompson series of class 5a for Monster): 1, 0, -6, 20, 15, 36, 0, -84, 195, 100, 240, 0, -461, 1020, 540, 1144, 0, -1980, ... matches A020558 (Numbers that are congruent to {0, 2, 3, 4} mod 5): 0, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20, 22, 23, 24, 25, 27, ... This match extends to at least first 25 terms, but it is immediately unclear if it holds for the whole sequences. Example 3. The record transform produces the record values of an input sequence. The record transform of A003959 (If n = k pek k , then a(n) = k(pk + 1)ek , a(1) = 1): 1, 3, 4, 9, 6, 12, 8, 27, 16, 18, 12, 36, 14, 24, 24, 81, 18, 48, 20, 54, 32, 36, 24, 108, ... matches A211221 (For any partition of n consider the product of the σ of each element. Sequence gives the maximum of such values): 1, 3, 4, 9, 12, 27, 36, 81, 108, 243, 324, 729, 972, 2187, 2916, 6561, 8748, 19683, ... Namely, the first 10,000 terms of A003959 contain 26 record values, which all match A211221. Example 4. The partial sum transform produces the sequence of partial sums of an input sequence. The partial sum transform of A231560 ( n i=2 1 i·log i ): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2 ... matches with A032521 (Sum of the integer part of 143-th roots of integers less than n): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, ... which extends to 169 first terms, but then the sequences diverge. Example 5. The n-th term of the least inverse transform of a sequence {ak} equals the smallest index m such that am = n. Th least inverse transform of A004445 (Nim- sum n + 4): 4, 5, 6, 7, 0, 1, 2, 3, 12, 13, 14, 15, 8, 9, 10, 11, 20, 21, 22, 23, 16, 17, 18, 19, 28, 29, ... matches itself. In fact, it can be easily seen that A004445 represents a permutation of nonnegative integers that is involution and thus a fixed point of the least inverse transform. Example 6. The labeled cycle transform applies cy- cle structure to a sequence of labeled objects. The labeled cycle transform of A046912 (Num- ber of irreducible quasiorders with n labeled points): 1, 1, 2, 11, 147, 3412, 121553, 6353629, 476850636, ... matches A001929 (Number of con- nected topologies on n labeled points): 1, 1, 3, 19, 233, 4851, 158175, 7724333, 550898367, ... and, similarly, the labeled cycle transform of A046908 (Number of irreducible posets with n labeled points): 1, 1, 1, 7, 97, 2251, 80821, 4305127, 332273257, ... matches A001927 (Number of connected partially ordered sets with n labeled points): 1, 1, 2, 12, 146, 3060, 101642, 5106612, 377403266, ... Correctness of these matches easily follows from the sequences definition. INTRODUCTION The Online Encyclopedia of Integer Sequences (OEIS) is a rich and constantly growing source of numerical data from various areas of science. As such, it often helps to establish unexpected con- nection between seemingly unrelated results and get deeper understanding of the underlying prob- lems. In our project, we mined the numerical data in the OEIS to discover yet unknown formulae that possibly connect different sequences. We assume that connection between se- quences A and B in the OEIS is known if either the description of A mentions B or vice versa, the de- scription of B mentions A; otherwise we assume that the connection between A and B is unknown. Many of the discovered new matches rep- resent a situation where one transformed se- quence matches a large number of existing se- quences. However, the most interesting results arise from the situation when a transformed se- quence matches only one existing sequence, since this makes the match somewhat special and thus deserving further investigation. MINING SEQUENCE CONNECTIONS Diagram 1: Project Outline We stored a local copy of the OEIS database. For each sequence in the database, we per- formed “standard” transforms from http:// oeis.org/transforms.html to obtain trans- formed sequences. Then each transformed se- quence was searched in the database to discover matches with existing sequences. The overall mining process is outlined in Diagram 1. Step 1. Data preprocessing: First, matrix and short (with less than 5 terms) sequences and were deleted. Second, from each sequence database en- try (excluding Sequence in context: and Adjacent se- quences: fields), we extract references to other se- quences, which are treated as known connections. Step 2. Cope with transforms: We ignored matrix transforms, which left us with 106 transforms. We further skipped any transform for a particular se- quence that resulted in an error. Step 3. Sequence processing pipeline: The se- quence processing pipeline is shown in Diagram 2. Apache lucene library was used to build an in- dex for efficient search of sequences in the local database. Step 4. Distributed execution: The project used 195 cores running individual sequence pipelines. Diagram 2: Sequence Processing Pipeline MATCH STATISTICS Match Type Known New Total Pairwise 56,725 15,286,111 15,342,836 One-to-Many 518 492,910 493,428 One-to-One 13,777 70,677 84,454 One-to-Any 14,295 56,3587 577,882 Self ? ? 202,777 For a sequence A and transformation T such that T(A) matches k ≥ 1 sequences in the database, this match is classified as one-to-one if k = 1 and one-to-many if k > 1. Sequences invariant w.r.t. transforms are classified as self matches. ACKNOWLEDGMENTS We are thankful to Neil Sloane and Charles Greathouse for providing us with the snapshot of the OEIS database (dated March 26, 2014). We are also indebted to Christian G. Bower for the PARI/GP implementation of transforms. The computations were performed at Colo- nialOne high-performance cluster at the George Washington University. The project is supported by the National Sci- ence Foundation under the grant No. IIS-1462107.