1. Empirical Discovery of Formulae Connecting Sequences in the OEIS
YANGCHEN PAN MAX A. ALEKSEYEV
Department of Computer Science Department of Mathematics
EXAMPLES OF DISCOVERED CONNECTIONS
While we have not yet performed a system-
atic analysis of the discovered new matches, be-
low we list some rather random examples of such
matches.
Example 1. The LAH transform (cf. A103194)
of a sequence A is the sequence B, whose expo-
nential generating functions satisfy the relation
EB(x) = 1
1−x · EA( x
1−x ).
The LAH transform of A014500 (Number of
graphs with unlabeled non-isolated nodes and n
labeled edges): 1, 1, 2, 9, 70, 794, 12055, 233238,
5556725, 158931613, 5350854707, ... matches
A020558 (Number of ordered multigraphs on
n labeled edges without loops): 1, 1, 4, 27,
274, 3874, 71995, 1682448, 47840813, 1615315141,
63566760077, ... which can be easily verified using
known e.g.f.’s of these sequences.
Example 2. The inverse characteristic transform
produces the sequences of indices of nonzero el-
ements of an input sequence.
The inverse characteristic transform of
A007253 (McKay-Thompson series of class 5a for
Monster): 1, 0, -6, 20, 15, 36, 0, -84, 195, 100, 240, 0,
-461, 1020, 540, 1144, 0, -1980, ... matches A020558
(Numbers that are congruent to {0, 2, 3, 4} mod 5):
0, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14, 15, 17, 18, 19, 20,
22, 23, 24, 25, 27, ... This match extends to at least
first 25 terms, but it is immediately unclear if it
holds for the whole sequences.
Example 3. The record transform produces the
record values of an input sequence.
The record transform of A003959 (If n =
k pek
k , then a(n) = k(pk + 1)ek
, a(1) = 1): 1,
3, 4, 9, 6, 12, 8, 27, 16, 18, 12, 36, 14, 24, 24, 81, 18,
48, 20, 54, 32, 36, 24, 108, ... matches A211221 (For
any partition of n consider the product of the σ
of each element. Sequence gives the maximum of
such values): 1, 3, 4, 9, 12, 27, 36, 81, 108, 243, 324,
729, 972, 2187, 2916, 6561, 8748, 19683, ... Namely,
the first 10,000 terms of A003959 contain 26 record
values, which all match A211221.
Example 4. The partial sum transform produces
the sequence of partial sums of an input sequence.
The partial sum transform of A231560
(
n
i=2
1
i·log i ): 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2 ... matches
with A032521 (Sum of the integer part of 143-th
roots of integers less than n): 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
... which extends to 169 first terms, but then the
sequences diverge.
Example 5. The n-th term of the least inverse
transform of a sequence {ak} equals the smallest
index m such that am = n.
Th least inverse transform of A004445 (Nim-
sum n + 4): 4, 5, 6, 7, 0, 1, 2, 3, 12, 13, 14, 15, 8,
9, 10, 11, 20, 21, 22, 23, 16, 17, 18, 19, 28, 29, ...
matches itself. In fact, it can be easily seen that
A004445 represents a permutation of nonnegative
integers that is involution and thus a fixed point
of the least inverse transform.
Example 6. The labeled cycle transform applies cy-
cle structure to a sequence of labeled objects.
The labeled cycle transform of A046912 (Num-
ber of irreducible quasiorders with n labeled
points): 1, 1, 2, 11, 147, 3412, 121553, 6353629,
476850636, ... matches A001929 (Number of con-
nected topologies on n labeled points): 1, 1,
3, 19, 233, 4851, 158175, 7724333, 550898367,
... and, similarly, the labeled cycle transform
of A046908 (Number of irreducible posets with
n labeled points): 1, 1, 1, 7, 97, 2251, 80821,
4305127, 332273257, ... matches A001927 (Number
of connected partially ordered sets with n labeled
points): 1, 1, 2, 12, 146, 3060, 101642, 5106612,
377403266, ... Correctness of these matches easily
follows from the sequences definition.
INTRODUCTION
The Online Encyclopedia of Integer Sequences
(OEIS) is a rich and constantly growing source of
numerical data from various areas of science. As
such, it often helps to establish unexpected con-
nection between seemingly unrelated results and
get deeper understanding of the underlying prob-
lems. In our project, we mined the numerical data
in the OEIS to discover yet unknown formulae that
possibly connect different sequences.
We assume that connection between se-
quences A and B in the OEIS is known if either the
description of A mentions B or vice versa, the de-
scription of B mentions A; otherwise we assume
that the connection between A and B is unknown.
Many of the discovered new matches rep-
resent a situation where one transformed se-
quence matches a large number of existing se-
quences. However, the most interesting results
arise from the situation when a transformed se-
quence matches only one existing sequence, since
this makes the match somewhat special and thus
deserving further investigation.
MINING SEQUENCE CONNECTIONS
Diagram 1: Project Outline
We stored a local copy of the OEIS database.
For each sequence in the database, we per-
formed “standard” transforms from http://
oeis.org/transforms.html to obtain trans-
formed sequences. Then each transformed se-
quence was searched in the database to discover
matches with existing sequences. The overall
mining process is outlined in Diagram 1.
Step 1. Data preprocessing: First, matrix and
short (with less than 5 terms) sequences and were
deleted. Second, from each sequence database en-
try (excluding Sequence in context: and Adjacent se-
quences: fields), we extract references to other se-
quences, which are treated as known connections.
Step 2. Cope with transforms: We ignored matrix
transforms, which left us with 106 transforms. We
further skipped any transform for a particular se-
quence that resulted in an error.
Step 3. Sequence processing pipeline: The se-
quence processing pipeline is shown in Diagram
2. Apache lucene library was used to build an in-
dex for efficient search of sequences in the local
database.
Step 4. Distributed execution: The project used
195 cores running individual sequence pipelines.
Diagram 2: Sequence Processing Pipeline
MATCH STATISTICS
Match Type Known New Total
Pairwise 56,725 15,286,111 15,342,836
One-to-Many 518 492,910 493,428
One-to-One 13,777 70,677 84,454
One-to-Any 14,295 56,3587 577,882
Self ? ? 202,777
For a sequence A and transformation T such that
T(A) matches k ≥ 1 sequences in the database,
this match is classified as one-to-one if k = 1 and
one-to-many if k > 1. Sequences invariant w.r.t.
transforms are classified as self matches.
ACKNOWLEDGMENTS
We are thankful to Neil Sloane and Charles
Greathouse for providing us with the snapshot of
the OEIS database (dated March 26, 2014). We
are also indebted to Christian G. Bower for the
PARI/GP implementation of transforms.
The computations were performed at Colo-
nialOne high-performance cluster at the George
Washington University.
The project is supported by the National Sci-
ence Foundation under the grant No. IIS-1462107.