The minimum circuit size problem: DIMACS REU Blog

Monday, August 23, 2021


$$ % math operators \newcommand{\poly}{\operatorname{poly}} \newcommand{\polylog}{\operatorname{polylog}} \newcommand{\rank}{\operatorname{rank}} \newcommand{\sign}{\operatorname{sign}} % functions \newcommand{\And}{\textsf{And}} \newcommand{\Or}{\textsf{Or}} \newcommand{\Xor}{\textsf{Xor}} \newcommand{\Sat}{\textsf{Sat}} \newcommand{\qed}{\blacksquare} % some letters \newcommand{\BA}{\mathbb A} \newcommand{\BB}{\mathbb B} \newcommand{\BC}{\mathbb C} \newcommand{\BD}{\mathbb D} \newcommand{\BE}{\mathbb E} \newcommand{\BF}{\mathbb F} \newcommand{\BG}{\mathbb G} \newcommand{\BH}{\mathbb H} \newcommand{\BI}{\mathbb I} \newcommand{\BJ}{\mathbb J} \newcommand{\BK}{\mathbb K} \newcommand{\BL}{\mathbb L} \newcommand{\BM}{\mathbb M} \newcommand{\BN}{\mathbb N} \newcommand{\BO}{\mathbb O} \newcommand{\BP}{\mathbb P} \newcommand{\BQ}{\mathbb Q} \newcommand{\BR}{\mathbb R} \newcommand{\BS}{\mathbb S} \newcommand{\BT}{\mathbb T} \newcommand{\BU}{\mathbb U} \newcommand{\BV}{\mathbb V} \newcommand{\BW}{\mathbb W} \newcommand{\BX}{\mathbb X} \newcommand{\BY}{\mathbb Y} \newcommand{\BZ}{\mathbb Z} \newcommand{\CA}{\mathcal A} \newcommand{\CB}{\mathcal B} \newcommand{\CC}{\mathcal C} \newcommand{\CD}{\mathcal D} \newcommand{\CE}{\mathcal E} \newcommand{\CF}{\mathcal F} \newcommand{\CG}{\mathcal G} \newcommand{\CH}{\mathcal H} \newcommand{\CI}{\mathcal I} \newcommand{\CJ}{\mathcal J} \newcommand{\CK}{\mathcal K} \newcommand{\CL}{\mathcal L} \newcommand{\CM}{\mathcal M} \newcommand{\CN}{\mathcal N} \newcommand{\CO}{\mathcal O} \newcommand{\CP}{\mathcal P} \newcommand{\CQ}{\mathcal Q} \newcommand{\CR}{\mathcal R} \newcommand{\CS}{\mathcal S} \newcommand{\CT}{\mathcal T} \newcommand{\CU}{\mathcal U} \newcommand{\CV}{\mathcal V} \newcommand{\CW}{\mathcal W} \newcommand{\CX}{\mathcal X} \newcommand{\CY}{\mathcal Y} \newcommand{\veca}{\mathbf a} \newcommand{\vecb}{\mathbf b} \newcommand{\vecc}{\mathbf c} \newcommand{\vecd}{\mathbf d} \newcommand{\vece}{\mathbf e} \newcommand{\vecf}{\mathbf f} \newcommand{\vecg}{\mathbf g} \newcommand{\vech}{\mathbf h} \newcommand{\veci}{\mathbf i} \newcommand{\vecj}{\mathbf j} \newcommand{\veck}{\mathbf k} \newcommand{\vecl}{\mathbf l} \newcommand{\vecm}{\mathbf m} \newcommand{\vecn}{\mathbf n} \newcommand{\veco}{\mathbf o} \newcommand{\vecp}{\mathbf p} \newcommand{\vecq}{\mathbf q} \newcommand{\vecr}{\mathbf r} \newcommand{\vecs}{\mathbf s} \newcommand{\vect}{\mathbf t} \newcommand{\vecu}{\mathbf u} \newcommand{\vecv}{\mathbf v} \newcommand{\vecw}{\mathbf w} \newcommand{\vecx}{\mathbf x} \newcommand{\vecy}{\mathbf y} \newcommand{\vecz}{\mathbf z} \newcommand{\veczero}{\mathbf 0} \newcommand{\vecone}{\mathbf 1} % math \renewcommand{\tilde}{\widetilde} \renewcommand{\hat}{\widehat} \newcommand{\eqdef}{\;\stackrel{\small \text{def}}{=}} \newcommand{\tpose}{\top} % probability \newcommand{\Exp}{\mathop{\mathbb E}} % complexity \newcommand{\P}{\mathbf P} \newcommand{\NP}{\mathbf{NP}} \newcommand{\NC}{\mathbf{NC}} \newcommand{\AC}{\mathbf{AC}} % algebra \newcommand{\im}{\operatorname{im}} \newcommand{\dist}{\operatorname{dist}} \newcommand{\wt}{\operatorname{wt}} $$
$$ \newcommand{\MCSP}{\textsf{MCSP}} \newcommand{\MKTP}{\textsf{MKTP}} \newcommand{\NISZK}{\textbf{NISZK}} \newcommand{\NISZKL}{\textbf{NISZK_L}} \newcommand{\Parity}{\textsf{Parity}} \newcommand{\Maj}{\textsf{Majority}} $$

July 2022 update: Below is the blog I kept for my project on the minimum circuit size problem at the DIMACS REU at Rutgers over Summer 2021, advised by Eric Allender, and in collaboration with Vishal Ramesh and Sasha Sami. The original project page is hosted on the REU site. The project ran from May 23 to July 24, 2021. In this version, I’ve slightly changed the MathJax formatting and reformatted the references. I should mention that last fall, I realized that the goal I mention below of proving that Ilango’s reduction [7] can be implemented in \(\AC^0\) turns out not to be as interesting as hoped, because that result depends on \(\NP\)-hardness-of-approximation results for DNF minimization, which even in their weakest forms (i.e., [3]) do not have known unconditional \(\AC^0\)-hardness analogues. See my Theory StackExchange question on \(\AC^0\)-hardness of “gapped optimization” problems.

Week 1

This week we began our investigations into the complexities of \(\MCSP\) and \(\MKTP\) (a version of \(\MCSP\) which models computation via Turing machines instead of circuits) and discussed potential research directions. I read the papers [10], [4], and [1] below and participated in several joint meetings with Prof. Allender, Sasha, Vishal, and Harsha (one of Prof. Allender’s graduate students who will also be part of the team this summer). Personally I focused on trying to assemble a coherent picture of the “landscape” of the complexity of \(\MCSP\). Most results we are studying are of the form “Problem \(P\) does/does not reduce to \(\MCSP\) variant \(V\) under \(\CC\) type of reductions”, and we have little understanding of the “robustness” of these results when we slightly vary the definitions. Notions of reductions we’ve discussed include \(\AC^0\) (both uniform and non-uniform variants), \(\NC^0\) (we typically get hardness for \(\NC^0\) reductions “for free” from hardness for \(\AC^0\) reductions), local reductions (i.e., those s.t. individual output bits can be computed in very small time and/or space), and projections, which are circuits consisting only of \(\mathsf{Not}\) gates along with constants. Surprisingly, most natural \(\NP\)-complete problems are even \(\NP\)-complete under projections (e.g., \(\Sat\)), whereas [10] shows that \(\MCSP\) cannot even be \(\NP\)-complete under reductions which can calculate an individual output bit in \(O(n^{1/2-\epsilon})\) time.

We have decided to (at least initially) focus on last year’s REU paper [1], which shows that \(\MKTP\) is hard for a complexity class called \(\NISZKL\) (non-interactive statistical zero-knowledge proofs with log-space verifiers) under uniform projections. This is a result which is not yet known to hold for \(\MCSP\) and we are thinking about why. Also, we are preparing for our introductory presentation next week.

Week 2

On Tuesday we gave our introductory presentation to the rest of the REU students. It was pretty amazing to see the variety of different projects people are working on (lots of biology!).

Getting back to complexity, this week we continued studying last year’s REU paper [1] and tried to identify the limitations which prevent us from extending the results to \(\MCSP\). [1] uses ideas from an earlier paper [2] which bound the KT complexity of independent samples drawn from a distribution in terms of its entropy. Roughly, they show that if \(t\) is a large polynomial, with high probability the KT complexity of \(t\) samples from a random variable \(X\) is close to \(tH(X)\), and this distance is a smaller polynomial than \(t\). This allows the construction of a simple projection from the “entropy approximation” promise problem — distinguishing between circuits whose entropy on a uniform input is \(k\) vs. \(k+1\) — which is complete for \(\mathbf{NISZK}\). Unfortunately, the analysis for KT complexity relies on the fact that we have very tight bounds for the KT complexity of the “hardest” string \(x \in \{0,1\}^n\) (it is \(\approx n\)), and we lack such tight bounds for the circuit complexity of the largest string.

Given this, we read another paper [6] which manages to prove \(\AC^0[p]\) lower bounds against \(\MCSP\) — a result which was previously known for \(\MKTP\). We are thinking about how this result circumvented the issues with lack of tight bounds for circuit complexity.

Week 3

This week we continued to examine Golovnev et al.’s paper [6] which shows that \(\MCSP\) is hard for \(\AC^0[p]\) circuits. The “meat” of this paper was via a reduction through the coin problem — the distributional problem of distinguishing strings of unbiased coin flips from strings of slightly biased coin flips. This problem is known to be hard for \(\AC^0[p]\), and so the paper simply constructs a reduction from the coin problem to MCSP, which ultimately uses the fact that random functions are substantially more complex than random biased functions.

We began discussing whether it may be possible to make further use of this reduction to build a simple \(m\)-reduction from other problems of interest (e.g., \(\Maj\) or \(\Parity\)) to \(\MCSP\). There is a known reduction from \(\Maj\) to the coin problem due to Shaltiel and Viola [11] (it is cited in [6]), but this reduction is not necessarily an \(m\)-reduction.

Week 4

This week began with a continued examination of the idea to extend the Shaltiel and Viola reduction [11] to give an \(m\)-reduction from some problem of interest to \(\MCSP\). However, we are lacking a crucial ingredient: a “gadget” that lets us take the \(\And\) or \(\Or\) of two \(\MCSP\) instances (even simple ones, with constant thresholds, created by projection, etc.). We considered instead trying to build a reduction to the minimum formula size problem (MFSP), on the theory that we could construct such gadgets. (Formulas are circuits in which every gate has fan-out 1, so they cannot reuse computations. This makes it easier to reason about the complexity of various combinations of functions.)

Along these lines, we first looked at §1.4 in Jukna’s book [8] which discusses the best known bounds for the formula sizes of the hardest \(n\)-bit Boolean function. These are even weaker than those which are known for circuit size, and so the reduction of [2] does not seem to go through directly. (If it did, it would supersede any attempt to directly show that \(\mathbf{TC}^0\) problems reduce to MFSP.) Next, I looked at §10.2 of Wegener’s book [12] which discusses the formula complexity of combinations of functions. Indeed, given Boolean functions \(f,g : \{0,1\}^n \to \{0,1\}\), it is easy to show that \(L(f \vee g) = L(f) + L(g) \pm 1\), where \(L(\cdot)\) denotes formula size and \(f \vee g\) is the \(2n\)-bit function mapping \(x,y\) to \(f(x) \vee g(y)\). However, this “gadget”does not suffice for our purposes, since it increases the size of truth tables quadratically, which means that it could only be applied a constant number of times.

Given this issue, we decided to focus on a different problem; specifically, we will study the recent papers of Fu [5] and Ilango [7] which prove hardness, or barriers towards proving hardness, for some seemingly related variants of circuit minimization.

Week 5

This week, I studied Ilango’s paper [7] in more detail, attempting to understand the techniques he uses to show that the depth-\(d\) formula size problem is \(\NP\)-complete under quasipolynomial-time randomized reductions. The heart of his reduction is a depth reduction for computing the variant of the problem where the top gate is an \(\Or\). I attempted to understand how simply this reduction could be implemented (in particular, can it be implemented as a quasipolynomial-size, nonuniform (or randomized?) \(\AC^0\) reduction with oracle gates?). The current construction uses some \(\Maj\) operations but we have some ideas as to how to work around this.

I also had some insights this week with regard to our earlier discussions on showing hardness of \(\MCSP\) for \(\Maj\). Using the framework of Golovnev et al. [6] for analyzing the circuit complexity of random strings, I believe that I was able to construct an \(m\)-projection from \(\Maj\) to \(\MCSP\). It has both a “randomized” and “nonuniform” mode. One error in the argument for nonuniformity came up during our Thursday group meeting, but it appears to be fixable by slightly adjusting some of the concentration bounds that we use. Sasha and I also had several discussions over email which helped clarify the construction.

I have further ideas which may enable projections from \(\Parity\), or even maybe all of \(\mathbf{TC}^0\), to \(\MCSP\). I plan to explore these ideas over the weekend/next week, in addition to continuing to study the Ilango paper [7].

Week 6

Eric was traveling this week and so our discussions were solely conducted over email. This week I mostly continued focusing on the papers of Ilango [7] and Fu [5], while still thinking through the projection I constructed last week.

A formula is \(\Or\)-top if the final output gate is an \(\Or\). The reduction in the Ilango paper [7] consists of several parts:

  1. A reduction from depth-\(d\) formula complexity to depth-\(d\) \(\Or\)-top formula complexity.
  2. A reduction from depth-\(d\) \(\Or\)-top formula complexity to depth-\((d-1)\) \(\Or\)-top formula complexity (for \(d \geq 2\)).
  3. A reduction from depth-\(2\) \(\Or\)-top formula complexity to some \(\NP\)-complete problem.

The formula class in (3) is the same as simply CNFs.

In order for us to understand how simply Ilango’s reduction can be implemented, we have to separately analyze each of these three components. (2) is the “technical heart” of the paper; (1) is relatively simple and follows from the existence of explicit functions with large gaps between depth-\(d\) and depth-\((d-1)\) formula complexity; and (3) is based on earlier work using PCPs. I hope to “drill down”next week into understanding the details of how each might be implemented.

Week 7

Eric was again out of town this week, so we had our first video meeting on Friday.

Unfortunately, in the middle of the week, I realized that my earlier idea for a \(\Parity\) to \(\MCSP\) reduction was wrong. I planned to combine several randomized projections from \(\mathsf{Wt}_i\) (the indicator function for whether a Boolean string has Hamming weight exactly \(i\)) to \(\MCSP\) via \(\mathsf{Xor}\) gates. Each projection would be designed to yield strings of independently sampled bits, where the bias of the bits is slightly different in the \(\mathbf{Yes}\) and \(\mathbf{No}\) cases. However, I forgot that the loss in distinguishability given the \(\mathsf{Xor}\) of random bits is multiplicative ; hence the gap in bias between the \(\mathbf{Yes}\) and \(\mathbf{NO}\) cases for \(\Parity\) would be too small to be detected using \(\MCSP\) (or indeed any function — we’d require an exponential number of samples to notice a difference when the gap in biases is exponentially small!). However, we still have the reduction from \(\Maj\) to \(\MCSP\).

Given this, in our meeting with Eric on Friday we discussed mostly the Ilango and Fu papers [7], [5]. I read the relevant sections of [5] in depth on Wednesday and Thursday, and we discussed in our meeting with Eric why we may be able to show that depth-\(d\) formula complexity cannot be hard for \(\NP\) (indeed, even \(\mathsf{ZPP}\)) under quasipolynomial-size \(\AC^0\) reductions unless \(\mathsf{ZPP}=\mathsf{EXP}\); indeed, it turns out that in padding arguments, it typically is okay to relax from polynomial to quasipolynomial functions.

On Friday we also discussed aspects of the plan to implement Ilango’s reduction [7] as a uniform, quasipoly-size \(\AC^0\) reducion. As I wrote above in Week 6, Ilango’s reduction has three parts. Parts 1 and 2 seem very plausibly implementable in \(\AC^0\), although some work will be required to derandomize Part 2. Part 3, on the other hand, seems trickier. Recall that Part 3 is the \(\NP\)-hardness-of-approximation result for minimizing DNF formula complexity. Ilango [7] uses a hardness result due to Khot and Saket [9]; their reduction seems likely too sophisticated for our purposes, as it involves PCPs. But to make the proof work we only need hardness of \(O(1)\)-approximation, while [9] proves hardness of \(O(n^{1-\epsilon})\)-approximation for every \(\epsilon>0\). Hence we will read another hardness-of-approximation result for DNF minimization due to Allender, Hellerstein, McCabe, Pitassi, and Saks [3] which seems to use much simpler tools.

Week 8

This week we looked at the Allender et al. paper [3] which proves hardness of approximation for DNF minimization. One key ingredient in this paper is an (approximation-preserving) reduction to DNF minimization from partial DNF minimization (i.e., finding the smallest DNF formula consistent with a partially-specified truth table). This reduction uses \(\Parity\) as a gadget to guarantee the presence of certain clauses in a minimal DNF, but we believe it may be possible to use some other gadget.

We also discussed another possible problem, on trying to better understand the class \(\NISZKL\). Eric conjectures that it is equal to \(\mathbf{NISZK_{\NC^0}}\), which seems plausible, since we know that entropy approximation for \(\NC^0\) circuits is a complete problem. However, the natural protocol for this problem (see e.g., [1]) requires 2-universal hash functions, which do not appear implementable in \(\NC^0\).

Finally, on Tuesday there was also a very interesting graduate school panel which we got to attend.

Week 9

This week I mostly spent working on our final writeup for the REU, which is due Friday, as well as our final presentation. Unfortunately, in the course of writing up our work, we discovered several issues that we hadn’t caught earlier in the summer in our proofs. Firstly, our \(\Maj\) to \(\MCSP\) reduction relied on some assumptions about the monotonicity of expected circuit complexity of \(p\)-biased random functions (as a function of \(p\)); we contacted Rahul Ilango about this issue, who confirmed to us that this problem is actually still open. Moreover, in the course of writing up our version of Fu [5]’s proof that \(\NP\) hardness or \(\MCSP\) and variants under adaptive reductions implies new/likely very difficult lower bounds, Sasha discovered that we don’t actually seem to be able to extend to the constant-depth formula case. Finally, Eric now thinks that pieces of the Ilango reduction [7] cannot be derandomized. Regardless, our goal is to carefully write-up everything we have and note where we rely on unproven assumptions.


[1] Eric Allender, John Gouwar, Shuichi Hirahara, and Caleb Robelle. "Cryptographic hardness under projections for time-bounded Kolmogorov complexity". URL:
[2] Eric Allender, Joshua A. Grochow, Dieter Melkebeek, Cristopher Moore, and Andrew Morgan. "Minimum circuit size, graph isomorphism, and related problems". In: SIAM Journal on Computing 47.4 (2018), pp. 1339-1372. DOI: 10.1137/17M1157970.
[3] Eric Allender, Lisa Hellerstein, Paul McCabe, Toniann Pitassi, and Michael Saks. "Minimizing disjunctive normal form formulas and \(\mathsf{AC}^0\) circuits given a truth table". In: SIAM Journal on Computing 38.1 (2008), pp. 63-84. DOI: 10.1137/060664537.
[4] Eric Allender, Rahul Ilango, and Neekon Vafa. "The non-hardness of approximating circuit size". In: Theory of Computing Systems 65.3 (2021), pp. 559–578. DOI: 10.1007/s00224-020-10004-x.
[5] Bin Fu. "Hardness of sparse sets and minimal circuit size problem". In: COCOON 2020. DOI: 10.1007/978-3-030-58150-339.
[6] Alexander Golovnev, Rahul Ilango, Russell Impagliazzo, Valentine Kabanets, Antonina Kolokolova, and Avishay Tal. "\(\mathsf{AC}^0[p]\) lower bounds against MCSP via the coin problem". In: ICALP 2019. DOI: 10.4230/LIPIcs.ICALP.2019.66.
[7] Rahul Ilango. "Constant depth formula and partial function versions of MCSP are hard". In: FOCS 2020. DOI: 10.1109/FOCS46700.2020.00047.
[8] Stasys Jukna. Boolean Function Complexity: Advances and Frontiers. Springer, 2012.
[9] Subhash Khot and Rishi Saket. "Hardness of minimizing and learning DNF expressions". In: FOCS 2008. DOI: 10.1109/FOCS.2008.37.
[10] Cody D. Murray and R. Ryan Williams. "On the (non) NP-hardness of computing circuit complexity". In: Theory of Computing 13.4 (2017), pp. 1–22. DOI: 10.4086/toc.2017.v013a004.
[11] Ronen Shaltiel and Emanuele Viola. "Hardness amplification proofs require majority". In: SIAM Journal on Computing 39.7 (2010), pp. 3122-3154. DOI: 10.1145/1374376.1374461.
[12] Ingo Wegener. The Complexity of Boolean Functions. John Wiley & Sons, Ltd., and B. G. Teubner, Stuttgart, 1987.