Abstract:
Abstract: We will consider the fundamental trade-off between the compactness of information representation and the efficiency of information extraction in the context of the set membership problem. In the set membership problem, a set S of size at most $n$ from a universe of size $m$ is to be represented as a short bit vector in order to answer membership queries of the form ``Is x in S?'' by probing the bit vector at $t$ places. The bit-probe complexity $s(m,n,t)$ is the minimum number of bits of storage needed for such a scheme. Improving on the works of Buhrman, Milterson, Radhakrishnan and Venkatesh (2002) and Alon and Feige (2009) we provide better bounds on $s(m,n,t)$ for a range of $m$, $n$ and $t$.
For two probes:
$s(m,n,2) =O(m^{1-\frac{1}{4n+1}})$; this improves on a result of Alon and Feige that states that for $n \leq \lg m$, $s(m,n,2) = O( m n \lg ((\lg m) / n) / \lg m)$
$s(m,n,2)= \Omega(m^{1-\frac{1}{\lfloor n/4 \rfloor}})$; in particular, $s(m,n,2)=\Omega(m)$ for $n \geq 4\lg m$.
For three probes:
$s(m,n,3) = \tilde{O}(\sqrt{mn}).$ This improves a result of Alon and Feige that states that $s(m,n,2)=O(m^{\frac{2}{3}} n^{\frac{1}{3}})$. For $n\approx \lg m$, our general lower bound implies a lower bound of $\Omega(\sqrt{m})$ which comes close to the upper bound.
The best non-adaptive schemes (when probes are made in parallel) for $t\geq 4$ use majority decoding to answer membership queries. For three non-adaptive probes, we show a lower bound for majority decoding schemes: $s(m,n,t)=\Omega(m^{1-\frac 1{cn}})$ for some $c>0$. Lower bounds are also obtained for other functions.
In general: We exhibit non-adaptive and adaptive schemes for odd $t\geq 5$ that use significantly less space than the schemes of Buhrman et al. We also obtain slight improvement over their general lower bound.
(This is joint work with Jaikumar Radhakrishnan.)