Exponential Mechanisms  What's Wrong with Sensitivity?
A private mechanism is often designed as a slight perturbation of what is truly wanted. This is straightforward if — just add Gaussian (or your favorite) noise. What if is a finite, unstructured set? The classical tool is exponential mechanism. Since is no longer equipped with a metric, we need an additional structure: the quality score . The score measures how good the item fits dataset . The better fits the higher is. Traditionally, the sensitivity of the quality score plays a crucial role in the theory.
Definition A randomized algorithm is called the exponential mechanism with quality score and parameter , if the outcome is sampled with probability proportional to , i.e.
Theorem 0 [McSherryTalwar ‘07] is DP.
Neat, huh? Here’s an example. Let and . Clearly and . On the other hand, you can easily verify that they lead to the same mechanism, i.e. . Intuitively, and are not that different – they just start with different grounds. Two items and look equally different from the perspective of and , because the difference of the quality scores is independent of the variable .
So even though Alice and Bob are running exactly the same algorithm, they may end up with arbitrarily different privacy conclusions using Theorem 0. Mathematically this already makes no sense. On the practice side, Bob can severely underestimate the privacy that he already has. In that unfortunate case, he either reports very weak privacy guarantee, or increases the noise level and ruins the accuracy.
To the rescue, you may try to find the least sensitive quality score that leads to the same mechanism. A first guess could be that is always the least sensitive score that realizes a given mechanism , but it’s not true.
Root of All Evil
In analogy to sensitivity, we define a quantity called range to resolve the problem.
\begin{align}
\mathrm{sens}(q)&:= \sup_{x\sim x’} \max_{y} q(x’,y)q(x,y)\\
\mathrm{range}(q)&:= \sup_{x\sim x’} (\max_y  \min_y) [q(x’,y)q(x,y)].
\end{align}
Here is the new privacy theorem in terms of range. We also put Theorem 0 above it for ease of comparison.
Theorem 0 [McT ‘07] is DP.
Theorem 1 [DDR ‘19] is DP.
Unlike its predecessor, this theorem concludes privacy about the mechanism instead of the quality score, because
Proposition If then .
The new theorem is better not only semantically but also quantitatively.
 .
 if is monotone, i.e. if ^{1}. In particular, counting queries are monotone, with .
In summary, Theorem 1 is never worse than Theorem 0, and two times better for a very general class of queries.
Practical Implications
The reasoning above was motivated by a practical question, which is discussed by my collaborator Ryan in detail here. Basically LinkedIn wants to answer queries from advertisers like “What are the most popular articles in the last 30 days among data scientists working in the San Francisco Bay area?” And yes, they decide to use differential privacy. These “top ” queries often come in batch^{2}, and are most naturally answered by repeated use of exponential mechanisms (with twists that we ignore here). Accumulative effect on privacy, as usual, is handled by composition theorems.
So we need to analyze composition of exponential mechanisms. Before our work, the best analysis would use Theorem 0 and the optimal DP composition theorem (Opt DP Compo) from [KOV ‘15]. Our improvements come in two ways: 1) use range instead of sensitivity, 2) with the additional knowledge that each component is an exponential mechanism, it’s possible to improve on “Opt DP Compo.” In fact, we proved optimal composition theorems for exponential mechanisms (Opt Exp Comp).
Counting queries are the most common, so let’s assume that. For the same algorithm, i.e. fold composition of exponential mechanisms with parameter , the two analyses yield
\begin{align}
\text{Thm 0 + Opt DP Compo} &\Rightarrow \,\, \big(\varepsilon_{\mathrm{g}},\delta = \delta_{\mathrm{DP}}(\varepsilon,k,\varepsilon_{\mathrm{g}})\big)\text{DP}\\
\text{Thm 1 + Opt Exp Compo} &\Rightarrow \,\, \big(\varepsilon_{\mathrm{g}},\delta = \delta_{\mathrm{Exp}}(\varepsilon,k,\varepsilon_{\mathrm{g}})\big)\text{DP}
\end{align}
You can find the expressions of and here and here, but I always find it challenging to understand their asymptotic behaviors. Here is something you can do if you are ok with asymptotics ( as it should, and ) and Gaussian Differential Privacy (GDP, see an introduction here).
\begin{align}
\text{Thm 0 + DP CLT} &\Rightarrow 2\varepsilon\sqrt{k}\text{GDP}\\
\text{Thm 1 + Exp CLT} &\Rightarrow \tfrac{1}{2}\varepsilon \sqrt{k}\text{GDP}
\end{align}
These asymptotic results make it clear that finegrained analysis can buy a factor of 4 in terms of privacy^{3}. If we want to reach the same level of final privacy, then the improved (in fact, optimal) analysis allows us to answer 16 more queries. This can be crucial for DP at LinkedIn: advertisers are probably ok with 16 queries per day, but definitely not with 1 query per day.
Conclusion
 For exponential mechanisms, forget sensitivity. Use range.
 You can buy a factor of 4 in privacy parameters from classical analysis in composition of exponential mechanisms.
To find out more, check out this paper and its predecessor.
Notes

This relies on the neighboring model being add/remove. Notation in the post is for replace model for simplicity. ↩

In reality, queries come adaptively, i.e. the next query comes after the previous is answered. To keep the presentation accessible we assume the batch setting. ↩

Part of the achievement here comes from the previous paper of my collaborators David Durfee and Ryan Rogers. Comparison to this result is necessarily less accessible because of the complicated formula involved, which is why I choose to compare to the classical analysis in this post. ↩