I can't find the answer through googling, so here goes: I would think that attacking (finding a pre-image of) a hash digest would be NP complete in one sense: it's easy to very a hash inversion (putting it in NP), but difficult to find that pre-image. This is because, for a digest of n bits, it takes exp(n) guesses to invert, making it as hard as other NP-complete problems.
On the other hand, I never see it in lists of NP-complete problems:
There is the possibility that it's NP-hard, which (I think) would mean that NP-complete problems can be reduced to it, but not vice versa (i.e., an oracle for hash inversion would speed up the problem of solving NP-complete problems but not vice versa), as suggested by this Euler diagram (focusing on the P != NP case here):
I realize that any specific cryptographic hash function has a fixed-size search space, even if it is infeasibly large (and across unbroken cryptographic hash functions in general, the inversion difficulty varies exponentially with digest length) . So I suggest the following revision to my question to better capture what I'm asking:
1) Say there's a family of cryptographic hash functions for which you can set a parameter that determines the digest length n, which can be arbitrary large. In that case, it's clear that the difficulty of hash inversion scales exponentially with n, and (I think) makes it at least as hard as the hardest problems in NP.
In that case, would there be a reduction (in any direction) between (variable-digest-length) hash inversion and NP-complete problems?
2) Alternatively, make the comparison between a specific cryptographic hash function and a fixed-length version of an NP-complete problem. Is there a reduction (in either direction) between the two problems? If you had an oracle for SHA-256 inversion, for instance, would that help you solve a version of the 3-SAT problem with an upper-bounded number of clauses (for some upper bound)?
This is a commonly misunderstood area of hash functions.
Hash functions, such as SHA-1, are fixed algorithms - there is no key involved. On a given input, they return an output deterministically. So NP-hardness questions break down really quickly because there is no parameter which can give asymptotic behavior.
So, if you are looking at this from formal complexity theory, there exists a simple, constant-time algorithm that inverts most inputs for any known hash function. That function has access to a table that has the hash function inverted (such a table can be computed in constant time, like 2^256 time), and it just looks up answers in its table.
There are a few holes in there since we don't know if the range of the hash function will be covered by hashing 2^256 values. But if we change the argument to collisions, it holds up very well. Collisions exist in any hash function due to the pigeon hole principle, and the algorithm just prints out one such collision.
----------------
That said, if no one can *find* such a function in *practical* time, then we can, and do, rely on hash function security. Phillip Rogaway wrote an interesting paper trying to formalize this notion to put hash functions on more solid theoretical footing: https://docs.google.com/viewer?u...