In many ways this is was is required by the Bayesian paradigm, namely to find a prior that "encodes" your subjective belief about the parameter.
Example let's again the the coin example from before. Here is a very different, but probably more realistic, prior: Before we flip the coin we reason as follows: either the coin is fair, and we think that is most likely the case, or if it is not, we don't have any idea what it might be. We can "encode" this belief in the following prior:
Let d1/2 be the point mass at 1/2, that is a random variable which always takes the value 1/2, or P(d1/2 = 1/2) = 1. Let U~U[0,1] and let Z~Ber(α). Now p = Zd1/2 + (1-Z)U
So with probability α p is just 1/2 (and the coin is fair), and with probability 1-α p is uniform on [0,1] (and we don't have any idea what's going on).
This prior is a mixture of a discrete and a continuous r.v. so we will need to be a bit careful with the calculations.
The cdf and the prior "density" are as follows

because if x≠½ the cdf is a straight line with slope 1-α and P(p=½)=α+½(1-α)-½(1-α)=α
see bayescoin(5)
The joint density of p and y is:
and the marginal distribution of y:
because the uniform is also a Beta(1,1) and we can use the result above.
Now we find the posterior distribution of p given y:
Finally as above we can find the mean of the posterior distribution to get an estimate of p for a given y:

in bayescoin with which==3 we study the effect of the sample size on the estimate of p.
in bayescoin with which==4 we study the effect of alpha on the estimate of p. A larger alpha means a prior that puts more weight on the coin being fair.
Generally finding a good subjective prior can be done using utility theory.
these are priors which together with the sampling distribution lead to a posterior distribution of the same type as the prior. We already saw one example:

Here are two more:
where τ and θ are assumed to be known.
Example X1, .., Xn~Ber(p), then p~U[0,1] is a non-informative prior.
Example X1, .., Xn~N(μ,1) now μ can be any real number, so a prior has to be a density on the whole real line. But π(μ)=c is not possible because it integrates out to infinity for any c>0!
There are two solutions to this problem:
a) Allow improper priors, that is priors with an infinite integral. This is generally ok as long as the posterior is a proper density:
One justification for this is that we usually can express the improper prior as the limit of a sequence of proper priors: we saw already that if X~N(μ,σ) and μ~N(τ,θ), then μ|X=x is again normal. Now
b) A second solution is to think a little more about what "non-informative" really means, Say we have X1, .., Xn iid N(μ,σ) and we want to estimate σ. At first it seems we should use p(σ)=1, σ>0 It turns out, though, that this is not really "competely non-informative" because of the following: say we estimate σ=2.7, then there is small interval (0,2.7) "below" our estimate but a very large interval "(2.7, ∞) "above" it. There is a class of priors that were developed explicitely to express this idea of "complete lack of knowledge" called Jeffrey's priors:
Let's say X1, .., Xn~f(x|θ). Jeffrey considered one-to-one transformations of the parameter δ=u(θ). From our discussions in probability we know that the prior density of δ is given by
f(δ)=f(θ)|u-1(θ)|'
Jeffrey's argued that true lack of knowledge means that any transformation of the parameter should yield the equivalent result. Essentially it should not matter whether we measure temperature in Fahrenheit or Centigrade. He showed that this implies that the prior should be proportional to the square root of the Fisher information:
Example say X1, .., Xn~Pois(λ) and λ~Gamma(α,β), then we know that λ|X=x~Gamma(α+∑xi, β+n). But how do we choose α and β? In a subjective Bayesian analysis we would need to use prior knowledge to estimate them. The idea of empirical Bayes is to use the data itself to estimate the "hyper-parameters" α and β. For example, we know that the mean of a Gamma(α,β) is αβ and the variance is αβ2. So we choose α and β as the solutions of the system of non-linear equations
=αβ
s2=αβ2
or
s2=αβ·β=
·β
so β=s2/
and α=
/β=
2/s2
One can go even further and use the empirical distribution function as the cdf of the prior itself. In many ways, though, empirical Bayes goes against the spirit of Bayesian analysis.
These are just some of the methods for finding priors, the are many others.