I am trying to create a sample of data that I do not have. What I do have is the following:

P25 = 6

Median = 8

P75 = 12

Is there a way to regenerate the data so I can produce a distribution curve?

I am trying to create a sample of data that I do not have. What I do have is the following:

P25 = 6

Median = 8

P75 = 12

Is there a way to regenerate the data so I can produce a distribution curve?

Hi Denis, welcome to the community!

The problem with your question is Q1, Q2 and Q3 do not uniquely define a distribution. There arr infinitely many distributions which satisfies that 25% of values lie below 6, 50% lie below 8 and 75% lie below 12.

If that is not important to you, i.e. you are satisfied with any sample of `n`

observations which satisfies these conditions, what you can do is simply this:

- Calculate m = \frac{n}{4}, assuming
`n`

is a multiple of`4`

. - Generate
`m`

observations less than 6. - Generate
`m`

observations greater than 6 and less than 8. - Generate
`m`

observations greater than 8 and less than 12. - Generate
`m`

observations greater than 12. - Combine these 4 sets of
`m`

observations and it'll satisfy the constraints.

You may wonder how to generate for any of the cases. It doesn't matter how you generate. You can choose same value for all, you can choose from uniform, or any distribution as you feel.

Hope this helps.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.