Reverse formualting

baddtude said:
… it doesn't show the "-1" expressed in DrMike's equation …


I'm discussing the exponential function y = a*b^x, not y = a*b^(x-1).

Galactus did it for you by using exponential regression.

galactus said:
I generated an exponential equation of \(\displaystyle y=51.719559\cdot 1.509891^{x}\)


I tried the regression applet at your first referenced site, and it worked for me.

You don't need to understand correlation coefficients to use this applet; just proofread your entries, and make sure that there's no typographical errors.

Enter only xValues and yValues.

xValue = 1 and yValue = 78
xValue = 2 and yValue = 118
xValue = 3 and yValue = 178
xValue = 4 and yValue = 269
et cetera …

NOTE: Since that particular applet does not provide more than 12 fields for (x,y) data points, I spread the points out, when I entered the data. In other words, I entered levels 1, 3, 5, … 15, 17, 19.

After you finish entering the data, click the [y=a(b^x)] button.

'
… my biggest obsticle seems to be that I get lost in what I'm actually reading in the far too simple explanations …

… I suspect it will also require a little reading of matertial that create the foundations of this particuliar math …


Yeah, well those simple explanations are not meant to serve as an online classroom.

Most people come to understand these sorts of things only after several weeks of math instruction, in a classroom setting.

Self-instruction is certainly not a crime; nonetheless, you gotta do the time! 8-)

 
I wonder if I'm asking the right question for the answer I want, or if I'm asking the right question, but not comprehending it's answer. (Ugh, I feel stupid doing this, but ....)

I am trying to find the formula that will fit my data. I thought DrMike had it. He provided an equation that flawlessly fit the data I provided.
DrMike said:
cost = ceil(78 * 1.51[sup:1r90sspb](level-1)[/sup:1r90sspb] )

So, here's where it falls on me. I really didn't understand galactus's answer:

galactus said:
I generated an exponential equation of \(\displaystyle y=51.719559\cdot 1.509891^{x}\)

The context was completely lost on me, as, at that point, I didn't know what y represented, and it didn't provide me a complete answer to what I was seeking. Probably because I didn't ask correctly. Also, when I tried to copy and paste that equation, it appeared as "y=51.719559\cdot 1.509891" when pasted. At first, it confused me more. "What's a 'cdot'?", I thought. And what does the "\" mean? When I read this, I didn't have a whole lot of time to give it, either. And seemed unnecessary once DrMike provided his formula.

On the site applets, I could get them to work as they are, but they didn't give me the values in the equation except y. So, in the equation:

cost = ceil(78 * 1.51[sup:1r90sspb](level-1)[/sup:1r90sspb] )

Translates in Excel (the spreadsheet I'm trying to use) as:

=ROUNDUP(78*(1.51^(level-1)),0)

level represents the (column, row) coordinance of the level associated to the cost.

When looking at equations to use for regression, y=a(b^x) seems to fit the best.

y = cost
a = 78
b = 1.51
x= level-1

But, not completely. From my table of the cost/level the only thing I know is y. I don't have the other variables (or in this case, constants) a or b, and the actual variable x, we can call unknowns. Of course, the applets weren't as accurate as DrMike. Partly because they don't give me the "-1" as part of x. From your last response:
mmm4444bot said:
I'm discussing the exponential function y = a*b^x, not y = a*b^(x-1).
I think I can assume there are other equations that aren't part of the applets.

So, back to my original purpose: trying to figure out the formula that matches my information. I do have a couple other tables I want to do the same for and the applets don't come close.

Anyway, I thank you all for your help. I am sorry that I wasn't able to grasp the information you gave me. I had expected a simpler answer, like a formula that would allow me to reverse calculate, and provide me with the formula that I needed to enter into the spreadsheet and, thus, be given the ability to expand my available information.
 
Did you realize that \(\displaystyle ab^x = (ab)b^{x-1}?\)

Without the ceiling function Dr. Mike's and galactus' answer are identical, save for rounding:

\(\displaystyle 51.72 \cdot 1.51^x = (51.72 \cdot 1.51) \cdot 1.51^{x-1} \approx 78.1 \cdot 1.51^{x-1}\)
 
baddtude said:
… I had expected a simpler answer, like a formula that would allow me to reverse calculate, and provide me with the formula that I needed to enter into the spreadsheet …


No such puppy.

But, we could "reverse" the formula cost = a * b^level to get the inverse formula level = log_b(cost/a) !

Post your other data, and I'll give you equations in the form cost = a * b^level.

 
To find the exponential function you could do the following:

Find the slope of the line segment between each adjacent pairs of given points.

In calculus, I'm not sure if you've taken it, the derivative of a function gives the instantaneous slope of a curve at any x value. If the data is to fit an exponential regression than the slopes should be something close to the derivative. I would use the midpoints between each set of line segments (x1,x2), (x2, x3).. etc

The derivative of a function ab^x is aln(b) * b^x

Lets try the first two points:

x1=1, x2=2, so lets pick x=1.5

We need aln(b) * b^1.5 = (y2-y1)/(x2-x1) = 40

And the next two:

aln(b) * b^2.5 = 60

aln(b) * b^3.5 = 91

Dividing the second by the first we get b = 1.5
Dividing the third by the second we get b = 1.5166...
... You could continue this for every set of points you have... and you should, as a data set can have outliers.

This will at least give you an IDEA for b; hopefully most are all close to one another. Say you wish to test your finding that b=1.51

Then we would like to see some consistency for a in our data points, right? That would mean our choice was "close."

If ab^1 = 78, then a(1.51)=78. Which means a =51.65
If ab^2 = 118 then we get a(1.51)^2 = 118 => a = 51.75
Seems pretty close to galactus' answer of a~51.72 and b~1.509, doesn't it?

Once you find an a and b that makes the data fit "closest" you can start playing with things like the ceiling and floor functions or whatever else you have in your tool box.

I'm not sure if this is how DrMike or galactus did it though.


edit to add: To find the best b from your given data, you shoulld calculate every potential b as I started above. Throw out any outliers. Then say you have \(\displaystyle b_1,b_2,...,b_n\) good values, then choose \(\displaystyle b = \sqrt[n]{b_1b_2\cdots b_n}\). That will give you the average growth rate of an expoential function (its the geometric mean).

Of course its possible that none of your data seems to fit, or some seems to but most doesn't. That simply means you've probably chosen the wrong regression curve.
 
daon said:
Did you realize that \(\displaystyle ab^x = (ab)b^{x-1}?\)

Without the ceiling function Dr. Mike's and galactus' answer are identical, save for rounding:

\(\displaystyle 51.72 \cdot 1.51^x = (51.72 \cdot 1.51) \cdot 1.51^{x-1} \approx 78.1 \cdot 1.51^{x-1}\)

Things like this are kind of what I mean. From what I do remember from Algebra (this could be why I had unrealistic expectations), there are usually more than 1 way to calculate out variables. For instance, when finding the area of a triangle a[sup:1wdxmvcl]2[/sup:1wdxmvcl]+b[sup:1wdxmvcl]2[/sup:1wdxmvcl]=c[sup:1wdxmvcl]2[/sup:1wdxmvcl]. As long as you have a majority of the variables, you can calculate the unknowns.

mmm4444bot said:


baddtude, are you feigning ignorance (sometimes)? 8-)


8-) I had to laugh a little at this. No, I'm not feigning ignorance. Regardless of my lack of education, I am of reasonable intelligence, and, when in school, was fairly adept in math and algebra. But, I never studied past that (actually never studied at all in school), with the exception of taking a college entry level statistics class, where I was denied the formulas and told to get a scientific calculator.

daon said:
To find the exponential function you could do the following:

Find the slope of the line segment between each adjacent pairs of given points.

In calculus, I'm not sure if you've taken it, the derivative of a function gives the instantaneous slope of a curve at any x value. If the data is to fit an exponential regression than the slopes should be something close to the derivative. I would use the midpoints between each set of line segments (x1,x2), (x2, x3).. etc

The derivative of a function ab^x is aln(b) * b^x

Lets try the first two points:

x1=1, x2=2, so lets pick x=1.5

We need aln(b) * b^1.5 = (y2-y1)/(x2-x1) = 40

And the next two:

aln(b) * b^2.5 = 60

aln(b) * b^3.5 = 91

Dividing the second by the first we get b = 1.5
Dividing the third by the second we get b = 1.5166...
... You could continue this for every set of points you have... and you should, as a data set can have outliers.

This will at least give you an IDEA for b; hopefully most are all close to one another. Say you wish to test your finding that b=1.51

Then we would like to see some consistency for a in our data points, right? That would mean our choice was "close."

If ab^1 = 78, then a(1.51)=78. Which means a =51.65
If ab^2 = 118 then we get a(1.51)^2 = 118 => a = 51.75
Seems pretty close to galactus' answer of a~51.72 and b~1.509, doesn't it?

Once you find an a and b that makes the data fit "closest" you can start playing with things like the ceiling and floor functions or whatever else you have in your tool box.

I'm not sure if this is how DrMike or galactus did it though.


edit to add: To find the best b from your given data, you shoulld calculate every potential b as I started above. Throw out any outliers. Then say you have \(\displaystyle b_1,b_2,...,b_n\) good values, then choose \(\displaystyle b = \sqrt[n]{b_1b_2\cdots b_n}\). That will give you the average growth rate of an expoential function (its the geometric mean).

Of course its possible that none of your data seems to fit, or some seems to but most doesn't. That simply means you've probably chosen the wrong regression curve.

I can read a bit of this, but not all. I'm sure if I looked up a few definitions and the theories, I could understand more.
mmm4444bot said:
baddtude said:
… I had expected a simpler answer, like a formula that would allow me to reverse calculate, and provide me with the formula that I needed to enter into the spreadsheet …


No such puppy.

But, we could "reverse" the formula cost = a * b^level to get the inverse formula level = log_b(cost/a) !

Post your other data, and I'll give you equations in the form cost = a * b^level.

The 1st sets of numbers I provided were for only 1 cost, as there are actually 2. The costs look more like this:

Level, Pig iron, Kryptonite
1..... 78........ 25
2..... 118....... 38
3..... 178....... 58
4..... 269....... 87
5..... 406....... 130
6..... 613....... 197
7..... 925....... 297
8..... 1,397..... 448
9..... 2,109..... 676
10.... 3,184..... 1,021

DrMike's equation worked on the the second cost as well, but only by changing 1 value. Instead of "cost = ceil(78 *(1.51^(level-1)))" I simply changed the 78 to 25 as such "cost = ceil(25 *(1.51^(level-1)))". This worked for nearly every construct's building cost. Unfortunately, that is as far as that formula would hold up. The full table thos numbers belong to is:

Pig Iron Mine
Level, Pig iron, Kryptonite, Construction time, Output/h, Energy/h
1.... 78...... 25..... 0 00:01:33..... 27.... -4
2.... 118..... 38..... 0 00:02:21..... 63.... -10
3.... 178..... 58...... 0 00:03:33 ..... 109... -18
4.... 269..... 87...... 0 00:05:21 ..... 167... -27
5.... 406..... 130..... 0 00:08:03 ..... 241... -40
6.... 613..... 197..... 0 00:12:10 ..... 333... -55
7.... 925..... 297..... 0 00:18:21 ..... 446... -74
8.... 1,397... 448..... 0 00:27:42...... 587... -97
9.... 2,109... 676...... 0 00:41:49...... 759... -126
10... 3,184... 1,021.... 0 01:03:08...... 970... -161

Level is the level of the construct.
The next 2 columns are construction costs (where DrMike's formula worked)
The 4th column is construction time in the format (d = day, h = hour, m = minute, s = second) d hh:mm:ss
5th column represents the production per hour
6th column represents energy consumption per hour (or upkeep cost)

Most of these I thought would be easy to figure out (I never wanted to bother to figure out the time growth, but seeing the complexity of just finding the curve on plain numbers, has certainly discouraged any thoughts to do so on time integers).

What started out as wanting to put together a simple (not totally simple) spreadsheet, where I could enter a level for the construct and have the stats display for me and possibly even calculate the values to give cumulative information (i.e. the amount of production over 2 hours, or the cost of production of multiple levels like the cost to build both levels 1 and 2), has had it's own growth to the power of egads. :shock:
 


For "Pig Iron", we can take the ceiling of cost using: cost = 78 * 1.51^(Level - 1)

For "Kryptonite", we can take the ceiling of cost using: cost = 25 * 1.51^(Level - 1)

For "Construction Time", I first converted all times to seconds, using the following.

1 hour = 3600 seconds
1 minute = 60 seconds

Given an amount of time in h:m:s, the total number of seconds is: S = 3600*h + 60*m + s

Here's one way to convert total seconds S back into h:m:s, first use S to find h, then use S and h to find m, then use S, h, and m to find s.

h = floor(S/3600)

m = floor(S/60 - 60h)

s = S - h - m

NOTE: The floor(x) function rounds x down to the nearest Integer; in some applications, the floor(x) function is called trunc(x) or int(x).

So, for "Construction Time", we can round time using: time = 93.2846 * 1.5091^(L - 1)

Some of the times generated by this function are off by +1 second or -1 second. I hope that's not a critical error! 8-)

Here are the times, actual and (generated):

00:01:33 --> S = 0093 sec (0093)
00:02:21 --> S = 0141 sec (0141)
00:03:33 --> S = 0213 sec (0212)
00:05:21 --> S = 0321 sec (0321)
00:08:03 --> S = 0483 sec (0484)
00:12:10 --> S = 0730 sec (0730)
00:18:21 --> S = 1101 sec (1102)
00:27:42 --> S = 1662 sec (1663)
00:41:49 --> S = 2509 sec (2509)
01:03:08 --> S = 3788 sec (3789)

The "Output/h" and "Energy/h" data do not follow exponential growth, so I used a different regression.

Quartic Regression fits the data to a fourth-degree polynomial: y = a*x^4 + b*x^3 + c*x^2 + dx + e.

For "Output/h", we can round output using: output = 0.0293*Level^4 + 0.0121*Level^3 + 4.4716*Level^2 + 21.6857*Level + 0.9167

For "Energy/h", we can round energy using: energy = -0.0052*Level^4 + 0.0051*Level^3 - 0.7815*Level^2 - 3.5796*Level + 0.3333

Cheers,

~ Mark

 
mmm4444bot said:


For "Construction Time", I first converted all times to seconds, using the following.

Some of the times generated by this function are off by +1 second or -1 second. I hope that's not a critical error! 8-)

The "Output/h" and "Energy/h" data do not follow exponential growth, so I used a different regression.

Quartic Regression fits the data to a fourth-degree polynomial: y = a*x^4 + b*x^3 + c*x^2 + dx + e.

For "Output/h", we can round output using: output = 0.0293*Level^4 + 0.0121*Level^3 + 4.4716*Level^2 + 21.6857*Level + 0.9167

For "Energy/h", we can round energy using: energy = -0.0052*Level^4 + 0.0051*Level^3 - 0.7815*Level^2 - 3.5796*Level + 0.3333


Wow, thanks. It wasn't necessary to do the time, and therefore is not critcal to be exact. I, too, had the idea that the time would first need to be converted into seconds.

On the output and and energy equations, I had been toying with the applets functions, I might have even tried the formula you provided. To me, as I look at the answers given by the applets, I didn't get how to use them. I believe thats why I felt I was getting incomplete information. I think if I had seen an answer that read something like : a=0.0293, b=0.0121, etc, and the formula needed "y=a*x .... dx + e", I'd have understood the answer enough to use it.

I installed a 'Data Analysis' add-on to Excel and entered the output numbers and level while running the numbers through each of the applet's various regressions to find correllation that could make it easier to find the correct formula. Here's what that looked like:

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.912200356
R Square 0.83210949
Adjusted R Square 0.822233577
Standard Error 818.4166139
Observations 19

ANOVA
df... SS... MS... F... Significance F
Regression... 1... 56435466.71... 56435466.71... 84.25646747... 5.35127E-08
Residual... 17... 11386697.82... 669805.7539...
Total... 18... 67822164.53

Coefficients... Standard Error... t Stat... P-value... Lower 95%... Upper 95%... Lower 95.0%... Upper 95.0%...
Intercept... -1382.736842... 390.8487122... -3.537780218... 0.002528187... -2207.356695... -558.116989... -2207.356695... -558.116989
Level... 314.6578947... 34.27969973... 9.179132174... 5.35127E-08... 242.3339492... 386.9818402... 242.3339492... 386.9818402



RESIDUAL OUTPUT...

Observation... Predicted Output /h... Residuals...
1... -1068.078947... 1095.078947... 1.376837813...
2... -753.4210526... 816.4210526... 1.026482501...
3... -438.7631579... 547.7631579... 0.688700144...
4... -124.1052632... 291.1052632... 0.366005332...
5... 190.5526316... 50.44736842... 0.063427248...
6... 505.2105263... -172.2105263... -0.216519517...
7... 819.8684211... -373.8684211... -0.470063077...
8... 1134.526316... -547.5263158... -0.688402363...
9... 1449.184211... -690.1842105... -0.86776549...
10... 1763.842105... -793.8421053... -0.998094092...
11... 2078.5... -850.5... -1.069329807...
12... 2393.157895... -853.1578947... -1.072671566...
13... 2707.815789... -788.8157895... -0.991774528...
14... 3022.473684... -645.4736842... -0.811551147...
15... 3337.131579... -408.1315789... -0.51314199...
16... 3651.789474... -58.78947368... -0.07391574...
17... 3966.447368... 423.5526316... 0.53253081...
18... 4281.105263... 1064.894737... 1.338887341...
19... 4595.763158... 1893.236842... 2.380358128...

PROBABILITY OUTPUT

Percentile... Output /h
2.631578947... 27
7.894736842... 63
13.15789474... 109
18.42105263... 167
23.68421053... 241
28.94736842... 333
34.21052632... 446
39.47368421... 587
44.73684211... 759
50... 970
55.26315789... 1228
60.52631579... 1540
65.78947368... 1919
71.05263158... 2377
76.31578947... 2929
81.57894737... 3593
86.84210526... 4390
92.10526316... 5346
97.36842105... 6489

Sorry, I haven't figured out a better way to display a table on this forum.

Anyway, as you can see, a lot of information is displayed, but no formula. Applets have formulas to use, and I was start to compare the answers to find 1 that had enough in common, to use in my spreadsheet. Am I just making it harder than it is? Or was I actually onto someting there? :?

Regardless, I do appreciate your help. I hope that is all the growth curves I'll need to calculate.
 
baddtude said:
So, may I ask, how did you discover the formula? For further reference, it could come in handy to know how to do this.

Step 1 : I typed all the numbers into a spreadsheet, and plotted them on a graph with a logarithmic scale for the y-axis. This confirmed that it's pretty much exponential (the line was straight).

Step 2 : I made some spreadsheet columns with formulae like
* cost(level+1) = round($B$1*cost(level))
* cost(level) = 78*round($C$1^(level-1))
AND (most important) columns showing the errors between my formulae and the given data

Step 3 : Then, I adjusted the parameters in the formulae, trying to minimise the numbers of errors.

Step 4 : When I couldn't make the number of errors zero, I tried cost(level) = 78*ceil($C$1^(level-1)) instead. This gave zero errors when $C$1 was between about 1.509999 and 1.510001.

Step 5 : Post the formula to the forum... :)
 
DrMike said:
Step 1 : I typed all the numbers into a spreadsheet, and plotted them on a graph with a logarithmic scale for the y-axis. This confirmed that it's pretty much exponential (the line was straight).

Step 2 : I made some spreadsheet columns with formulae like
* cost(level+1) = round($B$1*cost(level))
* cost(level) = 78*round($C$1^(level-1))
AND (most important) columns showing the errors between my formulae and the given data

Step 3 : Then, I adjusted the parameters in the formulae, trying to minimise the numbers of errors.

Step 4 : When I couldn't make the number of errors zero, I tried cost(level) = 78*ceil($C$1^(level-1)) instead. This gave zero errors when $C$1 was between about 1.509999 and 1.510001.

Step 5 : Post the formula to the forum... :)

Ok, I tried this, but I get an output of 'FALSE' instead of a number.
 
baddtude said:
DrMike said:
Step 1 : I typed all the numbers into a spreadsheet, and plotted them on a graph with a logarithmic scale for the y-axis. This confirmed that it's pretty much exponential (the line was straight).

Step 2 : I made some spreadsheet columns with formulae like
* cost(level+1) = round($B$1*cost(level))
* cost(level) = 78*round($C$1^(level-1))
AND (most important) columns showing the errors between my formulae and the given data

Step 3 : Then, I adjusted the parameters in the formulae, trying to minimise the numbers of errors.

Step 4 : When I couldn't make the number of errors zero, I tried cost(level) = 78*ceil($C$1^(level-1)) instead. This gave zero errors when $C$1 was between about 1.509999 and 1.510001.

Step 5 : Post the formula to the forum... :)

Ok, I tried this, but I get an output of 'FALSE' instead of a number.

Sorry for the confusion. The stuff I typed above is not the literal formulae I used, but an indication of what they meant. Literally, it would have been more like

=78*ceil($c$1^(a3-1)) in cell C3
 
DrMike said:
Step 1 : I typed all the numbers into a spreadsheet, and plotted them on a graph with a logarithmic scale for the y-axis. This confirmed that it's pretty much exponential (the line was straight).

What would it have meant had the line not been straight, but, instead, had a curve?
 
baddtude said:
What would it have meant had the line not been straight, but, instead, had a curve?


I'm thinking that it means the original data (i.e., not the logarithmic counterparts) do not represent exponential growth. In other words, the data points don't lie close to any exponential curve y = a*b^x.

I've never taken the time to learn about the benefits of switching to logarithmic scales, so I'm not 100% sure; but, I'm 100% sure that Mike will post comments, if there's more to your question than what I've answered. 8-)

 
My applied engineering math professor explained to us -

If a data-set is not linear - take log. If it is not linear still - take log again - keep on repeating - eventually your data set may become just a POINT. Log-Log scale will hide all the sins of your data collection scheme. Now you can analyze the heck out of it - the way you want it .... :twisted:

But that aside - exponential scale contracts the large x-values (or expands depending on the nature) - where as opposite is done to smaller x-values.
 
Re:

mmm4444bot said:
baddtude said:
What would it have meant had the line not been straight, but, instead, had a curve?


I'm thinking that it means the original data (i.e., not the logarithmic counterparts) do not represent exponential growth. In other words, the data points don't lie close to any exponential curve y = a*b^x.

Exactimundo... :)
 
Re: Re:

DrMike said:
mmm4444bot said:
baddtude said:
What would it have meant had the line not been straight, but, instead, had a curve?


I'm thinking that it means the original data (i.e., not the logarithmic counterparts) do not represent exponential growth. In other words, the data points don't lie close to any exponential curve y = a*b^x.

Exactimundo... :)

Ok, I kind of understood that, when it was said that the line being straight means it was exponential growth. I was actually hoping for an answer explaining what a curved line did represent.
 
baddtude said:
… hoping for an answer explaining what a curved line did represent.


I'm not sure if we can always deduce from such a curve (obtainined after switching to logarithmic scales) the nature of the original data.

Mike or Subhotosh will need to elaborate on this, but I see at least two possibilities.

(1) The resulting curve could indicate that the original data does not fall into any basic family of functions (eg: power, exponential, quadratic, cubic, quartic, root)

(2) The resulting curve would need to be analyzed or identified, in order to determine in which basic family of functions the original data belongs

I'll add learning about logarithmic scales to my "rainy-day list", but I never seem to get around to spending much time with that list. (Can I blame that on global warming?)

 
Re:

mmm4444bot said:
baddtude said:
… hoping for an answer explaining what a curved line did represent.


I'm not sure if we can always deduce from such a curve (obtainined after switching to logarithmic scales) the nature of the original data.

Mike or Subhotosh will need to elaborate on this, but I see at least two possibilities.

(1) The resulting curve could indicate that the original data does not fall into any basic family of functions (eg: power, exponential, quadratic, cubic, quartic, root)

(2) The resulting curve would need to be analyzed or identified, in order to determine in which basic family of functions the original data belongs


either of these would do...

I'll add learning about logarithmic scales to my "rainy-day list", but I never seem to get around to spending much time with that list. (Can I blame that on global warming?)

not if you live in the tropics
 
Top