If he the OP really wants to calculate this efficiently on a low end part the best way would be to normalize everything into a custom floating point format (i.e. a 32 bit number representing 0.5000 (0x80000000 to 0xFFFFFFFF) and an exponent (maybe from 2^-32768 to 2^32768).
Write a routine to normalize the numbers...
void normalize(u32 *mantissa, u16 *exponent) {
if(*mantissa == 0) {
*exponent = 0;
return;
}
while(*mantissa & 0xC0000000 != 0) {
*mantissa<<=1;
*exponent--;
}
Then multiplications become
void multiply(u32 *mantissa_1, u16 *exponent_1, u32 *mantissa_1, u16 *exponent_1) {
*mantissa_1 = (u32)(((u64)(*mantissa_1) * (u64)*mantissa_2)>>32);
*exponent_1 = *exponent_1 + *exponent_2;
normalize(&mantissa_1, &exponent_1);
}
Addition is a bit trickier, as you have to put things to a common exponent (whichever is the highest) and then add, and allow for an overflow.
With that infrastructure the calculation becomes:
void calc(m_x, e_x) {
*m_temp1 = *m_x;
*e_temp1 = *e_x; /* temp_1 = x */
multiply(&m_temp1, &e_temp1, &m_x,& e_x); /* temp_1 = x^2 */
multiply(&m_temp1, &e_temp1, &m_x,& e_x); /* temp_1 = x^3 */
multiply(&m_temp1, &e_temp1, &m_x,& e_x); /* temp_1 = x^4 */
*m_temp2 = *m_temp1; /* temp2 = x^4 */
*e_temp2 = *e_temp1;
multiply(&m_temp1, &e_temp1, &m_x,& e_x); /* temp_1 = x^5 */
multiply(&m_temp1, &e_temp1, &m_const1, e_const1); /* temp1 = const1 * x^5 */
multiply(&m_temp2, &e_temp2, &m_const2, e_const2); /* temp1 = const2 * x^4*/
add((&m_temp1, &e_temp1, &m_temp2, &e_temp2); /* temp1 = const1 * x^5 + const2 * x^4 */
/* K is now in m_temp1 & e_temp1 */
}